YaCy-Bugtracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0000083YaCy[All Projects] Generalpublic2011-12-02 00:342013-10-25 15:58
Reporterlolcatis 
Assigned ToOrbiter 
PriorityimmediateSeveritycrashReproducibilitysometimes
StatusresolvedResolutionfixed 
ETAnone 
PlatformLinuxOSMintOS Version11 amd64
Product Version 
Target VersionFixed in Version 
Summary0000083: Yacy crashes after a few searches; log shows java.lang.OutOfMemoryError
Description*) After some searches, when executing a search, yacy will hang up while searching. It never finishes the search-request (via the Web-Interface).
*) Log shows java.lang.OutOfMemoryError after a while.
*) CPU-usage will raise to steady 100% (apparently only for one core)
*) yacy has to be killed and restarted.

I marked this for immediate resolve, as yacy is not fit for daily usage, if it hangs up permanently
Steps To Reproduce1.) Search a few times
2.) There will be a point, when yacy hangs up and does not respond to any commands in the webinterface
Additional InformationError from the log-file:

.
.
.
.
I 2011/12/02 00:20:24 YACY SEARCH failed, Peer: 4ie0hmIJRmwR:_anonufe-11042661-116 (Client can't execute: Connect to 186.105.69.142:8090 timed out)
I 2011/12/02 00:20:24 YACY SEARCH failed, Peer: sqzwkW6ZEi13:_anonufe-59124102-34 (Client can't execute: Read timed out)
I 2011/12/02 00:20:24 YACY SEARCH failed, Peer: cc_ychbFh_iu:Machiventa_Melchizedek (Client can't execute: Connect to 216.166.10.225:8090 timed out)
I 2011/12/02 00:20:24 YACY SEARCH failed, Peer: YE8D4hntQiAg:Plest (Client can't execute: Connect to 85.141.96.46:8090 timed out)
I 2011/12/02 00:20:24 YACY SEARCH failed, Peer: egH5jnD3QxV0:_anonw-14711442-62 (Client can't execute: Connect to 85.99.57.189:8090 timed out)
E 2011/12/02 00:20:24 UNCAUGHT-EXCEPTION Thread yacySearch_ICSY-TU-KAISERSLAUTERN: Java heap space
java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOf(Arrays.java:2882)
        at java.lang.StringCoding.encode(StringCoding.java:277)
        at java.lang.String.getBytes(String.java:969)
        at net.yacy.cora.document.UTF8.getBytes(UTF8.java:154)
        at net.yacy.peers.Protocol$SearchResult.<init>(Protocol.java:742)
        at net.yacy.peers.Protocol.search(Protocol.java:473)
        at net.yacy.peers.RemoteSearch.run(RemoteSearch.java:117)

java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOf(Arrays.java:2882)
        at java.lang.StringCoding.encode(StringCoding.java:277)
        at java.lang.String.getBytes(String.java:969)
        at net.yacy.cora.document.UTF8.getBytes(UTF8.java:154)
        at net.yacy.peers.Protocol$SearchResult.<init>(Protocol.java:742)
        at net.yacy.peers.Protocol.search(Protocol.java:473)
        at net.yacy.peers.RemoteSearch.run(RemoteSearch.java:117)
I 2011/12/02 00:20:21 YACY SEARCH failed, Peer: ujwC5004fHs-:yalxccy (Client can't execute: Connect to 78.105.243.209:8090 timed out)
I 2011/12/02 00:20:21 YACY SEARCH failed, Peer: IUUv10wV-qAK:GabisMac (Client can't execute: Connect to 92.228.230.200:8080 timed out)
I 2011/12/02 00:20:21 YACY SEARCH failed, Peer: IQ_kCkIGbe9T:_anonufe-54469255-107 (Client can't execute: Connect to 218.22.21.23:8090 timed out)
I 2011/12/02 00:20:21 YACY SEARCH failed, Peer: 4kKPq0Ih027z:_anonw-54361596-67 (Client can't execute: Connect to 77.95.61.157:8090 timed out)
.
.
.
.
.
.
TagsNo tags attached.
Attached Fileslog file icon yacy00.log [^] (688,545 bytes) 2011-12-02 00:52
? file icon yacy00.log_marco_peereboom [^] (334,184 bytes) 2012-02-11 17:13
? file icon yacy00.log_marco_peereboom_2_1 [^] (1,048,675 bytes) 2012-02-11 17:39
? file icon yacy00.log_marco_peereboom_2_2 [^] (988,419 bytes) 2012-02-11 17:40

- Relationships

-  Notes
(0000158)
lolcatis (reporter)
2011-12-02 00:40
edited on: 2011-12-02 00:56

This probably is related to issue 0000080

Edit: This issue also appeared without the shown OutOfMemoryError. I added the logifle, where I freshly started yacy and did some searches until it hung up. After a few minutes yacy was killed with kill -9 (as it was not reacting to kill anymore).

Used yacy_v1.0_20111127_8121.tar.gz.

(0000159)
sixcooler (developer)
2011-12-02 03:19

please have a look at http://forum.yacy-websuche.de/viewtopic.php?f=5&t=3411&p=23550#p23550 [^]
(0000228)
Quix0r (updater)
2012-01-06 08:37

A quick (little OT) note for this: If your peer still reports OutOfMemoryError while it tries to fork a thread, you should increase maximum allowed processes in limits.conf (nproc soft/hard).
(0000303)
marco_peereboom (reporter)
2012-02-11 17:12
edited on: 2012-02-11 18:08

I hate to be a "me too" but, me too!

I have a beefy enough machine and yacy isn't complaining about memory or other resources however if I search multiple words and repeat that while other searches are still ongoing then yacy hangs pegging a cpu. I see the hangs even when yacy has plenty of resources available. I attached a log and pasted some details and am capable and willing to do additional work on this. Just let me know what else would be of use.

Version is: yacy_v1.01_20111207_9000.tar.gz

My ulimit is as follows:
$ ulimit -a
time(cpu-seconds) unlimited
file(blocks) unlimited
coredump(blocks) unlimited
data(kbytes) 8388608
stack(kbytes) 32768
lockedmem(kbytes) 4056476
memory(kbytes) 4056476
nofiles(descriptors) 16384
processes 1024

top output:
load averages: 1.06, 0.91, 0.56 yacy.conformal.com 10:09:24
34 processes: 32 idle, 2 on processor
CPU0 states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle
CPU1 states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle
CPU2 states: 99.9% user, 0.0% nice, 0.1% system, 0.0% interrupt, 0.0% idle
CPU3 states: 0.1% user, 0.0% nice, 0.1% system, 0.0% interrupt, 99.7% idle
Memory: Real: 2160M/2916M act/tot Free: 1048M Cache: 547M Swap: 0K/8189M

  PID USERNAME PRI NICE SIZE RES STATE WAIT TIME CPU COMMAND
 8470 marco 64 0 3456M 2149M onproc/2 - 84:16 99.02% java

(0000304)
marco_peereboom (reporter)
2012-02-11 17:39

I can confirm that this still happens with the following version (took a little longer to reproduce):
S 2012/02/11 10:18:55 STARTUP YaCy version: 1.01/9233
S 2012/02/11 10:18:55 STARTUP Java version: 1.7.0

This time I'll attach 2 logs that should show the entire startup + hang process.
(0000326)
marco_peereboom (reporter)
2012-02-24 03:29

Anything else I can do to help this along?
(0000327)
Orbiter (manager)
2012-02-24 03:49

well I guess this must be a memory leak. I can confirm that I see similar effects, but never if the peer runs in a robinson mode. I therefore beliefe it must have a connection to the p2p processes that happen during the search. I still don't know how to narrow down the problem, so all ideas are welcome..
(0000328)
marco_peereboom (reporter)
2012-02-24 03:57

I don't see the out of memory errors. In fact yacy usually runs in like 300M and has 2.5G still available. This seems to be a deadlock of sorts. Some threads hang and some continue; for example I see some crawling still going but all website stuff halts. Got a debug thing I can throw at it?

Oh, I am not running in robinson mode, I run it in peer-to-peer.

I can add another full log of this hanging up until I'll kill it. That of any help?
(0000333)
laima (reporter)
2012-02-25 21:09
edited on: 2012-02-25 21:10

I run YaCy on Linux Ubuntu 11.10, with Open JDK 7 installed.

The crawling and the peer-to-peer interaction (I don't use Robinson) seem to trundle along fine. When I attempt to search the web through the YaCy interface, I get one search, then nothing. It hangs. Is it more stable on Windows? Is the problem the Open JDK?

May I just say thank you for this great program. P2P search should out-wit the big search companies in the end.

(0000334)
laima (reporter)
2012-02-26 13:33

My problem vanished when I installed 1.02/9000. I think I had been using 1.01/9-something.
(0000578)
Quix0r (updater)
2013-07-27 13:36
edited on: 2013-07-27 13:47

I have it back with latest GIT:
W 2013/07/27 13:24:17 StackTrace unable to create new native thread
java.lang.OutOfMemoryError: unable to create new native thread
    at java.lang.Thread.start0(Native Method)
    at java.lang.Thread.start(Thread.java:693)
    at net.yacy.crawler.robots.RobotsTxt.ensureExist(RobotsTxt.java:239)
    at net.yacy.crawler.Balancer.push(Balancer.java:297)
    at net.yacy.crawler.data.NoticedURL.push(NoticedURL.java:176)
    at net.yacy.crawler.CrawlStacker.stackCrawl(CrawlStacker.java:358)
    at net.yacy.crawler.CrawlStacker.job(CrawlStacker.java:148)
    at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at net.yacy.kelondro.workflow.InstantBlockingThread.job(InstantBlockingThread.java:99)
    at net.yacy.kelondro.workflow.AbstractBlockingThread.run(AbstractBlockingThread.java:78)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:724)

Is this really needed to start a thread for every robots.txt? Isn't a FIFO enough?

(0000582)
Orbiter (manager)
2013-07-27 16:33

in the third week of august I will rewrite the crawler; in this context I will consider a different strategy for the loading of robots.txt as well
(0000583)
Quix0r (updater)
2013-07-27 17:13

Good to hear that. Well, my OoME happened somewhere else than the reporter's one so it is maybe good to track this one in a new issue ticket.

My idea here is (rudimentary):
- One FIFO-based "worker" thread is handling all robots.txt requests.
- Other threads can queue new entries and will be called back if the file has been loaded/found in cache.
- The robots.txt can be cached (e.g. RAM and/or in existing API table, RAM for users with a lot RAM, on-disk for low-mem systems).
- Cached robots.txt can be updated by sending a HEAD request to the server (won't hurt much) with a "Modified-Since" (!!!) header which includes the date last fetch).
- Other threads must then wait (or continue with other crawler entries) to let the robots.txt worker thread finish to download.
- Already downloaded and up-to-date (necessary as these files can change, too) robots.txt files can be instantly returned to the crawler thread.
- Changed robots.txt must be flagged to make other threads (including cleanup/crawler) aware of possible disallowed/allowed new entries.
- If e.g. a new disallowed entry has been discovered, this entry must be checked and corresponding URLs must be removed from index.
- As this step "costs" a lot I/O (on big indexes) therefor it can be "remembered" somehow to let results not being shown in local results or returned to remote searches.

Please feel free to ask questions about this complex (but really good) idea.
(0000584)
Quix0r (updater)
2013-07-27 17:59

$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 63943
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1024
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
(0000628)
g4jc (reporter)
2013-10-24 00:49

Any updates on this issue? I experienced the bug as well. If needed I can try to gain support for memory leak finding.
(0000631)
Shnatsel (reporter)
2013-10-24 12:35

I'm experiencing this bug as well in version 1.4/9000
(0000633)
g4jc (reporter)
2013-10-25 01:23

Hi everyone.
I've made a new meta-bug to address this issue for the next release. There are several bugs all saying the same thing, we NEED to fix the memory issues.
http://bugs.yacy.net/view.php?id=305 [^]

This issue is a big deal for me, so I'm willing to pay USD 100.00 for it.
This offer is registered on FreedomSponsors (http://www.freedomsponsors.org/core/issue/369/perfomance-issues-fix-memory-leaks-minimize-memory-usage-and-optimize-stability-for-next-release [^] [^]).
If you solve it (according to the acceptance criteria described there), please register on FreedomSponsors and mark it as resolved there
I'll then check it out and gladly pay up!

Oh, and if anyone else also wants throw in a few bucks on this, you should check out FreedomSponsors!
(0000634)
Orbiter (manager)
2013-10-25 15:58

the specific bug was not reproducible but it was indeed reproducible that a very heavy search load (tested with a script) causes OOMs and deadlocks. This was provoked (at least) by a mixture of too-late clearing of temporary data during search, inefficient methods during the temporary storage time and an overall memory leak with was fixed yesterday.
After extensive testing, the following commit seems to fix the problem:
https://gitorious.org/yacy/rc1/commit/9bb7eab389ff997a42180b4273726b341aa80312 [^]
please re-test.

- Issue History
Date Modified Username Field Change
2011-12-02 00:34 lolcatis New Issue
2011-12-02 00:40 lolcatis Note Added: 0000158
2011-12-02 00:52 lolcatis File Added: yacy00.log
2011-12-02 00:52 lolcatis Note Edited: 0000158 View Revisions
2011-12-02 00:56 lolcatis Note Edited: 0000158 View Revisions
2011-12-02 03:19 sixcooler Note Added: 0000159
2012-01-06 08:37 Quix0r Note Added: 0000228
2012-02-11 17:12 marco_peereboom Note Added: 0000303
2012-02-11 17:13 marco_peereboom File Added: yacy00.log_marco_peereboom
2012-02-11 17:39 marco_peereboom Note Added: 0000304
2012-02-11 17:39 marco_peereboom File Added: yacy00.log_marco_peereboom_2_1
2012-02-11 17:40 marco_peereboom File Added: yacy00.log_marco_peereboom_2_2
2012-02-11 18:08 marco_peereboom Note Edited: 0000303 View Revisions
2012-02-24 03:29 marco_peereboom Note Added: 0000326
2012-02-24 03:49 Orbiter Note Added: 0000327
2012-02-24 03:57 marco_peereboom Note Added: 0000328
2012-02-25 21:09 laima Note Added: 0000333
2012-02-25 21:10 laima Note Edited: 0000333 View Revisions
2012-02-26 13:33 laima Note Added: 0000334
2013-07-27 13:36 Quix0r Note Added: 0000578
2013-07-27 13:47 Quix0r Note Edited: 0000578 View Revisions
2013-07-27 16:33 Orbiter Note Added: 0000582
2013-07-27 17:13 Quix0r Note Added: 0000583
2013-07-27 17:59 Quix0r Note Added: 0000584
2013-10-24 00:49 g4jc Note Added: 0000628
2013-10-24 12:35 Shnatsel Note Added: 0000631
2013-10-25 01:23 g4jc Note Added: 0000633
2013-10-25 15:58 Orbiter Note Added: 0000634
2013-10-25 15:58 Orbiter Status new => resolved
2013-10-25 15:58 Orbiter Resolution open => fixed
2013-10-25 15:58 Orbiter Assigned To => Orbiter


Copyright © 2000 - 2019 MantisBT Team
Powered by Mantis Bugtracker