YaCy-Bugtracker - YaCy
View Issue Details
0000376YaCy[All Projects] Generalpublic2014-02-22 05:462014-02-22 05:46
korvin 
 
normalcrashalways
newopen 
none 
LinuxKubuntu12.04
YaCy 1.6 
 
0000376: OutOfMemory during indexing of ru.wikipedia.org on a large database of 60 GB
YaCy version
Heap memory was set up to 2048 MB

At the moment there were sufficiently large DB of 60 GB with around 3 million of documents and approx. 100 million of edges in the webgraph. Still, all worked fine and crawler was on it's way indexing ru.wikipedia.org when it happened. Local crawler queue was around 500 000.

At the point uptime was 2 days. A lot of crawlers were set up, still wikipedia one was the only active these days.

E 2014/02/22 02:53:43 BLOCKINGTHREAD Internal Error in serverInstantThread.job: null
E 2014/02/22 02:53:43 BLOCKINGTHREAD shutting down thread 'java.lang.reflect.Method.storeDocumentIndex.7'
W 2014/02/22 02:53:43 ConcurrentLog null
java.lang.reflect.InvocationTargetException
    at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at net.yacy.kelondro.workflow.InstantBlockingThread.job(InstantBlockingThread.java:99)
    at net.yacy.kelondro.workflow.AbstractBlockingThread.run(AbstractBlockingThread.java:78)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
    at java.util.Arrays.copyOfRange(Arrays.java:2694)
    at java.lang.String.<init>(String.java:203)
    at java.lang.StringBuilder.toString(StringBuilder.java:405)
    at net.yacy.search.schema.WebgraphConfiguration.getEdge(WebgraphConfiguration.java:158)
    at net.yacy.search.schema.WebgraphConfiguration.addEdges(WebgraphConfiguration.java:125)
    at net.yacy.search.schema.CollectionConfiguration.yacy2solr(CollectionConfiguration.java:849)
    at net.yacy.search.index.Segment.storeDocument(Segment.java:661)
    at net.yacy.search.Switchboard.storeDocumentIndex(Switchboard.java:2826)
    at net.yacy.search.Switchboard.storeDocumentIndex(Switchboard.java:2767)
    ... 10 more
W 2014/02/22 02:53:43 ConcurrentLog GC overhead limit exceeded
java.lang.OutOfMemoryError: GC overhead limit exceeded
    at java.util.Arrays.copyOfRange(Arrays.java:2694)
    at java.lang.String.<init>(String.java:203)
    at java.lang.StringBuilder.toString(StringBuilder.java:405)
    at net.yacy.search.schema.WebgraphConfiguration.getEdge(WebgraphConfiguration.java:158)
    at net.yacy.search.schema.WebgraphConfiguration.addEdges(WebgraphConfiguration.java:125)
    at net.yacy.search.schema.CollectionConfiguration.yacy2solr(CollectionConfiguration.java:849)
    at net.yacy.search.index.Segment.storeDocument(Segment.java:661)
    at net.yacy.search.Switchboard.storeDocumentIndex(Switchboard.java:2826)
    at net.yacy.search.Switchboard.storeDocumentIndex(Switchboard.java:2767)
    at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at net.yacy.kelondro.workflow.InstantBlockingThread.job(InstantBlockingThread.java:99)
    at net.yacy.kelondro.workflow.AbstractBlockingThread.run(AbstractBlockingThread.java:78)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)
W 2014/02/22 02:53:43 ConcurrentLog GC overhead limit exceeded
java.lang.OutOfMemoryError: GC overhead limit exceeded
    at java.util.Arrays.copyOfRange(Arrays.java:2694)
    at java.lang.String.<init>(String.java:203)
    at java.lang.StringBuilder.toString(StringBuilder.java:405)
    at net.yacy.search.schema.WebgraphConfiguration.getEdge(WebgraphConfiguration.java:158)
    at net.yacy.search.schema.WebgraphConfiguration.addEdges(WebgraphConfiguration.java:125)
    at net.yacy.search.schema.CollectionConfiguration.yacy2solr(CollectionConfiguration.java:849)
    at net.yacy.search.index.Segment.storeDocument(Segment.java:661)
    at net.yacy.search.Switchboard.storeDocumentIndex(Switchboard.java:2826)
    at net.yacy.search.Switchboard.storeDocumentIndex(Switchboard.java:2767)
    at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at net.yacy.kelondro.workflow.InstantBlockingThread.job(InstantBlockingThread.java:99)
    at net.yacy.kelondro.workflow.AbstractBlockingThread.run(AbstractBlockingThread.java:78)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)
E 2014/02/22 02:53:43 BLOCKINGTHREAD Runtime Error in serverInstantThread.job, thread 'java.lang.reflect.Method.storeDocumentIndex.7': null

Just crawl and wait until memory is filled.
No tags attached.
log yacy04.log (1,048,588) 2014-02-22 05:46
http://mantis.tokeek.de/file_download.php?file_id=106&type=bug
Issue History
2014-02-22 05:46korvinNew Issue
2014-02-22 05:46korvinFile Added: yacy04.log

There are no notes attached to this issue.