YaCy-Bugtracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0000761YaCy[All Projects] Generalpublic2017-07-02 23:512017-07-05 19:28
Reportersmokingwheels 
Assigned To 
PrioritynormalSeveritycrashReproducibilitysometimes
StatusnewResolutionopen 
ETAnone 
PlatformLinux Low End machineOSUbuntu 16.04OS Version
Product VersionYaCy 1.9 
Target VersionFixed in Version 
Summary0000761: Remote crawler causes JVM GC overhead limit exceeded
DescriptionSeems to be sites with a bit/Lot of .js causes this error.
After crash when trying to terminate java, pkill java is unable to do anything a system reboot is needed.

I tried a few times and reduced the speed of remote indexing but did not make much difference.

I have had this problem in the past but did not look into it.
Additional InformationW 2017/07/03 04:38:08 ConcurrentLog java.io.IOException: org.apache.solr.common.SolrException: Exception writing document id 0gtV3CeMd4oB to the index; possible analysis error.
java.io.IOException: org.apache.solr.common.SolrException: Exception writing document id 0gtV3CeMd4oB to the index; possible analysis error.
    at net.yacy.cora.federate.solr.connector.SolrServerConnector.add(SolrServerConnector.java:212)
    at net.yacy.cora.federate.solr.connector.MirrorSolrConnector.add(MirrorSolrConnector.java:171)
    at net.yacy.search.index.Fulltext.putDocument(Fulltext.java:279)
    at net.yacy.search.index.Segment.putDocument(Segment.java:467)
    at net.yacy.search.index.Segment.storeDocument(Segment.java:510)
    at net.yacy.search.Switchboard.storeDocumentIndex(Switchboard.java:2601)
    at net.yacy.search.Switchboard.storeDocumentIndex(Switchboard.java:2551)
    at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at net.yacy.kelondro.workflow.InstantBlockingThread.job(InstantBlockingThread.java:56)
    at net.yacy.kelondro.workflow.AbstractBlockingThread.run(AbstractBlockingThread.java:44)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.solr.common.SolrException: Exception writing document id 0gtV3CeMd4oB to the index; possible analysis error.
    at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:206)
    at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:67)
    at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55)
    at org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:979)
    at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1192)
    at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:748)
    at org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
    at org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:261)
    at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:188)
    at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
    at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
    at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
    at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
    at org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:179)
    at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:160)
    at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:173)
    at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:190)
    at net.yacy.cora.federate.solr.connector.SolrServerConnector.add(SolrServerConnector.java:209)
    ... 16 more
Caused by: org.apache.lucene.store.AlreadyClosedException: this IndexWriter is closed
    at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:749)
    at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:763)
    at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1567)
    at org.apache.solr.update.DirectUpdateHandler2.updateDocument(DirectUpdateHandler2.java:924)
    at org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHandler2.java:913)
    at org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2.java:302)
    at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:239)
    at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:194)
    ... 33 more
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
TagsNo tags attached.
Attached Filespng file icon Screenshot from 2017-07-03 06-00-08.png [^] (171,912 bytes) 2017-07-03 00:03


gz file icon Threads.txt.tar.gz [^] (34,946 bytes) 2017-07-04 10:03
gz file icon crashhostdump.txt.tar.gz [^] (902,073 bytes) 2017-07-05 07:19

- Relationships

-  Notes
(0001458)
smokingwheels (reporter)
2017-07-03 00:05
edited on: 2017-07-03 01:59

Screenshot from 2017-07-03 06-00-08.png
Remote crawler speed spikes. Unsure why.
Related to http://mantis.tokeek.de/view.php?id=656 [^]
Yacy will start but is swapping on my pc.

(0001459)
luc (reporter)
2017-07-03 15:08

If you can launch again your peer with the JVM option "-XX:+HeapDumpOnOutOfMemoryError" enabled (this generates a heap memory dump once a OutOfMemory error is thrown), this could greatly help analysis.

The generated hprof files can be quite large, so you may not want to share it directly, but instead open it with JVisualVM, and check some key points :
 - "Threads at the heap dump"
 - Find "N" biggest objects by retained size
 - check the "References" to the biggest objects in the "Instances" view
(0001460)
smokingwheels (reporter)
2017-07-03 15:08
edited on: 2017-07-03 23:29

MEMORY performed necessary GC, freed 138355 KB (requested/available/average: 125000 / 261476 / 195498 KB) is showing up in the logs now.
I have double/parallel GC threads now.

#get javastart args
JAVA_ARGS=" -server -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djsse.enableSNIExtension=false -XX:+UseParNewGC -XX:ParallelGCThreads=2 -d64 ";

If you use a 32 bit system you must Remove -d64.

I also added a space in the JAVA_ARGS=" -server. Must have Java JDK installed to use this option.

My system has 4GB in total and I am trying these settings.
javastart_Xmx=Xmx1920m
javastart_Xms=Xms384m

One little Tweet.
W 2017/07/03 21:19:08 IODispatcher Could not add merge job to queue: Queue full

-XX:ParallelGCThreads=4
Seems to be ok was 2 did not crash for 5.5 hours crawling.

(0001461)
smokingwheels (reporter)
2017-07-04 10:07

Threads.txt.tar.gz
Just a bunch of thread dumps from june to now.

In peers with IPv6 only it causes a problem for my IPv4 peers not to sure really what to do.
(0001462)
smokingwheels (reporter)
2017-07-04 14:11

Ok the Dumps are around 2GB.

It gets stuck on this.

I 2017/07/04 19:16:20 HeapReader * close HeapFile citation.index.20170703130353782.blob; trace: net.yacy.kelondro.blob.HeapModifier.close(HeapModifier.java:50) -> net.yacy.kelondro.blob.HeapModifier.close(HeapModifier.java:54) -> net.yacy.kelondro.blob.HeapModifier.finalize(HeapModifier.java:58) -> java.lang.System$2.invokeFinalize(System.java:1270) -> java.lang.ref.Finalizer.runFinalizer(Finalizer.java:98) -> java.lang.ref.Finalizer.access$100(Finalizer.java:34) -> java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:210)
I 2017/07/04 19:17:19 MEMORY * performed explicit GC, freed 1595366 KB (requested/available/average: 93851 / 1664166 / 350964 KB)

I will have a look at the dumps with QB64 and see what I find going on what you said. Its too big to transfer anywhere.
(0001463)
smokingwheels (reporter)
2017-07-04 21:34
edited on: 2017-07-04 22:04

I tried various option but an old PC. One program tells me the dump file are no good.

Short Video of java 8 Visual VM running not sure which buttons to push, I tried a few.

https://youtu.be/OQIx_XxU9v8 [^]

I started Yacy then the Visual VM.

(0001464)
smokingwheels (reporter)
2017-07-05 07:32

crashhostdump.txt.tar.gz
It is a very small part of an XML dump from my peer. I increased the JVM and was able to start yacy long enough to get a 3 GB XML dump file.

If you use a search function in a word processor look for the string zzzz this is the end of each XML record defined by each of the <doc><str name codes.
 
The last number indicates the length of the line input of one line. Some are very long I have no idea what I am looking at so I will leave that for someone else. Values/text are over 500 000 in length in some cases.
(0001466)
smokingwheels (reporter)
2017-07-05 19:28

https://youtu.be/AvRUntG9j30 [^]
Is with 50 GC threads and lower memory settings but new install. I tried an upgrade on the crashing server and I could not reduce the memory settings.
Currently running Xmx = 1024 Xms = 512.

With 4 GC threads, I noticed the Survivor 0 and 1 had timing gaps on the trace.

I know the 50 maybe overkill. I have tried 16 and upwards.

The average Load of the system is higher when crawling but the actual Java load is normal my system still responds ok.

- Issue History
Date Modified Username Field Change
2017-07-02 23:51 smokingwheels New Issue
2017-07-03 00:03 smokingwheels File Added: Screenshot from 2017-07-03 06-00-08.png
2017-07-03 00:05 smokingwheels Note Added: 0001458
2017-07-03 01:59 smokingwheels Note Edited: 0001458 View Revisions
2017-07-03 15:08 luc Note Added: 0001459
2017-07-03 15:08 smokingwheels Note Added: 0001460
2017-07-03 15:10 smokingwheels Note Edited: 0001460 View Revisions
2017-07-03 15:20 smokingwheels Note Edited: 0001460 View Revisions
2017-07-03 23:29 smokingwheels Note Edited: 0001460 View Revisions
2017-07-04 10:03 smokingwheels File Added: Threads.txt.tar.gz
2017-07-04 10:07 smokingwheels Note Added: 0001461
2017-07-04 14:11 smokingwheels Note Added: 0001462
2017-07-04 21:34 smokingwheels Note Added: 0001463
2017-07-04 22:04 smokingwheels Note Edited: 0001463 View Revisions
2017-07-05 07:19 smokingwheels File Added: crashhostdump.txt.tar.gz
2017-07-05 07:32 smokingwheels Note Added: 0001464
2017-07-05 19:28 smokingwheels Note Added: 0001466


Copyright © 2000 - 2017 MantisBT Team
Powered by Mantis Bugtracker