YaCy-Bugtracker - YaCy
View Issue Details
0000769YaCy[All Projects] Generalpublic2017-09-30 03:142020-07-22 15:03
smokingwheels 
 
normalminoralways
newopen 
none 
YaCy 1.9 
 
0000769: http interface is not responding after crawling for a while.
I set my peer to crawl at approx 10% of peak speed overnight (500 PPM).
Log files indicate it is still working.

I downloaded Visualvm and ran that and not sure of the command to get a heapdump?
Did a few screen dumps a threaddump and 2 yacy logfiles.
Also have a host file.

http://45.77.100.168/yacylogs30am.tar.gz [^] 6139421 bytes
http://45.77.100.168/hosts [^] 15157 bytes

This is a temporary site may run out or disappear in the future.
No tags attached.
gz Screenshot from 2017-10-01 09-11-37.png.tar.gz (122,278) 2017-10-01 03:39
http://mantis.tokeek.de/file_download.php?file_id=299&type=bug
log hs_err_pid24830.log (179,765) 2017-10-01 12:05
http://mantis.tokeek.de/file_download.php?file_id=300&type=bug
pdf installing-an-easy-http-proxy-cache-polipo-805-lznv54.pdf (36,239) 2017-10-02 01:55
http://mantis.tokeek.de/file_download.php?file_id=301&type=bug
Issue History
2017-09-30 03:14smokingwheelsNew Issue
2017-09-30 03:24smokingwheelsNote Added: 0001473
2017-09-30 06:24smokingwheelsNote Edited: 0001473bug_revision_view_page.php?bugnote_id=1473#r451
2017-10-01 03:31smokingwheelsNote Added: 0001474
2017-10-01 03:32smokingwheelsNote Edited: 0001474bug_revision_view_page.php?bugnote_id=1474#r453
2017-10-01 03:39smokingwheelsFile Added: Screenshot from 2017-10-01 09-11-37.png.tar.gz
2017-10-01 03:41smokingwheelsNote Edited: 0001474bug_revision_view_page.php?bugnote_id=1474#r454
2017-10-01 12:04smokingwheelsNote Added: 0001475
2017-10-01 12:05smokingwheelsFile Added: hs_err_pid24830.log
2017-10-01 12:06smokingwheelsNote Edited: 0001475bug_revision_view_page.php?bugnote_id=1475#r456
2017-10-02 01:55smokingwheelsFile Added: installing-an-easy-http-proxy-cache-polipo-805-lznv54.pdf
2017-10-02 02:00smokingwheelsNote Added: 0001476
2017-10-03 15:38smokingwheelsNote Added: 0001479
2017-10-05 10:07lucNote Added: 0001481
2017-10-05 14:37smokingwheelsNote Added: 0001482
2017-10-05 14:40smokingwheelsNote Edited: 0001482bug_revision_view_page.php?bugnote_id=1482#r462

Notes
(0001473)
smokingwheels   
2017-09-30 03:24   
(edited on: 2017-09-30 06:24)
I have noticed on a forceful end task (pkill) when starting again it opens crawler stacks with 0 urls.
Also Pi config -XX:+UseParNewGC -XX:ParallelGCThreads=2
Old pc -XX:+UseParNewGC -XX:ParallelGCThreads=8 -d64

(0001474)
smokingwheels   
2017-10-01 03:31   
(edited on: 2017-10-01 03:41)
Cloned and did a fresh peer after running e4defrag and started to crawl a few sites it was still crawling this morning. 500 ppm was the top crawl speed.
Using -XX:+UseParNewGC -XX:ParallelGCThreads=7 -d64
Why does the fresh install Version show up 1.921/9000?
All the others upgrades where 9388.
Did a heapdump and found the file.
http://45.77.100.168/heapdumps.tar.gz [^] 116304045 Bytes
I will try the other peer after upgrade today sometime.
The screen shot is of JVisualM Screenshot from 2017-10-01 09-11-37.png.tar.gz

(0001475)
smokingwheels   
2017-10-01 12:04   
(edited on: 2017-10-01 12:06)
I enabled the jsresort and it crashed my peer.
I did an emergency restore of /DATA/index/*.* to another newfolder of a yacy server because of disk performance issues in the particular folder I was using.
I tested with dd if=/dev/zero of=/mnt/md0/test bs=512k count=20
and had over 600 MB/s an hot upgrade with yacy stopped I was getting lessthan 1 MB/s transfer.
I modified Grub. EDIT: to solve this, either remove enough RAM, or add “mem=8G” as kernel boot parameter (e.g. in /etc/default/grub on Ubuntu — don’t forget to run update-grub !) https://fhackts.wordpress.com/2014/03/10/very-slow-disk-write-performance-linux/ [^] I have no idea really but there is a hs_err_pid24830.log in the old yacy folder. #ShiftDel

(0001476)
smokingwheels   
2017-10-02 02:00   
I am running a trial Polipo Proxy DNS cache server to see if that helps.

I ran DNS Benchmark for https://www.grc.com/dns/benchmark.htm [^] with Playonlinux and rebuilt the custom resolvers list this may help.
(0001479)
smokingwheels   
2017-10-03 15:38   
Postprocessing Progress
busy:collecting 8087 documents from the collection for harvestkey null, partitioned by responsetime_i
http://45.77.100.168/heapdumps2.tar.gz [^]
http://45.77.100.168/heapdumps3.tar.gz [^]
http://45.77.100.168/heapdumps4.tar.gz [^]
http://45.77.100.168/heapdumps5.tar.gz [^]
http://45.77.100.168/heapdumps6.tar.gz [^]
Something about long string.
I removed some html files later on.
(0001481)
luc   
2017-10-05 10:07   
Hi smokingwheels, quite much data to analyze here :)

In the log files you provide in yacylogs30am.tar.gz, apart the traced OutOfMemory error, there is an interesting recurrent exception :

org.eclipse.jetty.server.AbstractConnector
java.nio.channels.ClosedSelectorException
    at sun.nio.ch.SelectorImpl.keys(SelectorImpl.java:68)
    at org.eclipse.jetty.io.ManagedSelector.size(ManagedSelector.java:104)
        ...

There is maybe something to dig here. And this kind of error seems to be mentioned in some other projects using Jetty as a webserver...
(0001482)
smokingwheels   
2017-10-05 14:37   
(edited on: 2017-10-05 14:40)
Hi Luc,
Dont worry to much I think my file system was set too high (775).
My Limit Crawler sits at 0 now.

Fixed now but thats the sort of thing that can happen if you leave the door open now days.

If you could run the jetty server thread in a controlled background process that may help. Sort of like a silent 503 but keep the caller online.