Summary0000769: http interface is not responding after crawling for a while.
DescriptionI set my peer to crawl at approx 10% of peak speed overnight (500 PPM).
Log files indicate it is still working.

I downloaded Visualvm and ran that and not sure of the command to get a heapdump?
Did a few screen dumps a threaddump and 2 yacy logfiles.
Also have a host file. [^] 6139421 bytes [^] 15157 bytes

This is a temporary site may run out or disappear in the future.
Screenshot from 2017-10-01 09-11-37.png.tar.gz (122,278 bytes) 2017-10-01 03:39
hs_err_pid24830.log (179,765 bytes) 2017-10-01 12:05
installing-an-easy-http-proxy-cache-polipo-805-lznv54.pdf (36,239 bytes) 2017-10-02 01:55

smokingwheels (reporter)
2017-09-30 03:24
edited on: 2017-09-30 06:24

I have noticed on a forceful end task (pkill) when starting again it opens crawler stacks with 0 urls.
Also Pi config -XX:+UseParNewGC -XX:ParallelGCThreads=2
Old pc -XX:+UseParNewGC -XX:ParallelGCThreads=8 -d64

smokingwheels (reporter)
2017-10-01 03:31
edited on: 2017-10-01 03:41

Cloned and did a fresh peer after running e4defrag and started to crawl a few sites it was still crawling this morning. 500 ppm was the top crawl speed.
Using -XX:+UseParNewGC -XX:ParallelGCThreads=7 -d64
Why does the fresh install Version show up 1.921/9000?
All the others upgrades where 9388.
Did a heapdump and found the file. [^] 116304045 Bytes
I will try the other peer after upgrade today sometime.
The screen shot is of JVisualM Screenshot from 2017-10-01 09-11-37.png.tar.gz

smokingwheels (reporter)
2017-10-01 12:04
edited on: 2017-10-01 12:06

I enabled the jsresort and it crashed my peer.
I did an emergency restore of /DATA/index/*.* to another newfolder of a yacy server because of disk performance issues in the particular folder I was using.
I tested with dd if=/dev/zero of=/mnt/md0/test bs=512k count=20
and had over 600 MB/s an hot upgrade with yacy stopped I was getting lessthan 1 MB/s transfer.
I modified Grub. EDIT: to solve this, either remove enough RAM, or add “mem=8G” as kernel boot parameter (e.g. in /etc/default/grub on Ubuntu — don’t forget to run update-grub !) https://fhackts.wordpress.com/2014/03/10/very-slow-disk-write-performance-linux/ [^] I have no idea really but there is a hs_err_pid24830.log in the old yacy folder. #ShiftDel

smokingwheels (reporter)
2017-10-02 02:00

I am running a trial Polipo Proxy DNS cache server to see if that helps.

I ran DNS Benchmark for https://www.grc.com/dns/benchmark.htm [^] with Playonlinux and rebuilt the custom resolvers list this may help.
smokingwheels (reporter)
2017-10-03 15:38

Postprocessing Progress
busy:collecting 8087 documents from the collection for harvestkey null, partitioned by responsetime_i [^] [^] [^] [^] [^]
Something about long string.
I removed some html files later on.
luc (reporter)
2017-10-05 10:07

Hi smokingwheels, quite much data to analyze here :)

In the log files you provide in yacylogs30am.tar.gz, apart the traced OutOfMemory error, there is an interesting recurrent exception :

    at sun.nio.ch.SelectorImpl.keys(SelectorImpl.java:68)
    at org.eclipse.jetty.io.ManagedSelector.size(ManagedSelector.java:104)

There is maybe something to dig here. And this kind of error seems to be mentioned in some other projects using Jetty as a webserver...
smokingwheels (reporter)
2017-10-05 14:37
edited on: 2017-10-05 14:40

Hi Luc,
Dont worry to much I think my file system was set too high (775).
My Limit Crawler sits at 0 now.

Fixed now but thats the sort of thing that can happen if you leave the door open now days.

If you could run the jetty server thread in a controlled background process that may help. Sort of like a silent 503 but keep the caller online.

