0000751: Thread Dumps for anyone's PDF's screen dumps and what was happening at the time.
2017-06-07 20:08
Windows Linux MAC
0000751: Thread Dumps for anyone's PDF's screen dumps and what was happening at the time.
A Sort of Torture and Test thread record for anyone to use and add things.

Eg the Mouse movement becomes laggy or slow to respond

yacy-Debug.tar.gz (1,132,522) 2017-06-07 20:16
YaCy 'anonufe-11011755-15'_ Crawler 600 mb cloud.pdf (217,731) 2017-06-07 20:40
YaCy 'webportal'_ Console Status result of hosts file.pdf (141,988) 2017-06-07 22:41
Debug 06-08-2017 8-35.tar.gz (1,979,731) 2017-06-08 03:49
Debug Threads 9-07--53 slow crawling.tar.gz (2,591,553) 2017-06-08 03:52
YaCy 'anonufe-11011755-15'_ Console Status crawler very slow9-07-53.pdf (115,044) 2017-06-08 03:58
Debug 06-08-17 10-06-42.tar.gz (3,461,256) 2017-06-08 16:28
Threads.txt.tar.gz (10,345) 2017-06-12 04:15
Yacy_java_memory.ods (17,664) 2017-06-12 23:40
Yacy_java_calculator.png (176,507) 2017-06-12 23:45

Quad-cloud-and-home-using-trial-memory-settings.tar.gz (403,556) 2017-06-13 11:45
Local and Cloud Debug threads.
File names like -18 is equal to renice -18 -p 1234 eg java
Note Yacy Code is modified though.

2017-06-07 20:41
YaCy 'anonufe-11011755-15'_ Crawler 600 mb cloud.pdf
1 CPU 1 GB
1 CPU 1 GB

Crawler has paused not enough memory. pending: collection=191059

YaCy version: 1.92/9231
Uptime: 2 days 03:04
Java version: 1.8.0_131
Processors: 1
Load: 1.19
Threads: 251/20, peak:526, total:55345

Was crawling at approx 300 PPM
YaCy 'webportal'_ Console Status result of hosts file.pdf [^] (141,988 bytes) 2017-06-07 22:41

http://winhelp2002.mvps.org/HOSTS.HTM [^] There are many other sites to choose from.

Blocked Threads not so often.
2017-06-08 03:51
Debug 06-08-2017 8-35.tar.gz

Crawling webportal running ok
20 dumps

Crawling webportal running ok
20 dumps
2017-06-08 03:57   
(edited on: 2017-06-08 04:51)
Debug Threads 9-07--53 slow crawling.tar.gz [^] (2,591,553 bytes) 2017-06-08 03:52

Noticed crawler speed dropped down 0-500 ppm
Started Logger recording
20 Dumps
Shutdown yacy.
190 k in collection.

YaCy 'anonufe-11011755-15'_ Console Status crawler very slow9-07-53.pdf [^] (115,044 bytes) 2017-06-08 03:58
Status page during the event.

Wow great job!

Not easy to analyze but very valuable.

The "Debug Threads 9-07--53 slow crawling.tar.gz" traces seems to clearly show that there are some threads inter-blocked by operations on the crawler Cache, likely all waiting for this one :
"Thread= ArrayStack.DELETE_EXECUTOR_pool-1-thread-3821 id=31967 TIMED_WAITING
at java.io.RandomAccessFile.readFully(RandomAccessFile.java:436)".

I guess this won't be easy to solve. Not sure for now why that read operation on a heap file could be blocking so long...
(edited on: 2017-06-08 18:27)
Does this link work its a file access log while crawling?

https://www.dropbox.com/s/5vi8lzqoee1vwty/disk.txt.tar.gz?dl=0 [^]

2017-06-08 16:31   
(edited on: 2017-06-08 16:42)
Debug 06-08-17 10-06-42.tar.gz

While crawling slowly before I terminated it.

Webportal is running ok in the cloud but that is on a different class of hardware.

The particular DNS servers you use may affect it also because my router was hanging after 12 hours or so of crawling.

I guess the dropbox link should work, but personally I don't want to install dropbox on my computer... by the way you provided already much to analyze, which should inspire some improvements to try.
Thanks for the feedback @Luc that is good news to hear.
It took 4-5 hours to work out a TAB key on my Linux PC needs 100ms per code and a double release code to make it work the send 10 at a time...

I have just Upgraded my home setup to 1.921/9236
I guess I will keep my cloud at the version I did created this post in, while testing from home the new version.

Just a minor issue crawler briefly slowed down.
I was using the mouse to refresh the output.

Thread= Finalizer daemon id=3 BLOCKED
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)

Thread= Java2D Disposer daemon id=124 BLOCKED
at java.lang.Thread.run(Thread.java:748)
I have room in my Dropbox to store the modified version, I am thinking along the lines of trying 8 $5.00 modules in the next experiment.

I think I can use or share or generate a link that does not show up in the public folder. I will look into that first.
2017-06-10 01:17   
(edited on: 2017-06-10 12:49)
I ran out of JAVA heap space HTTP 500 doing a crawl for awhile.
1.921/9236 600 MB

I think the start up config of javastart_Xms=Xms is normally about 130 MB from my memory.
I have found around 200 mb is more stable but it depends on the memory available of the particular system in question. eg Raspberry PI, $5.00 cloud.
For Linux if JAVA will Shell to a Bash prompt and return to the adjust it semi automatic for 1 time configuration of a peer. https://www.cyberciti.biz/faq/ram-size-linux/ [^]
You maybe already may have the figure in the system somewhere (I think I did look up the command java shell back in version 1.69X).
$ less /proc/meminfo > memory.txt
$ cat /proc/meminfo > memory.txt

The startup Linux nice setting I think was 0 and is now 19 I will look into that

In my New cloud install the JAVA Nice setting is set to 0 https://twitter.com/smokingwheels/status/873488505099370497 [^]

After restart Nice is 10.

After Increasing JVM Nice is 19.

Just installed a spare PC will try.
ssh address-of-B 'wget -O - http://server-C/whatever' [^] >> whatever
I will also used it for backups.
I guess in the rush or the chances I take, I did not start a new copy after a usual few mods.

Its great my internet has very little bandwidth left with the default 6000 crawl I was setting up over 8000 before.

My little keyboard is now doing a few speed tests. If you would like to see 1 or 2 I can post them just let me know?
Uploading on my ADSL2 connection takes 2 Hours, I Could install Ubuntu with the desktop version in my cloud because you have room for 2 custom ISO's and run this code modifier remotely.
https://youtu.be/3hOkcVbE0yQ [^]

Just at the end of this month I will be hit by 10% GST from my Cloud provider.
HTTP Error 408 Request timeout
I got this when checking new site to crawl in browser.

Is there anyway to drop the site from the crawler Que if it happens too much like 3 times?
2017-06-12 04:19   
(edited on: 2017-06-12 21:03)
Threads.txt.tar.gz [^] (10,345 bytes) 2017-06-12 04:15

at net.yacy.kelondro.index.RAMIndex.has(RAMIndex.java:137) [assert (key != null);]
Could be %3Bamp% in the url not to sure.

I did list it in the blacklist and the error seems to have gone.

(edited on: 2017-06-12 23:49)

I made a Java memory calculator based on the default settings. I am testing at the moment on a 1 GB P4 doing a crawl on adsl2.

Just a Quick reference guide to try for Non Linux users. The java settings are all in eg 1024 256 128 increments etc.

Contains 2 pics Memory usage the cloud has 8 GB.
The GC seems to be working differently. Really too early to tell though.
improved by @luc
https://github.com/yacy/yacy_search_server/commit/a7394b479b4a2ecb9bd0c9696355e5d1008b8742 [^]