YaCy-Bugtracker - YaCy
View Issue Details
0000751YaCyWishlist - Wunschlistepublic2017-06-07 20:082017-06-15 22:59
smokingwheels 
administrator 
nonetextN/A
resolvedfixed 
none 
Windows Linux MAC
 
 
0000751: Thread Dumps for anyone's PDF's screen dumps and what was happening at the time.
A Sort of Torture and Test thread record for anyone to use and add things.

Eg the Mouse movement becomes laggy or slow to respond

  
No tags attached.
gz yacy-Debug.tar.gz (1,132,522) 2017-06-07 20:16
http://mantis.tokeek.de/file_download.php?file_id=272&type=bug
pdf YaCy 'anonufe-11011755-15'_ Crawler 600 mb cloud.pdf (217,731) 2017-06-07 20:40
http://mantis.tokeek.de/file_download.php?file_id=274&type=bug
pdf YaCy 'webportal'_ Console Status result of hosts file.pdf (141,988) 2017-06-07 22:41
http://mantis.tokeek.de/file_download.php?file_id=275&type=bug
gz Debug 06-08-2017 8-35.tar.gz (1,979,731) 2017-06-08 03:49
http://mantis.tokeek.de/file_download.php?file_id=276&type=bug
gz Debug Threads 9-07--53 slow crawling.tar.gz (2,591,553) 2017-06-08 03:52
http://mantis.tokeek.de/file_download.php?file_id=277&type=bug
pdf YaCy 'anonufe-11011755-15'_ Console Status crawler very slow9-07-53.pdf (115,044) 2017-06-08 03:58
http://mantis.tokeek.de/file_download.php?file_id=278&type=bug
gz Debug 06-08-17 10-06-42.tar.gz (3,461,256) 2017-06-08 16:28
http://mantis.tokeek.de/file_download.php?file_id=279&type=bug
gz Threads.txt.tar.gz (10,345) 2017-06-12 04:15
http://mantis.tokeek.de/file_download.php?file_id=283&type=bug
? Yacy_java_memory.ods (17,664) 2017-06-12 23:40
http://mantis.tokeek.de/file_download.php?file_id=284&type=bug
png Yacy_java_calculator.png (176,507) 2017-06-12 23:45
http://mantis.tokeek.de/file_download.php?file_id=285&type=bug
png

gz Quad-cloud-and-home-using-trial-memory-settings.tar.gz (403,556) 2017-06-13 11:45
http://mantis.tokeek.de/file_download.php?file_id=286&type=bug
Issue History
2017-06-07 20:08smokingwheelsNew Issue
2017-06-07 20:15smokingwheelsNote Added: 0001412
2017-06-07 20:16smokingwheelsFile Added: yacy-Debug.tar.gz
2017-06-07 20:17smokingwheelsNote Edited: 0001412bug_revision_view_page.php?bugnote_id=1412#r407
2017-06-07 20:18smokingwheelsNote Edited: 0001412bug_revision_view_page.php?bugnote_id=1412#r408
2017-06-07 20:40smokingwheelsFile Added: YaCy 'anonufe-11011755-15'_ Crawler 600 mb cloud.pdf
2017-06-07 20:41smokingwheelsNote Added: 0001413
2017-06-07 22:41smokingwheelsFile Added: YaCy 'webportal'_ Console Status result of hosts file.pdf
2017-06-07 22:44smokingwheelsNote Added: 0001414
2017-06-08 03:49smokingwheelsFile Added: Debug 06-08-2017 8-35.tar.gz
2017-06-08 03:51smokingwheelsNote Added: 0001415
2017-06-08 03:52smokingwheelsFile Added: Debug Threads 9-07--53 slow crawling.tar.gz
2017-06-08 03:57smokingwheelsNote Added: 0001416
2017-06-08 03:58smokingwheelsFile Added: YaCy 'anonufe-11011755-15'_ Console Status crawler very slow9-07-53.pdf
2017-06-08 03:59smokingwheelsNote Edited: 0001416bug_revision_view_page.php?bugnote_id=1416#r410
2017-06-08 04:51smokingwheelsNote Edited: 0001416bug_revision_view_page.php?bugnote_id=1416#r411
2017-06-08 09:35lucNote Added: 0001417
2017-06-08 16:20smokingwheelsNote Added: 0001418
2017-06-08 16:28smokingwheelsFile Added: Debug 06-08-17 10-06-42.tar.gz
2017-06-08 16:31smokingwheelsNote Added: 0001419
2017-06-08 16:42smokingwheelsNote Edited: 0001419bug_revision_view_page.php?bugnote_id=1419#r413
2017-06-08 18:27smokingwheelsNote Edited: 0001418bug_revision_view_page.php?bugnote_id=1418#r415
2017-06-08 23:41lucNote Added: 0001421
2017-06-09 23:42smokingwheelsNote Added: 0001423
2017-06-09 23:54smokingwheelsNote Added: 0001424
2017-06-10 01:17smokingwheelsNote Added: 0001425
2017-06-10 08:34smokingwheelsNote Added: 0001426
2017-06-10 11:38smokingwheelsNote Added: 0001427
2017-06-10 12:44smokingwheelsNote Edited: 0001425bug_revision_view_page.php?bugnote_id=1425#r420
2017-06-10 12:49smokingwheelsNote Edited: 0001425bug_revision_view_page.php?bugnote_id=1425#r421
2017-06-12 04:14smokingwheelsNote Added: 0001429
2017-06-12 04:15smokingwheelsFile Added: Threads.txt.tar.gz
2017-06-12 04:19smokingwheelsNote Added: 0001430
2017-06-12 21:03smokingwheelsNote Edited: 0001430bug_revision_view_page.php?bugnote_id=1430#r423
2017-06-12 23:40smokingwheelsFile Added: Yacy_java_memory.ods
2017-06-12 23:44smokingwheelsNote Added: 0001431
2017-06-12 23:45smokingwheelsFile Added: Yacy_java_calculator.png
2017-06-12 23:49smokingwheelsNote Edited: 0001431bug_revision_view_page.php?bugnote_id=1431#r425
2017-06-13 11:45smokingwheelsFile Added: Quad-cloud-and-home-using-trial-memory-settings.tar.gz
2017-06-13 11:56smokingwheelsNote Added: 0001432
2017-06-15 22:59BuBuNote Added: 0001442
2017-06-15 22:59BuBuStatusnew => resolved
2017-06-15 22:59BuBuResolutionopen => fixed
2017-06-15 22:59BuBuAssigned To => administrator

Notes
(0001412)
smokingwheels   
2017-06-07 20:15   
(edited on: 2017-06-07 20:18)
Local and Cloud Debug threads.
File names like -18 is equal to renice -18 -p 1234 eg java
Note Yacy Code is modified though.

yacy-Debug.tar.gz [^] (1,132,522 bytes) 2017-06-07 20:16

Upload file first then reference in post.

(0001413)
smokingwheels   
2017-06-07 20:41   
YaCy 'anonufe-11011755-15'_ Crawler 600 mb cloud.pdf [^] (217,731 bytes) 2017-06-07 20:33
1 CPU 1 GB

Crawler has paused not enough memory. pending: collection=191059


YaCy version: 1.92/9231
Uptime: 2 days 03:04
Java version: 1.8.0_131
Processors: 1
Load: 1.19
Threads: 251/20, peak:526, total:55345

Was crawling at approx 300 PPM
(0001414)
smokingwheels   
2017-06-07 22:44   
YaCy 'webportal'_ Console Status result of hosts file.pdf [^] (141,988 bytes) 2017-06-07 22:41

http://winhelp2002.mvps.org/HOSTS.HTM [^] There are many other sites to choose from.

Blocked Threads not so often.
(0001415)
smokingwheels   
2017-06-08 03:51   
Debug 06-08-2017 8-35.tar.gz [^] (1,979,731 bytes) 2017-06-08 03:49

Crawling webportal running ok
20 dumps
(0001416)
smokingwheels   
2017-06-08 03:57   
(edited on: 2017-06-08 04:51)
Debug Threads 9-07--53 slow crawling.tar.gz [^] (2,591,553 bytes) 2017-06-08 03:52

Noticed crawler speed dropped down 0-500 ppm
Started Logger recording
20 Dumps
Shutdown yacy.
190 k in collection.

YaCy 'anonufe-11011755-15'_ Console Status crawler very slow9-07-53.pdf [^] (115,044 bytes) 2017-06-08 03:58
Status page during the event.

(0001417)
luc   
2017-06-08 09:35   
Wow great job!

Not easy to analyze but very valuable.

The "Debug Threads 9-07--53 slow crawling.tar.gz" traces seems to clearly show that there are some threads inter-blocked by operations on the crawler Cache, likely all waiting for this one :
"Thread= ArrayStack.DELETE_EXECUTOR_pool-1-thread-3821 id=31967 TIMED_WAITING
at java.io.RandomAccessFile.readFully(RandomAccessFile.java:436)".

I guess this won't be easy to solve. Not sure for now why that read operation on a heap file could be blocking so long...
(0001418)
smokingwheels   
2017-06-08 16:20   
(edited on: 2017-06-08 18:27)
Does this link work its a file access log while crawling?


https://www.dropbox.com/s/5vi8lzqoee1vwty/disk.txt.tar.gz?dl=0 [^]

(0001419)
smokingwheels   
2017-06-08 16:31   
(edited on: 2017-06-08 16:42)
Debug 06-08-17 10-06-42.tar.gz

While crawling slowly before I terminated it.

Webportal is running ok in the cloud but that is on a different class of hardware.

The particular DNS servers you use may affect it also because my router was hanging after 12 hours or so of crawling.

(0001421)
luc   
2017-06-08 23:41   
I guess the dropbox link should work, but personally I don't want to install dropbox on my computer... by the way you provided already much to analyze, which should inspire some improvements to try.
(0001423)
smokingwheels   
2017-06-09 23:42   
Thanks for the feedback @Luc that is good news to hear.
It took 4-5 hours to work out a TAB key on my Linux PC needs 100ms per code and a double release code to make it work the send 10 at a time...

I have just Upgraded my home setup to 1.921/9236
I guess I will keep my cloud at the version I did created this post in, while testing from home the new version.


Just a minor issue crawler briefly slowed down.
I was using the mouse to refresh the output.

THREADS WITH STATES: BLOCKED
 
Thread= Finalizer daemon id=3 BLOCKED
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)

 
Thread= Java2D Disposer daemon id=124 BLOCKED
at java.lang.Thread.run(Thread.java:748)
(0001424)
smokingwheels   
2017-06-09 23:54   
I have room in my Dropbox to store the modified version, I am thinking along the lines of trying 8 $5.00 modules in the next experiment.

I think I can use or share or generate a link that does not show up in the public folder. I will look into that first.
(0001425)
smokingwheels   
2017-06-10 01:17   
(edited on: 2017-06-10 12:49)
I ran out of JAVA heap space HTTP 500 doing a crawl for awhile.
1.921/9236 600 MB

I think the start up config of javastart_Xms=Xms is normally about 130 MB from my memory.
I have found around 200 mb is more stable but it depends on the memory available of the particular system in question. eg Raspberry PI, $5.00 cloud.
For Linux if JAVA will Shell to a Bash prompt and return to the adjust it semi automatic for 1 time configuration of a peer. https://www.cyberciti.biz/faq/ram-size-linux/ [^]
You maybe already may have the figure in the system somewhere (I think I did look up the command java shell back in version 1.69X).
$ less /proc/meminfo > memory.txt
OR
$ cat /proc/meminfo > memory.txt

The startup Linux nice setting I think was 0 and is now 19 I will look into that

In my New cloud install the JAVA Nice setting is set to 0 https://twitter.com/smokingwheels/status/873488505099370497 [^]

After restart Nice is 10.

After Increasing JVM Nice is 19.

(0001426)
smokingwheels   
2017-06-10 08:34   
Just installed a spare PC will try.
ssh address-of-B 'wget -O - http://server-C/whatever' [^] >> whatever
I will also used it for backups.
I guess in the rush or the chances I take, I did not start a new copy after a usual few mods.

Its great my internet has very little bandwidth left with the default 6000 crawl I was setting up over 8000 before.

My little keyboard is now doing a few speed tests. If you would like to see 1 or 2 I can post them just let me know?
(0001427)
smokingwheels   
2017-06-10 11:38   
Uploading on my ADSL2 connection takes 2 Hours, I Could install Ubuntu with the desktop version in my cloud because you have room for 2 custom ISO's and run this code modifier remotely.
https://youtu.be/3hOkcVbE0yQ [^]

Just at the end of this month I will be hit by 10% GST from my Cloud provider.
(0001429)
smokingwheels   
2017-06-12 04:14   
1.921/236
HTTP Error 408 Request timeout
I got this when checking new site to crawl in browser.

Is there anyway to drop the site from the crawler Que if it happens too much like 3 times?
(0001430)
smokingwheels   
2017-06-12 04:19   
(edited on: 2017-06-12 21:03)
1.921/236
Threads.txt.tar.gz [^] (10,345 bytes) 2017-06-12 04:15

at net.yacy.kelondro.index.RAMIndex.has(RAMIndex.java:137) [assert (key != null);]
Could be %3Bamp% in the url not to sure.

I did list it in the blacklist and the error seems to have gone.

(0001431)
smokingwheels   
2017-06-12 23:44   
(edited on: 2017-06-12 23:49)
1.921/236
Yacy_java_memory.ods

I made a Java memory calculator based on the default settings. I am testing at the moment on a 1 GB P4 doing a crawl on adsl2.

Yacy_java_calculator.png
Just a Quick reference guide to try for Non Linux users. The java settings are all in eg 1024 256 128 increments etc.

(0001432)
smokingwheels   
2017-06-13 11:56   
1.921/236
Quad-cloud-and-home-using-trial-memory-settings.tar.gz
Contains 2 pics Memory usage the cloud has 8 GB.
The GC seems to be working differently. Really too early to tell though.
(0001442)
BuBu   
2017-06-15 22:59   
improved by @luc
https://github.com/yacy/yacy_search_server/commit/a7394b479b4a2ecb9bd0c9696355e5d1008b8742 [^]