YaCy-Bugtracker - YaCy
View Issue Details
0000762YaCyWishlist - Wunschlistepublic2017-07-09 03:582019-07-28 08:38
smokingwheels 
 
lowtweaksometimes
newopen 
none 
YaCy 1.9 
 
0000762: Have crawler queue for domains that have large robots delay time.
I have noticed when crawling I come across sites that have a crawl delay of 10 to 60 seconds, during this time the PPM drops to 0 and it is not the particular site of crawl I intended to look at its just background noise.

The effect is it slows the site of interest being crawled takes much longer.
A temporary fix is just to add the site to the black list to get around the problem.

There are some sites you want to crawl and there is a robots delay there but you accept it will take some time to gather the information.

No tags attached.
Issue History
2017-07-09 03:58smokingwheelsNew Issue

There are no notes attached to this issue.