|View Issue Details [ Jump to Notes ] ||[ Issue History ] [ Print ] |
|ID||Project||Category||View Status||Date Submitted||Last Update|
|0000762||YaCy||Wishlist - Wunschliste||public||2017-07-09 03:58||2017-07-09 12:28|
|Assigned To|| |
|Product Version||YaCy 1.9|| |
|Target Version||Fixed in Version|| |
|Summary||0000762: Have crawler queue for domains that have large robots delay time.|
|Description||I have noticed when crawling I come across sites that have a crawl delay of 10 to 60 seconds, during this time the PPM drops to 0 and it is not the particular site of crawl I intended to look at its just background noise.|
The effect is it slows the site of interest being crawled takes much longer.
A temporary fix is just to add the site to the black list to get around the problem.
There are some sites you want to crawl and there is a robots delay there but you accept it will take some time to gather the information.
|Tags||No tags attached.|