YaCy-Bugtracker - YaCy
View Issue Details
0000730YaCyWishlist - Wunschlistepublic2017-03-28 19:342019-07-28 08:38
smokingwheels 
 
normalminoralways
newopen 
none 
X86Linux Debian +Ubuntu
YaCy 1.9 
 
0000730: Some web sites have URL's with ;amp;amp. hundreds of them.
When crawling some sites, I have noticed a few sites having a URL's that suffice/fix with www.domain.com/somepage.html;amp;amp;amp so on for at least 3 to 4 lines in the crawler monitor.

If I find then now I terminate the crawl or blacklist the site because it just slows down my slow PC.

Maybe an option to bypass the sites if one wishes?


 
No tags attached.
Issue History
2017-03-28 19:34smokingwheelsNew Issue

There are no notes attached to this issue.