YaCy-Bugtracker - YaCy
View Issue Details
0000699YaCy[All Projects] Generalpublic2016-10-20 12:092016-10-22 00:16
0000699: Crawl start from a redirected URL fails
Starting a crawl from a redirected URL is failing : the Link-list is sucessfully loaded and displayed in the /CrawlStartSite.html or /CrawlStartExpert.html page, but once the crawl is started nothing is crawled.

The "Rejected URLs" page (/IndexCreateParserErrors_p.html) effectively display a message such as "TEMPORARY_NETWORK_FAILURE cannot load: load error - java.io.IOException: CRAWLER Redirect of URL=http://wikipedia.org/ [^] to https://wikipedia.org/ [^] placed on crawler queue for double-check", but indeed nothing happens.
 - Go to the /CrawlStartSite.html or /CrawlStartExpert.html page
 - Choose a redirected URL as starting point, such as "http://wikipedia.org/" [^] (redirected to https://wikipedia.org/ [^] and then to https://www.wikipedia.org/ [^])
 - the Link-list is sucessfully loaded and displayed
 - start the new crawl job with default or whatever options
 - the job is started, but nothing is effectively crawled, even if the "Rejected URLs" page pretend the redirection will be sucessfully handled
No tags attached.
Issue History
2016-10-20 12:09lucNew Issue
2016-10-21 08:20lucNote Added: 0001332
2016-10-22 00:16BuBuStatusnew => resolved
2016-10-22 00:16BuBuResolutionopen => fixed
2016-10-22 00:16BuBuAssigned To => administrator

2016-10-21 08:20   
I commited a fix : https://github.com/yacy/yacy_search_server/commit/6f49ece22f910c94066ff5736d3dd75577c0cc37 [^]