View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0000699YaCy[All Projects] Generalpublic2016-10-20 12:092016-10-22 00:16
Assigned Toadministrator 
PlatformOSOS Version
Product Version 
Target VersionFixed in Version 
Summary0000699: Crawl start from a redirected URL fails
DescriptionStarting a crawl from a redirected URL is failing : the Link-list is sucessfully loaded and displayed in the /CrawlStartSite.html or /CrawlStartExpert.html page, but once the crawl is started nothing is crawled.

The "Rejected URLs" page (/IndexCreateParserErrors_p.html) effectively display a message such as "TEMPORARY_NETWORK_FAILURE cannot load: load error - java.io.IOException: CRAWLER Redirect of URL=http://wikipedia.org/ [^] to https://wikipedia.org/ [^] placed on crawler queue for double-check", but indeed nothing happens.
Steps To Reproduce - Go to the /CrawlStartSite.html or /CrawlStartExpert.html page
 - Choose a redirected URL as starting point, such as "http://wikipedia.org/" [^] (redirected to https://wikipedia.org/ [^] and then to https://www.wikipedia.org/ [^])
 - the Link-list is sucessfully loaded and displayed
 - start the new crawl job with default or whatever options
 - the job is started, but nothing is effectively crawled, even if the "Rejected URLs" page pretend the redirection will be sucessfully handled
TagsNo tags attached.
Attached Files

- Relationships

-  Notes
luc (reporter)
2016-10-21 08:20

I commited a fix : https://github.com/yacy/yacy_search_server/commit/6f49ece22f910c94066ff5736d3dd75577c0cc37 [^]

- Issue History
Date Modified Username Field Change
2016-10-20 12:09 luc New Issue
2016-10-21 08:20 luc Note Added: 0001332
2016-10-22 00:16 BuBu Status new => resolved
2016-10-22 00:16 BuBu Resolution open => fixed
2016-10-22 00:16 BuBu Assigned To => administrator

Copyright © 2000 - 2021 MantisBT Team
Powered by Mantis Bugtracker