View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0000630YaCy[All Projects] Generalpublic2016-01-12 22:222016-01-19 09:03
Assigned To 
PlatformOSGNU/LinuxOS VersionDebian Jessie
Product VersionYaCy 1.8 
Target VersionFixed in Version 
Summary0000630: Access to Crawling MediaWiki and phpBB3 Forums fail in Robinson mode
DescriptionWhen a Yacy node is configured with 'Search portal' or 'Intranet indexing' use cases, access to /Load_MediawikiWiki.html and /Load_PHPBB3.html fails with a HTTP 500 error.

Additional InformationError details as displayed in browser :

Problem accessing /Load_PHPBB3.html. Reason:

    Server Error

Caused by:

javax.servlet.ServletException: /home/luc/git/yacy_search_server/htroot/Load_PHPBB3.html
    at net.yacy.http.servlets.YaCyDefaultServlet.handleTemplate(YaCyDefaultServlet.java:844)
    at net.yacy.http.servlets.YaCyDefaultServlet.doGet(YaCyDefaultServlet.java:319)
TagsNo tags attached.
Attached Files

- Relationships

-  Notes
luc (reporter)
2016-01-12 22:23

This was a NullPointerException case.
I propose a fix : https://github.com/luccioman/yacy_search_server/commit/231be83eb65e7289ad56a3544fc9029dda656009 [^]
BuBu (developer)
2016-01-17 01:01

On quick try to reproduce behavior I saw the null pointer only in "Intranet" mode.
True, exception shouldn't happen, otherwise by definition of Intranet mode, Mediawiki external URL's shouldn't be accepted.
luc (reporter)
2016-01-19 09:03

- On my peer, in Intranet or Portal modes sb.peers.mySeed().getIPs() or sb.peers.mySeed().getIP() always return empty or null. I am behind a router and have no static IP.
So when processing SeedDB.initMySeed (https://github.com/yacy/yacy_search_server/blob/7d0d19cb8eb0817db290fc60555b9262ccb253a7/source/net/yacy/peers/SeedDB.java#L213 [^]), serverSwitch.myPublicIPs returns empty because the only addresses found are local network or loopback addresses...
In P2P mode, my public IP is found when processing Protocol.hello(...).
I guess everything here is normal, as my peer is reported as senior.

- Shouldn't media wiki urls crawling been accepted even in intranet mode (default proposed url is http://localhost:8090/repository/ [^])? We want to be able to crawl a local or local network wiki or PhpBB instance... A test show external urls are correctly rejected in intranet mode. For example "Crawling of "https://fr.wikipedia.org/" [^] failed. Reason: denied_(the host 'fr.wikipedia.org' is global, but global addresses are not accepted"

- Issue History
Date Modified Username Field Change
2016-01-12 22:22 luc New Issue
2016-01-12 22:23 luc Note Added: 0001202
2016-01-17 01:01 BuBu Note Added: 0001203
2016-01-19 09:03 luc Note Added: 0001205

Copyright © 2000 - 2020 MantisBT Team
Powered by Mantis Bugtracker