YaCy-Bugtracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0000591YaCy[All Projects] Generalpublic2015-06-21 21:132015-12-13 03:41
ReporterScarfmonster 
Assigned ToBuBu 
PrioritynormalSeverityminorReproducibilityalways
StatusresolvedResolutionfixed 
ETAnone 
PlatformOSOS Version
Product VersionYaCy 1.8 
Target VersionFixed in Version 
Summary0000591: MediaWiki import error
DescriptionI've tried this with various dumps from various wikis and languages. What happens is Yacy seems to process the dump normally, but then throws this error (here seen on Polish Voyage Wiki):

I 2015/06/21 21:03:06 WIKITRANSLATION [CONSUME] Title: Hessen
I 2015/06/21 21:03:06 WIKITRANSLATION [CONSUME] Title: Moduł:Wikidane/format/string
I 2015/06/21 21:03:06 WIKITRANSLATION [CONSUME] Title: Leipzig
I 2015/06/21 21:03:06 WIKITRANSLATION [CONSUME] Title: Frýdek-Místek
I 2015/06/21 21:03:06 WIKITRANSLATION [CONSUME] Title: Trójmiasto
I 2015/06/21 21:03:06 WIKITRANSLATION [CONSUME] Title: Meksyk (ujednoznacznienie)
I 2015/06/21 21:03:06 WIKITRANSLATION [CONSUME] Title: Panama (ujednoznacznienie)
I 2015/06/21 21:03:06 WIKITRANSLATION [CONSUME] Title: Tejpej
I 2015/06/21 21:03:06 WIKITRANSLATION [CONSUME] Title: Stara Andora
I 2015/06/21 21:03:06 WIKITRANSLATION [CONSUME] Title: Andora (ujednoznacznienie)
I 2015/06/21 21:03:06 WIKITRANSLATION [CONSUME] Title: MediaWiki:Gadget-heading-icons.js
I 2015/06/21 21:03:06 WIKITRANSLATION [CONSUME] Title: Gwatemala (ujednoznacznienie)
I 2015/06/21 21:03:06 WIKITRANSLATION [CONSUME] Title: Georgetown
I 2015/06/21 21:03:06 WIKITRANSLATION [CONSUME] Title: Balti
I 2015/06/21 21:03:06 WIKITRANSLATION [CONSUME] Title: Wyspy Normandzkie
I 2015/06/21 21:03:06 WIKITRANSLATION [CONSUME] Title: Pucice
I 2015/06/21 21:03:06 WIKITRANSLATION [CONSUME] Title: Szablon:Witaj/opis
I 2015/06/21 21:03:06 WIKITRANSLATION [CONSUME] Title: Wyspa Man
I 2015/06/21 21:03:06 WIKITRANSLATION [CONSUME] Title: Pays de la Loire
I 2015/06/21 21:03:06 WIKITRANSLATION [CONSUME] Title: Słowacki Raj
I 2015/06/21 21:03:06 WIKITRANSLATION [CONSUME] Title: Oberwesel
I 2015/06/21 21:03:06 WIKITRANSLATION [CONSUME] Title: Szablon:Quickbar empty
I 2015/06/21 21:03:06 WIKITRANSLATION [CONSUME] Title: Szablon:Quickbar item
I 2015/06/21 21:03:06 WIKITRANSLATION [CONSUME] Title: Lusowo
I 2015/06/21 21:03:06 WIKITRANSLATION convertConsumer / got poison
I 2015/06/21 21:03:06 WIKITRANSLATION *** convertConsumer has terminated
I 2015/06/21 21:03:06 WIKITRANSLATION [CONSUME] Title: Lusowo
I 2015/06/21 21:03:06 WIKITRANSLATION convertConsumer / got poison
I 2015/06/21 21:03:06 WIKITRANSLATION *** convertConsumer has terminated
I 2015/06/21 21:03:06 WIKITRANSLATION [CONSUME] Title: Wertheim
I 2015/06/21 21:03:06 WIKITRANSLATION convertConsumer / got poison
I 2015/06/21 21:03:06 WIKITRANSLATION *** convertConsumer has terminated
I 2015/06/21 21:03:06 WIKITRANSLATION [CONSUME] Title: Kłodzko
I 2015/06/21 21:03:06 WIKITRANSLATION convertConsumer / got poison
I 2015/06/21 21:03:06 WIKITRANSLATION *** convertConsumer has terminated
I 2015/06/21 21:03:06 WIKITRANSLATION [CONSUME] Title: Moduł:Wikidane/data
I 2015/06/21 21:03:06 WIKITRANSLATION convertConsumer / got poison
I 2015/06/21 21:03:06 WIKITRANSLATION *** convertConsumer has terminated
I 2015/06/21 21:03:06 WIKITRANSLATION [CONSUME] Title: Wikipodróże:Pub podróżnika/Archiwum002
I 2015/06/21 21:03:06 WIKITRANSLATION convertConsumer / got poison
I 2015/06/21 21:03:06 WIKITRANSLATION *** convertConsumer has terminated
I 2015/06/21 21:03:06 WIKITRANSLATION [CONSUME] Title: Pasawa
I 2015/06/21 21:03:06 WIKITRANSLATION convertConsumer / got poison
I 2015/06/21 21:03:06 WIKITRANSLATION *** convertConsumer has terminated
I 2015/06/21 21:03:06 WIKITRANSLATION [CONSUME] Title: Moduł:Lang/data
I 2015/06/21 21:03:06 WIKITRANSLATION convertConsumer / got poison
I 2015/06/21 21:03:06 WIKITRANSLATION *** convertWriter has terminated
I 2015/06/21 21:03:07 SWITCHBOARD processed surrogate C:\Users\scarf\Desktop\yacy\DATA\SURROGATES\in\plwikivoyage-latest-pages-articles.xml.0.xml
W 2015/06/21 21:03:07 ConcurrentLog org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.
org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.
    at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
    at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
    at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
    at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
    at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
    at org.apache.xerces.impl.XMLScanner.reportFatalError(Unknown Source)
    at org.apache.xerces.impl.XMLDocumentScannerImpl$PrologDispatcher.dispatch(Unknown Source)
    at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
    at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
    at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
    at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
    at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
    at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
    at org.apache.xerces.jaxp.SAXParserImpl.parse(Unknown Source)
    at net.yacy.document.content.SurrogateReader.run(SurrogateReader.java:157)
    at java.lang.Thread.run(Unknown Source)
I 2015/06/21 21:03:08 SWITCHBOARD processed surrogate [pathToFile]\plwikivoyage-latest-pages-articles.xml.0.xml

This error repeats several times, it always reports the same file and it's only this one file which exists in SURROGATES/out with
plwikivoyage-latest-pages-articles.xml.0.xml
and
plwikivoyage-latest-pages-articles.xml.0.xml.prt
appearing in the SURROGATES/in from time to time.
Steps To ReproduceImport MediaWiki dump of any wiki in any language
TagsNo tags attached.
Attached Files

- Relationships
related to 0000625resolvedOrbiter MediaWiki import still failing 

-  Notes
(0001079)
Scarfmonster (reporter)
2015-06-24 02:01

Okay so I I found out that if you start an import and stay on the page that opens immediately after it will restart the import on the every refresh of the page as it tries to update the progress. Closing this page and going to import page shows progress normally.
(0001121)
BuBu (developer)
2015-10-25 05:44

according to your comment, page refresh issue changed in v1.83/9450

https://github.com/yacy/yacy_search_server/commit/a2dcf6403953476bc996a73178fe5ff153a800a4 [^]

- Issue History
Date Modified Username Field Change
2015-06-21 21:13 Scarfmonster New Issue
2015-06-24 02:01 Scarfmonster Note Added: 0001079
2015-10-25 05:44 BuBu Note Added: 0001121
2015-10-25 05:44 BuBu Status new => resolved
2015-10-25 05:44 BuBu Resolution open => fixed
2015-10-25 05:44 BuBu Assigned To => BuBu
2015-12-13 03:41 BuBu Relationship added related to 0000625


Copyright © 2000 - 2019 MantisBT Team
Powered by Mantis Bugtracker