YaCy-Bugtracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0000724YaCy[All Projects] Generalpublic2017-01-30 16:012017-02-02 08:59
ReporterDNcrawler 
Assigned To 
PrioritynormalSeverityminorReproducibilityalways
StatusnewResolutionopen 
ETAnone 
Platformx86-64OSFreeBSDOS Version10.3
Product Version 
Target VersionFixed in Version 
Summary0000724: hostbrowser.xml does not contain the site queried
DescriptionHostBrowser.xml output files do not contain the original site. See https://github.com/yacy/yacy_search_server/blob/ff6589fc0f4332bc83f89f875b62e7670762e4ee/htroot/HostBrowser.xml [^] for example.

By comparison, webstructure.xml does contain the site queried inside xml attribute tags.
Steps To ReproduceOne queries "http://localhost:8090/HostBrowser.xml?hosts=example.com". [^]

Resulting Hostbrowser.xml output does not contain the domain example.com in the file itself.
TagsNo tags attached.
Attached Files

- Relationships

-  Notes
(0001378)
luc (reporter)
2017-01-31 08:55

Hi, you are not using the right parameter. To restrict results to a specific hostname, the "path" parameter must be used instead of "hosts".

Thus it would be : http://localhost:8090/HostBrowser.xml?path=example.com [^]
(0001379)
DNcrawler (reporter)
2017-02-02 08:07

I am calling it correctly, I just copy/pasted the wrong line.

Here's the line from the script I have:

http://localhost:8090/HostBrowser.xml?path=example.com [^]

The resulting xml file doesn't have the domain as specified in "path" in the file. Here's the output xml file after running the command above:

<?xml version="1.0"?>
<hostbrowser>


 <files>
  <root />
   </files>



 <inbound>
    <host name="example.org" count="4116" />
    <host name="example.net" count="62" />
   </inbound>

</hostbrowser>
(0001380)
luc (reporter)
2017-02-02 08:59

Ok, but is it really a problem?
To my mind it is in the webstructure.xml result because when queried without the "about" parameter, "out" and "in" <references> tag can contain multiple host references lists. So each <domain> tag has to specify which host it is about.

In the HostBrowser.xml structure, <inbount> and <outbound> lists are only produced for the path you specified, so I guess we can assume that path is known when parsing the result...

- Issue History
Date Modified Username Field Change
2017-01-30 16:01 DNcrawler New Issue
2017-01-31 08:55 luc Note Added: 0001378
2017-02-02 08:07 DNcrawler Note Added: 0001379
2017-02-02 08:59 luc Note Added: 0001380


Copyright © 2000 - 2017 MantisBT Team
Powered by Mantis Bugtracker