View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0000627YaCy[All Projects] Generalpublic2015-12-13 22:542016-01-17 01:59
Assigned ToBuBu 
PlatformOSOS Version
Product VersionYaCy 1.8 
Target VersionFixed in Version 
Summary0000627: 20 search results is not 100 or 50 search results
Description1) Configure the default number of search results to be 50 or 100 in the administration interface and go to the search page OR go to the search page and click "more options..." and click 50 or 100 for "Results per page".
2) Do a search.

Now you get .. volla .. *20* search results. And you get 10 if you select then there.

YaCy isn't "quite" as good as Duck Duck Go when it comes to sorting search results and what you are looking for is never among the first 10 or 20 results. It would be good if a request for 50 results gave.. more than 20.

It should also be noted that if you do click to page 2 or 3 then you mostly just get page 1 mixed with some other random results. I know it'd take some more time to load 100 results but damnit if I ask for 100 then I want 100 not 20.
TagsNo tags attached.
Attached Files

- Relationships

-  Notes
luc (reporter)
2015-12-16 21:36

Not reproduced with YaCY 1.83/9586 : you get the maximum number of results you asked for.
I do not know which commit has corrected this.
BuBu (developer)
2015-12-16 23:38

can also not reproduce it and get with or w/o option requested items per page
oyvinds (reporter)
2016-01-04 02:54

This is _still_ a bug on YaCy 1.83/9616.

1) Have 50 search results per page selected in the admit interface.
3) Go search.
4) Get the 50 results as expected.

5) Clear your cookies or visit your YaCy node with another browser or device (and don't login).
6) Search.
7) Enjoy 20 results not 50.

Feel free to try this on a image search and try to zoom in on the thumbnail you get and enjoy no image - only a white blank square - when you're not logged into it.

This bug is not fixed, just ignored. idk if nobody ever tests YaCy without loggin in (there's no "logout button", but there should be one next to re-start and shutdown imho) but there's quite a few bugs when you're not.
luc (reporter)
2016-01-04 13:57

Ok, I reproduce it with YaCy 1.83/9635 when requesting from unauthenticated external browser.
I reported it to work by testing it against my local peer from local browser (for YaCy, local requests have the same credentials than an authenticated user). It is my main use case, and I think the easier to test.
But you are right, we must be careful to also test from external.

By the way I do not reproduce image zoom-in issue (my external browser is Firefox 43 on Win32)
oyvinds (reporter)
2016-01-04 17:59

Regarding the image thing, you can try it at http://yoona.everdot.org/ [^]

The image links generated (what you should be able to view when you zoom in but can't) are:

http://yoona.everdot.org:8090/ViewImage.png?code=SUcIOadYkdPY&isStatic=true&url=http://www.soompi.com/es/files/2015/10/YoonA.jpg [^]

..and these links return a 500 server error for some reason. It works just fine when I login.

I do not run YaCy on my desktop, it's on the NAS and I would like to be able to "just use it" without logging into it on any device used at any location. Then again I'd also like it to return actual useful results & a cute korean girlfriend too but one can't have it all.
oyvinds (reporter)
2016-01-04 18:06

When I look at it, I'm wondering why YaCy would proxy pull the image to begin with.

Couldn't it just link directly to the image in question?

Why does it put itself in the middle (and not actually deliver)?

I am guessing this ?code= part is there to prevent YaCy nodes from being abused as random proxies (for images, anyway). Doesn't this just solve a problem that's there because it's in the middle/in the way?
luc (reporter)
2016-01-06 23:30

Ok I reproduced the image problem on your nice search page, and also on my peer (my previous network configuration was wrong).
YaCy ViewImage is used to render images preview for 2 main reasons :
 - extending image formats support without relying on browser eventual plugins (see http://forum.yacy-websuche.de/viewtopic.php?f=8&t=5689 [^])
 - prevent copyright issues for non authenticated users : embedding unauthorized full third party images on a web page is not a problem for local or personnal use but might be illegal for a public peer (https://en.wikipedia.org/wiki/Copyright_aspects_of_hyperlinking_and_framing#Inline_link [^]).
Code parameter ensures you are coming from a search request and that the generated image link can only be used once. As far as I understand it, it is intentionnaly that non authentified users can not display full image previews. By the way, for them at least the same thumbnail could be displayed...
luc (reporter)
2016-01-08 23:37

I made some improvements concerning image issue : https://github.com/yacy/yacy_search_server/pull/39 [^]

Note it doesn't fix the pagination issue.
luc (reporter)
2016-01-09 00:35

Root of pagination problem is here : https://github.com/yacy/yacy_search_server/blob/e8256bb3b1de7a943a81ddd987eeabf09b7fa315/htroot/yacysearch.java#L229 [^]

To resume :
 - maximum page results number is the lowest value among eventual specified parameter, configuration... and 20 or 1000 when non authenticated, depending on cache strategy

I see it as a way of protecting against Dos attacks from non authenticated users who may otherwise search for terms with many results and set maximumRecords parameter to a large value. But :
 - 20 is quite low : shouldn't the owner of the peer be able to choose whatever maximum records value he wants? I suggest two values should be proposed on config page : one for authenticated, and one for non authenticated.
 - api search has no such protection : to my mind same rule should be applied.

Any suggestions?
oyvinds (reporter)
2016-01-09 15:00

- I personally see no good reason to have any configuration options for maximum number of search results allowed unless there are real use cases for grabbing more than 1000 - in which case there should be 1 configuration option for this. If someone visits my peer and selects 100 in the search interface then they should get 100. As for more than 100 like 1000: I'm not sure what the real world use cases for wanting 1000 on a single page would be. Perhaps there is one.
- If DOS is a concern then there's other solutions for doing this like not being able to do a search more than once every five seconds or something like that.
- What _default_ number of search results which are returned can already be configured once and that option should just work as expected. If I set it to 50 then I want 50 from my tablet and laptop (where I never login, why would I, do those stupid people who use Google login just to ensure that Google does additional tracking and spying before they search? if they watch more than an hour of TV per day.. probably?).

Regarding the image fix: Thank you but I'm not happy with it at all. Showing a thumbnail when you click on a thumbnail? That's a joke. It does not change or fix image search being broken unless you login and I don't login to the administration interface unless I want to .. well, administer something.

I am not entirely sure why there should be so many restrictions and handcuffs unless you login. A configuration option like "[ ] Allow public usage of peer for search", perhaps? And if I click that and go to the search page then it just works regardless of where I am and what device I'm on? And if someone doesn't click that then the search page just shows a login box? ofc if you don't want people to use your yacy peer then you can simply not open a port for it..

ofc I probably see things a bit differently since I fundamentally see every-slightly-increasing search-engine censorship as a problem and think even those who couldn't install YaCy if their life depended on it should be able to do uncensored searches.
luc (reporter)
2016-01-11 10:21

- It easy to search terms grabbing more than 1000 results : simply search for "wikipedia", or "twitter", or "chicago"... User may in that case (intentionnaly or by mistake) type a value over 1000 for maximumRecords parameter.
- Search requests already have a timeout. Limiting maximum results number was maybe not added for DOS concern, but to have consistent results : I did some more tests, and on a medium performance computer, it appears to be currently difficult to reach 1000 resuls within timeout.
- Is it really a problem to login to your own YaCy peer? It is not comparable to login to Google or whatever search engine company to perform searches.

- Showing a thumbnail when you click on a thumbnail : not a joke. It is better than having a blank square, and you still have links if you want to see full size image. Displaying full size third party images without authorisation may be considered as copyright infringement in many countries. The pull request I did is not intended to break current YaCy copyright policy.

But I agree a peer owner may be able to configure it more finely and for example to choose between full access, thumbnails only, pr links only for images previews. To my mind this need some feedback from other developers and users. I think we are not talking here of censorship, only trying to ensure default YaCy behavior is compatible with most countries copyright laws.
oyvinds (reporter)
2016-01-11 14:20

I did not mean to imply that YaCy or devs do censorship.
Everyone else (Google/Microsoft/"news outlets") do it and this is why
I see YaCy as important.

I live in a country where the police come and steal your hardware
 regularly when you run Tor/Bitcoin/YaCy. Devices that can't be
locked up securely (tablets etc) therefore don't/can't
store passwords (or anything) and my YaCy password is a very
long string of random. This makes logging into YaCy a problem.

A login requirement poses a lot of other problems. It means
 you can only use YaCy on your own devices. You can't visit
someone and show them how it works without giving them your
password. If the person you visit uses Windows then you would
hand the password over to Microsoft and NSA and GCHQ and the
world as well.

Regarding Copyright, I do not see anyone going after
Google Image search https://www.google.com/imghp [^] for
violating Copyright. This does not mean it can't happen
to YaCy operators since fascist regimes like the ones most
of the EU are made up of tend selectively use laws against
individuals and let corporations do as they see profitable.

"We must do more to reduce the immense volume of material
that is online and so easily accessible to our citizens."
Perhaps it is wise to not give them excuses to go after
YaCy operators. I personally don't care, though, I have to
replace my hardware regularly anyway.
BuBu (developer)
2016-01-11 22:41

Hi guys, nice discussion, maybe something to move over to the forum.
fyi: I'll tackle the main topic 20 is not 50 (in view of the point made, config consistency but performance/DOS risk).
luc (reporter)
2016-01-12 00:56

Thank you BuBu, I started a new forum subject : http://forum.yacy-websuche.de/viewtopic.php?f=23&t=5735 [^]
BuBu (developer)
2016-01-17 01:58

search accepts now parameter for up to 100 results per page, regardless of login status.
commit https://github.com/yacy/yacy_search_server/commit/4765e374e65d1a8dd264ff0bb6e3eaa7dc1973f5 [^]

- Issue History
Date Modified Username Field Change
2015-12-13 22:54 oyvinds New Issue
2015-12-16 21:36 luc Note Added: 0001178
2015-12-16 23:38 BuBu Note Added: 0001179
2015-12-16 23:38 BuBu Status new => closed
2015-12-16 23:38 BuBu Resolution open => unable to reproduce
2016-01-04 02:54 oyvinds Note Added: 0001187
2016-01-04 02:54 oyvinds Status closed => feedback
2016-01-04 02:54 oyvinds Resolution unable to reproduce => reopened
2016-01-04 13:57 luc Note Added: 0001188
2016-01-04 17:59 oyvinds Note Added: 0001189
2016-01-04 17:59 oyvinds Status feedback => new
2016-01-04 18:06 oyvinds Note Added: 0001190
2016-01-06 23:30 luc Note Added: 0001194
2016-01-08 23:37 luc Note Added: 0001195
2016-01-09 00:35 luc Note Added: 0001196
2016-01-09 15:00 oyvinds Note Added: 0001197
2016-01-11 04:28 BuBu Assigned To => BuBu
2016-01-11 04:28 BuBu Status new => assigned
2016-01-11 10:21 luc Note Added: 0001198
2016-01-11 14:20 oyvinds Note Added: 0001199
2016-01-11 22:41 BuBu Note Added: 0001200
2016-01-12 00:56 luc Note Added: 0001201
2016-01-17 01:58 BuBu Note Added: 0001204
2016-01-17 01:58 BuBu Status assigned => resolved
2016-01-17 01:58 BuBu Resolution reopened => fixed

Copyright © 2000 - 2020 MantisBT Team
Powered by Mantis Bugtracker