Anonymous | Login | Signup for a new account | 2021-01-21 15:41 CET | ![]() |
Main | My View | View Issues | Change Log | Roadmap |
View Issue Details [ Jump to Notes ] | [ Issue History ] [ Print ] | ||||||||||||
ID | Project | Category | View Status | Date Submitted | Last Update | ||||||||
0000648 | YaCy | Wishlist - Wunschliste | public | 2016-03-24 13:36 | 2016-03-24 13:37 | ||||||||
Reporter | b0b3r | ||||||||||||
Assigned To | |||||||||||||
Priority | normal | Severity | feature | Reproducibility | N/A | ||||||||
Status | new | Resolution | open | ||||||||||
ETA | none | ||||||||||||
Platform | OS | OS Version | |||||||||||
Product Version | YaCy 1.8 | ||||||||||||
Target Version | Fixed in Version | ||||||||||||
Summary | 0000648: Blac/white-lists entries and priorities. | ||||||||||||
Description | For some use cases it would be good to be able to use whitelisting, and prioritize rules. The rule may have to be in CSV format: <PRIO>,<B/W>,<RULE EXPRESSION TEXT> The rules would be checked in descending order of priority and the first hit wins and decides whether it will be W-hitelisted or B-lacklisted. Example scenario: I want that only pages with the domain ".pl" be added to index. So at the beginning I blacklist everything: 10,B,".*" Then I'm unlocking ".pl" domains. So I make a rule that whitelist it, and has higher priority than block all: 20,W,"*.pl/.*" But I do not want to domains with 'porn' substring even in ".pl" subdomains. I have to blacklist it with higher priority rule: 30,B,".*porn.*\/.*" But I also want to to have in the index pages of "Pidżama Porno" music band, which address "http://pidzamaporno.art.pl/" [^] contains 'porn' substring. So I just have to add whitelisting rule with higher priority: 40,W,"pidzamaporno.art.pl/.*" This approach makes realization of even the most complex scenarios very easy. And it should also be relatively inexpensive in terms of CPU usage as it requires only a sort of ruleset by priority numeric values, and after the first hit, there is no need to check the rest of the ruleset. | ||||||||||||
Tags | No tags attached. | ||||||||||||
Attached Files | |||||||||||||
![]() |
|||
Date Modified | Username | Field | Change |
2016-03-24 13:36 | b0b3r | New Issue |
Copyright © 2000 - 2021 MantisBT Team |