reger
4c7e515769
correct Collection navigatior - search servlet modifier parameter
...
(navigator entries are single collection names, spaces are removed by crawlstart)
preparation: for abstraction of navi's
8 years ago
reger
af39a76bf6
Reduce number of default max. search navigator lines (from 10000)
...
to 100 + make it configurable
8 years ago
Sudheesh Singanamalla
065bcfba75
Merge pull request #88 from sudheesh001/Patch16
...
Fixes #16 Updates documentation about cloning and build from source
8 years ago
sudheesh001
d97da1ddb7
Fixes #16 Updates documentation about cloning and build from source
8 years ago
reger
20a1b29ed3
add simple test case for ReferenceContainer helpful for debugging
...
calculated ranking parameter
8 years ago
reger
3cc2af8f92
reduce the mix of absolute and relative internal html page links
...
(prefer relative for same pg or neighbors) to ease proxied access
e.g. http://mantis.tokeek.de/view.php?id=701
8 years ago
reger
3c7220bc7b
Refacture rwi reference word position and word distance calculation
...
used for rwi ranking.
Main changes:
- introduce a posintext() to access the stored value. This reduces also mem alloc of position array for WordReferenceRow (index access)
- use the positions() array for joined references on multi-word queries if needed (otherwise allow positions() to be null
- adjust assignments and the min() max() and distance() calculation accordingly
8 years ago
luccioman
f0639d810c
Customized name for Threads still using the default "Thread-n" pattern.
...
This makes threads monitoring easier to read.
8 years ago
luccioman
c0379c3cd3
Merge branch 'master' of https://github.com/yacy/yacy_search_server.git
8 years ago
luccioman
db3b9db9c2
Crawl from local file : faster task end when manually terminating crawl.
8 years ago
luccioman
78085fad8d
Fixed NullPointerException case.
...
As reported by @reger24 , search in Intranet mode was failing due to
this error.
8 years ago
reger
4c67ed3f8d
catch rwi ranking div by zero exception
...
during rwi search result processing worddistance calculation is effected
by concurrent update (normalization) of min/max ranking parameter for
wordpositions. On update of min/max the exception is raised in distance calc
and now catched.
This concurrent update and change of ranking results is needed for speed
but should be further checked for optimization
8 years ago
luccioman
47af33a04c
Advanced Crawl from local file : better processing of large files.
...
Applied strategy : when there is no restriction on domains or
sub-path(s), stack anchor links once discovered by the content scraper
instead of waiting the complete parsing of the file.
This makes it possible to handle a crawling start file with thousands of
links in a reasonable amount of time.
Performance limitation : even if the crawl start faster with a large
file, the content of the parsed file still is fully loaded in memory.
8 years ago
luccioman
ee92082a3b
Updated javadocs : warning about closing stream responsibility.
8 years ago
luccioman
6f49ece22f
Fixed redirected URLs processing as crawl start point.
...
See mantis 699 (http://mantis.tokeek.de/view.php?id=699 ) for details.
8 years ago
reger
68217465fe
div by null in word distance calculation
...
(again, description in http://mantis.tokeek.de/view.php?id=698 )
as root cause was not seen, added just workaround reducing in favour over a
try catch (for easier followup).
8 years ago
luccioman
7263d17436
Removed mentions of deprecated LURL-db.
...
Thanks to LA_FORGE asking about if on YaCy forum (
http://forum.yacy-websuche.de/viewtopic.php?f=5&t=5895 )
8 years ago
luccioman
c3c4a52408
Added more examples in Blacklist JUnit test.
8 years ago
reger
8b74a6bf57
fix min/max calculation of WordReferenceVars.distance()
...
Issue was the calculation in AbstractReference with positions.clear() call,
this made distance result always 0 (distance needs min 2 positions) and created concurrency issues.
+ unit test of changes
8 years ago
luccioman
da362628fb
Added fine log level for too long blacklist matching processing.
8 years ago
reger
aaae7c6462
adjust ConcurrentScoreMap internal value map to interface and use parameter
...
Long -> Integer (saves some bytes)
8 years ago
reger
31d2a5645e
remove obsolete query variable
...
leftover from 8fb370d9f8 (diff-1d4259005ebfddc11083387857a86175)
harmonize ranking shift parameter to 0xFF
correct addresult weight parameter to long
8 years ago
luccioman
93ea366778
Updated license header file name
8 years ago
luccioman
4c0be4d5d4
Fixed maven compilation error
...
Removed unit test yacysearchitemTest from default maven Junit tests
path, as yacysearchitem class is not in maven build classpath.
8 years ago
reger
ba77e8f8ec
upd to Jetty 9.2.19
8 years ago
luccioman
a588ed7628
Applied image headers customization to the new ViewFavicon servlet.
8 years ago
luccioman
d16e57b41e
Merge pull request #39 from luccioman/master
...
Favicon retrieval and image preview enhancements.
More details on mantis 629 (http://mantis.tokeek.de/view.php?id=629 )
8 years ago
luccioman
7717a3d43d
Fixed license headers on files created to improve favicon management.
8 years ago
luccioman
6e1959f469
Merge branch 'master' of https://github.com/yacy/yacy_search_server.git
...
Conflicts:
htroot/yacysearchitem.java
source/net/yacy/cora/federate/solr/responsewriter/YJsonResponseWriter.java
source/net/yacy/search/schema/CollectionConfiguration.java
source/net/yacy/server/serverObjects.java
8 years ago
luccioman
7136b1ad60
HTML validation : fixed URL encoding of Pictures link.
8 years ago
reger
407563b9f0
add lock symbol to messages UI Trans menu item
8 years ago
reger
685d8e86bf
Avoid frequent data type casting (float/long) for rwi score
...
refactor to using long in URIMetadataNode too (and related call parameters)
As remote rwi score's are not used (since v1.83) skip reading float-score ,
but keep in toString() for communication with older versions.
8 years ago
luccioman
3ccd89e274
Fixed MultiProtocolURL.resolveBackpath to handle remaining '..' segments
8 years ago
luccioman
f1f4459f88
Added some unit tests for Blacklist.isListed()
8 years ago
luccioman
4b699c469a
Blacklist refactoring : extracted a function for easier unit testing
8 years ago
luccioman
54cfcc3f56
CrawlCheck_p.html : also display info about disallowed URLs.
8 years ago
luccioman
8b341e9818
Robots : properly handle URLs including non ASCII characters
...
This fixes GitHub issue 80 (
https://github.com/yacy/yacy_search_server/issues/80 ) reported by
Lord-Protector.
8 years ago
luccioman
75bb77f0cb
Refactoring : extracted a method to handle authorized action links.
8 years ago
luccioman
c996b04741
HTML validation : fixed URL encoding of search results action links.
8 years ago
luccioman
2b81703828
Refactored search result action links construction.
...
These are long URLS with common parts : it is valuable to build the
common parts only one time.
8 years ago
reger
e68b00678e
prevent negative score on URIMetadataNode - in the special case were no
...
solr score is supplied.
+ assert before use & test case
8 years ago
luccioman
242707f9b4
Fixed loadFromCache with strategy IFFRESH.
...
This fixes mantis 695 ( http://mantis.tokeek.de/view.php?id=695 ) :
crawl start with 'Link-List of URL' option on websites using cookies.
8 years ago
reger
c778219768
remove module for swfparser from maven parent pom
...
not longer required for the build
see a4465c97d6
8 years ago
luccioman
094aed8664
Merge branch 'master' of https://github.com/yacy/yacy_search_server.git
8 years ago
luccioman
c7402a2f89
Removed invalid empty form action.
...
A form action URL must not be empty (see
https://www.w3.org/TR/html/sec-forms.html#element-attrdef-form-action ).
No action attribute has the same effect (relaunching the same GET
action) but is valid HTML.
8 years ago
luccioman
37df2e19fd
Removed xmlns attribute which no more makes sense in HTML5 pages.
8 years ago
luccioman
94924e288f
Added some accessibility improvements to the main interface.
...
Tested with NVDA screen reader.
8 years ago
luccioman
dd86f7c44e
Fixed HTML validation errors and grouped radios options in fieldsets
8 years ago
luccioman
fc0c72c84b
Switched to the short HTML Doctype
...
This pages were already no more XHTML 1.0 because made use of the HTML5
syntax and elements.
Applied current (2016) HTML standard recommended Doctype declaration
(see https://www.w3.org/TR/html/syntax.html#the-doctype ).
8 years ago
reger
7c81160f45
correct blacklist export as text url to blacklists_p.txt
...
was using servlet for network access and missing network.unit.name
fix for http://mantis.tokeek.de/view.php?id=694
+ prevent unresoved_pattern in yacy/list servlet
8 years ago