reger
66f6797f52
make config search page layout closer to actual page appearance
11 years ago
sixcooler
5b1c4ef191
Monitoring and limit connection-count for Jetty
11 years ago
orbiter
ce1dbfeb0f
fix appearance of image search thumbnails.
11 years ago
orbiter
6daae59479
switch on core.service.rwi when switching back from portal mode to p2p
...
mode
11 years ago
Michael Peter Christen
f0db501630
better handling of ranking parameters and new default values for date
...
navigation which is done using ranking in solr.
11 years ago
Michael Peter Christen
2520590b45
migrated from pdfbox 1.8.4 to 1.8.5. They have a very long bugfix list
...
for that update:
http://www.apache.org/dist/pdfbox/1.8.5/RELEASE-NOTES.txt
11 years ago
Michael Peter Christen
6634b5b737
debug code for index distribution testing
11 years ago
Michael Peter Christen
89e13fa34e
fixed bug in test function
11 years ago
Marc Nause
4723329e29
Improved blacklist XML/JSON API.
11 years ago
reger
f91b2f51ae
fix: load_Rss remove feed to many parameter for get
...
use form post methode
11 years ago
orbiter
c028ae9b09
Merge branch 'master' of git@gitorious.org:yacy/rc1.git
11 years ago
reger
e31493e139
"Use remote proxy for yacy" has no function, remove option and related config item
...
see/fix bug http://mantis.tokeek.de/view.php?id=23
http://mantis.tokeek.de/view.php?id=189
11 years ago
reger
89e2c5e884
fix: allow enable of CrawlStartExpert.html #file
11 years ago
reger
1b37b12998
fix: CrawlStartExpert.html # From File with missing filename
...
- crawlName must not be empty
- crawlingFile must not be empty
11 years ago
orbiter
0d8072aa99
removed warnings
11 years ago
orbiter
be7c99dbe8
switched menu position of ConfigPortal.html and ConfigSearchBox.html
11 years ago
Michael Peter Christen
a1ac4c3b76
automatically clear graphics cache
11 years ago
reger
f87ac716f3
improve IndexDeletion by query
...
adding transparently text_t as pseudo default search field if no fieldname (no : ) is included.
adressing bug report http://mantis.tokeek.de/view.php?id=274
11 years ago
reger
e9060d31bd
update to Jetty 9
...
besides adjustments in code it makes the servlet settings in web.xml significant.
This applies to solr, gsa and proxy servlet. There is no longer a default setup in code during init (as jetty 9 checks for double definition).
11 years ago
orbiter
b9c1a61814
added a peername=<peername> property in the seedlist API
11 years ago
orbiter
c637955e67
fix for navigation steering / p2p mode
...
see also:
http://forum.yacy-websuche.de/viewtopic.php?f=5&t=5198&p=29958#p29958
11 years ago
Marc Nause
f98ccf952f
Improved Blacklist API:
...
*) added JSON support
*) fixed Exception in case of missing parameters
*) renamed parameter for items in "add entry" and "delete entry" from
"entry" to "item" to match term in XML
11 years ago
reger
91bd384cf6
fix input-group layout on index.html
...
see bug http://mantis.tokeek.de/view.php?id=391
11 years ago
Marc Nause
0d88f292dc
Key for parameter "blacklist name" is "list" in all servlets now.
11 years ago
reger
80e0ee92e5
adjust search page layout - search box to current style
11 years ago
reger
a81dfc27eb
remove obsolet css class bookmarkfieldset
11 years ago
Michael Peter Christen
0898f0be17
input-group for main search input window
11 years ago
Michael Peter Christen
9bb616d778
enhanced HostBrowser buttons and fixed text input alignment
11 years ago
Michael Peter Christen
4a818ad72c
fix for strange fail reason
11 years ago
Michael Peter Christen
a2fba6584f
use submitted default userAgent if cloning a crawl
11 years ago
Marc Nause
e0822fa008
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
11 years ago
Marc Nause
c97da1a0d8
First draft of a blacklist API.
11 years ago
reger
312972c586
add display filter (active/disabled) to IndexSchema_p.html config
...
for easier overview of schema fields
11 years ago
Michael Peter Christen
d79d7dde55
fix for result display
11 years ago
Michael Peter Christen
362c988c05
design fixes to better use the new colours
11 years ago
Michael Peter Christen
bbadccbd8d
better buttons
11 years ago
Michael Peter Christen
a9963d5c95
bootstrap update
11 years ago
reger
4e57000a40
remove redundant javascript & id in index.html
...
to set focus to query field in IE11
11 years ago
reger
121d25be38
recover sax fatal error on OAI-PMH import of xml with entity error
...
this allows to continue loading next resumptionToken even if import file caused sax parser error
fix http://mantis.tokeek.de/view.php?id=63
11 years ago
reger
81dc2aa536
add current css to HTMLResponseWriter to fix metadata view
...
(using css from metas.template except js links)
11 years ago
orbiter
c6f0bd05f8
better removal of stored urls when doing a crawl start
11 years ago
orbiter
469e0a62f1
added new button to terminate all crawls
11 years ago
orbiter
4ee4ba1576
fix for NPE in IndexCreateParserErrors_p.html caused by bad handling of
...
lazy value instantiation of 0-value in crawldepth_i
11 years ago
reger
727dfb5875
refactore URIMetadataNode to further unify interaction with index
...
- URIMetadataNode extending SolrDocument
- use language as stored (String), reducing conversion to string
- optimize debug code in transferIndex
11 years ago
reger
2dabe2009d
- remove unused manual http KeepAlive config
...
(reducing references to obsolete httpdemon)
- add port info to settings_http
11 years ago
Michael Peter Christen
10cf8215bd
added crawl depth for failed documents
11 years ago
Michael Peter Christen
b4b0d14c04
fix for display bug
11 years ago
Michael Peter Christen
9a5ab4e2c1
removed clickdepth_i field and related postprocessing. This information
...
is now available in the crawldepth_i field which is identical to
clickdepth_i because of a specific crawler strategy.
11 years ago
Michael Peter Christen
da86f150ab
- added a new Crawler Balancer: HostBalancer and HostQueues:
...
This organizes all urls to be loaded in separate queues for each host.
Each host separates the crawl depth into it's own queue. The primary
rule for urls taken from any queue is, that the crawl depth is minimal.
This produces a crawl depth which is identical to the clickdepth.
Furthermorem the crawl is able to create a much better balancing over
all hosts which is fair to all hosts that are in the queue.
This process will create a very large number of files for wide crawls in
the QUEUES folder: for each host a directory, for each crawl depth a
file inside the directory. A crawl with maxdepth = 4 will be able to
create 10.000s of files. To be able to use that many file readers, it
was necessary to implement a new index data structure which opens the
file only if an access is wanted (OnDemandOpenFileIndex). The usage of
such on-demand file reader shall prevent that the number of file
pointers is over the system limit, which is usually about 10.000 open
files. Some parts of YaCy had to be adopted to handle the crawl depth
number correctly. The logging and the IndexCreateQueues servlet had to
be adopted to show the crawl queues differently, because the host name
is attached to the port on the host to differentiate between http,
https, and ftp services.
11 years ago
Michael Peter Christen
dd12dd392f
introduction of a data structure for HyperlinkEdges which should use
...
less memory as it does no double-storage of source links for each edge
of the graph.
11 years ago