Michael Peter Christen
046d7de95b
Merge remote branch 'reger/master'
13 years ago
reger
a95f645a61
Bugfix class repository.Loaddispatcher fixed download file limit of 10000
...
line 355: final Response response = this.load(request, cachePolicy, 10000, true);
13 years ago
Michael Peter Christen
ef78f22ee1
performance hack
13 years ago
Michael Peter Christen
41536eb4a2
performance hack
13 years ago
Michael Peter Christen
f91487fc50
added delete-button for host navigation
13 years ago
Michael Peter Christen
e8d24fd802
author navigator can be switched off
13 years ago
Michael Peter Christen
558ab7bd4e
made the protocol navigator reversible
13 years ago
Michael Peter Christen
96cb75f1d4
made the filetype navigator be able to deselect the search constraint
13 years ago
Michael Peter Christen
9ebcae2fbc
enhanced url parser to understand urls with & instead of & in post
...
urls
13 years ago
Michael Peter Christen
1f4f60654a
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
...
Conflicts:
source/net/yacy/document/parser/pdfParser.java
13 years ago
reger
32104360ce
PDFParser - return at least first 3 pages of PDF
...
fix for pdf parsing without returning parsed text due to interruption by
time out.
13 years ago
Michael Peter Christen
ef5192f8c9
using the generic document parser for crawl starts instead of the html
...
parser. This makes it possible that every type of document can be a
crawl start point, not only text documents or html documents. Testet
this with a pdf document.
13 years ago
Michael Peter Christen
a02fdf8625
better error messages
13 years ago
Michael Peter Christen
eadb58dd87
small enhancements in pdf parser
13 years ago
Michael Peter Christen
c6ba44468e
timeout = 5000 instead 3000
13 years ago
reger
b616de5973
PDFParser - return at least first 3 pages of PDF
...
fix for pdf parsing without returning parsed text due to interruption by time out.
13 years ago
Michael Peter Christen
e6d26a023f
fix for bookmark crash with possible side-effects on crawl start after
...
the crash
13 years ago
Lotus
c73af39e54
refactoring of tray icon class,
...
now uses Java 6 methods natively
13 years ago
Michael Peter Christen
4eff0e26f1
npe bugfix
13 years ago
low012
8776b84c10
*) small fix to make password change function of reconfigureYACY.sh work
...
again
13 years ago
Michael Peter Christen
190b77c55e
added Ukrainian translation
13 years ago
Michael Peter Christen
1a0b6b3913
get more navigation details to search results
13 years ago
Michael Peter Christen
7f9b6b7a0c
added switches to ConfigParser to accept/deny documents by their
...
extension
13 years ago
Michael Peter Christen
4901cee3cc
suppress auto-tagged subject entries when sending out or receiving
...
metadata from other peers
13 years ago
Michael Peter Christen
83009d86f7
added the vocabulary navigator. It can be very simply tested by
...
switching on the locale dictionaries.
13 years ago
sixcooler
985b78cf89
correct 'avaiable()' to use max of young / eden
13 years ago
sixcooler
4da8746275
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
sixcooler
c9aaa9e00a
respect non-reserved Memory in GenerationMemoryStrategy
...
and enable it again
13 years ago
Michael Peter Christen
37f2d1b3e9
replaced Thread initialization with ExecutorService pool for delete
...
method. This is much faster and produces less blocking when using the
Compressor class which is used by the HTCache. I.e. picture search is
much faster now.
13 years ago
Michael Peter Christen
a58dc4a91f
added autotagging to document condenser:
...
- tags that are automatically generated now enrich the dc:subject
- auto-generated tags have a '$' at the beginning of the tag
- auto-generated tags lead the tag name with a vocabulary name
each tag has the form
$<vocabulary-name>:<tag-printname-space-replaced-by-'_'>
13 years ago
Michael Peter Christen
0d6176804b
emergency disabling of GenerationMemoryStrategy because of non-working
...
available-method
13 years ago
Lotus
411aab02e3
Windows installer now detects reliably whether YaCy runs. A file lock on
...
the yacy.running file has been implemented.
13 years ago
Michael Peter Christen
87f0210480
enriched log output to find NPE in HeapReader
13 years ago
Michael Peter Christen
987b412491
updated solr scheme: generic declaration of solr schemes
13 years ago
Michael Peter Christen
254adea51c
small fixes
13 years ago
Michael Peter Christen
49be60a7c8
WorkflowProcess is forced to make small pauses if shortMemoryStatus is
...
reached.
13 years ago
Michael Peter Christen
b7bb84c0bb
set a limit to CharBuffer object size to fight against bad/too large
...
content
13 years ago
Michael Peter Christen
c602eaaf46
enhanced search process
13 years ago
Michael Peter Christen
087f97d4c0
less noise if a browser cannot be opened
13 years ago
Michael Christen
eff966f396
fix for search process (it was aborted too early during remote search)
13 years ago
Michael Christen
e6d51363ee
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
Marek Otahal
a231d0eeb9
Run from Java the whole app YACY
...
start for java webStart
allow for better integration with IDE
Conflicts:
source/net/yacy/gui/framework/Browser.java
13 years ago
Marek Otahal
72adbeae90
!Important: move from Hashtable to HashMap
...
Hashtable is an obsolete collection v1, now since v2 offers HashMap with same or better
functionality. Please review, almost all code was already moved, so only a few changes. That is not the issue,
but I found notices that some (ugly big) helper classes had to be created in past
to compensate missing Hashtable's functionality. I'd like input if we can remove some of them.
look for //FIX: if these commits
Signed-off-by: Marek Otahal <markotahal@gmail.com>
13 years ago
Marek Otahal
c1af123ddd
just a little faster toString
...
Signed-off-by: Marek Otahal <markotahal@gmail.com>
13 years ago
Marek Otahal
64e4bcee82
serverSwitch get(App/Data)Path() use common helper method
...
Signed-off-by: Marek Otahal <markotahal@gmail.com>
13 years ago
Marek Otahal
371fbb4deb
just comment + shorter code in serverSwitch
...
Signed-off-by: Marek Otahal <markotahal@gmail.com>
13 years ago
Marek Otahal
ed253b7aff
update javadoc, does not throw IOException
...
Signed-off-by: Marek Otahal <markotahal@gmail.com>
13 years ago
Marek Otahal
f40efb39af
Blacklist loadList() remove duplicates by using Set
...
Signed-off-by: Marek Otahal <markotahal@gmail.com>
13 years ago
Marek Otahal
f75b5e40e0
little fix in copy()
...
Signed-off-by: Marek Otahal <markotahal@gmail.com>
13 years ago
Marek Otahal
1dc5d9f0f3
make ConnectionInfo comparable and sort list of connections in Connections_p
...
ConnectionInfo compare by initTime
Connections_p implement wish to sort connections, descending
Signed-off-by: Marek Otahal <markotahal@gmail.com>
13 years ago
Michael Christen
fa8da7f89d
vocabularies are now also used as source for a did-you-mean computation
13 years ago
Michael Christen
eaec14ecc4
Dictionaries from words caches can now be used as autotagging vocabulary
13 years ago
Michael Peter Christen
91940fdf56
redesign of WordCache to be prepared to hold multiple
...
independent dictionaries. Such dictionaries can then be also used as
simplified vocabularies.
13 years ago
Michael Christen
bd40a10230
added autotaggig stub .. only reading and parsing of vocabularies at
...
this time
13 years ago
Michael Peter Christen
2ee8cbeb2c
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
...
Conflicts:
source/net/yacy/search/Switchboard.java
13 years ago
Michael Peter Christen
992dbdf4bb
added noload statistic to servlets
13 years ago
Michael Christen
354b976110
fix for concurrency problem and endless loop in /suggest.json
13 years ago
Michael Christen
c21966bb43
fix
13 years ago
Michael Christen
13b05f9c08
fix
13 years ago
Michael Christen
e5d878c59e
Merge branch 'master' of ssh://gitorious.org/yacy/rc1
...
Conflicts:
source/de/anomic/crawler/CrawlQueues.java
13 years ago
Michael Christen
ec26b2bea4
Merge commit 'fa08ed5ae5d72bddc3cc6a662b23103579e86109' into quix0r
...
Conflicts:
source/de/anomic/crawler/CrawlQueues.java
13 years ago
Michael Christen
eebc02f5c1
fix
13 years ago
Michael Christen
216a287a85
Merge commit '6d4e08ed06c5cd28c45981b2ebe31c7f7ec6fd83' into quix0r
...
Conflicts:
source/de/anomic/crawler/CrawlQueues.java
13 years ago
stbrumm
d18095dc48
Patch fuer Issue 0000102
...
and fixes to Patch (private peer status is a property of a peer, not a
status)
13 years ago
stbrumm
9f1b1b4604
Type for Robinson-Mode/Private Perr added
13 years ago
Michael Christen
20962a4ed7
added metadata node stub for metadata from blobs
13 years ago
Michael Christen
575dbbaa93
enhancements in Blob retrieval: try to use less CPU resources by testing
...
a blog first that most certainly has wanted entries.
13 years ago
Michael Christen
585a8f3c44
fixed a bug in search sequence (caused emtpy results)
13 years ago
Michael Christen
361146dd7a
better error handling for file loader
13 years ago
Roland 'Quix0r' Haeder
6d4e08ed06
Rewrote filesize() to (hopefully) avoid a NPE, rewrote Blacklist class to concurrent classes to avoid a CME
13 years ago
Roland 'Quix0r' Haeder
901f37d608
Also this ... :( #2
13 years ago
Roland 'Quix0r' Haeder
a985717ed2
Also this ... :(
13 years ago
Roland 'Quix0r' Haeder
5f490de554
Fix for ported fix from my old days ...
13 years ago
Roland 'Quix0r' Haeder
fa08ed5ae5
Fixed a lot CHMOD rights (no need for execute flag on *.java/*.html) and introduced local/remote crawl size ratio based check
13 years ago
Roland Haeder
319fd1f4aa
A concurrent access can happen on the blacklist (with latest introduced blacklist check in media snippet computation)
13 years ago
Roland 'Quix0r' Haeder
a3083d13bf
Blacklist checks are now always turned on, in media searches (e.g. image search) images matching blacklist entries are no longer shown to the user
13 years ago
Michael Christen
52184a1170
fix for search process
13 years ago
Michael Christen
85bd4cc8bc
better lookup for peer names
13 years ago
Michael Christen
20e3084bd4
redesign of fining of peers by ip: more leightweight method to read the
...
seed databases
13 years ago
Michael Christen
0797b0de99
new handling of remote search processes: looking for seeds will now not
...
block the whole search process any more. A deadlock with a DHT selection
process may have been the cause for interface lockings in the past.
13 years ago
Michael Christen
ee9aae5cc0
more about CreativeCommons license vocabulary
13 years ago
Michael Christen
ecd74fe34f
less dramatic upnp failures
13 years ago
Michael Christen
c75e1a3125
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
Michael Christen
13f5b5f80d
the component part in the YaCy Metadata is filled using the Dubling Core
...
vocabulary
13 years ago
Michael Peter Christen
8d2cbfb685
more vocabularies and more semantics for lod data structures
13 years ago
Michael Christen
9cd36b4c44
added vocabulary for geolocalization as used in georss
13 years ago
Michael Christen
9e5894c784
Removed handling of components objects for URIMetadataRows.
...
This is a preparation to replace this rows with nodes from the node
store.
13 years ago
Michael Christen
66ab51f89d
added rdf vocabulary
13 years ago
Michael Christen
c04bfaa51b
refactoring
13 years ago
Michael Peter Christen
136b514f52
added a Triple Store based on Nodes that fit to the new storage classes.
...
Added also a first Vocabulary for the node store - Dublin Core.
13 years ago
Michael Peter Christen
613ab6a69d
added BEncodedHeapBag and BEncodedHeapShard which are storage container
...
for a new metadata store. An abstraction of the content for this storage
is defined with MapStore. A MapStore is an abstraction of a RDF Node
store.
13 years ago
Michael Christen
6fecd0db88
one more performance hack to prevent costly md5 computation
13 years ago
Michael Christen
e13441b069
better digest pool size (smaller by default but unlimited)
13 years ago
Michael Christen
1f4afb4dc0
performance hacks
13 years ago
Michael Christen
675d557e88
removed debug logging
13 years ago
Michael Christen
e9dc99fe15
added rules to set specific RWIs as private RWIs which are not
...
transmitted to remote peers. This will be used for private index copies
and phonetic indexes.
13 years ago
Michael Peter Christen
4243ace863
added phonetic classes
13 years ago
Michael Peter Christen
0bcef2d156
added feature as requested in
...
http://forum.yacy-websuche.de/viewtopic.php?f=18&t=3461
The search can now be configured with a non-display host list.
the search will always exlude the given list of host unless they are
requested directly using the host navigation
13 years ago
Michael Christen
204c29f010
small bugfixes for search result display and cache display
13 years ago
Michael Christen
17f962fceb
translator updates:
...
- config string for chinese
- do not copy the language file to DATA/LOCALE any more (and do not use
them there, this is really confusing for new translators)
13 years ago
Michael Christen
752b092b8a
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
Michael Christen
078fcde0dd
bad initialization
13 years ago
admin
391fc9bd57
Merge branch 'master' of gitorious.org:yacy/rc1
13 years ago
admin
23afee58fe
Merge branch 'master' of git://github.com/f1ori/yacy
13 years ago
Michael Christen
14e45e90fd
patch for a bug that I don't understand by now.
13 years ago
Michael Christen
3eccdca63c
protection against too long running snippet fetch processes
13 years ago
Michael Christen
86b3385847
fixed a deadlock during secondary remote search
13 years ago
apfelmaennchen
ff19fcdb28
bugfix for YMarks XBEL import and export; thanks to Dominic
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8138 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
Michael Christen
c715d19c09
fixes for dependency on svn
13 years ago
Michael Christen
404758698a
less io operations
13 years ago
Michael Christen
0bc5d76bee
ups
13 years ago
Michael Christen
044f83feed
added some pauses into the search process which shall produce
...
better-ranked search results. without that pauses the result page will
only contain links from the peer that answers first which is not a good
average picture of all the peers that provided results
13 years ago
Michael Christen
943b670738
less terrible warning if uPnP fails
13 years ago
sixcooler
448656087a
probably fix for http://bugs.yacy.net/view.php?id=94
...
(don't know how to force this exception)
13 years ago
Michael Christen
f14faf503b
better ranking because we wait a very little time during the search
...
process more to get better remote sear results into the ranking priority
stack
13 years ago
Michael Christen
762e0ecfb6
fixed localization dictionaries, see
...
http://forum.yacy-websuche.de/viewtopic.php?f=5&t=3418&view=next
13 years ago
Michael Christen
6e66c9d7f1
fix for http://bugs.yacy.net/view.php?id=87
13 years ago
Michael Christen
d35bdc2df6
removed npe
13 years ago
Michael Christen
e7e429705a
- less automatic indexing after a search (needs to reset the default
...
crawl profiles)
- fix for concurrency problem in storage of serverSwitch Properties
- markup update
13 years ago
admin
a4ac051029
Merge branch 'master' of git://github.com/f1ori/yacy
13 years ago
low012
7cfdc2c092
Improved CGI capabilities:
...
*) CGI respects shebang now (should solve problems with MS Windows)
*) better error handling (more correct HTTP error codes)
*) logging
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8136 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
Michael Christen
9cd469e6d6
added pull request from als plus an NPE fix
13 years ago
admin
484c4ad339
Merge branch 'master' of git://github.com/f1ori/yacy
13 years ago
orbiter
402e9d71ef
changed ording on release files: main criteria is not the svn any more; releases are now ordered by
...
- release number
- date
- svn number
additionally there is a new option to remove the svn number completely
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8135 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
admin
29f07fea33
Merge branch 'master' of git://github.com/f1ori/yacy
13 years ago
orbiter
11729061f2
added an option in the bookmark import process to put everything into the crawler
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8134 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
admin
b9c5ce6cae
Merge branch 'master' of git://github.com/f1ori/yacy
13 years ago
apfelmaennchen
70bcfc150a
- small bug fix to ymarks html importer
...
- import of delicious.com exports has successfully been tested
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8132 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
admin
56ce8488e4
Merge branch 'master' of git://github.com/f1ori/yacy
13 years ago
orbiter
4b8ff84705
- search bugfixes (page counter and number of results per page; recognition of new search)
...
- experiments to speed-up the network image production (commented out)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8130 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
sixcooler
aeeae75b8a
the timeout of httpclient is not absolut, but till a connection is
...
established or between bytes send
trying this to reduce count of client-connections to /yacy/search.html
of other peers
13 years ago
hermens
2ac272cfbf
Fix for PeerSelection.seedsByAge() for big networks (>1000 Peers)
...
To get the most(least) recent peers search those with highest(lowest) LastSeen instead of the first by peerhash
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8129 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
apfelmaennchen
b5d9f631e3
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8128 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
83335c3b09
fix for http://bugs.yacy.net/view.php?id=78
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8127 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
Al Sutton
39898cb94a
Added try/finally protection to ensure streams are closed. Added initial size guess for the CharBuffer
13 years ago
Al Sutton
4c67a964a1
Added try/finally protection to ensure streams are closed. Added initial size guess for the CharBuffer
13 years ago
Al Sutton
3f9b9f953f
Added close() to ensure buffer close actions are invoked
13 years ago
Al Sutton
d73c84f9a0
Allow initial buffer size definition in TransformWriter, and use available() method to set it in htmlParser. In this situation a ByteArrayInputStream is used so the available() method gives a good size estimation and avoid the buffer needing to be continually grown
13 years ago
Al Sutton
f02ea27b31
Added missing closure of ByteArrayInputSteam
13 years ago
orbiter
0796b54601
- some speed hacks for network image
...
- panic patch for 'AD' hashes until it is clear where the problem comes from
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8126 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
f9216e388c
- faster ping to clean up old peers faster
...
- clean up more news
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8125 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
35a9e8f307
- fixed network graphic
...
- debuged evaluation tables
- changed cache settings in template engine
- some speed hacks
- changed int angles for peer positions in network graphic to double angles
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8124 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
Al Sutton
8993cac4d8
Initial performance improvements
13 years ago
orbiter
d9c066227a
fix for npe
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8122 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
8895d8c1cd
removed unnecessary log entries
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8117 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
550c881d80
remove more news (all older than one day) because they can be a performance problem if we have too many peers sending news
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8112 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
ebd840ebf6
- enhanced description on search front page
...
- fixed language and heuristic modifier
- added hint to crawl start that we can do also ftp and smb crawls
- added a protocol extension to remote crawls to transport all search modifiers to remote peers
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8108 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
apfelmaennchen
77a080ced9
smaller fixes for YMarks
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8105 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
e22f8497c9
- tested the ARC methods
...
- removed strict authentication (if password is empty; this was buggy and not useful; can be switched on if necessary globally and not for each interface method)
- increased speed of CrawlResults page (no dns lookup any more)
- increased speed of favicon display (removed dns lookup)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8104 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
bc5df0eef5
updated ranking tables (fresh computation)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8103 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago