orbiter
415b92bb07
fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1148&hilit=&p=7711#p7711
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4795 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
apfelmaennchen
2113672bf2
small fix on tag comporator functions
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4794 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
dd75b3cabc
- patch for bad profiles
...
- time-out when deleting profiles
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4793 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
2080ff72b7
ftpc fix for npe
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4789 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
lotus
e021278bf0
unescape link display in search results
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4788 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
3f1721b827
informational comment
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4787 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
04a51b775a
changed .org/.net back to America
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4786 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
74b1a60043
fixed "java.lang.NoClassDefFoundError: org/a"
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4784 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
f44d5d302b
updated TLDs
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4782 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
hermens
5bfc02ccfb
Repair publishThread
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4781 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
f42c8cf69c
updated terminal and dynamic webstructure applet: can now change when crawl is running
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4780 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
906c144799
- design update to new terminal and rssTerminal
...
- added terminal to main menu
- removed transfer size limitation in server
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4779 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
7ec01d444a
fix for npe
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4778 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
ad0f905124
fix for npe in crawler
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4777 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
ae03a54d23
pdfParser: updated lib, fixed ClassNotFoundException: CMSError
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4776 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
0d3808bd9e
minor refactoring
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4775 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
719f5defb1
updated some grafics at new terminal_p
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4774 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
lotus
9bc56a9edc
xss protection
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4772 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
b32736762c
enhanced rssTerminal
...
- 3 lines possible
- distinguishing of private and public data, if not authorized only public data is shown
- shows now more events, including local searches in clear text if user is logged in
- simplyfied peer events
- better recognition of 'real' new peers
- presentation of peer pings from other peers
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4771 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
fbb712c669
refactoring:
...
moved importer classes to crawler and plasma package
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4770 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
1689030ee8
refactoring: moved all crawler classes into their own package
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4768 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
d2ba1fd2ab
major step forward to network switching (target is easy switch to intranet or other networks .. and back)
...
This change is inspired by the need to see a network connected to the index it creates in a indexing team.
It is not possible to divide the network and the index. Therefore all control files for the network was moved to the network within the INDEX/<network-name> subfolder.
The remaining YACYDB is superfluous and can be deleted.
The yacyDB and yacyNews data structures are now part of plasmaWordIndex. Therefore all methods, using static access to yacySeedDB had to be rewritten. A special problem had been all the port forwarding methods which had been tightly mixed with seed construction. It was not possible to move the port forwarding functions to the place, meaning and usage of plasmaWordIndex. Therefore the port forwarding had been deleted (I guess nobody used it and it can be simulated by methods outside of YaCy).
The mySeed.txt is automatically moved to the current network position. A new effect causes that every network will create a different local seed file, which is ok, since the seed identifies the peer only against the network (it is the purpose of the seed hash to give a peer a location within the DHT).
No other functional change has been made. The next steps to enable network switcing are:
- shift of crawler tables from PLASMADB into the network (crawls are also network-specific)
- possibly shift of plasmaWordIndex code into yacy package (index management is network-specific)
- servlet to switch networks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4765 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
d70a472460
added file for previous commit
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4764 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
d32fe84472
added default User-Agent
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4763 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
8c5f062e0b
corrected YaCy version in HTTP User-Agent
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4762 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
d7b21bc90c
re-added gzip POST for transferRWI/URL (HTTP/1.1 compliant)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4761 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
low012
8d83febb95
*) BlacklistCleaner_p.java reports exception to log instead of System.err
...
*) changes in formatting for better readability in BlacklistCleaner_p.java
*) replaced test for necessary Java version (was 1.4.2, is 1.5 now)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4756 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
d4bce6affd
refactoring (initialized static fields, removed empty if/else, serialized some fields in serializable classes)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4755 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
be2c9c07ff
escape some unescaped characers in URLs (fixes problems with proxy)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4753 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
d0678f7ab9
refactoring as result of
...
http://forum.yacy-websuche.de/viewtopic.php?f=6&t=959&p=7560#p7560
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4752 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
483e9a2066
- shifted tld recognition methods from yacyURL to serverDomains
...
- changed isLocal Property in such a way that it is possible to see if a domain is in the internet (and not intranet)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4751 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
a3df23659c
re-implementation of charset checking
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4750 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
75a1702133
- fix for ConcurrentModificationException during shutdown
...
- fix for Ranking distribution problem (suma-lab peer does not exist any more)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4749 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
32b5b057b9
- modified, simplified old kelondroHTCache object; I believe it should be replaced by something completely new
...
- removed tree data type in kelondroHTCache
- added new class kelondroHeap; may be the core for a storage object that will once replace the many-files strategy of kelondroHTCache
- removed compatibility mode in indexRAMRI
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4747 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
ec84a52adb
change for problem with NPE (seen as "PROXY Unknown Error while processing request")
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4745 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
5813cc149f
fix for bad rssTerminal behavior
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4744 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
88216c1f1f
fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1103&hilit=&p=7362#p7362
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4743 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
d0b893523e
- protection against RAM overflow caused by new peer rss news
...
- more XSS protection
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4742 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
685794e7e7
fix for parser/encoding Exception
...
see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1111&hilit=&sid=55a320b54e1e3bda9410e7c50b5147f1&p=7431#p7431
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4741 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
cf042e6957
reverted change by mistake in yacyVersion
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4740 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
9935e83c86
added new news window into the status page. At this moment it is just a test.
...
The news inside the window are about peer arrivals and departures, remote search accesses and crawls
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4739 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
bac38cfa18
added very rudimentary peer news as rss feed. An example can be retrieved with
...
http://localhost:8080/xml/feed.rss?channel=PEERNEWS
to be extended and integrated in interface ...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4738 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
6495227ad6
the class rssReader is replaced by RSSReader, RSSFeed and RSSMessage
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4737 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
724bbdf9b2
refactoring of RSS reader
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4736 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
b9a2a2d287
more search performance hacks
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4735 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
ff755fb858
small corrections and enhancements after search timing profiling
...
search should be a little bit faster now
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4734 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
e024e3b9cf
added new default profiles to distinguish snippet fetch for local and global search
...
the difference is, that a local search will no not cause a re-indexing of loaded pages
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4731 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
9b03310f8a
bin jetzt wach :/
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4729 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
7bd8601f04
delete old releases compatible with java 1.5 ;)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4728 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
da386a1924
fixed deleteOldDownloads if there are no downloads
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4726 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
21418a22a3
removed DEBUG output
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4725 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
79a3edeeef
deleting downloaded releases after x days (default 30)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4724 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
763f9d4f5d
serverCore: setting timeout for new connection before SSLDetect
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4723 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
1995faef8d
- refactoring of Colage back-end: move to plasma package
...
- renamed also the plasmaCrawlResults to have a consistent naming for url and image queues
- added a double-check for the images
- added additional queues for the images: all worse-quality images go there, so the queue can be used also if no sizes are given; no image is lost
- added a cleanup for the stacks so they cannot flood the memory
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4722 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
d7e89c2aca
fixed near-deadlock situation when deleting crawl profiles
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4721 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
5e3ce46339
- better logging when rejecting a url because it is not in declared domain
...
- more XSS attack protection
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4720 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
48ffd61e6a
changed "patched wrong" to warning, so it goes to the logfile
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4716 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
2f629d20a7
- tried to fix the '4217666-problem'
...
- removed more unused code
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4715 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
512f48e7d6
- removed unused methods
...
- fixed xss attack on peer list in CrawlStartSimple
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4714 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
3c76342619
- added servlet to configure the search page greeting line
...
- added information output about the current network definition in the network servlet
- better description and usage of profile entries in User Profile servlet regarding FOAF format
- reformatting of menues at status page
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4710 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
d1ee231866
HTTPC close more unused connections
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4702 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
181796cffb
- HTTPC ConnectionInfo entfernen bei Exceptions, unnötigen Code entfernt
...
- FTPC (GET-)connections bei Fehlern auf jeden Fall schliessen
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4701 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
04c1226c80
added/fixed missing integrity-test else-case during deploy in case that we update with a tar file
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4700 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
45ae3da7e7
another patch to prevent NPE in EcoTable
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4698 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
cb93ded5c6
applied configuration path patch
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4697 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
96e39b297a
reduced StackTraces (by connect timed out)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4696 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
93376acdca
fixed a bad chunkcache limit check which could have caused ArrayIndexOutOfBoundsExceptions
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4695 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
1cab240198
patch for possible NPE in EcoTable iterator
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4694 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
9a32a4c328
fixed concurrentModificationException during hello-process
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4693 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
64c33e717f
catched ConcurrentModificationException in ConnectionInfo.cleanUp so cleanUp is not interrupted
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4692 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
d8677ba611
fixed ConcurrentModificationException in HttpConnectionInfos
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4690 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
c7021c14bb
patch for ArrayIndexOutOfBoundsException in BMP parser
...
(may occur in case of malformed BMPs)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4689 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
8dd35f74c8
fixed redirect problem (does not work for POST)
...
see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1068&hilit=
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4687 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
8313d58ae7
- integrated the collage into the Web Visualization menu
...
- added a counter for the public and private queue on the page (testing..)
- fixed wrong public/private categorization
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4686 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
2617f4dcdb
Connections_p.html: better formatting and remove very old entries
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4684 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
82bf9ac1c8
- added Collage servlet from datengrab and modified it:
...
* all images are queued
* private/public is respected
* inserted into switchboard
* added collageQueue class that stores all the queued images
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4683 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
959f448e5f
- disabled redirects in proxy (so client sees real path)
...
- added connection stats (only connections currently in use)
- remove "old" connections (closed or idle for some time)
- synchronized shared parts of proxyHandler
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4682 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
8fe39ebd74
-fixed file transmission with POST. The only usage was in ranking transmission, therefore:
...
-fixed ranking transmission
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4681 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
82a9861779
fix for last commit
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4680 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
5d1fbb25e7
fix for bad deploy:
...
- the name of downloaded release files is adopted if the httpc delivers uncompressed tar.gz files (the .gz is removed from the file name)
- the deploy method is able to handle tar-file (not tar.gz-files)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4679 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
202a3adb3e
refactoring of HttpClient Writer processes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4678 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
8aa9fd8f24
HTTPC with only 1 retry
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4677 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
444dce7e81
more performance hacks
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4676 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
2c2dcd12a2
- enhanced performance of Eco-Tables: less time-consuming size() - operations
...
- will increase speed of indexing and collection.index creation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4675 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
e356625b22
- refacotring of stream copy handling to support time-consuming operations
...
- made usage of BufferedStreams explizit to distinct different copy method in serverFileUtils (byte-by-byte and using an own buffer)
- introduced another timeout setting (java internal property)
- more restrictions to clients accessing a single host (a security setting to prevent DoS by mistake)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4674 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
f01c50cf8d
Proxy logging error (first step to resolution!?)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4673 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
c3342e1178
- removed class with only one static method
...
- removed connection method with too long time-out
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4672 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
f97971b63b
fixed NPE problems doing a shutdown from command-line
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4671 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
7a35126e91
http timeouts von alten httpc wieder gesetzt
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4670 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
2c1c3bb6eb
- some refactoring (sorry Daniel, hab in deinem Code rumgewütet)
...
- fixed broken downloads (flush was missing)
- different problem handling when download is corrupted
- different default values in yacy.init
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4669 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
d96e2badc7
- fixed POST in proxy
...
- prepared http connection tracking
- refactoring (mainly moving StreamTools to serverFileUtils)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4668 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
14404d31a8
- enhanced performance graph (more info)
...
- added conditions for rarely used logging lines to prevent unnecessary CPU usage for non-printed info
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4667 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
696b8ee3f5
fix for http://forum.yacy-websuche.de/viewtopic.php?p=6806#p6806
...
- removed all InputStream.available() because this does not work for files > 2GB
- iterator terminate when a IOException occurs
- added handling of non-executing index.add methods to enhance assert usage
- added index for file indexes > 2GB, to be used in new indexHeap
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4666 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
94d3d3a86f
fixed Proxy (for GET, POST still does not work!)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4665 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
081ed1d3ec
HTTPLoader: reduced stackTraces
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4664 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
8b2efb6f8c
fixed garbage in HTCACHE
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4663 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
225f9fd429
various fixes
...
- shutdown behavior (killing of client sessions)
- EcoFS reading better
- another synchronization in balancer.size()
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4662 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
6e36c156e8
added more logging to EcoFS
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4661 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
fb541f9162
HTTPC: default timeout half-hour
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4660 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
a94f6cdca4
HTTPC: allowed self-signed certs
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4659 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
ab330cfdca
Network.html: removed ; from location
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4658 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
319144f4b2
fix for outofbounds-excception in EcoFS chunk iterator
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4657 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
a9cf6cf2f4
generalization of index container-heap class.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4654 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
f099061944
protection against bad dht-flush word selection
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4653 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
5e4fddc1e6
more logging for new EcoFS.ChunkIterator to find bug for
...
http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1024&hilit=&p=6806#p6806
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4652 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
117ae78001
speed enhancement for reading of eco-table indexes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4647 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
7c149a4ee8
- undo less 'binary data found'
...
- removed duplicate stackTrace
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4643 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
96cce8bed9
reduced 'Binary data found' errors
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4642 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
2aef1414f5
removed test (in yacy.init)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4641 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
5c3c1fdf41
replaced httpc with Apache Jakarta Commons HttpClient (includes some refactoring ;)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4640 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
daa04f5db9
added additional check in file handler to prevent that url attacks are hidden in url path encodings
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4637 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
783a4c9edb
strong speed enhancements for the index cache dump and restore:
...
storage and loading is 30 times faster! a cache of 100000 RWIs needed 180 seconds
to store and 100 seconds to restore; now the same cache needs only 6 seconds to store and
3 seconds to restore. The cache size has decreased now by 30% (95 MB instead of 150 MB).
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4634 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
442204a1c8
fix for concurrentModificationException
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4633 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
d2f4926951
- more logging for balancer to get a hint where the problem is
...
- fix for new concurrency method in kelondroSplitTable
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4631 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
20dadba426
- added a deadlock prevention function in cache flushing
...
- removed unused methods in collection index
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4630 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
764a40e37d
speed enhancements for crawler and url retrieval (affects also search speed)
...
- concurrency for LURL-fetching: this can be done using a concurrent lookup into the separated url databases. Concurrency is possible because there is no IO during lookup. The more LURL-Tables are present, the better is the speedup. More CPUs will increase speed
- because a large number of LURL-lookups are made during crawling (for double-check), the LURL-Lookup speed enhancements enhances also crawling speed
- search speed also profits from LURL-lookup enhancement
- changed some flushing parameters in word index caching which should make better use of large word index caches and should speed up indexing
- removed flush chunksize parameter, because this was only useful for IO path enhancement feature which was removed some weeks ago to prevent blocking and deadlocks during search requests
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4628 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
3ce3a4a3a1
added stub for new index container heap data structure (purpose: index folding)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4627 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
2c34038912
addition/correction to last commit: usage of concurrent-classes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4626 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
b2150057d2
removed unnecessary cleanup method
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4625 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
lulabad
c4c0d54b22
* added regex extended blacklistengine
...
* removed my own engines
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4618 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
368593e449
enhanced the concurrency handling of indexing process (better queue size control, better data concept, better shutdown behavior)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4617 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
be58135b3e
possible fix for deadlock in search execution
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4612 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
0241d070bc
added concurrency to indexing process:
...
- the methods {parsing, semantic analysis (condensing), structure analysis (web structure)} in the serialized indexing path had been made concurrent.
- four BlockingQueues handle concurrency and hand-over of the indexing objects, the last object in the queue is stored into a blockingQueue of maximum size 1 to serialize the process for storage (which uses IO and therefore here should not be deserialized)
- a concurrency of (CPUs + 1) is default. Single-CPU users will profil from the change because large files cannot block the indexing process any more.
- removed the secondary indexing thread, which is superfluous now. Concurrency is default for all users.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4609 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
lulabad
9fb5d661f2
added my Blacklistengines
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4608 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
bca87f1e38
- refactoring of serverThreads: renaming to distinguish busy-threads and blocking-threads
...
- added blockingThreads which are threads that are not driven by pause times but by BlockingQueue lookup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4606 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
968c775025
- preparation of parsing/indexing queue for concurrent execution
...
- remote crawl receipts are now transmitted concurrently in separate threads (makes remove crawls much faster!)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4605 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
9b0e20fb06
next refactoring step in document indexing to prepare concurrency environment for document parsing
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4604 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
7f9f639d20
- refactoring and abstraction of index reference (urls) handling: blacklisting is part of reference filtering
...
- refactoring of word/phrase handling: word abstraction from condenser becomes part of index element handling
- removed unused code parts from condenser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4603 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
d6050b9ffb
- separated the LURL data storage and Crawl result stack for process supervision.
...
this is another step to enable multiple, concurrent fulltext-indexes
- another try to make the yacy-httpc more stable
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4602 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
8d6a13bc07
refactoring of parsing-condensing-indexing process:
...
- separated parts
- removed storagePeer function
next step will be parallelization of processes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4600 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
d3b06913ec
protection against seed-db failure during enumeration
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4598 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
5aa96dbc36
fix for shutdown configuration
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4596 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
93633abed8
- removed some debugging code from search process - should speed up now
...
- added some profiling code to search event - more time details in PerformanceSearch_p.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4594 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
fba46c51d7
fixed non-termination bug in qsort
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4593 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
541b817502
refactoring of switchboard queueing
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4591 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
fc94fbe224
another improvement to the collection sorting
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4589 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
11270d450e
better quicksort-pivot computation: 30% faster (measured with test program)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4588 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
3e44293f07
- fixed a problem with thread pools in row collection
...
- added a line-viewing feature in threaddump
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4587 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
e43051b125
- fixed Threaddump output (html-escaped ie. <init>)
...
- in EcoFS converted comments to javadoc
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4586 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
433ff855f7
- fixed another concurrency problem in collection sorting
...
- fixed a typing problem that was introduced in svn 4579 and caused the crawl monitor to fail
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4585 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
19286fa2d1
tried to fix seed2.old.db-problem
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4584 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
f3996e63b8
tried to fix more deadlocks:
...
- changed connection modes in ftpc
- replaced sort tread pool in row collections by new one using util.concurrent. the old pool had caused blockings
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4582 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
7008a218b3
avoid ConcurrentModificationException in plasmaCrawlerQueues
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4579 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
7150b463ff
changed handling of default values and database paths:
...
- the default files yacy.init and for the network definition is now moved to the path defaults
- the httpProxy.conf is renamed to yacy.conf
- the DATA/INDEX/PUBLIC is renamed to the actual network nickname, which should be freeworld or sciencenet
more menu entries
- added apfelmaennchens alternative search page to the menu
- added the new thread dump page to the server log menu point as submenu
modifications
- modified the thread dump page: sorting by thread type
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4575 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
lulabad
25f5035f23
typo
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4571 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
7fd094fcbe
small bug in ftpc: did cot compile in Java 1.5
...
Please set compiler to Java 1.5-compliance
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4570 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
f51bad8ae5
FTP:
...
- report connection status (to break if no connection possible)
- fixed isFolder()
- additional error output
- fixed paths with encoded symbols (ie. a%20file.txt)
- refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4567 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
danielr
820641938e
ftpc: fixed date parsing, some refactoring
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4566 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
4c584dff87
disabled soLinger to prevent that too many connections stay open (it's a TEST!)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4565 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago
orbiter
9c989fe5f7
fixed deadlock
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4562 6c8d7289-2bf4-0310-a012-ef5d649a1542
17 years ago