Al Sutton
39898cb94a
Added try/finally protection to ensure streams are closed. Added initial size guess for the CharBuffer
13 years ago
Al Sutton
4c67a964a1
Added try/finally protection to ensure streams are closed. Added initial size guess for the CharBuffer
13 years ago
Al Sutton
3f9b9f953f
Added close() to ensure buffer close actions are invoked
13 years ago
Al Sutton
d73c84f9a0
Allow initial buffer size definition in TransformWriter, and use available() method to set it in htmlParser. In this situation a ByteArrayInputStream is used so the available() method gives a good size estimation and avoid the buffer needing to be continually grown
13 years ago
Al Sutton
f02ea27b31
Added missing closure of ByteArrayInputSteam
13 years ago
orbiter
0796b54601
- some speed hacks for network image
...
- panic patch for 'AD' hashes until it is clear where the problem comes from
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8126 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
f9216e388c
- faster ping to clean up old peers faster
...
- clean up more news
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8125 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
35a9e8f307
- fixed network graphic
...
- debuged evaluation tables
- changed cache settings in template engine
- some speed hacks
- changed int angles for peer positions in network graphic to double angles
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8124 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
Al Sutton
8993cac4d8
Initial performance improvements
13 years ago
orbiter
d9c066227a
fix for npe
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8122 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
8895d8c1cd
removed unnecessary log entries
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8117 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
550c881d80
remove more news (all older than one day) because they can be a performance problem if we have too many peers sending news
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8112 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
ebd840ebf6
- enhanced description on search front page
...
- fixed language and heuristic modifier
- added hint to crawl start that we can do also ftp and smb crawls
- added a protocol extension to remote crawls to transport all search modifiers to remote peers
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8108 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
apfelmaennchen
77a080ced9
smaller fixes for YMarks
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8105 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
e22f8497c9
- tested the ARC methods
...
- removed strict authentication (if password is empty; this was buggy and not useful; can be switched on if necessary globally and not for each interface method)
- increased speed of CrawlResults page (no dns lookup any more)
- increased speed of favicon display (removed dns lookup)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8104 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
bc5df0eef5
updated ranking tables (fresh computation)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8103 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
5a55397f99
some last-minute performance hacks
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8101 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
apfelmaennchen
dd1482aaf5
further update to YMarks
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8100 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
c9216d5adf
fixed secondary remote search (the process that finds distributed join situations)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8098 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
64fd20b857
new default ranking profile
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8097 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
0cf9ebc3b0
speed enhancements when parsing RWI rows (makes search slightly faster)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8096 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
c9a0dbd25a
added a security check
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8094 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
ee8b1d4de1
fixed unresolved pattern and unwanted local/global switch when using votes on search results
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8093 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
c584db991f
creating a bookmark from the search results now works again .. with new YMarks
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8092 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
1120f0c93c
update to network graphics: slightly less crawling activity, slightly stronger color for query activity
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8089 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
6cd27473f5
- better default values for caching and cache usage
...
- set new caching and verification behavior according to use case automatically
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8087 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
709013385a
fix for language fix
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8086 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
1019c36dad
bug fixes and speed enhancements for search
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8085 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
507c9d478d
much better timing when search globally; less blocking; more results earlier!
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8084 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
8e0b2c5832
fixed cluster search
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8083 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
c0c6e9e7a5
fix for bad language encoding
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8082 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
apfelmaennchen
564374d1fe
- included YMarks in addition to old bookmarks in yacysearchitem.html; don't get confused by the old bookmark dialog, the ymark is automatically added silently beforehand.
...
- reworked bookmark creation on crawlstart
- many smaller adjustments to ymarks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8072 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
05f34a3fa7
added a full, complete, database insert, update and delete API for the tables.
...
Please see this example:
list all database tables:
http://localhost:8090/api/table_p.xml
now create a new table and insert some values into 'mytable'
http://localhost:8090/api/table_p.xml?table=mytable&pk=&commitrow=&col_termin=Release%20Machen&col_datum=24.11.2011&col_status=ongoing
list the table content:
http://localhost:8090/api/table_p.xml?table=mytable&pk=
update the table and change a single value inside. You must refer to the row using a primary key 'pk'
http://localhost:8090/api/table_p.xml?table=mytable&pk=000000000001&commitrow=&col_datum=29.11.2011
you can also select rows using a search operator
http://localhost:8090/api/table_p.xml?table=mytable&pk=&count=10&search=
now lets delete the row:
http://localhost:8090/api/table_p.xml?table=mytable&pk=&deleterows=pk_000000000001
and we can also delete the complete table:
http://localhost:8090/api/table_p.xml?table=mytable&deletetable=
You can use this to administrate the robots, bookmarks and API steering using an outside application!
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8071 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
lotus
3cc93325f0
temporary remove compare search from tray
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8070 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
c93f10417a
add a bookmark automatically each time a new crawl is started
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8063 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
e4a82ddd8b
produce a bookmark entry from every crawl start. these bookmarks are always private.
...
these bookmarks will be used to get a source reference for the search in case of intranet or portal searches.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8062 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
apfelmaennchen
6287c2b4a9
YMarks:
...
- introduced tag manager - a quite powerful tool (still not 100% stable, so be careful)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8060 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
cominch
2236e01137
Minor correction to prevent useless comma at beginning of string, created from list
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8059 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
apfelmaennchen
5581be12fb
YMarks:
...
- added backend and api for tag management
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8058 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
804e48888b
smaller bug fixes for search behavior; should produce less unnecessary removals and an exact number of results as shown in counter
...
should also be a little bit faster
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8057 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
apfelmaennchen
a3eebfdcba
YMarks:
...
- show active/running crawls
- execute crawls (works currently only if API entry is available)
- various smaller fixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8056 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
c50f8f9a06
code cleanup
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8055 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
84c3fc9d97
local/global fixes in search, better abstraction
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8054 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
apfelmaennchen
4f95f72124
YMarks:
...
- working direct importer for YaCy Crawl Starts
- working direct import for old bookmarks.db
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8052 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
aa322bc6d0
fix
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8050 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
97d1347adb
added also a default accept field to robots.txt downloads
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8049 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
f183d3822c
added a default accept header in http requests since some http fraud detection functions check that this header field exist
...
see also: http://bad-behavior.ioerror.us/ in source file browser.inc.php
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8048 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
06352b8d6b
more logging
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8047 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
a99934226e
more logging for debugging of robots.txt
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8046 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
7a5841e061
fix for robot parser
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8045 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
458c20ff72
fix for robot parser
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8044 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
017a01714d
- enhanced logging in robots.txt parser for remote debugging
...
- robots.txt is now more robust against database operations
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8043 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
apfelmaennchen
a8dfe787ed
- updated to jquery flexigrid 1.1
...
- YMarks.html automatically recognizes if a bookmark is a crawl start
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8040 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
eb1c7c041d
write info about robots.txt evaluation into getpageinfo_p.xml
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8038 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
apfelmaennchen
abba31f02e
- bugfix for correctly sorting ymarks
...
- some tuning for the autotagger (still not perfect)
- /api/ymarks/get_metadata.xml now provides info for crawlstarts
- removed unused code
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8036 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
3a15e58e28
- increased stability when opening the robots table
...
- increased stability when deleting tables
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8034 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
775b44017e
refactoring
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8033 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
e914a30099
fix for npe
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8032 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
apfelmaennchen
5f7dbe1c42
- some refactoring (ymarks)
...
- improvement for autotagger (is now able to create/detect multi word tags e.g. 'open source')
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8031 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
78ce3b13be
typo
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8027 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
85d6bf4ac4
fixed urls to media content during indexing
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8021 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
0d858d48ec
replaced String with StringBuilder in suggestion process
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8020 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
3a807e10cf
- added a cache for active crawl profiles to the crawl switchboard
...
- moved the domain cache for domain counter from the crawl switchboard to the crawl profiles. the crawl domain counter is now therefore relative for each crawl start, not for the whole crawler.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8018 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
37e35f2741
normalization of url using urlencoding/decoding
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8017 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
e58438c01c
- added a new retry connector for solr (for cases where solr responses are slow)
...
- added a new exist property into the metadataRepository which includes solr entries
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8016 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
d8d9735b4f
stability bugfix
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8012 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
c31564ef08
stability bugfixes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8011 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
f121f4bb45
fix for link in Supporter and Suftipps page
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8010 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
94eab08794
- updated opensearchdescription text and icon
...
- removed automatic setting of maxitems during search (can be set now elsewhere)
- updated RSSMessage.java
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8009 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
279482a76d
fix for npe
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8007 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
1b86d06d1e
fix for http://bugs.yacy.net/view.php?id=62
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8004 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
9e4875230f
performance hacks
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8001 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
eb9c9edb01
enhanced table method (used by almost all yacy api interfaces)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8000 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
4ad9fc2bff
new snippet strategy for search hits in metadata: show beginning of text instead of hit position
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7999 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
a9838f8b99
fix for http://bugs.yacy.net/view.php?id=59
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7997 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
hermens
d3df03838a
make sure myself-target is always inserted at its appropriate position
...
this was previously omitted if the own peer should have been the first target
or the peer was the last peer before the rotation to AAAAAAAAAAAA
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7996 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
hermens
c3e7efa846
added sender side prevention of rwi flooding as mentioned in SVN 7993
...
saves memory and speeds up enqueueContainers by limiting the size of transfer.Chunk
saves network bandwidth by not transmitting RWIs that would get discarded at the target anyway
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7995 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
5af9598bd1
enhanced exported row parsing during row import
...
this affects the search and dht receive speed
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7994 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
7598a9e26b
fix for thread dump
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7992 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
8eef8722d1
update to ThreadDump analysis: freerunner and thread state recognition
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7990 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
1df43b137d
another performance hack
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7989 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
7df0643f0e
performance hacks
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7988 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
a7df70221e
refactoring
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7987 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
1b45e33f04
added robots tag parser to solr scheme
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7986 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
cf4fd525ee
added directDocByURL attribute in crawl profile
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7985 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
c61e4cfd78
- fix for incomplete clear() in balancer
...
- renamed Parser Errors to Rejected URLs
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7984 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
813f297a95
another performance hack: re-use of known host addresses for isLocal property; avoids look-up in local hash
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7983 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
035ebfbf3b
- performance hacks (should affect the crawl balancer and reduce CPU load during crawl stack re-fill)
...
- this may have also (good) performance side effects on other parts of YaCy
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7982 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
b250e6466d
implemented crawl restrictions for IP pattern and country lists
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7980 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
f1ori
e207c41c8e
* fix urlproxy for urls containing dolar signs
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7979 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
57d5529a01
performance hacks
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7977 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
5ad7f9612b
added crawl settings for three new filters for each crawl:
...
must-match for IPs (IPs that are known after DNS resolving for each URL in the crawl queue)
must-not-match for IPs
must-match against a list of country codes (allows only loading from hosts that are hostet in given countries)
note: the settings and input environment is there with that commit, but the values are not yet evaluated
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7976 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
47a8c69745
added a new feature to MultiProtocolURIs to get the locale for each url:
...
This is done using a new library InetAddressLocator.jar which is NOT added by default to YaCy because it is very old and with that library we will never get a debian package. However, some people want that functionality and it can be made available if the library is taken from http://javainetlocator.sourceforge.net/ and placed into the /lib directory where it will be found using reflection.
The new feature will be used to extend the crawler steering.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7975 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
2c3161b4ac
refactoring:
...
RankingProcess -> RWIProcess
ResultFetcher -> SnippetProcess
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7974 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
d2ea250d99
refactoring:
...
- moved many classes from de.anomic to net.yacy
- made more sub-packages for search classes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7973 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
low012
42b5f09f68
*) this should fix a bug in snippet creation (also cleaned up a little bit)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7972 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
low012
277b454a62
*) added comments
...
*) minor refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7971 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
6b22865dbc
- removed some warinings
...
- removed a dead update location
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7970 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
0c6d95e57b
- more tolerance against failure of table opening
...
- more connections for solrj
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7968 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
4f31869c5a
enhanced search result timing
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7966 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago