Michael Peter Christen
4de50fe808
adding more principal peers for bootstraping
13 years ago
reger
067728bccc
add search result heuristic. adding a crawl job with depth-1 for every displayed search result (crawling every external linked page of displayed search result pages)
13 years ago
Michael Peter Christen
508a81b86c
added solr field 'refresh_s' which stores the refresh url contained in
...
the meta-refresh html header field.
13 years ago
Michael Peter Christen
9116013c64
- allow lazy initialization of solr value (if using 'lazy', then no
...
0-values and no empty strings are written). This may save a lot of
memory (in ram and on disc) if excessive 0-values or empty strings
appear)
- do not allow default boolean values for checkboxes because that does
not make sense: browsers may omit the checkbox attribute name if the box
is not checked. A default value 'true' would not comply with the
semantic of the browsers response.
- add a checkbox in IndexFederated_p for the lazy initialization of solr
fields.
13 years ago
Michael Peter Christen
c03d306afa
shorter autocommit time (now: 1 second) to prevent that user cannot see
...
results in solr the first time they try it out. The value can now be
easily set to a higher number using the IndexFederated_p interface.
13 years ago
Michael Peter Christen
3fd4a01286
added option to record urls that are forwarded to the solr index
13 years ago
Michael Peter Christen
8dd469b9dd
added option to configure the autocommit delay time of solr on-the-fly
13 years ago
Michael Peter Christen
b9dfca4b0a
- fixed IndexFederated Servlet / a embedded Solr can now be selected
...
- added code stub for an embedded Solr but generation of Solr store is
still commented out (it works but is not yet ready for usage)
13 years ago
Michael Peter Christen
1be0025a9c
- added test for EmbeddedSolrConnector
...
- added needed libraries for this test
this includes most (all) files needed for an embedded solr
13 years ago
Michael Peter Christen
dbdd697f4d
moved RDFaParser.xsl configuration file to defaults
13 years ago
Michael Peter Christen
8738336408
set Xms lower than Xmx
13 years ago
Michael Peter Christen
96f6a5869f
more robust OAI-PMH client (large time-out, three re-tries). OAI-PMH
...
server appeart to be very slow sometimes
13 years ago
Michael Peter Christen
6d17686258
made triplestore persistent by default
...
added a size display in triplestore servlet
13 years ago
cominch
3c255c025b
Show tags in search results (if activated in ConfigPortal_p.html)
13 years ago
Michael Peter Christen
a5cdfb91de
- fixed Cache link (below snippet)
...
- added 'Augmented Proxy' link below snippet
- added configuration options for augmented proxy
13 years ago
Roland 'Quix0r' Haeder
af5a597e47
Scroogle is not comming back, remove dead code
...
Conflicts:
source/net/yacy/search/Switchboard.java
13 years ago
cominch
90512640bf
Added config switches for custom parser
...
Conflicts:
source/net/yacy/document/TextParser.java
13 years ago
cominch
5d20cd324a
Add Triplestore and RDF query interface
...
Conflicts:
build.xml
defaults/yacy.init
source/net/yacy/interaction/AugmentHtmlStream.java
13 years ago
cominch
a32943b382
add json mimetype
13 years ago
Michael Peter Christen
41c02cb10e
- less restrictions for usage of Table RAM copy
...
- new limit to use the table copy (instead of flag): 400MB available. If
less is available, then a copy is never used. If more is available, then
it can be used if there is a remaining space of at least 200MB
- flush caches more often: flush the Digest cache
13 years ago
Michael Peter Christen
8002fd2578
use less cache space since a large cache would cause more memory usage
...
in index files.
13 years ago
Michael Peter Christen
5aee19daa4
added show from cache in search results (not yet finished)
13 years ago
Michael Peter Christen
0d32a766ed
relax verify attribute for search widget to make it faster:
...
set to "cacheonly"
13 years ago
Michael Peter Christen
7eece0256f
moved yacy.logging to defaults according to request in
...
http://bugs.yacy.net/view.php?id=55
13 years ago
Michael Peter Christen
db9d81cb7a
ups
13 years ago
Michael Peter Christen
e7e381d110
added configuration to switch off redirection following in crawler
13 years ago
Michael Peter Christen
2be327b5ab
update location update
13 years ago
Michael Peter Christen
99c74699de
removed scroogle (scroogle is dead)
13 years ago
Michael Peter Christen
8bee1472c9
there is no noindex, only nofollow in links
13 years ago
Michael Peter Christen
4c5edab1ec
added option to have exception search result windows
13 years ago
Michael Peter Christen
696ee5fc16
removed pdf from default parser deny list
13 years ago
Lotus
c73af39e54
refactoring of tray icon class,
...
now uses Java 6 methods natively
13 years ago
Michael Peter Christen
987b412491
updated solr scheme: generic declaration of solr schemes
13 years ago
Michael Peter Christen
0bcef2d156
added feature as requested in
...
http://forum.yacy-websuche.de/viewtopic.php?f=18&t=3461
The search can now be configured with a non-display host list.
the search will always exlude the given list of host unless they are
requested directly using the host navigation
13 years ago
Michael Christen
17f962fceb
translator updates:
...
- config string for chinese
- do not copy the language file to DATA/LOCALE any more (and do not use
them there, this is really confusing for new translators)
13 years ago
Michael Christen
c715d19c09
fixes for dependency on svn
13 years ago
Michael Christen
f62e6fb438
less frequent DHT distribution to reduce the load a bit on every peer
13 years ago
Michael Christen
9dbc93613e
now that the whole world knows that we actually do p2p and not
...
metasearch we can support a default look-up to scroogle to gain more
attention to people who say that your search results are incomplete
13 years ago
orbiter
f9216e388c
- faster ping to clean up old peers faster
...
- clean up more news
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8125 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
ac5bda205f
- removed lower page navigation (it never looks nice)
...
- added visibility of metadata and parser in search results since that shows what YaCy can do in a nice way
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8091 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
c659310e89
- removed option to search for audio, video and applications. These things are still experimental and should not be shown to new users since this would cause them to argue that YaCy does not work. The functions are stil available, because:
...
- added a configuration option in ConfigPortal to swtich the search media types on or off
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8090 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
6cd27473f5
- better default values for caching and cache usage
...
- set new caching and verification behavior according to use case automatically
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8087 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
5866c73a09
fix for compare search: use scroogle instead of bing and get a default search if configured search engine is not available
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8074 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
e4a82ddd8b
produce a bookmark entry from every crawl start. these bookmarks are always private.
...
these bookmarks will be used to get a source reference for the search in case of intranet or portal searches.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8062 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
f183d3822c
added a default accept header in http requests since some http fraud detection functions check that this header field exist
...
see also: http://bad-behavior.ioerror.us/ in source file browser.inc.php
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8048 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
78ce3b13be
typo
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8027 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
suessthomas
887f088dad
The IP address of the YaCy-Demo portal added to Whitelist.
...
This is only a temporary workaround.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8013 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
1b45e33f04
added robots tag parser to solr scheme
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7986 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
cf4fd525ee
added directDocByURL attribute in crawl profile
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7985 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago
orbiter
5ad7f9612b
added crawl settings for three new filters for each crawl:
...
must-match for IPs (IPs that are known after DNS resolving for each URL in the crawl queue)
must-not-match for IPs
must-match against a list of country codes (allows only loading from hosts that are hostet in given countries)
note: the settings and input environment is there with that commit, but the values are not yet evaluated
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7976 6c8d7289-2bf4-0310-a012-ef5d649a1542
13 years ago