Michael Peter Christen
25499eead5
- added a new field for the regular expression in crawl start
...
- added the field in crawl profile
- adopted logging end error management
- adopted duplicate document detection
- added a new rule to the indexing process to reject non-matching
content
- full redesign of the expert crawl start servlet
The new filter field can now be seen in /CrawlStartExpert_p.html at
Section "Document Filter", subsection item "Filter on Content of
Document"
12 years ago
reger
0a9b0992f3
RinkingSolr_p: include warning if boost field not in local index
12 years ago
orbiter
e1bfe9d07a
- reduction of the concurrently running processes to make YaCy more
...
adjusted to smaller and 1-core devices.
- the workflow processor now starts no process at all. these are started
as soon as parser/condenser/indexing queues are filled.
- better abstraction
12 years ago
Michael Peter Christen
c091000165
added collection attribute also to the rss feed reader
12 years ago
Michael Peter Christen
43ca359e24
Merge branch 'master' of ssh://gitorious.org/yacy/rc1
12 years ago
Michael Peter Christen
2d60dfb3e1
Merge branch 'master' of git://gitorious.org/~saranshupscale/yacy/yacy-india-rc1
12 years ago
orbiter
f7571386a3
added a 'collection' property attribute in yacysearch.html which can be
...
used to select between different collections as defined during a crawl
start with the 'collection' attribute. This actually implements the
ability to prepare search tenants which restrict their search results to
a specific collection. The main use for this is to provide tenants to
the yaml4 interface (at this time).
12 years ago
Saransh Sharma
04b61e08c8
More Translation
12 years ago
orbiter
3e79bd4b1f
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
orbiter
d571e739b6
increased row limitation for authorized users from 10000 to 100000000 in
...
solr interface
12 years ago
Michael Peter Christen
d937c55204
extended limitation of dom export size from 100000 to 100000000
12 years ago
Michael Peter Christen
fc2095ac67
some extensions to raster plotter to transform a RGB picture to an
...
indexed color scheme. This is needed for gif animations
12 years ago
Michael Peter Christen
c1a2175fbc
added transparency to gif image animation and the integration to the
...
YaCy httpd for on-the-fly generated gifs (including animated gifs)
12 years ago
Michael Peter Christen
a1fffe8e86
fixed default ranking values
12 years ago
orbiter
5d442dad82
avoid NPE in regex checker
12 years ago
Michael Peter Christen
24bcf54100
Merge branch 'master' of git://gitorious.org/~saranshupscale/yacy/yacy-india-rc1
12 years ago
Saransh Sharma
b31793f5d6
Hello world
12 years ago
Michael Peter Christen
50421171c3
added new schema fields:
...
hreflang_url_sxt and hreflang_cc_sxt
for
http://support.google.com/webmasters/bin/answer.py?hl=de&answer=189077
navigation_url_sxt and navigation_type_sxt
for
http://googlewebmastercentral.blogspot.de/2011/09/pagination-with-relnext-and-relprev.html
publisher_url_s
for http://support.google.com/plus/answer/1713826?hl=de
all fields are disabled by default and not written to the index.
12 years ago
Michael Peter Christen
566d6c980c
checking of document signature for a double-document check now refers
...
only to documents within the same domain
12 years ago
Michael Peter Christen
1d30082446
added hindi translation configuration
12 years ago
Saransh Sharma
ee9d50e4b8
Hindi Some parts only
12 years ago
Michael Peter Christen
d05dc07cff
setting of new default values for ranking
12 years ago
Michael Peter Christen
97775fbebc
fixed ranking for add-function queries: this did not work. The option
...
was removed. All function queries are now boosts (multiplies the score
according to a function). This is also the recommended way to boost
rankings based on functions as explained in
http://nolanlawson.com/2012/06/02/comparing-boost-methods-in-solr/
12 years ago
Michael Peter Christen
ac5fa9fe48
fix for result counter logging
12 years ago
Michael Peter Christen
298bf2deb5
fix to ranking configuration servlet
12 years ago
Michael Peter Christen
2db058b551
added in RankingSolr_p.html a select box to switch between different
...
ranking situations. By default, four situations can be configured.
12 years ago
Michael Peter Christen
6fbca35215
fixed api table navigation
12 years ago
Michael Peter Christen
7ab5093321
added new solr title_exact_signature_l and
...
description_exact_signature_l to be able to identify unique title and
unique description fields.
12 years ago
Michael Peter Christen
f24ac518e6
redesign of exists()-query (can now be called with query) and the
...
CachedSolrConnector which based its cache on the key value. This will be
used to correct the title_unique_b and description_unique_b field.
12 years ago
Michael Peter Christen
27d6222880
added new field host_extent_i which, after a crawl and postprocessing,
...
holds the number of documents for the host where the document is hosted.
This is necessary for ranking and the norming of references per local
host in the ranking computation.
12 years ago
Michael Peter Christen
579eb01a49
showing now the details of references count in host browser:
...
external (ext), internal (int) and external hosts (hosts) for each
indexed document.
12 years ago
reger
0f4237d8e5
add admin option to delete load errors from index
12 years ago
reger
518b20147c
skip postprocessing during document.store if no citation index connected (prevent null pointer exception)
12 years ago
Marc Nause
ac478384d3
*) did some long overdue refactoring
12 years ago
Marc Nause
e99c8789ff
*) fixed encoding of query in link to map (in case geolocalization is
...
enabled, "Show search results for "köln" on map")
*) applied suggestions of Checkstyle plugin
12 years ago
Michael Peter Christen
ada3f27de7
added three new field for a better ranking: references_internal_i,
...
references_external_i and references_exthosts_i. These can be used to
count and evaluate the number of external links to every web page. An
experimental ranking function can be i.e.:
div(add(references_internal_i,product(references_external_i,references_exthosts_i)),add(clickdepth_i,1))
12 years ago
Michael Peter Christen
082e3274d6
- setting the same default ranking in the solr interface as for YaCy
...
search interfaces if no other ranking attributes are given
- using the YaCy ranking in the GSA interface only if there was not
given a GSA-style sort attribute
- to avoid confusion about correct ranking attributes, only the default
'0'-ranking profile is used and not scenario-adopted (site, date)
because that should be configurable in the web interface before it is
used actually for ranking.
12 years ago
Michael Peter Christen
a20941c067
resume paused crawls on startup; user expects that restarts 'heal'
...
everything
12 years ago
Michael Peter Christen
edc0b33f6d
- showing references count and clickdepth in host browser
...
- fixed generation and presentation of both values
12 years ago
orbiter
2c3b024196
if the crawl was paused (automatically), show the reason for pausing in
...
the Crawler_p servlet.
12 years ago
reger
566a3b0294
fix: Index Administration > Reverse Word Index (IndexControlRWIs_p) corrected use of word search to word-hash search
...
- removed duplicate QueryParams.hashes2Handles , redundant with .hashes2Set
12 years ago
reger
989575b447
Merge branch 'master' of git://gitorious.org/yacy/rc1.git
12 years ago
Michael Peter Christen
27907c9739
added missing library after solr upgrade
12 years ago
reger
f37b4c984c
adjust Netbeans IDE project.xml classpath for Solr 4.2.1 jars
12 years ago
Michael Peter Christen
c6c01a3ca2
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
Michael Peter Christen
cf0acd2cb4
upgrade to solr 4.2.1
12 years ago
reger
40b3f2c5fe
comment out dead menue link
12 years ago
reger
bf1e1ddca1
fix typo in prev commit
12 years ago
reger
d4d93be779
uncomment "used time" calculation for remote search log
12 years ago
reger
36202f27b0
improve remote search log, set "Returned Results" to transmitcount (instead of no value)
12 years ago