Michael Peter Christen
6ec02deec6
added new crawl attributes in crawl profile (not active yet)
12 years ago
Michael Peter Christen
a13e5153ac
- added the possibility to have not one but a list of crawl start urls
...
- the list of urls is entered in the expert crawl start in a textfield;
the one-line input field was replaced with a text box
- start urls can also be given in one single line where the urls are
separated by a '|'-character
- as an effect, the crawl profile cannot carry a single start url for
identificaton because it is possible to have more. Therefore the url was
removed from the crawl profile
- this affect all servlets which display a crawl profile: removed the
url field from all there servlets
- to work consistently with several start urls and the other crawl
starts which computed crawl start url lists from sitelists or sitemaps,
the crawl start servlet was restructured completely
- new rules for must-match patterns were created to make it possible
that site crawl starts also work with several crawl starts at once
12 years ago
Michael Peter Christen
975bc95ddf
added default facet fields for json response format (stub)
12 years ago
Michael Peter Christen
2f218df55d
added missing license headers
12 years ago
Michael Peter Christen
a30653a864
added a regular expression test servlet which is linked within the
...
parser/crawler error page whenever a problem with regular expression
occurs.
This makes it easy to correct and enhance the must-match and
must-not-match patterns just by trying out which pattern could be
correct.
12 years ago
Michael Peter Christen
0504b01bdc
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
orbiter
9413f77b65
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
orbiter
a55e77a115
added twitter search heuristic
12 years ago
Michael Peter Christen
e54ac38095
- some corrections in usage of getFile() and getFileName()
...
- added more attributes in json response writer according to yacy
servlet
12 years ago
Michael Peter Christen
62add1d564
added the protocol and the file name extension to the solr fields since
...
these fields are probably facets in file search
12 years ago
Michael Peter Christen
e072632a54
no complaints about memory if the database is empty
12 years ago
Michael Peter Christen
b846f585fa
fixed a bug with size_i field usage
12 years ago
Michael Peter Christen
9db032664e
activate two solr fields which will be used by administration interface
...
(later)
12 years ago
orbiter
fcd5c7eec3
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
orbiter
6171143b4a
added facet stub in JsonResponseWriter
12 years ago
Michael Peter Christen
e6330f648a
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
Michael Peter Christen
e84ffdb4f3
enhanced solr writers
12 years ago
Michael Peter Christen
9644c186a4
added search functionality to ViewFile.html servlet
12 years ago
Marc Nause
03f3a8b647
*) fix for http://www.yacy-forum.org/viewtopic.php?f=2&t=759
12 years ago
Michael Peter Christen
b69ed96f0b
- added collections to yacydoc
...
- changed yacydoc.htm to yacydoc.json
- added query logging in solr and gsa search result
12 years ago
Michael Peter Christen
5df553c152
- added a json writer for solr (yes there was one using xslt but this
...
one writes the same way as yacysearch.json)
- using the new json solr result to change the ajax search in
IndexControlURLs to the new solr search
12 years ago
Michael Peter Christen
4634f0e626
fix for images_withalt
12 years ago
Michael Peter Christen
e65cecc419
- updated lucene libraries to 3.6.1
...
- added lucene-grouping which enables faceted search; try this:
http://localhost:8090/solr/select?q=*:*&start=0&rows=3&facet=true&facet.field=host_s
12 years ago
Michael Peter Christen
1754fbb6d9
Merge remote-tracking branch 'reger/master'
12 years ago
Michael Peter Christen
4d29f59a27
removed warnings
12 years ago
Michael Peter Christen
8c099d2106
Merge remote-tracking branch 'origin/master'
...
Conflicts:
htroot/api/ymarks/import_ymark.java
source/de/anomic/data/ymark/YMarkEntry.java
source/de/anomic/data/ymark/YMarkTables.java
12 years ago
apfelmaennchen
59bd478ed1
Added more sophisticated RDF output for YMarks, including the folder
...
structure (b:Topic) and support for multiple tags (dc:subject) and
folders (b:hasTopic) via rdf:Bag container.
12 years ago
apfelmaennchen
d31a632951
- added dmoz RDF dump importer
...
- added indexing to Tables columns to support larger bookmark
collections
- added RDF output (HTTP) for public bookmarks at /YMarks.rdf
- YMarkRDF also provides a Jena RDF Model as "internal" API
- various other changes/fixes for YMarks (mainly backend)
12 years ago
reger
40d8086bf7
keep input order of translation entries within one file section.
...
Allowing on translation conflicts (translaton of words contained in other sentence) to put shorter key at the end of the translation list.
12 years ago
Michael Peter Christen
10b911eed4
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
Michael Peter Christen
be67c70a47
added Solr fields:
...
inboundlinks_text_chars_val
inboundlinks_text_words_val
inboundlinks_alttag_txt
outboundlinks_text_chars_val
outboundlinks_text_words_val
outboundlinks_alttag_txt
12 years ago
orbiter
d73fff0e0e
added solr field images_withalt_i
12 years ago
orbiter
66ac4076c2
added disjunction '|' option to site parameter in GSA API
12 years ago
sixcooler
a975bcffcb
clear fulltext-cache and stop crawling if running out of memory
12 years ago
sixcooler
e78fe3f477
also do a clearcache on the solr-connector-caches
12 years ago
sixcooler
9ee2e09983
statistics for solr-cache
12 years ago
Michael Peter Christen
d8425e6809
added collections to crawl monitor
12 years ago
Michael Peter Christen
ee23fc7a32
added h1..h6 counter fields
12 years ago
Michael Peter Christen
4b36a2c3b4
small style changes
12 years ago
Michael Peter Christen
8ca842b137
added new button design to more buttons
12 years ago
Michael Peter Christen
04709e91d7
add nice submit buttons to pdblue skin
12 years ago
Michael Peter Christen
ef6de52ab5
dependency is java6 only
12 years ago
Michael Peter Christen
b2b516cc3e
added a collection attribute to crawls and searches:
...
- a solr field collection_sxt can be used to store a set of crawl tags
- when this field is activated, a crawl tag can be assigned when crawls
are started
- the content of the collection field can be comma-separated, all of
them are assigned to the documents when they are indexed as result of
such a crawl start
- a search result can be drilled down to a specific collection; this is
currently only available in the solr interface and also in the gsa
interface using the 'site' option
- this adds a mandatory field for gsa queries (the google api demands
that field all the time)
12 years ago
Michael Peter Christen
174530a9e0
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
apfelmaennchen
43f3a932fd
removed jquery.slider as it is already included as part of jquery-ui
...
package
12 years ago
apfelmaennchen
a01eb1b7fe
removed unused jquery plugin slider as it is part of jquery-ui package
12 years ago
Michael Peter Christen
4815713ec7
added synchronization to solr server requests since lucene is not
...
thread-safe. We experienced problems as described in
http://stackoverflow.com/questions/5327978/lockobtainfailedexception-updating-lucene-search-index-using-solr
12 years ago
Michael Peter Christen
f75b3f8a47
added more patches to work without RWI data structure
12 years ago
Michael Peter Christen
a427a68bac
removed many warnings
12 years ago
Michael Peter Christen
c72c435517
- moved the gsa search interface from /gsa/searchresult? to /gsa/search?
...
- fixed the NB field data
12 years ago