Michael Peter Christen
3bcd9d622b
cleaned up classes and methods which are either superfluous at this time
...
or will be superfluous or subject of complete redesign after the
migration to solr. Removing these things now will make the transition to
solr more simple.
13 years ago
Michael Peter Christen
6f1ddb2519
Moved solr index-add method to the same method where the YaCy index is
...
written. Also done some code-cleanup.
13 years ago
Michael Peter Christen
315d83cfa0
cleanup
13 years ago
Michael Peter Christen
76202f068e
extended abstraction of local and remote solr index using one front-end
...
for index administration and querying.
13 years ago
Michael Peter Christen
7ec7341f60
added user-authentication protection to solr search (same as implemented
...
for yacysearch)
13 years ago
Michael Peter Christen
e2a97ef8f6
better explain how to access the embedded solr
13 years ago
Michael Peter Christen
826967513b
changed options in IndexFederated_p to switch on/off parts of the index
...
individually. The settings are experimental and the values of the
settings will be overwritten when an index migration from urldb to solr
starts.
13 years ago
Michael Peter Christen
cba4ab862e
fix for http://bugs.yacy.net/view.php?id=202
13 years ago
reger
36c9875b6e
removed localized number formatting from num-results_totalcount response (this is only used in xml and json where localized format is not valid)
13 years ago
orbiter
69e743d9e3
- more abstraction for the RWI index as preparation for solr integration
...
- added options in search index to switch parts of the index on or off
13 years ago
orbiter
6cc5d1094e
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
orbiter
05a3ffd03a
patches to ensure that solr connectors are active ony if they have a
...
solr object assigned and vice versa
13 years ago
orbiter
5a3c829872
embedded solr is only initiated if it is activated with
...
IndexFederated_p.html
13 years ago
Lotus
3a350a2f83
partial html fix for
...
http://forum.yacy-websuche.de/viewtopic.php?f=5&t=4454
13 years ago
Michael Peter Christen
97b7bcf2a6
added a solr search index
...
- by default, a (empty) solr storage instance is created at
SEGMENTS/solr_36
- the index is written if in /IndexFederated_p.html the flag "embedded
solr search index" is switched on
- a standard solr query interface is available now with a new servlet at
http://127.0.0.1:8090/solr/select
To test this, do the following:
- switch to webportal mode
- switch on the feature as described
- do a crawl. this fills the solr index. The normal YaCy search will NOT
work now!
- do a solr query, like:
http://127.0.0.1:8090/solr/select?q= *:*
http://127.0.0.1:8090/solr/select?q=text_t:Help
play with different search fields as you can see in
/IndexFederated_p.html
You can use the standard solr query attributes as described in
http://wiki.apache.org/solr/SearchHandler
13 years ago
Michael Peter Christen
f78ce93a80
collection of speed and memory saving hacks
13 years ago
orbiter
c00a3cf74d
less usage of generic logger to avoid logger generation overhead
13 years ago
orbiter
e76159040b
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
orbiter
bbfa497a3c
replaced more size() > 0 by !isEmpty()
13 years ago
Michael Peter Christen
e3aa05b9dd
added creation of subpath pattern when crawl start is 'from file'
13 years ago
orbiter
0cbda0b2b8
- replaced all length() == 0 and size() == 0 with isEmpty()
...
- replaced some length() > 0 and size() > 0 with !isEmpty() - cannot be
done automatically
- implemented some isEmpty() methods
13 years ago
Roland 'Quix0r' Haeder
aef9dd0350
- removed cleaning of blacklist cache on startup
...
- added cleaning of blacklist cache if cache is modified in interface
- extended cache saving to all cache types
- moved cache location to DATA/LISTS
- fixed static file path which was relative to the application path but
should be relative to data path - which is different in debian and mac
implementations
13 years ago
orbiter
c7afa8bc48
using SwitchboardConstants for solr attributes
13 years ago
orbiter
62202e2d71
refactoring of query attribute variable names for better consistency
...
with (next) stored query words
13 years ago
Michael Peter Christen
91f14ea38e
fix to solr configuration (case where the external solr was not online)
13 years ago
sixcooler
2c5b68d932
more abstraction of error message
13 years ago
Michael Peter Christen
9758c521ab
abstraction of error message
13 years ago
sixcooler
9b6e4e46ca
fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=4430
13 years ago
Michael Peter Christen
b0c408788b
made class methods static where possible
13 years ago
Michael Peter Christen
5bd3c90907
- removed unnecessary semicolons
...
- added default case for switch
13 years ago
Michael Peter Christen
7c1ba99755
removed more unused method parameters
13 years ago
Michael Peter Christen
0301aba1e9
removed unused method parameters
13 years ago
Michael Peter Christen
241dd8410a
removed snippet pattern filter - it was not used
13 years ago
Michael Peter Christen
d3964253ae
- added @SuppressWarnings to unused servlet method parameters
...
- removed unnecessary casts
- removed unnecessary throw statements
13 years ago
Michael Peter Christen
ea10766bfd
cleaned unnecessary nested code
13 years ago
orbiter
78fc3cf8f8
refactoring and new usage of SentenceReader: this class appeared as one
...
of the major CPU users during snippet verification. The class was not
efficient for two reasons:
- it used a too complex input stream; generated from sources and UTF8
byte-conversions. The BufferedReader applied a strong overhead.
- to feed data into the SentenceReader, multiple toString/getBytes had
been applied until a buffered Reader from an input stream was possible.
These superfluous conversions had been removed.
- the best source for the Sentence Reader is a String. Therefore the
production of Strings had been forced inside the Document class.
13 years ago
Michael Peter Christen
276a66a793
Adding a limit of 1000 links that a parser shall store during indexing.
...
A limit was necessary because some web pages have such huge numbers of
links that it can easily cause a OOM just by the number of links.
The quesion if the number of 1000 links is sufficient or too weak must
be answered with the result of testing this feature.
13 years ago
Michael Peter Christen
1825f165b8
better integration of blacklist according to use case
13 years ago
Michael Peter Christen
c18fa9fa75
Merge branch 'master' of git://gitorious.org/~reger/yacy/bbyacy-rc1
13 years ago
Michael Peter Christen
ce8d4b87d9
fixes for new eclipse 'Juno' warning 'Resource leak'.
13 years ago
reger
067728bccc
add search result heuristic. adding a crawl job with depth-1 for every displayed search result (crawling every external linked page of displayed search result pages)
13 years ago
Michael Peter Christen
03280fb161
removed segments-concept and the Segments class:
...
the segments had been there to create a tenant-infrastructure but were
never be used since that was all much too complex. There will be a
replacement using a solr navigation using a segment field in the search
index.
13 years ago
Michael Peter Christen
9116013c64
- allow lazy initialization of solr value (if using 'lazy', then no
...
0-values and no empty strings are written). This may save a lot of
memory (in ram and on disc) if excessive 0-values or empty strings
appear)
- do not allow default boolean values for checkboxes because that does
not make sense: browsers may omit the checkbox attribute name if the box
is not checked. A default value 'true' would not comply with the
semantic of the browsers response.
- add a checkbox in IndexFederated_p for the lazy initialization of solr
fields.
13 years ago
cominch
e6792ed37d
Merge remote-tracking branch 'original yacy/master'
13 years ago
Michael Peter Christen
96aeb127e3
generalized localhost naming.
...
this is also a preparation for a better IPv6 implementation.
13 years ago
Michael Peter Christen
77f795756c
fixing redirects and status codes: storing of status code in
...
ResponseHeader to make it available for late evaluations, like storage
in solr.
13 years ago
Michael Peter Christen
8dd469b9dd
added option to configure the autocommit delay time of solr on-the-fly
13 years ago
Michael Peter Christen
b9dfca4b0a
- fixed IndexFederated Servlet / a embedded Solr can now be selected
...
- added code stub for an embedded Solr but generation of Solr store is
still commented out (it works but is not yet ready for usage)
13 years ago
Michael Peter Christen
fad3b14813
added jetty libraries, needed for future use as web server and as
...
application server for the solr search interface
13 years ago
Michael Peter Christen
b9d42fd9c8
using com.google.common.io.Files instead of homebrew methods
13 years ago
Michael Peter Christen
a5eb91fa60
refactoring
13 years ago
cominch
c1ba58ae51
Augmented browsing: Small CSS fix
13 years ago
cominch
b2b205aa38
Augmented browsing: small js fix
13 years ago
cominch
dc9ee0cdb3
Augmented browsing: CSS fix
13 years ago
cominch
74fcc6f8c5
Augmented browsing: small UI modifications
13 years ago
cominch
c63c3a4495
Show additional interaction elements in footer section on each page, if
...
activated in ConfigPortal.html.
This footer is also visible in augmented browsing proxy mode.
13 years ago
cominch
fa98657bb3
Augmented Browsing: changed the settings page
13 years ago
cominch
751eeade0d
Merge remote-tracking branch 'original yacy/master'
13 years ago
cominch
84a11ec48c
Corrected loading of default page settings on ConfigPortal.html
13 years ago
sixcooler
bea002dc15
correct table in new look of Crawler_p
13 years ago
Michael Peter Christen
8738336408
set Xms lower than Xmx
13 years ago
cominch
6b4545d6b0
Only load tag information if necessary
13 years ago
cominch
011f8a5818
Auto Tagging: Add hyperlinks to tags (provisional)
13 years ago
Michael Peter Christen
1d4e206b2b
bugfix in vocabulary generation
13 years ago
cominch
2c89975378
Merge remote-tracking branch 'original yacy/master'
13 years ago
cominch
71047fe63a
Augmented browsing: CSS fix
13 years ago
Michael Peter Christen
52f5d40043
better abstraction of document model generation
13 years ago
Michael Peter Christen
8b7c4d3144
produce a rdf output containing the triplestore with yacydoc; ie:
...
http://localhost:8090/api/yacydoc.rdf?urlhash=yOiCM7Fh1hyQ
13 years ago
cominch
f7160dae5c
Merge remote-tracking branch 'original yacy/master'
13 years ago
cominch
e4555cbee3
Augmented browsing: Pass on additional action parameter
13 years ago
Michael Peter Christen
24bbe359ca
integrate also geonames library files for less cities. these are more
...
useful for tagging since less normal words are false-identified as
location
13 years ago
Michael Peter Christen
5a41e739b4
better apilink description
13 years ago
Michael Peter Christen
e16e4bd2ba
added ontology extraction in xml as api call for vocabularies
13 years ago
cominch
8cf47a8335
Merge remote-tracking branch 'original yacy/master'
13 years ago
cominch
b85f01a14e
Augmented browsing: small UI fix
13 years ago
Michael Peter Christen
26cb1c65c2
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
...
Conflicts:
source/net/yacy/document/importer/OAIPMHLoader.java
13 years ago
Michael Peter Christen
963f92ed9a
- merged files
...
- changed behaviour of delete button in vocabulary edit
- fixed size numbe in vocabulary listing
13 years ago
cominch
d8815db877
Merge remote-tracking branch 'original yacy/master'
13 years ago
cominch
e4dab19045
Augmented Browsing: added template for document info bar
13 years ago
Michael Peter Christen
743b0ec89f
- added size of vocabulary to vocabulary view
...
- fixed bad terms in vocabulary-from-titles autogeneration
13 years ago
Michael Peter Christen
22d5e33c5e
added more methods to vocabulary generation: scrape document title and
...
document author to vocabulary
13 years ago
Michael Peter Christen
b2d1c25ebb
removed warnings/unused entities
13 years ago
Michael Peter Christen
f1aa4c4390
- accept only location names wit a minimum length
...
- remove comma from synonym terms
13 years ago
Michael Peter Christen
cc9ad7198a
- use only names which consists of at least two parts
...
- remove word from derewo from locations
13 years ago
Michael Peter Christen
9264d8b4af
removed old navigation practice using subject tags in favor of
...
triplestore-tags
13 years ago
Michael Peter Christen
eeb4fd8b8c
refactoring (geolocalzation -> geolocation)
13 years ago
Michael Peter Christen
64c0268b2b
show triplestore metadata in yacydoc and viewfile
13 years ago
Michael Peter Christen
c2f0d16d2c
fixed vocabulary initialization
13 years ago
Michael Peter Christen
fbded1f466
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
Michael Peter Christen
df3531f8d5
added the generation of virtual vocabularies using the pnd
13 years ago
Michael Peter Christen
e806106b10
jquery bugfix
13 years ago
Michael Peter Christen
a0f1decd82
- added loading of the dbpedia pnd triplestore in the dictionary loader
...
- renamed the dictionary loader to knowledge loader
- some refactoring in the library provider method names
13 years ago
Michael Peter Christen
6d17686258
made triplestore persistent by default
...
added a size display in triplestore servlet
13 years ago
Michael Peter Christen
8d6e77ad0c
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
cominch
2ac7a5c1f2
Augmented browsing: Add overlay bar which shows the vocabulary tags
13 years ago
Michael Peter Christen
777d22e145
renamed "augmented proxy" to "augmented browsing"
13 years ago
cominch
bddac2839e
add missing files for tag display
13 years ago
cominch
441430f507
Merge remote-tracking branch 'original yacy/master'
13 years ago
cominch
3c255c025b
Show tags in search results (if activated in ConfigPortal_p.html)
13 years ago
Michael Peter Christen
1f9120d189
create new vocabularies also without an objectspace. this creates an
...
empty vocabulary
13 years ago