Commit Graph

178 Commits (f13c0b2abd2be59fb48cc09e61440f966f924363)

Author SHA1 Message Date
orbiter 0f7ea7ad9f - enhanced solr.add procedure for mass adds
12 years ago
Michael Peter Christen 840fa22135 disabled clickdepth computation during craling since that is repeated
12 years ago
Michael Peter Christen d957739441 removed size request
12 years ago
Michael Peter Christen c95a84103a complete redesign of search process:
12 years ago
Michael Peter Christen 089dee1770 - generalized SchemaConfiguration into super-class Configuration and
12 years ago
Michael Peter Christen 56d5946a59 - added flags in IndexFederated_p.html to switch on or off the webgraph
12 years ago
Michael Peter Christen 788288eb9e added the generation of 50 (!!) new solr field in the core 'webgraph'.
12 years ago
Michael Peter Christen 91a0401d59 introduced a second core named 'webgraph'. This core will hold the link
12 years ago
Michael Peter Christen 33bc255e85 prevent that crawl starts with very large url lists cause a time-out in
12 years ago
Michael Peter Christen b6de1f42dc Full redesign of solr connection architecture. This was done to support
12 years ago
Michael Peter Christen 4111606654 removed the commitWithin attribute because that is not the way how the
12 years ago
Michael Peter Christen de58043205 Added image license generation for solr image search results when
12 years ago
Michael Peter Christen 6f6ddaf7e7 A robinson peer does not need to write RWI data if such peers are only
12 years ago
Michael Peter Christen 7806680ab8 fixed a problem with re-feeding of already indexed documents whith
12 years ago
Michael Peter Christen eb80405a16 added a disable function in RemoteCrawl_p servlet which prevents setting
12 years ago
Michael Peter Christen 4735bd47f4 - changed solr commit call and added an optimize option. Since Solr
12 years ago
Michael Peter Christen becd52a984 added also a re-calculation of reference counts during the
12 years ago
Michael Peter Christen 6f0baaa309 added the clickdepth post-processing: some links may have 'shortcuts' to
12 years ago
reger 0148f1bb8c fix: exception if default work files don't exist
12 years ago
Michael Peter Christen 9e4033f229 fix for event starter: delete start time when event is removed
12 years ago
Michael Peter Christen 99271ffd13 copy work tables from defaults/data/work if exist there and not in
12 years ago
Michael Peter Christen 24c9bb35f7 extended the Scheduler: introduced scheduled events
12 years ago
Michael Peter Christen cb5cbec14d distinguishing modified query string and original query string
12 years ago
orbiter 1f33c30d7b re-integrating useForHost method (lost sometime?) to get the noProxy
12 years ago
Michael Peter Christen 10527e28ae fix for wrong display of error urls in HostBrowser
12 years ago
Michael Peter Christen 8aa08261a7 update to Solr Boost handling
12 years ago
Michael Peter Christen 72f165d58b added a Boost class which stores solr query boost values. The class can
12 years ago
Michael Peter Christen 4eab3aae60 removed overhead by preventing generation of full search results when
12 years ago
Michael Peter Christen d6b82840f8 added a feature to find similarities in documents.
12 years ago
Michael Peter Christen f5ca5cea44 - added field options to all solr queries. This can be used to restrict
12 years ago
cominch 2bb8f045cc content control: use up-to-date definitions
12 years ago
cominch d2a94cc55e refactor package
12 years ago
cominch 21df1ad9e0 update and generalization of the SMW import and content control routines
13 years ago
Michael Peter Christen 71ed8e5e07 bugfixes for crawler
13 years ago
Michael Peter Christen 158732af37 automatically delete entries from the crawl profile list if crawl is
13 years ago
Michael Peter Christen 15d1460b40 added information about the reason of pausing of crawls
13 years ago
Michael Peter Christen 791e1dcfdf when a new crawl is started, delete all entries about error-urls for
13 years ago
Michael Peter Christen 8fb370d9f8 renovated the way how search results are count. should be correct now...
13 years ago
Michael Peter Christen 6629e37685 tried to clean up the search process mess
13 years ago
Michael Peter Christen f8f05ecba7 - added a delete button in host browser to delete a complete subpath
13 years ago
Michael Peter Christen 4a14122ba7 in case that a crawl profile has a collection assigned, use the
13 years ago
Michael Peter Christen 0833937c1c better balancing and duetime-cumputation also for no-delay intranet
13 years ago
Michael Peter Christen c25d7bcb80 - added concurrency for robots.txt loading
13 years ago
Michael Peter Christen 2d9e577ad0 replaced the custom robots.txt loader by the standard http loader
13 years ago
Michael Peter Christen 799d71bc67 enhanced solr caching:
13 years ago
Michael Peter Christen a33e2742cb - removed unnecessary synchronized and deadlock in crawler
13 years ago
sixcooler 47ae7e322e smaller dhtDispatcher.cloudSize
13 years ago
Michael Peter Christen ccc3760a47 Refactoring and redesign of data architecture to make URIMetadataRow
13 years ago
Michael Peter Christen e5b3c172ff removed hack which translated Solr documents to virtual RWI entries
13 years ago
Michael Peter Christen 43f3345c90 - removed dependencies from URIMetadataRow and made direct access to
13 years ago