sixcooler
480e4a6a5c
Update to Jetty-9.2.11 - a bugfix-release that did not solve my
...
Problems, but does not harm anything
10 years ago
reger
72f6a0b0b2
enhance recrawl job
...
- allow to modify the query to select documents to process (after job has started)
- allow to include failed urls (httpstatus <> 200)
10 years ago
Michael Peter Christen
e0a23c56c7
Merge branch 'master' of git@github.com:yacy/yacy_search_server.git
10 years ago
Michael Peter Christen
fb9e1dd3f5
servlet for latest commit
10 years ago
reger
5183ad718d
upd to poi-3.12.jar
10 years ago
reger
7478338a40
remove augmented parsing activation from frontend
...
experimental implementation not used and based on error prone experimental rdfaparser
10 years ago
reger
11aa2edfe1
remove RDFa parser activation from frontend
...
reason: experimental implementatin of RDFa parser not executed (limited to special urls) but may cause error on normal html parsing due to a inputstream.reset
10 years ago
Michael Peter Christen
ff11ac89f7
Merge branch 'master' of git@github.com:yacy/yacy_search_server.git
10 years ago
Michael Peter Christen
5e2d23b7a0
removed the new index export method from the IndexControlURLs_p.html
...
servlet and moved it to a new /IndexExport_p.html servlet. This servlet
is now more prominent linked in the main menu under Production -> Index
Export/Import
10 years ago
reger
64a7b0b140
Merge origin/master
10 years ago
reger
49b79987c9
remove obsolete searchfl work table
...
was used to register urls with not complete words in snippet but is never accessed
10 years ago
sixcooler
4533f392b0
correct the dark themes to show also a dark navbar on searchresults
10 years ago
Michael Peter Christen
d0aff91f23
fix for index import
10 years ago
Michael Peter Christen
34de1e8cbc
gzip compression will perform more efficient and with better compression
...
level
10 years ago
Michael Peter Christen
98be59ce9c
full solr xml exports will now be automatically compressed during
...
export. That makes it possible to export a solr xml dump even if disc
space is low.
10 years ago
Michael Peter Christen
a1a8edfc0a
wrap HeaReader close() in a catch Throwable block to prevent that an
...
excpetion during close blocks the whole shotdown process
10 years ago
Michael Peter Christen
b43811d38c
added surrogate import process for exported solr dumps.
...
Just throw your solr dump file into DATA/SURROGATES/in/ and it will be
imported!
10 years ago
Michael Peter Christen
b77537294d
prevent disc usage when showing tray animation
10 years ago
Michael Peter Christen
eec78e1b0c
added intensity option to graphics
10 years ago
Michael Peter Christen
a5007f345e
re-licensing some of my old visualization classes under LGPL 2.1
10 years ago
Michael Peter Christen
c99a665593
adding a 3-pixel font generator made some time ago..
10 years ago
Michael Peter Christen
c7576d6028
added a full solr export to the IndexControlURLs_p.html servlet. The
...
export function is also now the default export option. The export file
format for a full solr export is very similar to a solr search result
xml, only the <lst name="responseHeader"> tag is missing.
The exported xml has a special line termination feature: all documents
will be exported into a single line without any CR in between. That
means that every document is completely inside a single line. While this
is not readable at all for humans, it is very useful for linux line
processing scripts, like grep. Using grep it will be easy to select
single documents which match for a given pattern.
Such dumps shall be importable with the DATA/SURROGATE/in import
function, but that import is not yet adopted to the new file format.
10 years ago
Michael Peter Christen
47682bf467
fix for unresolved pattern
10 years ago
Michael Peter Christen
197f7449e5
All entities of crawl profiles are now editable in the crawl profile
...
editor.
10 years ago
reger
1d8e1e4bac
- Image search expand box, adjust javascript hs padtominsize parameter, to make sure expand box doesn't shrink on small images
...
- asure ImageResult.imagetext has value for the link text (use filename if no alt text given)
10 years ago
reger
8b35656007
remove hard throw exception in makeResultEntry
...
remove not used "share." peername.yacy url rewrite
10 years ago
reger
af57fbefad
use available mime (instead null) on imageresult from metadatanode
10 years ago
reger
dd7782bac0
revert deletion of BinSearch
...
(accident)
10 years ago
reger
000dde9511
Eleminate duplication of values for search ResultEntry
...
by instatiation from URIMetadataNode, by eleminating differentiation of ResultEntry/URIMetadataNode.
- moved remaining ResultEntry functionallity to URIMetadataNode
- for 1:1 functionallity added a function makeResultEntry()
- removed ResultEntry
- refactored related code
Main difference is after makeResultEntry the text_t content is removed and alternative title/url strings for display are calculated.
Main difference left is, that
10 years ago
reger
29c4aa3991
fix compiler notification of missing serialID
...
from last commit
10 years ago
reger
3d53da8236
refactor ResultEntry to be based on MetadataNode/SolrDocument
...
to share/reuse common access routines
10 years ago
reger
d882991bc5
Implement sharing of ioDispatcher for term & citation index
...
as proposed in ioDispatcher description
10 years ago
reger
17e820cfd7
use doctype() in ViewFile to choose display routines
...
in preference of getfileExtension()
10 years ago
reger
370ba9da71
On imageSearch prefere mime to sort out none-image documents
...
Generalize the hack to prevent urls with just a img extension beeing returned
improving http://mantis.tokeek.de/view.php?id=528
10 years ago
reger
cd31633369
improve MultiprotocolURL.getFileExtension()
...
prevent string OOB while querypart contains a dot (return just "")
see log snippet in http://mantis.tokeek.de/view.php?id=533
10 years ago
reger
c60ccdfbcf
Increase IODspatcher dumpQueue size to 2 to reduce risk of concurrent emergency dump,
...
skip concurrent emergency merge
dealing with/see http://mantis.tokeek.de/view.php?id=566
10 years ago
reger
8a9622c31c
fix string OoB on getImagelinks with long alttext
...
in description calculation
10 years ago
reger
aa83931765
Convert content charset for display via CacheResource_p
...
Cached resource charset encoding might not fit to internal handling (using utf-8),
convert resource to utf-8
see http://mantis.tokeek.de/view.php?id=576
10 years ago
reger
3e742d1e34
Init remote crawler on demand
...
If remote crawl option is not activated, skip init of remoteCrawlJob to save the resources of queue and ideling thread.
Deploy of the remoteCrawlJob deferred on activation of the option.
10 years ago
Michael Peter Christen
dbf9e3503d
Merge branch 'master' of git@github.com:yacy/yacy_search_server.git
10 years ago
Michael Peter Christen
8b1a30be50
removed a -UNRESOLVED_PATTERN-
10 years ago
Michael Peter Christen
9938c81378
fix for division by zero
10 years ago
reger
13f013f64a
Limit extra sleep of BusyThread on LowMemCycle
10 years ago
reger
cd7c0e0aae
detail optimization of RecrawlThread
10 years ago
reger
ace71a8877
Initial (experimental) implementation of index update/re-crawl job
...
added to IndexReIndexMonitor_p.html
Selects existing documents from index and feeds it to the crawler.
currently only the field fresh_date_dt is used determine documents for recrawl (fresh_date_dt:[* TO NOW-1DAY]
Documents are added in small chunks (200) to the crawler, only if no other crawl is running.
10 years ago
reger
141cd80456
correct log msg text
10 years ago
reger
f3ce99bfb8
fix extract of inboundlinks_protocol_sxt
...
url counter maybe > 999
10 years ago
reger
2bc9cb5828
fix early return in addToCrawler
...
check / handle all supplied urls after error url
10 years ago
Michael Peter Christen
f5f88272e4
Merge branch 'master' of git@github.com:yacy/yacy_search_server.git
10 years ago
Michael Peter Christen
5c67c4d460
fix for latest commit, see
...
f810915717 (commitcomment-11145880)
10 years ago