Commit Graph

41 Commits (78e7aadb26ad38c30daa1a845b2d9cee3843c853)

Author SHA1 Message Date
Michael Peter Christen 765943a4b7 Redesign of crawler identification and robots steering. A non-p2p user
11 years ago
Michael Peter Christen 57ffdfad4c added a crawl option to obey html-meta-robots-noindex. This is on by
12 years ago
Michael Peter Christen 25499eead5 - added a new field for the regular expression in crawl start
12 years ago
Michael Peter Christen 0b6566a389 optimizations when starting large crawl requests with many start urls in
12 years ago
Michael Peter Christen fb0fa9a102 - fixed 'delete from subpath' during crawl start which deleted nothing;
12 years ago
orbiter b55ea2197f - redesign of crawl start servlet
12 years ago
orbiter 1c66de4bd4 - removed scheduled crawling options in crawl start because it is
12 years ago
Michael Peter Christen 5e77801aac update to web interface structure
12 years ago
orbiter 354ef8000d - added 'deleteold' option to crawler which causes that documents are
12 years ago
Michael Peter Christen ac9540dfb6 removed options for stopwords which are not used
12 years ago
orbiter 60b1e23f05 added new crawl options:
12 years ago
Michael Peter Christen a13e5153ac - added the possibility to have not one but a list of crawl start urls
12 years ago
Michael Peter Christen b2b516cc3e added a collection attribute to crawls and searches:
12 years ago
Michael Peter Christen d7eb18cdf2 accept also file names beginning with "file://" for crawl start from
13 years ago
Michael Peter Christen 8bfc987374 enhanced hint how to enter file:// urls
13 years ago
orbiter ebd840ebf6 - enhanced description on search front page
13 years ago
orbiter e4a82ddd8b produce a bookmark entry from every crawl start. these bookmarks are always private.
13 years ago
orbiter ff32469272 added a link to /api/util/getpageinfo_p.xml as API to crawl start info and to ViewFile.html
13 years ago
low012 1b8b989744 *) set maxlength of input field for country code filter to value > default text length (old value caused warning in Opera)
13 years ago
orbiter cf4fd525ee added directDocByURL attribute in crawl profile
13 years ago
orbiter b250e6466d implemented crawl restrictions for IP pattern and country lists
13 years ago
orbiter 5ad7f9612b added crawl settings for three new filters for each crawl:
13 years ago
orbiter af63aa1d0e added fresh links to java regular expression api-doc
14 years ago
orbiter 7962d35425 - removed file upload function in crawl start and replaced it with an input field for a file path where the crawl start file is loaded. This was necessary to support the API steering for file crawl starts, for two reasons:
14 years ago
orbiter 11bebe356b fixed crawl start: with SVN 7225 the name of the crawl start url was not given in input field and therefore all crawl starts had contained the empty string as crawl start url
14 years ago
mikeworks 70576e88d2 de.lng: Added some more untranslated strings I found and uncommented old ones that were removed
14 years ago
orbiter f6eebb6f99 replaced auto-dom filter with easy-to-understand Site Link-List crawler option
14 years ago
mikeworks b019426811 de.lng: Added German translations for new Index Creation pages RSS Feeds and adapted text in Tables_p.html and CrawlStartExpert_p.html to match some typos, also changed one name tag to id to conform with XHTML 1.0 Strict
14 years ago
orbiter 58b7417a59 - added a new 'easy' crawl start menu which can be used for the special case of loading a complete domain
14 years ago
orbiter 2f381b8d7a - fixed at least two causes for a NPE after a use case switch.
17 years ago
lulabad fc54d4519e some more XHTML strict errors
17 years ago
daburna 3636526bd6 replaced re-crawl/min-age as suggested here: http://forum.yacy-websuche.de/viewtopic.php?f=9&t=198
17 years ago
daburna a047e7f830 replaced irritating "re-crawl"
17 years ago
orbiter b183bf6f42 - fixed opensearch bugs
17 years ago
low012 51800539b2 *) changed regex that is created for crawling filter (see http://forum.yacy-websuche.de/viewtopic.php?t=83)
18 years ago
orbiter 5009695537 fix for double-entries of crawl tasks.
18 years ago
orbiter c7a614830a several bugfixes
18 years ago
allo b2a9080a14 fix for when the user hits cancel
18 years ago
allo b68fb8a0ba one \ more
18 years ago
allo e24b54301e RegEx, not Blacklist-style RegEx ;/
18 years ago
orbiter 3f49cd516b splittet the index create page into two pages:
18 years ago