luc
f7b854465b
Merge branch 'master' of https://github.com/yacy/yacy_search_server
9 years ago
reger
a6617ad887
expand initRemoteCrawler() to terminate worker threads if called to deactivate
...
remote crawl.
On startup we save the resources for remote crawler if disabled. Once started
threads are running idle after disable remote crawl. Now threads are terminated
to save the resources also while disabeling during runtime.
+ remove empty class Channels
9 years ago
reger
2048b7e057
support scraping start-/enddate from html tag with property "datetime"
...
This may be used in html5 <time> tag (which we don't explicite support yet for date in content scraping).
9 years ago
reger
900d4584ba
complet resource cleanup of lists in contentscraper's close()
9 years ago
reger
06e5cd6164
add support parsing swf-metadata to swfparser
...
flash supports metadata tag in swf file with metadata in xmp (xml) format.
parse some common data to include it in the head section of the html string
of converttohtml.
9 years ago
luc
e0ac26d63e
Merge branch 'master' of https://github.com/yacy/yacy_search_server
9 years ago
reger
11b1587067
replace remaining use of java.util.Vector by ArrayList (WebCat-swf)
9 years ago
reger
9331acdb18
add support for DEFINEFONT3 (swf8) to webcat parser
...
experienced issue with JPEGTABLE tag (with length=0) causing abort of parsing (ioexception)
as we don't use/need it for text parsing skip this tag.
9 years ago
reger
bf5fca5d99
add missing swf tag constants according to latest spec
...
reduce use of synced vector in webcat parser
9 years ago
luc
aa60ad1dbc
Merge branch 'master' of https://github.com/yacy/yacy_search_server
9 years ago
reger
1f18653de0
pass parsed swf content trough htmlscraper
...
Swf may contain subset of html tags which shoul'd appear as text.
Especially <font> tag may totally screw up metadata servlet if not filtered out.
9 years ago
reger
18ecf57792
add support of compressed swf to swfParser
...
from JavaSWF2 (source compatible to WebCat).
Moved swf file signature check to parser
Changed use of synced vector to list swf InStream
9 years ago
sixcooler
5cb7ba0dc4
fix for connections not getting closed to get favicon.ico during seach
9 years ago
sixcooler
e1dd808e1c
fix for 'move test classes to test/java'
9 years ago
luc
ef83e34b8a
Merge branch 'master' of https://github.com/yacy/yacy_search_server
9 years ago
reger
6c25710a34
replace bugfixed webcat-swf.jar
9 years ago
reger
4213ff84d4
import WebCat swf parser custom source package
...
This package is not available as jar (used jar is a custom compile as we
use just a portion of the package)
WebCat package is not maintained. To be able to fix bugs, source extract
of swf parser imported here.
9 years ago
reger
bceb779414
refactor libbuild/GitRevMavenTask (marvenize)
...
to be able to add additional modules to build
9 years ago
reger
730fb43ab1
add translation DE,FR submenuRanking.template
...
upd translation DE RankingSolr_p
9 years ago
reger
84c970eaec
move test classes to test/java (subdirectory as in Maven standard subdir layout)
...
because ViewImage*Test.java breaks test run
9 years ago
reger
9f91e6124f
add DE translation for submenuCrawler.template
...
+ upd submenuIndexControl.template
9 years ago
reger
ed3e16e092
apply remote result count config value to Bookmark Autosearch
...
+ prepare to make the widely unused Bookmark feature optional
9 years ago
Michael Peter Christen
5d635879f8
Merge pull request #40 from Scarfmonster/autocrawl
...
Automatic crawling
9 years ago
Ryszard Goń
7d6e0d8470
Add missing settings to autocrawl settings page
9 years ago
Ryszard Goń
7a7a1277bd
Add autocrawl settings page
9 years ago
Ryszard Goń
a98c395023
Add the Autocrawl thread
9 years ago
reger
4765e374e6
altered clac. of search result items per page to display
...
taking the existing limits into account but make it consistent with search option screen for admin and public user
changes:
- configured default number of items per page (ConfigPortal_p.html) is used as is (no hardcoded limit)
- otherwise requests are limited to 100 results per page ( = search option, index.html)
(this basically is the major change, inc. limit from 20 to 100 for public user)
P.S. - the older grant of more (1000), if no online snippet calculation, is kept (for the time being)
see http://mantis.tokeek.de/view.php?id=627
9 years ago
luc
231be83eb6
Corrected access to Load_MediawikiWiki.html and Load_PHPBB3.html
...
A NullPointerException occured when trying to access theses pages in
Robinson (Search portal) mode
9 years ago
Ryszard Goń
1728cd30c6
Create autocrawl profiles
9 years ago
luc
85a9363012
Merge branch 'master' of https://github.com/yacy/yacy_search_server
9 years ago
reger
abd8ecb503
remove contendom depending override of search result items per page
...
initially introduced e4570bffaf (diff-ae6c130fc11088c830b00ed9256ab56b)
(as one part of unexpected difference in actual vs requested results, partial bugfix for http://mantis.tokeek.de/view.php?id=627 )
9 years ago
luc
41767a01c2
Merge branch 'master' of https://github.com/yacy/yacy_search_server
9 years ago
reger
8271f783ca
upd pom javadoc goal
...
to not fail a build on javadoc errors
9 years ago
reger
ff27824964
fix swfParser reading file signature
...
before passing to library (current version expects data w/o signature)
9 years ago
reger
b29db4640c
update Maven pom - add release-profile
...
to create the release archive only if profile is activated (speeding up normal compilation)
- bind install of the 2 jar's not available in repository to validate phase (was clean)
to automatically add these to local repository (with disadvantage it's done every time)
9 years ago
reger
04161912a5
fix tray icon switch
...
(using predefined/correct config name)
9 years ago
luc
7aa1a29e33
Return more accurate HTTP status 400 with detail message when some error
...
occurs on ViewImage :
- missing required parameters
- url licence invalid
9 years ago
luc
bd9dc2f32b
Corrected NullPointerException cases occuring in YJsonResponseWriter
...
when no description is available.
9 years ago
luc
0076f9f97d
Updated documented sample url
9 years ago
luc
cfdbc2b487
Improved URLLicence reliability for use by conccurrent non authaurized
...
users.
Removed URLLicence generation when unnecessary (authorized users)
9 years ago
reger
e3d53f0248
add de translation for IndexExport_p
9 years ago
reger
9f5b768d84
fix typo in translation (de,hi) for AccessTracker_p
...
- rem some not translated in ru (-> currently best maintained translation)
9 years ago
reger
c91e712178
further refactor using standard java / (one) utf-8 charset variable
...
extending initiative of commit 9a25751850
9 years ago
Michael Peter Christen
e3e8015306
Merge pull request #28 from Stepanov-Sergey/patch-1
...
fixed typos
9 years ago
Michael Peter Christen
3dbd3caecf
Merge pull request #37 from sudheesh001/LogFix
...
Log files are commitable and shouldn't be
9 years ago
Michael Peter Christen
9a25751850
Merge pull request #38 from luccioman/master
...
Refactoring : use StandardCharsets instead of hardcoded charset names
9 years ago
reger
bfcca6bfee
update on translation files
...
- delete removed servlets
AugmentedBrowsingFilters_p.html (de)
CrawlStartIntranet_p.html (de)
IndexCreateWWW***Queue_p.html (de)
Ranking_p.html (de)
- add
IndexCreateQueues_p.html
- rename
Settings_Http.inc -> Settings_ProxyAccess.inc
Language_p.html -> ConfigLanguage_p.html
9 years ago
reger
c283efdd6d
remove obsolete css style for removed file CacheAdmin_p.html
...
and remove from translations
9 years ago
luc
571bc55937
Refactoring : use StandardCharsets constants instead of hard-coded
...
charset names.
9 years ago
reger
218061752e
add missing quote chars in sk.lng translation file
...
+ minor: del one redundancy
9 years ago