Otherwise on a malformed getpageinfo_p XML response (from the browser
point of view), JavaScript errors where thrown and the ajax status
steering wheel remained displayed indefinitely.
This prevent rendering a big and inconvenient scrollbar on resources
containing many links.
If really needed, preview of all links is still available with a "Show
all links" button.
Doesn't affect the number of links used once the crawl is effectively
started, as the list is then loaded again server-side.
Replaced by shortcuts defined by the HTML "accesskey" attribute which
has the advantage to be advertised by screen readers when focusing the
corresponding buttons, contrary to custom JavasScript key handlers.
Now With Firefox :
- "Alt + Shift + n" for next page
- "Alt + Shift + p" for previous page
Following ARIA recommendation : "keyboard shortcuts enhance, not
replace, standard keyboard access." ( see
https://www.w3.org/TR/wai-aria-practices/#kbd_shortcuts_behavior_design)
Fix for mantis 711 (http://mantis.tokeek.de/view.php?id=711)
Added as an additional icon with title in the search progress bar, to
inform about background search feeder threads terminated or still
running. While giving a bit more information to users about the p2p
search process, this can help choosing whether or not wait a little bit
more time before going to the next page, in order to get results from
various sources sorted as best as possible (see #91 for a discussion
about sorting accuracy and network latency).
Other related modifications included :
- regular updates to statistics in the progress bar until the
background feeders are completely terminated.
- removed some uses of unsecure and discouraged JavaScript elements
Fixes issue #90 for local queries only: Stealth mode, Portal mode or
Intranet mode.
For P2p mode, the issue would probably be difficult to solve with
reasonable performance. This is still to dig.
Also switched some InterreputedException catch log messages to warn
level as this is normal behavior when shutting down a peer.
Fixed yacysearch buttons navbar behavior to deal correctly with total
results count or offset over 1000. Also improved the buttons navbar to
be able to navigate over 10th page for local queries.
bayesian filters. This can be used to classify documents during
indexing-time using a pre-definied bayesian filter.
New wordings:
- a context is a class where different categories are possible. The
context name is equal to a facet name.
- a category is a facet type within a facet navigation. Each context
must have several categories, at least one custom name (things you want
to discover) and one with the exact name "negative".
To use this, you must do:
- for each context, you must create a directory within
DATA/CLASSIFICATION with the name of the context (the facet name)
- within each context directory, you must create text files with one
document each per line for every categroy. One of these categories MUST
have the name 'negative.txt'.
Then, each new document is classified to match within one of the given
categories for each context.
- date navigation
The date is taken from the CONTENT of the documents / web pages, NOT
from a date submitted in the context of metadata (i.e. http header or
html head form). This makes it possible to search for documents in the
future, i.e. when documents contain event descriptions for future
events.
The date is written to an index field which is now enabled by default.
All documents are scanned for contained date mentions.
To visualize the dates for a specific search results, a histogram
showing the number of documents for each day is displayed. To render
these histograms the morris.js library is used. Morris.js requires also
raphael.js which is now also integrated in YaCy.
The histogram is now also displayed in the index browser by default.
To select a specific range from a search result, the following modifiers
had been introduced:
from:<date>
to:<date>
These modifiers can be used separately (i.e. only 'from' or only 'to')
to describe an open interval or combined to have a closed interval. Both
dates are inclusive. To select a specific single date only, use the
'to:' - modifier.
The histogram shows blue and green lines; the green lines denot weekend
days (saturday and sunday).
Clicking on bars in the histogram has the following reaction:
1st click: add a from:<date> modifier for the date of the bar
2nd click: add a to:<date> modifier for the date of the bar
3rd click: remove from and date modifier and set a on:<date> for the bar
When the on:<date> modifier is used, the histogram shows an unlimited
time period. This makes it possible to click again (4th click) which is
then interpreted as a 1st click again (sets a from modifier).
The display feature is NOT switched on by default; to switch it on use
the /ConfigSearchPage_p.html servlet.
tested with IE11 and Firefox 32 (change worked for both to show 2nd line without cutting off height)
+fix charset parameter in metadataImageParser
+update start errMsgTxt to "java 1.7"
- added order option to solr queries to be able to retrieve document
lists in specific order, here: link length
- added HyperlinkEdge class which manages the link structure
- integrated the HyperlinkEdge class into clickdepth computation
- extended the linkstructure.json servlet to show also the clickdepth
and other statistic information
- added a servlet api/linkstructure.json which generates a link graph
information in json
- added a javascript link graph renderer hypertree.js using d3 and the
new servlet linkstructure.json
- embedded the new link graph in the crawler monitor and the host
browser
- removed 'worker' processes
- no internal time-out behaviour: methods either are successful or
return null
- waiting is only done on top-level
- removed snippet-production; this is replaced by solr snippets
- removed statistics based on solr size queries (they had been VERY
long); the statistics (like suggestions or tag cloud) are now again
based on the old but very fast RWI index. In portal or intranet mode the
RWI index is usually switched off; if you like to have statistics again
then you must switch on the rwis again in this mode.
- fixed many bugs regarding correct page counter