principles:
- provide an easy tutorial-like "what should I do first" menu
- provide all elements which are subject to most first questions to YaCy
exibition people on top level: Resource limitation, Parser and Ranking
settings
I apologize to everyone who are used to the old style and need to find
the menu items (again) after this change. I hope that this will make the
interface more usable for new users who see a web indexer/crawler the
first time.
which is the location for the search input frame. The new embedded
header is a clone of the simpleHeader with removed menus and without a
link to the index administration.
possible
- re-implemented formerly jquery-based tagcloud with simple html
- fixed minimized menu for mobile/minimized view (try that: make window
very narrow...)
- adopted all YaCy administration pages to new framework
- created new search page layout (working, but still work in progress)
- old skin files are fully appliable! (and looking good)
- target is a new style based on bootstrap examples, see /test.html
- icons in YaCy may be replaced by glyphicons (to be done)
config settings frome the page also removed from yacy.init
augmentation.reflect
augmentation.addDoctype
augmentation.reparse
interaction.overlayinteraction.enabled
introduced, it was also used for search facets. The generic search
facets are now deduced from generic solr fields which makes jena as tool
for facet semantics superfluous.
*) image links will not be marked (if they have class "yacylogo" or
"forceNoExternalIcon")
*) external links in menu on left (and "fork me"-banner) will open in
new tab/window now
- an existing ranking servlet for solr was extended. It is now possible
to set boost values for fields, boost functions and boost queries.
- The ranking can have different instances, but currently only the first
one is used
- added an abstraction layer for fields which can be used for search and
those fields can be edited in the solr ranking configruation
- the ranking value from solr within the field score is used to combine
remote search requests, which all are created using the same locally
defined boost values
- reduced the number of fields which are used for search (makes it
faster)
- replaced some text fields by string fields (makes indexing faster)
- removed classes which had no use
- made a large number of experiments for a better ranking and created a
temporary setting which prefers hits inside titles
- adjusted also the RWI-based ranking computation to 'prefer title'
- made special cases like for portal search where no post-processing and
post-ranking is wanted: this keeps the original ranking order as done by
Solr
- fixed many bugs with old settings for ranking
index (new solr core webgraph) .. this is now off by default
- completely redesigned this servlet
- added description how to attach a remote solr
- adjusted naming of servlet and menues
- moved 'lazy initialization' attribut from IndexSchema to
IndexFederated (this is a general option) back again.
The default schema uses only some of them and the resting search index
has now the following properties:
- webgraph size will have about 40 times as much entries as default
index
- the complete index size will increase and may be about the double size
of current amount
As testing showed, not much indexing performance is lost. The default
index will be smaller (moved fields out of it); thus searching
can be faster.
The new index will cause that some old parts in YaCy can be removed,
i.e. specialized webgraph data and the noload crawler. The new index
will make it possible to:
- search within link texts of linked but not indexed documents (about 20
times of document index in size!!)
- get a very detailed link graph
- enhance ranking using a complete link graph
To get the full access to the new index, the API to solr has now two
access points: one with attribute core=collection1 for the default
search index and core=webgraph to the new webgraph search index. This is
also avaiable for p2p operation but client access is not yet
implemented.
multiple solr cores instead of just one. Therefore it is now necessary
to distuingish between solr server connections (called an 'Instance')
and a connection to a single solr core. One Instance may now have
multiple connector classes assigned to it, each connecting to a single
core.
To support multiple cores it is also necessary to distinguish between
the connection configuration and the configuration of the index schema.
We will have multiple schema configurations in the future, each for
every solr core. This caused that the IndexFederated servlet had to be
split into two parts, the new Servlet for the Schema editor is now in
the IndexSchema Servlet.
- added additional config page (ConfigSearchPage_p) for easy setup of search page layout (to not overload ConfigPortal page)
- currently redundant setting with part of ConfigPortal page
- added missing config for filetype and protocol navigator
- adjusted init of SearchEvent to check navigation config setting
- renamed RankigProcess.getTopicNavigator to getTopics (to distiguish between added SearchEvent.getTopicNavigator)