Yacy Search Server
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
 
Go to file
Michael Peter Christen bb7d836985
code freeze for release
5 years ago
.github added patreon 5 years ago
.settings migrated Solr 5.5 -> Solr 6.6 and from Java 1.7 -> 1.8 8 years ago
addon Added a stop command using the existing stop script to the snap package 6 years ago
bin Updated the down.sh script, fixing the same kind of issue as in PR #260 6 years ago
debian Updated Debian optional dependencies with the ones used for snapshots 7 years ago
defaults more space for sponsoring 5 years ago
dictionaries changed way to integrate dictionary files: 16 years ago
docker Fixed Alpine flavour Docker image build on already existing /opt folder 6 years ago
examples move example code SearchClient out of yacycore package 9 years ago
htroot switched url and snippet position 5 years ago
langdetect Use language-detection library for increased accuracy 10 years ago
lib Upgraded Lucene/Solr dependencies from 6.6.5 to 6.6.6 6 years ago
libbuild Use standard Java annotation syntax instead of custom Javadoc tag 7 years ago
libt update to JUnit 4.11 11 years ago
locales removed comment line 6 years ago
skins switched url and snippet position 5 years ago
snap Added a stop command using the existing stop script to the snap package 6 years ago
source removed some warnings 5 years ago
test fixed many links to old forum, now https://searchlab.eu 6 years ago
vocabularies moved yacy.logging to defaults according to request in 13 years ago
.checkstyle problems with code style 16 years ago
.classpath Upgraded Lucene/Solr dependencies from 6.6.5 to 6.6.6 6 years ago
.env Removed unnecessary variables for local heroku run 9 years ago
.gitignore Ignore generated Javadoc with git SCM. 8 years ago
.project Configuration projet eclipse : ajout nature et validation javascript 9 years ago
.travis.yml Fixed Travis configuration for Debian package building task 7 years ago
AUTHORS added flori and me 17 years ago
CONTRIBUTING.md Add contributor guidelines; closes #214 7 years ago
COPYRIGHT updated copyright message; included LGPL for 'cora' and a warranty 12 years ago
Heroku.md Simplified Heroku variables configuration 9 years ago
NOTICE * updated jxpath to latest v1.3 16 years ago
Procfile Simplified Heroku variables configuration 9 years ago
README.md fixed many links to old forum, now https://searchlab.eu 6 years ago
app.json Updated YaCy home page embedded links from http to https scheme 7 years ago
assembly.xml Fixed maven assembly base directory to match last main YaCy binaries. 8 years ago
build.nsi Updated the JRE URL from 8u181 to 8u191 for the MS Windows installer 6 years ago
build.properties code freeze for release 5 years ago
build.xml Upgraded Lucene/Solr dependencies from 6.6.5 to 6.6.6 6 years ago
getWin32MaxHeap.bat update for memory observer algorithm 15 years ago
gpl.txt initial load with yacy 0.36 20 years ago
installYaCyWindowsService.bat upd classpath in batches (remove not necessary htroot) 9 years ago
killYACY.sh Fix for http://mantis.tokeek.de/view.php?id=432 11 years ago
lgpl21.txt migrated all my LGPL 3 -licensed files to the LGPL 2.1 because LGPL 3 is not compatible to the GPL 2 15 years ago
pom.xml Upgraded Lucene/Solr dependencies from 6.6.5 to 6.6.6 6 years ago
reconfigureYACY.sh Fix for http://mantis.tokeek.de/view.php?id=432 11 years ago
startYACY.bat Made SNI extension user configurable without the need for server restart 6 years ago
startYACY.sh fixed many links to old forum, now https://searchlab.eu 6 years ago
startYACY_debug.bat fixed many links to old forum, now https://searchlab.eu 6 years ago
stopYACY.bat upd classpath in batches (remove not necessary htroot) 9 years ago
stopYACY.sh Relevant message when using the stop script while YaCy is not running 6 years ago
uninstallYaCyWindowsService.bat added Windows Service installer 11 years ago
updateYACY.sh Fix for http://mantis.tokeek.de/view.php?id=432 11 years ago
yacy-packages.readme added documentation for new yacy package structure 15 years ago
yacy.yellow performance setting for remote indexing configuration and latest changes for 0.39 20 years ago

README.md

YaCy

Gitter Build Status

Deploy

What is this?

YaCy is a search engine software. It takes a new approach to search because it does not use a central server. Instead, its search results come from a network of independent peers. In such a distributed network, no single entity decides what gets listed, or in which order results appear.

The YaCy search engine runs on each user's own computer. Search terms are hashed before they leave the user's computer. Different from conventional search engines, YaCy is designed to protect the users' privacy. A user's computer can create with YaCy its individual search indexes and rankings, so that results better match what the user is looking for over time. YaCy also makes it easy to create a customized search portal with a few clicks.

Each YaCy user is either part of a large search network (YaCy contains a peer-to-peer network protocol to exchange search indexes with other YaCy search engine installations) or the user runs YaCy to produce a personal search portal that can be either public or private.

YaCy search portals can also be placed in intranet environment which makes YaCy a replacement for commercial enterprise search solutions. A network scanner makes it easy to discover all available http, ftp and smb servers.

To create a web index, YaCy has a web crawler for everybody, without censorship and central data retention:

  • search the web (automatically using all other YaCy peers)
  • co-operative crawling; support for other crawlers
  • intranet indexing and search
  • set up your own search portal
  • all users have equal rights
  • comprehensive concept to anonymise the users' index

To be able to perform a search using the YaCy network, every user has to set up their own node. More users are leading to higher index capacity and better distributed indexing performance.

License

YaCy is published under the GPL v2 The source code is inside the release package (see /source and /htroot).

Where is the documentation?

Documentation can be found at:

Every of these locations has a (YaCy) search functionality which combines all these locations into one search result.

Dependencies? What other software do I need?

You need Java 1.8 or later to run YaCy, nothing else (Java 1.7 can still be used to run the main 1.92/9000 release ) Please download it from https://www.java.com

YaCy also runs on Iced Tea 3. See https://icedtea.classpath.org

NO OTHER SOFTWARE IS REQUIRED! (you don't need apache, tomcat or mysql or whatever)

How do I start this software?

Startup and Shutdown of YaCy:

  • on GNU/Linux and OpenBSD:

    • to start: execute ./startYACY.sh
    • to stop : execute ./stopYACY.sh
  • on Windows:

    • to start: double-click startYACY.bat
    • to stop : double-click stopYACY.bat
  • on Mac OS X: please use the Mac Application and start or stop it like any other Mac Application (double-click to start)

How do I use this software, where is the administration interface?

YaCy is a build on a web server. After you started YaCy, start your browser and open

http://localhost:8090

There you can see your personal search and administration interface.

What if I install YaCy (headless) on a server?

You can do that but YaCy authorizes users automatically if they access the server from the localhost. After about 10 minutes a random password is generated and then it is not possible to log in from a remote location. If you install YaCy on a server that is not your workstation, then you must set an administration account immediately after the first start-up. Open:

http://:8090/ConfigAccounts_p.html

and set an administration account.

Can I run YaCy in a virtual machine or a container?

YaCy runs fine in virtual machines managed by software such as VirtualBox or VMware.

Container technology may be more flexible and lightweight and also works fine with YaCy.

These technologies can either be deployed locally, on remote machines you own, or in the 'cloud'. Decide what fits the most your privacy requirements.

Docker

Deploy easily YaCy on a Docker cloud provider of your choice (can be a machine you own) with the deploy button at the top of this page.

More details for YaCy with Docker in docker/Readme.md.

Heroku

Deploy easily on Heroku PaaS (Platform as a service) provider using the deploy button at the top.

More details for YaCy on Heroku in Heroku.md.

Port 8090 is bad, people are not allowed to access that port

You can forward port 80 to 8090 with iptables:

iptables -t nat -A PREROUTING -p tcp --dport 80 -j REDIRECT --to-port 8090

On some operation systems, you must first enable access to the ports you are using like:

iptables -I INPUT -m tcp -p tcp --dport 8090 -j ACCEPT

How can I scale this; how much ram is needed; disk space?

YaCy can scale up to many millions of web pages in your own search index. The default assignment of RAM is 600MB which is assigned to the java process but not permanently used by it. The GC process will free the memory once in a while. If you have a small index (i.e. about 100000 pages) then you may assign less memory (i.e. 200MB) but if your index scales up to over 1 million web pages then you should start to increase the memory assignment. Open http://localhost:8090/Performance_p.html and set a higher/lower memory assignment. If you have millions of web pages in your search index then you might have gigabytes of disk space allocated. You can reduce the disk space i.e. setting the htcache space to a different size; to do that open http://localhost:8090/ConfigHTCache_p.html and set a new size.

Join the development!

YaCy was created with the love of a community. A large number of programmers have helped, please join us!

Here is a rough hint how to start developing YaCy in eclipse:

  • Clone https://github.com/yacy/yacy_search_server.git
  • File -> Import as Git -> Projects from Git -> Existing local repository
  • -> add -> your git clone of yacy_search_server
  • "Import existing Eclipse projects" -> finish
  • Run -> External Tools -> External Tools Configuration -> double-click Ant Build
  • -> Name: "YaCy Build" -> Buildfile: Browse Workspace -> build.xml -> Run
  • In Package Explorer, right-click on yacy -> Run as -> Java Application -> Select "yacy - net.yacy" -> Ok

To join our development community, got to https://searchlab.eu

If you implemented something amazing we welcome your pull request at https://github.com/yacy/yacy_search_server

How to get the source code and how to compile YaCy yourself?

The source code is inside every YaCy release. You can also get YaCy from https://github.com/yacy/yacy_search_server by cloning the repository

git clone https://github.com/yacy/yacy_search_server

Please clone our code and help with development! The code is licensed under the GPL v2.

Compiling YaCy:

  • you need Java 1.8 or later and Apache Ant
  • just compile: "ant clean all" - then you can "./startYACY.sh" or "./startYACY.bat"
  • create a release tarball: "ant dist"
  • create a Mac OS release: "ant distMacApp" (works only on a Mac)
  • create a debian release: "ant deb"
  • work with eclipse: within eclipse you also need to start the ant build process because the servlet pages are not compiled by the eclipse build process after the dist procedure, the release can be found in the RELEASE subdirectory

Build with Maven:

  • for the first time goto subdirectory libbuild (which contains the maven parent pom)
  • compile with "mvn clean install -DskipTests", this will create all needed modules
  • after above you can use just the pom in the main directory to build YaCy with maven

Are there any APIs or how can I attach software at YaCy?

There are many interfaces build-in in YaCy and they are all based on http/xml and http/json. You can discover these interfaces if you notice the orange "API" icon in the upper right of some web pages in the YaCy web interface. Just click on it and you will see the xml/json version of the information you just have seen at the web page. A different approach is the usage of the shell script provided in the /bin subdirectory. The shell scripts also call the YaCy web interface. By cloning some of those scripts you can create more shell api access methods easily.

Contact

Our primary point of contact is the international YaCy forum at https://searchlab.eu We encourage you to start a discussion there in your own language.

If you have any questions, please do not hesitate to contact the maintainer: Send an email to Michael Christen (mc@yacy.net) with a meaningful subject including the word 'yacy' to prevent that your email gets stuck in my anti-spam filter.

If you like to have a customized version for special needs, feel free to ask the author for a business proposal to customize YaCy according to your needs. We also provide integration solutions if the software is about to be integrated into your enterprise application.

Germany, Frankfurt a.M., 26.11.2011 Michael Peter Christen