*) adding content parser for
- pdf (using the pdf-box library)
- doc (using the textmining.org library)
*) adding a Interface for content parsers
*) adding a configuration file which can be used to configure which parser is used for which mimeType
*) Sempahore class was moved and renamed to serverSemaphore
*) Changing yacy shutdown behaviour
Buzy waiting loop for shutdown was removed and replaced with a blocking call (using the semaphore class mentioned above) to the new switchboard.waitForShutdown method.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@46 6c8d7289-2bf4-0310-a012-ef5d649a1542
- introduction of a threadpool for crawling
- introduction of a job queue to avoid buzy waiting for a free crawler slot
*) New classes added
- queue for receiving of crawler jobs
- semaphore class to do reader/writer synchronization (mutual exclusion)
- message object to hold all needed data about a crawler job
*) Trying to solve session-thread shutdown problem
- session thread stopped variable is now set from outside before interrupting the
session thread.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@39 6c8d7289-2bf4-0310-a012-ef5d649a1542
can be used instead of a ByteArrayOutputStream
*) Using a serverByteBuffer for lineBuffering in class httpc
instead of a ByteArrayOutputStream
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@35 6c8d7289-2bf4-0310-a012-ef5d649a1542
- many classes set to final
- implementation of a session-thread pool
- reusage of the server handler class (normally the httpd object)
within the session thread
- implementation of a httpc object pool
- introduction of a linebuffer in httpd which can be reused
- reusing the properties table in the httpc
- added to apache libs (commons-collections, commons-pool) which
are needed for the object/thread pool implementation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@26 6c8d7289-2bf4-0310-a012-ef5d649a1542