Hounder downloads
Current Version: 2.0.1
Download:
binaries -
sources
Changes in Hounder 2.0.1
- Fixed installation issues.
Changes in Hounder 2.0
- Using now lucene 2.4.1. The new indexes generated by this version will not be backwards compatible. This is the main reason we decided to change the mayor version number.
- Improved memory handling while indexing.
- Several modification to the traffic limiting policies in the searcher. On overload the quality of the results degrades nicely to keep up with the traffic.
- Added protection for cases where the pagedb in the next cycle would be empty. This can be turned off by setting the parameter "protect.against.empty.pagedb" to "off" in the crawler.properties file.
- Added a LoggingSearcher class that writes the queries to a log rolling file called "queries.log", to be used by other classes that need to use query stats.
- Fixed crawler stop process, now there is no need to kill the crawler.
- A stopped crawl can now be resumed at the point it was stopped. To restart a crawl cycle from the beginning, remove the progress*.* files and start the crawler.
- Added progress report to the sort and trim stages of the crawler.
- Added fetchlist retry capability in case a whole fetchlist fails (symptom of network problems). It can be turned on by setting the "retry.fetch.on.disconnect" crawler property to true.
- Eliminated unnecessary work done by the pagedb trimmer, making it much faster.
- Fixed a bug in the freshness factor in the IndexerModule that could cause negative factor values.
- Added a scalar payload scorer.
- Added a ShortDatePayloadScorer that takes a payload in the 'yyyymmdd' format and boosts newer documents.
- Added the payload parameter for opensearch queries.
- Added range queries to the searcher.
- Rewrote the opensearch output as an XSLT-transformation.
- Now the xmlSearchHandler can apply different xslt for different uri paths.
- Changed the OpenSearchSearcher to allow it to use XSL transformations.
Changes in Hounder 1.1.0
- Switched the fetcher from Nutch 0.7.2 to Nutch 0.9
- The Nutch9 http plugin can now work distributed
- The crawler IndexerModule can now talk to multiple indexers
- The crawler now makes progress reports
- There is now spam detection support
- The MultiSearcher returns stats about each searcher
- There is a new indexer for batch re-indexations on multi-cored hardware