Hounder downloads

Current Version: 2.0.1

Download: binaries - sources

Changes in Hounder 2.0.1

  • Fixed installation issues.

Changes in Hounder 2.0

  • Using now lucene 2.4.1. The new indexes generated by this version will not be backwards compatible. This is the main reason we decided to change the mayor version number.
  • Improved memory handling while indexing.
  • Several modification to the traffic limiting policies in the searcher. On overload the quality of the results degrades nicely to keep up with the traffic.
  • Added protection for cases where the pagedb in the next cycle would be empty. This can be turned off by setting the parameter "protect.against.empty.pagedb" to "off" in the crawler.properties file.
  • Added a LoggingSearcher class that writes the queries to a log rolling file called "queries.log", to be used by other classes that need to use query stats.
  • Fixed crawler stop process, now there is no need to kill the crawler.
  • A stopped crawl can now be resumed at the point it was stopped. To restart a crawl cycle from the beginning, remove the progress*.* files and start the crawler.
  • Added progress report to the sort and trim stages of the crawler.
  • Added fetchlist retry capability in case a whole fetchlist fails (symptom of network problems). It can be turned on by setting the "retry.fetch.on.disconnect" crawler property to true.
  • Eliminated unnecessary work done by the pagedb trimmer, making it much faster.
  • Fixed a bug in the freshness factor in the IndexerModule that could cause negative factor values.
  • Added a scalar payload scorer.
  • Added a ShortDatePayloadScorer that takes a payload in the 'yyyymmdd' format and boosts newer documents.
  • Added the payload parameter for opensearch queries.
  • Added range queries to the searcher.
  • Rewrote the opensearch output as an XSLT-transformation.
  • Now the xmlSearchHandler can apply different xslt for different uri paths.
  • Changed the OpenSearchSearcher to allow it to use XSL transformations.

Changes in Hounder 1.1.0

  • Switched the fetcher from Nutch 0.7.2 to Nutch 0.9
  • The Nutch9 http plugin can now work distributed
  • The crawler IndexerModule can now talk to multiple indexers
  • The crawler now makes progress reports
  • There is now spam detection support
  • The MultiSearcher returns stats about each searcher
  • There is a new indexer for batch re-indexations on multi-cored hardware