Apache Mahout 0.6 released

2012-02-08 21:33 More posts about Mahout Apache mongodb Cassandra
As of Monday, February 6th a new Apache Mahout version was released. The new package features

Lots of performance improvments:


  • A new LDA implementation using Collapsed Variational Bayes 0th Derivative Approximation - try that out if you have been bothered by the way less than optimal performance of the old version.
  • Improved Decision Tree performance and added support for regression problems
  • Reduced runtime of dot product between vectors - many algorithms in Mahout rely on that, so these performance improvements will affect anyone using them.
  • Reduced runtime of LanczosSolver tests - make modifications to Mahout more easily and have faster development cycles by faster testing.
  • Increased efficiency of parallel ALS matrix factorization
  • Performance improvements in RowSimilarityJob, TransposeJob - helpful for anyone trying to find similar items or running the Hadoop based recommender


New features:

  • K-Trusses, Top-Down and Bottom-Up clustering, Random Walk with Restarts implementation
  • SSVD enhancements


Better integration:

  • Added MongoDB and Cassandra DataModel support
  • Added numerous clustering display examples


Many bug fixes, refactorings, and other small improvements. More information is available in the Release Notes.

Overall great improvements towards better performance, better stability and integration. However there are still quite some outstanding issues and issues in need for review. Come join the project, help us improve existing patches, improve performance and in particular integration and streamlining of how to use the different parts of the project.