Apache Mahout 0.6 released #
As of Monday, February 6th a new Apache Mahout version was released. The new package features
Lots of
performance improvments:
- A new LDA implementation using Collapsed Variational Bayes 0th Derivative
Approximation - try that out if you have been bothered by the way less than optimal performance of the old
version.
- Improved Decision Tree performance and added support for regression problems
- Reduced runtime of
dot product between vectors - many algorithms in Mahout rely on that, so these performance improvements will affect
anyone using them.
- Reduced runtime of LanczosSolver tests - make modifications to Mahout more easily and have
faster development cycles by faster testing.
- Increased efficiency of parallel ALS matrix
factorization
- Performance improvements in RowSimilarityJob, TransposeJob - helpful for anyone trying to find
similar items or running the Hadoop based recommender
New features:
- K-Trusses, Top-Down
and Bottom-Up clustering, Random Walk with Restarts implementation
- SSVD enhancements
Better integration:
- Added MongoDB and Cassandra DataModel support
- Added numerous clustering display
examples
Many bug fixes, refactorings, and other small improvements. More information is available in the Release Notes.
Overall great improvements towards better performance, better stability and integration. However there are still quite some outstanding issues and issues in need for review. Come join the project, help us improve existing patches, improve performance and in particular integration and streamlining of how to use the different parts of the project.