Apache Mahout

Apache Mahout

Open Source projects are on the rise and Apache Mahout is no exception to this. Started in 2008 as side project to Apache Lucene, the wide ambit and application of Mahout coerced developers to define it as an altogether new branch in its machine learning arsenal.

So what’s Apache Mahout?

Mahout is fundamentally a machine learning Java library which at its core makes use of Hadoop distributed computation mechanism in order to support the much needed scalability. It is being extensively used for developing recommendation engines, clustering and classification systems.

Recommender Engines

Been on an Ecommerce or a dating website and been recommended something? Prospects are good that a mahout recommendation engine must be running in the background. Mahout recommendation engines run on a simple principle that similar users have the same predilection and hence the hypothesis that an item preferred by one person can also be preferred by some similar other user.Internally Mahout makes use of numerous complicated algorithms to develop a model of similar users and taking into account their activity to develop recommendations.

Clustering

To put it simply clustering technique is grouping together closely associated data on the basis of some similarity feature. Mahout’s clustering algorithms try to determine some similarity between unstructured data to develop a highly related cohesive clusters based on the degree of similarity.Well known example of Mahout clustering could be of Google news. Google News groups news articles by topic using clustering techniques, in order to present news grouped by logical story, rather than presenting a raw listing of all articles. A click on the Google News “Entertainment” tab could very well result in a Mahout’s “Entertainment related cluster” being presented to the user.

Classification

Want to predict something based on your past experiences? Well then Mahout Classification technique would fit seamlessly in your requirement. It’s a process of using specific information (input) to choose a single selection (output) from a short list of predetermined potential responses. Mahout’s classification algorithms try to infer from previous experiences and develop a model which is then used to make intelligent predictions.Ever wondered how your mailbox detects spam messages without you having to go over it? Well, the mail box providers develop a model based on prior emails and spam reports from users, as well as on characteristics of the email itself. So a new incoming mail is scrutinized by this model and makes an appropriate decision.

Mahout is an excellent solution for the aforementioned scenarios. And with the ever growing open source community there’s better support than for any other scalable machine learning projects. Good Luck with Mahout.

Superb! You have always been a genius.. Good goingg..

Like
Reply

To view or add a comment, sign in

Others also viewed

Explore content categories