Content, Content Everywhere and not an Item to Consume

I Feel Bloated

You have spent a great deal of time building up the catalog of things that you want to make available for consumers to purchase or consume.  In fact you have done such a great job that now you have (gasp) too much stuff and folks are not getting the value of the deep and rich service you provide.  You might even have multiple, complementary platforms that you offer your service or catalog through – potentially complicating the landscape of what to select even further.

Let’s face it; users are confounded at the myriad choices they have today.  “I have 999 channels to surf”, “There are 350,000 apps to look at”, “What other kinds of music tracks or artists might I like?”  Users are frustrated because search isn’t the answer.


The Wilson Confounded Search Conjecture:  You don’t know what to search for if you don’t know what you can search for.  


Search doesn’t solve the challenge of Discovery.  In order for Discovery to be a part of the user experience, the stuff they might like necessarily needs to find them!  It is precisely this challenge that we in the recommender system (recsys) space are aiming to solve.

I Think I Need A Recsys

The first step to solving an issue is to recognize that one exists.  If you are reading this post then you may have come to the realization that your stuff just isn’t performing and your users aren’t engaging. Don’t worry; there is a way through.

Some recsys questions to consider:

  • What are your goals and KPI’s that measure success for a recsys approach?
  • How broad is the catalog of items you express in your service?
  • Is that catalog of items subject to frequent change or is it static (long tail) in nature?
  • Do you believe in social and what does social mean to you?
  • Do you want content based recommendations (how content items relate to each other) or are you going for a more personal approach (recommendations based on users and their behavior)?
  • Do you have good event data to work with (click path histories, purchases, downloads, viewership)?
  • Is speed important?  How quickly do you need a request for a recommendation returned to you?
  • Is scale important?  Do you have “web scale” usage; many millions of users and thousands or millions of items to recommend?
  • Are you comfortable with Big Data?
  • What are your time to market needs?

Build vs. Buy?

This is an interesting and often challenging question to answer honestly.  Sure, you have some sharp engineers that are invigorated by the proposition of building this kind of tech.  It’s cool.  Super geeky, but cool. The tough question is; do they really have the experience and horsepower needed to get it done?  

Any amount of research into the recsys area will certainly reveal several Open Source Software (OSS) projects that aim to bring some general-purpose solutions for this heady problem down to earth.  Here are a few of the more promising projects:


Apache Mahout is a scalable machine learning library that supports large data sets.


Apache Lucene(TM) is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.


Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search.

The availability of these kinds of OSS options can mask the serious and complicated engineering nature of architecting a successful recsys solution.  Be aware that there is much work to be done before the above solutions can become pragmatic for your endeavors.

Also, make certain to answer the recsys questions listed above before embarking on your recsys project.  The answers will shape the success criteria for a buy option or the product definition for a build decision.