Kaggle: Join the global machine learning and AI community

Around a halve year back I stumbled over Kaggle.com, a vital community portal of Artificial Intelligence and machine learning experts. Kaggle not only encourages people around the world to share thoughts and example data sets on popular machine learning tasks, they also host great AI challenges.

Since I joined the Kaggle community 6 month ago, I was fascinated about the individual challenges that were published. Those challenges range from predicting Mercari product prices over detecting icebergs from radar data to speech recognition tasks.

Many companies such as Google, Mercari or Zillow are hosting challenges where more than thousand of teams try to predict the best results. Often it is unbelievable how those teams solve these complex machine learning tasks.

Besides providing the challenges and the data sets necessary to wake the interest of global leaders within the machine learning and AI community, Kaggle also offers a tremendously powerful kernel execution environment. This execution environment consists of preconfigured Docker containers that were specifically designed for training models. In order to design and execute a machine learning kernel you simply edit the code online (Python, R, Notebook) and execute it within the Kaggle infrastructure.

As Kaggle docker containers are completely preconfigured you save a lot of time to download and prepare your environment.     



Kaggle really pushes the AI community forward in terms of offering a flexible and open platform for executing kernels and to quickly get hands on interesting data sets. The community platform also does a pretty good job in bringing the global community together and stimulates a broader and practical discussion outside the theoretical scientific community.

Besides if you need a quick start tutorial on how to train your first neural network, grab my eBook at Amazon:

Facebook’s Social Graph Search – Semantic Search for People, Places and Things

semantic facebook search
image source: Facebook.com

Facebook recently  announced Social Graph Search, a new kind of semantic search engine that allows you to query Facebooks social graph data structure by using natural language queries. Facebook specifies some interesting but harmless samples for social graph queries, such as search for ‘People who like Cycling’, ‘Photos i like’, ‘Photos before 1970’, ‘Restaurants in London my Friends have been’ and so on. These queries seem to offer real potential for natural language processing in the area of social graphs. to use natural language processing for query large data graphs is not new, as it was already introduced by WolframAlpha, in order to query for all kind of general knowledge, from local weather to mathematical questions etc.

Facebook goes a little bit further by introducing a natural query language to search for quite personal information within a global available social graph. It of course depends on the type of queries that transforms Facebook’s social graph search into an informative personal tool for everyday life, or into the nightmare for every privacy activist. Some already published queries, such as ‘People working for Facebook’ or similar searches are quite scary and will for sure lead to misuse and dubious apps.

From business perspective the use of social graph queries allows advertisers to specifically select their target groups, even better than existing context- and location sensitive solutions already allow it today. Facebook’s social graph search will lead to a new level and quality of advertising, no matter if this means more or less advert annoyance for the future.



Talk ‘How to build a Recommendation Engine’

Watch this really interesting presentation by Coen Stevens, who explains how to build a recommendation engine for Wakoopa. Coen, who achieved his master degree at the TU Delft on Knowledge Based Systems and propedueses for Psychology and Philosophy at Leiden University became the Lead Recommendations Engineer at Wakoopa. He held this talk at recked.org.

Wakoopa is working on methods to understand what people do in their digital lives. In a privacy conscious way, their technology tracks what websites customers visit, what ads they see, or what apps they use. Users of Wakoopa are able to analyze that collected data in an online dashboard, and are able to optimize their digital marketing strategy accordingly.

Recommendation Engine for Everything


Today, the customers are overwhelmed by the number of available products and choices. As busy customers do not have much time to spent on search and product comparison, recommendation engines are the actual hype within ICT startup companies. Tipflare is such a recommendation engine, thats built for recommending everything from food to clothes. They support theyr customers by selecting related products on which the customer might be interested in. They analyze your ratings and purchase history to advise other products. And there is a huge amount of data, where the term Big Data appears the next big thing on the business intelligence market to handle and analyze such large amounts of data.

image source: tipflare website

ZAINOO – Personalize your Italy travel guide!

Zainoo, a brand new startup located in Austria, lets you plan your trip to Italy
by selecting a list of interesting regions, cities and sights. Individualistic
travelers can plan their trip to italy and download their finished selection to a smartphone
or tablet pc. Compared to TripWolf, one of the big players within the community travel guide
market, Zainoo offers first hand inside tips, descriptions and photos from the authors
themselves. At this stage Zainoo contains detailled information about Venice, Florence, Milan, Sicily and Rome.

eBay and product recommendation strategies

eBay Research Labs

Last week i read a really interesting scientific article from Neel Sundaresan, who works as Sr. Director & Head at the eBay Research Labs in San Jose. The article was published at the conference for Recommendation Systems in 2011 and deals with different strategies that are implemented at eBay to support customers to find relevant items inside a huge collection of available items. His article gives an excellent overview about challenges, opportunities, and approaches in building recommender systems for huge marketplaces, such as eBay. Sundaresan also states that the last decade has seen an explosive growth in research and use of recommender systems in all major e-commerce and content platforms like Netflix, eBay, Amazon, lastfm, spotify or Youtube. Beside the traditional recommendations, such as ‘People who performed action X also performed action Y‘, these engines also analyze the diversity of the users that includes collectors, value shoppers, resellers, and the complexities in shopping caused by the gap between buyer and seller languages. For e-commerce this means to think in terms of substitutes and complements and how to increase the size of a customers shopping cart. While substitutes are equivalent products which offer different prices or additional attributes, complements represent additional products that fit well with the items the customer has already within his basket.

In detail Sundaresan discusses the top-level questions of recommendation engines: 5 Ws and an H – What, When, Why, Who, Where, and How.

The What identifies and analyzes how the seller language differs from the description of an item that the customer could find attractive. The Where defines the different contexts in which a seller has the opportunity to offer items to a customer, that is in a specific mindset at this defined moment. The Why aspect addresses the transparency aspect of recommendation engines. So customers feel better, which also is reflected by the success of the approach, when there is a reason why the system proposes them an item. (thats the reason why most platforms tell you ‘People who bought this, aslo bought that’, even when the analysis in the backend is more complex than this) The Who aspect reflects who needs recommendations. A powerseller or buyer does not need the same recommendations than other customers. The When aspect defines the timeline when a customer needs, or is open for suggestions, while the How targets the technological aspects which include algorithms for designing recommendation engines. Modern approaches use content models, neighborhood models, matrix factorization models or hybrid approaches to complete the user-item matrix space.

Altogether, an excellent article, definitely worth reading!

The Journal of Personalization Research (UMUAI)


just read about an upcoming special issue of ‘The Journal of Personalization Research (UMUAI)’ which will deal with the topic of Context‐Aware Recommender Systems.
The upcoming special issue will deal with following topics:

  • Context modeling techniques for recommender systems,
  • Context‐aware user modeling for recommender systems
  • Acquisition, prediction, and mining of contextual information in recommender systems
  • Algorithms for context‐aware recommender systems
  • Interacting with context‐aware recommender systems
  • Novel applications for context‐aware recommender systems
  • Large‐scale context‐aware recommender systems
  • Context‐aware recommendation to groups
  • Evaluation and user studies of context‐aware recommender systems
  • Privacy issues in context‐aware recommender systems

Recommender systems are a popular area of personalization technologies, which has
enjoyed a tremendous amount of research and development activity in both academia and
industry in the last 10‐15 years. Recommender systems research typically explores and
develops techniques and applications for recommending various products or services to
individual users based on the knowledge of users’ tastes and preferences as well as users’ past activities (such as previous purchases), which are applicable in a variety of domains and settings.

Recommender Systems and Context-Awareness

File:Olpc forum purchase advice recommendations swapmarket contributions.svg

In the last months i realized that a new trend is emerging within all the fancy mobile apps that are published within the global marketplaces. The apps focus on the massive use of recommendation engines in combination with context-awareness. Such systems try to use all the higly dynamic information gathered from smartphones in order to guide a user through a huge collection of options. These options can be various items from a webshop, in product placements, advertising and many other categories.

While the traditional approaches for creating recommendation systems are mainly based on content-based and collaborative-filtering methods,  modern approaches also include the evaluation of context-information.

Generally, recommendation systems are based on one of two major strategies,

Content Filtering (or Content-based recommendation)

Within the content filtering approach explicit profiles of users and objects are defined to characterize their nature. Each profile contains a specific set of attributes which can be used to compare the similarity of one object to another. For example, a restaurant could have a cuisine attribute, describing the type of food it offers, a location attribute, a vegetarian tag, and so on.  A recommendation function f chooses items which are similar to items the user has already chosen or rated before. The utility function compares the users’ profile and calculates the similarity of a user profile with the available items.

Collaborative Filtering
The recommendation function chooses items which were preferred by other users with similar attributes. Collaborative Filtering is domain-free, which means that it can be applied to any application area and to different data aspects, which could be hard to formulate into an explicit profile. Collaborative filtering is more accurate than content filtering but has the challenge of starting without any initial data sets (cold start problem). It is for example not possible to address new users or objects for which the system has no initial data set available. The different collaborative filtering approaches can be classified into two major directions, neighborhood methods and latent factor models.

  1. kNearest neighbor, the similarity between the target user, u, and a neighbor, v, can be calculated using the Pearson’s correlation coefficient:
  1. association rules, buy A + buy B = recommend Buy C
  2. matrix factorization

Explicit feedback means that a user rates items explicitly (which results in a sparse matrix because user can rate only a small percentage of all available items)
Implicit feedback on the other hand means that the users preferences are indirectly reflected by observing his behaviour like previous purchases, navigation path, search terms and so on…

Today most of the implementations are so called hybrid approaches, which means that they combine content and collaborative filtering to optimize the recommendation results.

A very interesting scientific workshop (CARS: Workshop on Context-Aware Recommender Systems, http://cars-workshop.org/) deals with the actual trends of combining traditional approaches for recommendtion engines with context-awareness.