The answer to the challenge can be found here.
This post contains the winning solution for the Stratio challenge 2015 developed by Marco Piva, Leonardo Biagioli, Fabio Fantoni and Andrea De Marco (BitBang).
The work describes the data model and the architecture of a Big Data Analytics solution that can help online advertisers to get fast answers for their analytical questions about impressions, clicks and purchases,basing on the Stratio Challenge requirements.
The emphasis is placed on the technologies that have been used, with particular focus on Spark and Spray.
Stratio has just added top-k queries support to its Lucene based implementation of the Cassandra’s secondary indexes. This implementation was originally designed to allow embedded full-text and multivariable search in Apache Cassandra. The previous release included an ad-hoc mechanism to perform distributed relevance queries based on the Lucene’s scoring algorithm. The current release generalizes this mechanism to allow several types of top-k queries.
The Cassandra Summit 2014 took place on September 10-12 in San Francisco with more than 2000 attendants. The summit was split into 3 days, with Training sessions on the first and third day and Conference sessions with 6 tracks on the second one. The first thing we noticed after passing through the registration desk was how amazing the organization was.