We would like to welcome Chuanhow Technologies as our newest partner

Chuanhow Technologies is a specialist in cloud computing/big data, software defined networking, network security, data analysis, data loss prevention, application delivery management, business continuity, enterprise software, e-commerce, mobile computing, asset management, social media and an innovation strategy for next generation products. Chuanhow helps vendors in the IT market develop a profitable business, provide a distinctive pathway for development, a new way of thinking to help reseller partners and enterprise customers reduce costs, optimize operations, improve efficiency, and build a profitable portfolio.

We are proud to announce that Everis has become our partner

Everis is a multinational consulting firm providing business and strategy solutions, application development, maintenance, and outsourcing services. Established in 1996, everis has averaged 20% annual growth in revenues and became part of NTT Data in January, 2014.
Being part of the NTT Data group enables everis to offer a wider range of solutions and services through increased capacity as well as technological, geographical, and financial resources

We were at the Cassandra Summit, from 10 to 12 September in San Francisco

The Cassandra Summit 2014 took place on September 10-12 in San Francisco with more than 2000 attendants. The summit was split into 3 days, with Training sessions the first and third day and Conference sessions with 6 tracks on the second one. The first thing we noticed after passing through the registration desk was how amazing the organization was. Smooth registration process, clear track and room directions, clear demarcation of themes: sponsors, tracks, etc. In summary, one of the best organizations in an appropriate venue with huge social network activity. 

Paper of the week: “BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data” [1]

This paper has been presented at the Eurosys 2013 conference and is avaiblable for download at the conference website. The paper presents BlinkDB that, despite its name, is not a database but a query engine on top of Hive and Shark, and it is used for running interactive SQL queries on large volumes of data using data samples. BlinkDB is built using two key ideas: an adaptive optimization framework to build and maintain stratified samples, and a dynamic sample selection strategy to select appropiately sized sample based on a query’s accuracy or response time requirements.
This paper offers an interesting introduction on how to apply statistical inference technics on Big Data and makes clear that there is always a trade-off between accuracy and performance. In that regard, BlinkDB offers information about query accuracy so the user can make decisions. Although it is not clear what the cost of maintaining stratified samples is, the paper provides a good seed for future works in the area.
[1] Agarwal, Sameer, et al. “BlinkDB: queries with bounded errors and bounded response times on very large data.” Proceedings of the 8th ACM European Conference on Computer Systems. ACM, 2013.