Wild Data Part 3: Transfer Learning

Did you know that the word “hippopotamus” is a word of Greek origin? Hippos- comes from “horse” and -potamos means “river”. The funny thing here would be to imagine when Greeks run into this animal for the very first time. There was not a word for every single animal around the world, so they probably thought something like “what a strange horse…!!! Maybe the river has something to do with it. Got it! It will be a hippo-potamus!”

Read More

Wild Data Part 2: Transfer Style

This is the second post of our Wild Data series. In this post, we are going to expose how to transfer style from one image to another. Here, the most interesting point is to know that we won’t use a neural network to classify a set of classes as usual. I mean, we don’t need to train a network for a specific approach. Transfer style is based on pre-trained networks such as it could be a VGG19 trained with ImageNet (one million of images). Thus, a good understanding of transfer style will help you to better understand how convolutional neural networks works for vision. Let’s go on!

Read More

Swarm Intelligence Metaheuristics, part 1: Ant Colony Optimization

In a previous post, we reviewed the taxonomy of metaheuristic algorithms for optimization within the context of feature selection in machine learning problems. We explained how feature selection can be tackled as a combinatorial optimization problem in a huge search space, and how heuristic algorithms (or simply metaheuristics) are able to find good solutions -although not necessarily optimal- in a reasonable amount of time by exploring such space in an intelligent manner. Recall that metaheuristics are especially well fitted when the function being optimized is non-differentiable or does not have an analytical expression at all (for instance, the magnitude being optimized is the result of a randomized complex simulation under a parameter set that constitutes a candidate solution). Note that maths cannot help us in such cases and metaheuristics can be the only way to go.

Read More

Correlation does not imply… sluggishness

When the Father of Statistics Ronald Fisher started to witness mounting evidence in favor of the association between smoking and lung cancer, he was quick to fall back into the maxim he helped coin “Correlation does not imply causation”, discounting the evidence as spurious and continuing his habit as a heavy smoker of cigarettes. While his command of mathematics was well above his contemporaries at the time, this speaks volumes for the fact that everybody is prone to bias, and that we should attempt not to fall in love with our hypotheses too early in our decision making process. While Fisher may have had a point, the fact that he was a smoker himself certainly clouded his thinking and led him to not consider fairly all possible explanations for the available evidence.

Read More

Cooking ML Models

Have you ever watched the cooking teaching shows? You have probably noticed that chefs have usually already all the ingredients separated and chopped. A chef probably will be more useful and creative cooking rather than spending time peeling and chopping potatoes, even though it is still important in the recipe. Likewise, a data scientist will be more useful and creative building models rather than spending time with data preprocessing. In this way, where a chef would prepare exquisite delicacies a data scientist prepares succulent models.

Read More

Statistical Comparison of Machine Learning Algorithms (Part 1)

This is the first of a two-part series dealing with the application of statistical tests for the formal comparison of several Machine Learning (ML) algorithms in order to determine whether one generally outperforms the rest or not. In this first chapter, we explain the fundamentals of statistical tests, while in the second part, we examine how they are applied to ML algorithm performance data with the aim of comparing them from a statistical point of view.

Read More