Tag Archives: python

Ecological Inference

As a customer, special promotions offered by companies makes me happy. Since companies are aware of that they try to know their customer better, learn more about them to define promotions suit well to them. On the other hand, as a customer I do not want to give my personal information to companies. Since the most of the citizens have similar intention about their personal information, there are regulations about that such as GDPR in EU Zone and KVKK in Turkey. Despite to two attitude of customers seems contradictory, the companies which consider customer is always right and work on to solve paradox will be the winner.

Ecological inference method is a promising solution alternative to the problem. The method tries to get individual level behavior from aggregated data. To solve problem several statistical approaches are offered. Since different individual behaviors can result in same aggregate data it is not possible to solve problem with exact results. In literature mostly problem is defined on voting behavior of different groups. Gary King has remarkable studies on topic and published several books and papers.

There is a library called PyEi is available on GitHub. It allows you to easily benefit from ecological inference method by python. There are several models for inference in the library, one can select most suitable for their need. Output of the inference is in the format of percentages. Then you can generate flowing values from referencing to your original data. My favorite visualisation technique to present ecological inference result is Sankey diagram. It clearly shows flowing amount from origin to destination. 

I failed to find an open source data to use ecological inference in marketing context. I hope to share a tiny study about that when I discover a convenient open source data.

New Online Course: Intro to ML with Tensor Flow by Udacity

I have started a new online course on Udacity called Intro to ML with Tensor Flow. I applied the scholarship program of Bertelsmann Technology in October. In December they announced that I had accepted to the program. 

I just started the course and my experience so far is good. Welcome part of the course covers how to install Anaconda to setup working environment. After the welcome part supervised machine learning section starts. It covers various ML techniques such as decision trees and support vector machine. Since this course is a challenge course to determine who will get scholarship for the rest of the course, it only covers supervised learning. At the end of the challenge course there is a final assessment to find top performing 1600 candidate for the full scholarship.

I hope to learn new perspective about ML by the course and I am looking for to attend next phase.