/ Data Science Portfolio

Kaggle Competitions and Data Science Portfolios

Understanding Kaggle Competitions

Kaggle is a crowd-sourcing platform in which companies post their real-world data science problems in an effort to solve their problems. Kaggle will host Kaggle competitions to the public and rank the participants against one another. When large companies need help with their data science challenges, they turn to Kaggle and its community for help. The famous $1 million dollar Netflix challenge was originally a Kaggle competition. Other companies who are leveraging the Kaggle competitions to address business solutions are: Allstate, Bosh, State Farm, Red Hat, Facebook, Expedia, Home Depot, Yelp, Airbnb, Walmart, and Liberty Mutual.

Image of Kaggle Competitions Winner Holding Trophies

The Future of Hiring in Data Science

To be hired as an artist or an architect, you need to present a portfolio which showcases years of your work. Programmers need a Github/StackOverflow account to showcase their contributions. Similarly, Kaggle is positioning itself in the same way for any data vocation. The litmus test to be hired as a data scientist gets another hoop. Kaggle has spent the last half decade positioning itself as the premier platform for hiring, recruiting, and screening for data science talent. Companies and recruiters get a transparent record of your performance, a paper trail of your successes and failures, and most importantly, your growth–all of which is tracked in a quantifiable manner. Employers want to see whether a candidate has been battle-tested in data science through these competitions.

To you as an individual, this is a chance to participate and compete in Kaggle against others like you from around the world, prove to employers and recruiters that you have some hands-on experiments under your belt, and that you are worthy of the title of data scientist.

How do I Compete? Where do I start?

The best way is just to dive right in. Be warned that you are competing against people from all over the world. Your first submission scores will be demoralizing. However, do not give up as each failed attempt will shape you into a better data scientist. To fail in Kaggle competitions is not only a good thing, but is desired. Employers want to see your growth and better yet, your potential to overcome your challenges.

  1. Create an account
  2. Pick a competition
  3. Understand the data science problem
  4. Build models and continually tune to improve

We have posted a few tutorial videos below to get you started.

How do you submit to Kaggle?

How do you build an initial model for a Kaggle competition in R?

How to improve your model further by building a predictive model to predict missing values in R.

Competing in Kaggle using Azure Machine Learning Studio

Kaggle and Data Science Dojo

Data Science Dojo hosts a 5-day Data Science and Data Engineering bootcamp to expose people to the entire breadth of data science.Bootcamp attendees are required to participate in a Kaggle capstone project that spans all five days. The end goal of the project is to take what you’ve learned each day and apply it to your model. Each attendee takes what they learned during the day and applies it to their Kaggle models. At the end of the bootcamp, the top 3 performers receive a prize.

Before the end of the bootcamp, students are encouraged to join up into teams and engage in a Kaggle competition together after the bootcamp. Each team is then paired with an industry mentor.