Get a step ahead of your competitors with a concise collection of smart data handling and modeling techniques
Key Features
Learn how Kaggle works and how to make the most of competitions from two expert Kagglers
Sharpen your modeling skills with ensembling, feature engineering, adversarial validation, AutoML, transfer learning, and techniques for parameter tuning
Discover tips, tricks, and best practices for winning on Kaggle and becoming a better data scientist
Book Description
Millions of data enthusiasts from around the world compete on Kaggle, the most famous data science competition platform of them all. Participating in Kaggle competitions is a surefire way to improve your data analysis skills, network with the rest of the community, and gain valuable experience to help grow your career.
The first book of its kind, Data Analysis and Machine Learning with Kaggle assembles the techniques and skills you’ll need for success in competitions, data science projects, and beyond. Two masters of Kaggle walk you through modeling strategies you won’t easily find elsewhere, and the tacit knowledge they’ve accumulated along the way. As well as Kaggle-specific tips, you’ll learn more general techniques for approaching tasks based on image data, tabular data, textual data, and reinforcement learning. You’ll design better validation schemes and work more comfortably with different evaluation metrics.
Whether you want to climb the ranks of Kaggle, build some more data science skills, or improve the accuracy of your existing models, this book is for you.
What you will learn
Get acquainted with Kaggle and other competition platforms
Make the most of Kaggle Notebooks, Datasets, and Discussion forums
Understand different modeling tasks including binary and multi-class classification, object detection, NLP (Natural Language Processing), and time series
Design good validation schemes, learning about k-fold, probabilistic, and adversarial validation
Get to grips with evaluation metrics including MSE and its variants, precision and recall, IoU, mean average precision at k, as well as never-before-seen metrics
Handle simulation and optimization competitions on Kaggle
Create a portfolio of projects and ideas to get further in your career
Who This Book Is For
This book is suitable for Kaggle users and data analysts/scientists of all experience levels who are trying to do better in Kaggle competitions and secure jobs with tech giants.
Table of Contents
Introducing Data Science competitions
Organizing Data with Datasets
Working and learning with kaggle notebooks
Leveraging Discussion forums
Detailing competition tasks and metrics
Designing good validation schemes
Ensembling and stacking solutions
Modelling for tabular competitions
Modeling for image classification and segmentation
Modeling for Natural Language Processing
Handling simulation and optimization competitions
Creating your portfolio of projects and ideas
Finding new professional opportunities
Konrad Banachewicz holds a PhD in statistics from Vrije Universiteit Amsterdam. He is a lead data scientist at eBay and a Kaggle Grandmaster. He worked in a variety of financial institutions on a wide array of quantitative data analysis problems. In the process, he became an expert on the entire lifetime of a data product cycle.
Having joined Kaggle over 10 years ago, Luca Mass...