Dataset:https://www.kaggle.com/datasets/patrickb1912/ipl-complete-dataset-20082020

In this Article, i explore the IPL data from 2008-2024.From top scorers to team head-to-head stats here’s what i found in my analysis

The dataset contains 2 csv files:

1.matches.csv - It has the details of each match.

2.deliveries.csv - This file contains the ball to ball data of each match.

This is how they look: 1.matches.csv:

image.png

2.deliveries.csv:

image.png

The first step for any analysis is to clean the data and remove any duplicates if present.

Step-I : Cleaning/Removing the unwanted data

So After seeing the data of matches.csv , I tried to check the % of null values/NaN , Interestingly I found that the method column has 98.082 null/empty values. So i have dropped the column completely. Here is the full null value% of each column.

image.png

Step-II : removing the duplicates or renaming as per the requirements.

The unique teams initially are :

{'Kings XI Punjab', 'Rajasthan Royals', 'Delhi Capitals', 'Pune Warriors', 'Deccan Chargers', 'Rising Pune Supergiants', 'Gujarat Lions', 'Lucknow Super Giants', 'Delhi Daredevils', 'Mumbai Indians', 'Sunrisers Hyderabad', 'Punjab Kings', 'Rising Pune Supergiant', 'Gujarat Titans', 'Royal Challengers Bangalore', 'Chennai Super Kings', 'Royal Challengers Bengaluru', 'Kochi Tuskers Kerala', 'Kolkata Knight Riders'}


As seen above we can observed that Royal Challengers Bangalore and Royal Challengers Bengaluru