EV Growth Insights: Comprehensive Data on Registrations and Charging Stations

Introduction
The primary goal of the project is to analyze the number of Electric Vehicles (EV) registered per state in comparison to the availability of Public Charging Stations. This EV analysis aims to provide insights that can guide business decisions, such as identifying states with a higher demand for public charging infrastructure.
Method
This project begins with the process of downloading, cleaning, and preparing two datasets sourced from Kaggle and U.S. Department of Energy for Electric Vehicle (EV). The project utilizes Python and a range of libraries to conduct data cleaning and merging, culminating in creating a unified dataset tailored for analysis in Tableau. The datasets featured in this tutorial are “EV Registration Counts by State” and “EV Stations by State.”
Data Source – Kaggle
Downloaded two datasets from Kaggle and U.S. Department of Energy:
Data Cleaning and Preparation
- Imported necessary libraries:
- Pandas for data manipulation
- Loaded the “EV Registration by State” dataset using Pandas.
- Checked for missing values and filled them with zeros in the ‘Registration Count’ column.
- Capitalized state names for consistency.
- Checked summary statistics and visualized the distribution of registration counts.
- Identified states with the highest and lowest registration counts.
import pandas as pd
# Load the dataset
df_registration = pd.read_csv('C:\\Users\\EV Registration by State.csv')
# Check for missing values
missing_values = df_registration.isnull().sum()
print("Missing Values:\n", missing_values)
# Fill missing values with zeros (assuming missing values indicate zero registrations)
df_registration['Registration Count'].fillna(0, inplace=True)
# Capitalize state names for consistency
df_registration['State'] = df_registration['State'].str.capitalize()
# Explore the dataset
print("Summary Statistics:\n", df_registration.describe())- Cleaned and filtered the “EV Stations by State” dataset.
- Eliminated rows for countries other than the United States.
- Kept only the ‘State’ and ‘Access Code’ columns.
- Counted charging stations per state grouped by ‘Access Code(Public or Private).’
import pandas as pd
# Load the dataset
df_stations = pd.read_csv("C:\\Users\\Projects\\EV Stations by State.csv")
# Filter rows where the country is 'USA'
df_usa_stations = df_stations[df_stations['Country'] == 'US']
# Select only the 'State' and 'Access code' columns
df_cleaned_stations = df_usa_stations[['State', 'Access Code']]
# Check the first few rows of the cleaned DataFrame to verify the data
print(df_cleaned_stations.head())
# Write the cleaned DataFrame to a CSV file
#df_cleaned_stations.to_csv('Cleaned_EV_Stations.csv', index=False)
#the State column lists each occurrence of an EV charging station
#the Access Code column lists whether the EV charging station is public or private
#Count each charging station per state grouped by Access Code
# Group the DataFrame by 'State' and 'Access Code', and count the occurrences
charging_station_counts = df_cleaned_stations.groupby(['State', 'Access Code']).size().unstack(fill_value=0)
# Rename the columns for clarity
charging_station_counts.columns = ['Private Stations', 'Public Stations']
# Reset the index to make 'State' a regular column
charging_station_counts = charging_station_counts.reset_index()
# Display the resulting DataFrame
print(charging_station_counts)
# Save the charging station counts DataFrame to a CSV file
charging_station_counts.to_csv('EV_Stations_Counts_by_State.csv', index=False)
Merging Datasets
- Merged the cleaned “EV Registration Counts by State” and “EV Stations by State” datasets by the ‘State’ column to create a new dataset to Show State, Registration Counts, and Public Station Count.
- Filled missing values in the ‘Public Stations’ column with zeros.
import pandas as pd
# Load the EV Registration dataset
df_registration = pd.read_csv('C:\\Users\\reish\\OneDrive\\Projects\\EV_Registration_Counts_by_State.csv')
# Load the EV Stations dataset
df_stations = pd.read_csv('C:\\Users\\reish\\OneDrive\\Projects\\EV_Stations_Counts_by_State.csv')
# Merge the two datasets on the 'State' column
merged_df = pd.merge(df_registration, df_stations[['State', 'Public Stations']], on='State', how='left')
# Fill missing values in the 'Public Stations' column with 0 (assuming missing values mean 0 public stations)
merged_df['Public Stations'].fillna(0, inplace=True)
# Save the merged dataset to a new CSV file
merged_df.to_csv('Merged_EV_Data.csv', index=False, columns=['State', 'Registration Count', 'Public Stations'])Python and Pandas were used for data cleaning and merging, ensuring that the final dataset is ready for analysis in Tableau. The cleaned dataset includes state-level EV registration counts and the number of public charging stations, which will be used for in-depth analysis and visualizations in Tableau.
Visualizations in Tableau
The datasets, EV Registration Counts by State and EV Stations by State, were imported into Tableau. A worksheet displaying EV registration counts by state as a bar graph was created. A separate worksheet displaying the number of stations per state as another bar graph was created. The two worksheets were combined into a dashboard, allowing for side-by-side comparison. A state filter was added to the dashboard, enabling users to filter data by selecting a specific state of interest. The states were color-coded for visual clarity and differentiation.
Two maps were created in separate worksheets. One map displayed EV charging stations per state, and the other showed the EV per state ratio. A dashboard was formed by combining these two maps. The dashboard allowed users to filter data by selecting a specific state. The second map had an additional filter, enabling users to refine results based on EV station vs. EV registration ratio (below 20 or above 20).
The visualizations provided insights into states with varying EV registration counts and the corresponding availability of public charging stations. Users could identify states with a high demand for public charging infrastructure based on registration counts and station ratios.
Three separate bar graphs were created, each representing registration counts, public charging stations, and the ratio of stations to registrations. These bar graphs were combined into a dashboard for comprehensive analysis.
The EV analysis and visualizations allowed for a holistic view of EV registrations and charging station distribution.
Decision-makers could identify states with potential business opportunities for expanding charging infrastructure.
Insights highlighted areas where the station-to-registration ratio was either sufficient or needed improvement.
Conclusion
In this comprehensive project, we embarked on a journey to analyze and visualize Electric Vehicle (EV) data to extract meaningful insights. We started by leveraging Python and Pandas to clean, merge, and prepare two essential datasets from US Department of Energy: EV registration counts by state and Kaggle: the number of public charging stations. This crucial data cleaning and merging process ensured that we had a pristine dataset primed for analysis in Tableau.
With our data prepped and ready, we delved into the realm of Tableau, where we crafted a series of insightful visualizations that would serve as powerful tools for data-driven decision-making within the electric vehicle industry. Our visualizations included bar graphs showcasing EV registration counts and public charging station distribution, as well as interactive maps highlighting state-level EV statistics and ratios.
The significance of our work lies in the capacity to inform strategic choices and investments in the realm of public charging infrastructure. By following the steps outlined in this project, we gained access to invaluable insights. These insights can guide business decisions and aid in prioritizing states that are ripe for further development in public charging infrastructure. Whether it’s identifying states with EV interest or pinpointing regions with an urgent need for expanded charging options, our project has demonstrated how data-driven decisions are the driving force behind sustainable electric vehicle growth.
In conclusion, our collaborative effort between Python and Tableau has resulted in a comprehensive toolkit for EV analysis and visualization. These tools are essential for not only understanding current EV trends but also shaping the future of sustainable transportation infrastructure.
Datasources and Analysis Tools
- ames Lemieux. (n.d.). EV Chargers. Kaggle. https://www.kaggle.com/datasets/jameslemieux2/ev-chargers/
- Alternative Fuels Data Center. (2023). EV Registration Counts by State. https://afdc.energy.gov/files/u/data/data_source/10962/10962-ev-registration-counts-by-state_8-2-23.xlsx
- Python. (n.d.). https://www.python.org/
- Tableau. (n.d.). https://www.tableau.com/