Why I chose the data set of “Police Department Incident Reports: 2018 to Present” by the City of San Francisco.
For my final EDA project, I did not know where to start with the discovery of Kaggle as there were endless datasets for my many hobbies and interests. However, for my project I felt as it was my chance to show what I have learned through the semester with Jupyter Notebooks. I decided to use the data “Police Department Incident Reports: 2018 to Present” from data.sfgov.org. I had attempted to use this dataset in the beginning of the semester and decided not to because I did not have the proper tools and knowledge to tell the story I would like.
In 2020, Coronavirus swept the entire world causing stay-at-home orders to occur. My hometown of San Francisco had a stay-at-home order causing many people to not go out as often which would change the occurrence of incidents and the different types of incidents.
Neighborhoods of San Francisco Analysis
The top two neighborhoods in both 2019 and 2020 were the Tenderloin district and The Mission district. In 2019 both neighborhoods combined made up 20% of the incidents while in 2020 it was 19%. Tenderloin having 14,564 in 2019 and having 11,428 incidents in 2020 compared to The Mission which had 15,399 in 2019 and 11,316 in 2020. Keep in mind there are 36 different neighborhoods in San Francisco that means 2 neighborhoods made up 20% of incidents in two different years. On the other hand the bottom two neighborhoods are Seacliff and McLaren Park which kept the very similar number of incidents within the 300–400 area of both years.
Even though the percentage of incidents did not change within neighborhoods the amount of incidents and what types on incidents did change. There was a decrease of 31,068 incidents between 2019 and 2020; that is a 6% decrease in incidents which is good for any city, but for nine of the 12 months of 2020 there was a stay at home order.
The incident numbers do not tell the full story. We have to look at the difference in population in the four neighborhoods that were brought up. The population data is shown from 2016 so it will not be as accurate. The Mission and the Tenderloin are very different in comparing the size. The Mission being 1.48 sq mi and the Tenderloin being 0.35 sq mi. That is a 1.13 sq mile difference and to have very similar incident reports is very concerning. The Tenderloin’s incidents per capita from 2019 and 2020 are 0.52 and 0.4 which is very high compared to The Mission which is 0.26 for 2019 and .19 for 2020.
The most committed incident is theft from a locked vehicle greater than $950 which is a wobbler crime meaning that depending on the prosecutor, it could either be a felony or misdemeanor. This is not surprising because San Francisco has always had a problem with people breaking into cars. The following incidents of frequency is surprising. In 2019, lost property , theft, other property, $50-$200, battery, and malicious mischief, vandalism to property were the following three most common incidents. While on the other hand in 2020, malicious mischief, vandalism to property, vehicle, recovered, auto, vehicle, stolen, auto , theft, other property, $50-$200. There was an increase in vehicle related incidents, this was connected to the stay at home order. People were not allowed to go out and had to stay indoors meaning that cars sat for days and sometimes weeks or even months.
What It All Means
We discussed the number of incidents in The Tenderloin and The Mission and what neighborhoods where incidents were frequent. What does it all mean is something you may be asking? The Mission has a much higher population almost twice and much than the Tenderloin based from 2016 data yet the data shows that there is twice as many incidents in the Tenderloin compared to the Mission. The most common incidents were having to deal with stolen property. This data shows us that San Francisco whether having a stay at home order or not incidents occur at the same frequency per capita and the similar incidents occur. In a city that has many issues with the infrastructure there needs to be changed to reduce these numbers.