Analysis of Crime Data in New York City in 2020

Using NYPD Complaint Data Current (Year to Date) to analyze the crime events in New York City from January 2020 to June 2020.


As a foreign student in United States, I am little concerned about the city’s public safety when I first arrived at New York City. Due to the dense population in the city, I was afraid that the crime rates will be higher than other places. However, the statistics shows that New York City has become a much safer place than before. Being a student who is studying the data analysis and data visualization, I want to explore more about the recent crime data in New York City to get a more comprehensive understanding of the public safety.

Research Question

Since this is a relative broad topic that can be analyzed through various aspects, I am going to narrow it to down to two research questions for this project:

  • When will different types of crimes happen in NYC (day/night/weekday/weekend)?

Dataset Used

The dataset I used for this project is NYPD Complaint Data Current (Year To Date) from NYC Open Data provided by New York City Police Department. This dataset includes all valid felony, misdemeanor, and violation crimes reported to the New York City Police Department (NYPD) for all complete quarters so far this year (2020).

Data Pre-processing

Although the dataset contains all the data reported to the NYPD in 2020, not all of them were occurred in 2020. Therefore, as the latest crime event recorded was occurred on 6/30/2020, I filtered the dataset based on the exact date of occurrence for the reported event to be in the range of January to June 2020.

  • Number of variables: 36
Basic Information of Character Variable
Basic Information of Data and Numeric Variable

Data Analysis: Overview

Crime Level and Status

Crime levels are separated to three levels, and crime status is whether the crime was successfully completed or attempted, but failed or was interrupted prematurely. As we can see from the visualization, a lot more crimes are completed than attempted. Misdemeanor has the highest number of completed/attempted crimes, followed by felony and violations.

Crime Level and Location

The dataset also provides the specific location of the crime occurrence in or around the premises; inside, opposite of, front of, rear of the premises. From the visualization below, we can see that over 50% of all three crime levels happened inside of the premises. Violation has the highest percentage of crimes happening inside (66.98%), and more misdemeanor and felony crimes (28%-30%) happening outside than violation (22%). I think the reason behind it is that misdemeanor and felony crimes are more likely to be completed outside of the premises in the public space than inside.

Rank of the Crime Type

I further looked into the crime/offense types. Based on NYPD’s Description of offense corresponding with key code, there are total of 58 types of offense. Following visualization listed the top 10 crime types among all. Major crimes are petit larceny, harassment and criminal mischief.

Suspects’ Demographic

From the visualizations below, we can see that most of the suspects are between 25 to 44 years old, and 78.72% of suspects are male.

Victims’ Demographic

From the visualizations below, we can see that most of the victims are between 25 to 64 years old and 51.95% of victims are female. Compared to suspects’ demographic, older people and female are more likely to be the victims than younger people.

Data Analysis: Crime Events by Borough

Total Number of Crimes by Borough

According to the visualization, Brooklyn has the highest number of crimes, following by Manhattan, Bronx and Queens. Staten Island has the lowest number of crimes.

Crime Distribution by Borough

Crimes are divided to three levels: felony, misdemeanor, and violation. Violations are the most minor of offenses. Misdemeanors are the second type of criminal offenses. More serious than violations but less severe than felonies, misdemeanors can carry up to a year in jail. Felonies are the most serious of offenses and require a more thorough classification.

Mapping for Crime in Boroughs

New York 2010 Census Tract and Neighborhood Tabulation Area data are used to map the crime data in different boroughs. For this project, I chose two boroughs that have the highest number of crimes (Brooklyn and Manhattan) to show which part of the borough has the relative higher crime records.

Data Analysis: Crime Events by Time

Crimes per Month

From the graph we can see that the number of crimes is decreasing from January to June, and the April has the lowest number of crimes. This may due to the Corona Virus pandemic that forced people to stay at home and reduced the probability of crimes.

Hourly Crime Rate

I further analyzed the number of crimes happened for each hour in a day. From the following visualization, we can see that most of the crime events happened between 3 p.m to 6 p.m. I am a little surprised about the result since I think crime events are more likely to be happened at night, but the result shows that more would happen in the afternoon. I think this may because that most people are either off school or off work during this time and increase the probability of crimes. There are less crime events happened between 1 a.m and 7 a.m, which may due to the reason that most of people are sleeping at that time.

Next Step

From the data analysis above, I have a general understanding on the crime level, crime type, suspects and victims’ demographic. To answer my research questions, I found that different boroughs do have different crime levels and total number of crimes. In addition, I also analyzed the crime events by time and knowing the difference of hourly crime rate at different hour and weekdays/weekends.

