Analysis of Crime Data in New York City in 2020

Using NYPD Complaint Data Current (Year to Date) to analyze the crime events in New York City from January 2020 to June 2020.

Jingjing Ge
8 min readOct 19, 2020

Introduction

As a foreign student in United States, I am little concerned about the city’s public safety when I first arrived at New York City. Due to the dense population in the city, I was afraid that the crime rates will be higher than other places. However, the statistics shows that New York City has become a much safer place than before. Being a student who is studying the data analysis and data visualization, I want to explore more about the recent crime data in New York City to get a more comprehensive understanding of the public safety.

Research Question

Since this is a relative broad topic that can be analyzed through various aspects, I am going to narrow it to down to two research questions for this project:

  • How do the crime types and levels differ from each borough in NYC?
  • When will different types of crimes happen in NYC (day/night/weekday/weekend)?

Dataset Used

The dataset I used for this project is NYPD Complaint Data Current (Year To Date) from NYC Open Data provided by New York City Police Department. This dataset includes all valid felony, misdemeanor, and violation crimes reported to the New York City Police Department (NYPD) for all complete quarters so far this year (2020).

Data Pre-processing

Although the dataset contains all the data reported to the NYPD in 2020, not all of them were occurred in 2020. Therefore, as the latest crime event recorded was occurred on 6/30/2020, I filtered the dataset based on the exact date of occurrence for the reported event to be in the range of January to June 2020.

  • Number of observations: 190,178
  • Number of variables: 36

Each record has the information about the type of the crime, level of the crime, location and time of the event, demographics of both suspects and victims and other related crime descriptions.

Following shows a skim of the dataset filtered:

Basic Information of Character Variable
Basic Information of Data and Numeric Variable

As we can see when skimming through the data, there are a lot of empty values. For the future data analysis, the missing/unknown/empty values will be excluded unless it is specifically pointed out.

Data Analysis: Overview

Crime Level and Status

Crime levels are separated to three levels, and crime status is whether the crime was successfully completed or attempted, but failed or was interrupted prematurely. As we can see from the visualization, a lot more crimes are completed than attempted. Misdemeanor has the highest number of completed/attempted crimes, followed by felony and violations.

Based on these information provided for each complaint, I am wondering the completed/attempted rate for each crime level. For misdemeanor and violation crimes, more than 99% of the crimes were completed, and violation has the most crimes completed. However, only 96.28% of felony crimes was completed, and 3.72% was attempted. I think it is because that violation is a lesser offense that is much easier to complete, and felony is the most serious offense which may have many other factors affecting the event and are more difficult to carry out.

Crime Level and Location

The dataset also provides the specific location of the crime occurrence in or around the premises; inside, opposite of, front of, rear of the premises. From the visualization below, we can see that over 50% of all three crime levels happened inside of the premises. Violation has the highest percentage of crimes happening inside (66.98%), and more misdemeanor and felony crimes (28%-30%) happening outside than violation (22%). I think the reason behind it is that misdemeanor and felony crimes are more likely to be completed outside of the premises in the public space than inside.

Rank of the Crime Type

I further looked into the crime/offense types. Based on NYPD’s Description of offense corresponding with key code, there are total of 58 types of offense. Following visualization listed the top 10 crime types among all. Major crimes are petit larceny, harassment and criminal mischief.

Suspects’ Demographic

From the visualizations below, we can see that most of the suspects are between 25 to 44 years old, and 78.72% of suspects are male.

Victims’ Demographic

From the visualizations below, we can see that most of the victims are between 25 to 64 years old and 51.95% of victims are female. Compared to suspects’ demographic, older people and female are more likely to be the victims than younger people.

Data Analysis: Crime Events by Borough

Total Number of Crimes by Borough

According to the visualization, Brooklyn has the highest number of crimes, following by Manhattan, Bronx and Queens. Staten Island has the lowest number of crimes.

Crime Distribution by Borough

Crimes are divided to three levels: felony, misdemeanor, and violation. Violations are the most minor of offenses. Misdemeanors are the second type of criminal offenses. More serious than violations but less severe than felonies, misdemeanors can carry up to a year in jail. Felonies are the most serious of offenses and require a more thorough classification.

From the visualization below, we can find that misdemeanor is the most popular crime level across all the boroughs, which takes up about more than 50% of all the crimes in each borough (50% to 55%). Felony is the second most popular crime level (20% to 32%), and violation is the least crime level (13% to 22%).

While Brooklyn has the highest number of the misdemeanor, Manhattan has the highest percentage of misdemeanor crimes in the borough (54.85%).

For the Felony crimes, Brooklyn has the highest number of the events as well as highest percentage compared to others (32.95%). Staten Island has substantially low percentage of felony crimes, which is only 24.9%.

However, to my surprise, Staten Island has the highest percentage of violations(21.54%) compared to other four boroughs, and manhattan has the lowest percentage of violations (13.51%).

Mapping for Crime in Boroughs

New York 2010 Census Tract and Neighborhood Tabulation Area data are used to map the crime data in different boroughs. For this project, I chose two boroughs that have the highest number of crimes (Brooklyn and Manhattan) to show which part of the borough has the relative higher crime records.

Brooklyn

From the map we can see that lower right parts of Brooklyn have relative higher occurrence of crime events. East New York and Crown Heights North neighborhood tabulation areas have the highest number of crimes.

Manhattan

From the map we can see that midtown of Manhattan has relative higher occurrence of crime events. Midtown-Midtown South and East Harlem North neighborhood tabulation areas have the highest number of crime events.

Data Analysis: Crime Events by Time

Crimes per Month

From the graph we can see that the number of crimes is decreasing from January to June, and the April has the lowest number of crimes. This may due to the Corona Virus pandemic that forced people to stay at home and reduced the probability of crimes.

Hourly Crime Rate

I further analyzed the number of crimes happened for each hour in a day. From the following visualization, we can see that most of the crime events happened between 3 p.m to 6 p.m. I am a little surprised about the result since I think crime events are more likely to be happened at night, but the result shows that more would happen in the afternoon. I think this may because that most people are either off school or off work during this time and increase the probability of crimes. There are less crime events happened between 1 a.m and 7 a.m, which may due to the reason that most of people are sleeping at that time.

Additionally, I also looked in to the difference of hourly crime rate between weekdays and weekends. Following graph shows that basically the trend is similar for both weekdays and weekends, but there are more crimes happened during the day in weekdays than weekends. And there are more crimes happened in the night in weekends than weekdays.

Next Step

From the data analysis above, I have a general understanding on the crime level, crime type, suspects and victims’ demographic. To answer my research questions, I found that different boroughs do have different crime levels and total number of crimes. In addition, I also analyzed the crime events by time and knowing the difference of hourly crime rate at different hour and weekdays/weekends.

For the future analysis, I am willing to discover more about hourly crime rate on different crime types, different boroughs and different crime levels. And it is also interesting to compare between this year and past year’s crime data since 2020 is a little special case to discuss due to the pandemic.

--

--