Walking through your neighborhood one day, you notice a police car parked nearby. Most days, a police car is parked in the same general area.
Crime forecasting algorithms use historical police data to calculate the probability of future police interactions within a given geography or time frame. Law enforcement officials use those algorithms to label certain locations “at-risk” or certain people as “likely” to commit or be the victim of a crime, and then they base decisions about policing and prosecution on those labels.
Crime forecasting algorithms use historical police data to calculate the probability of future police interactions within a given geography or time frame. Law enforcement officials use those algorithms to label certain locations “at-risk” or certain people as “likely” to commit or be the victim of a crime, and then they base decisions about policing and prosecution on those labels.
Crime forecasting algorithms use historical police data to calculate the probability of future police interactions within a given geography or time frame. Law enforcement officials use those algorithms to label certain locations “at-risk” or certain people as “likely” to commit or be the victim of a crime, and then they base decisions about policing and prosecution on those labels.
Basics
“Crime forecasting” refers to two separate but related processes: predictive policing and data-driven prosecution. Both use historical police data to calculate the probability of future police activity — arrests, reported crimes and other police-civilian interactions — in a given area in the future. Police use the algorithm’s calculations as a basis for targeting particular neighborhoods or particular people. Prosecutors use the algorithm’s calculation as a basis for charging decisions and sentencing recommendations.
What crime forecasting tools are based on
There are two types of predictive policing tools: place-based, which maps so-called “hot spot” areas based on historical police activity, and person-based, which uses historical police data to generate lists of individuals supposedly at risk of committing or being a victim of crime.
Place-based policing algorithms rely primarily on police department data, including 911 calls and community or police reports of suspected crime. They may give weight to data points such as reports of property crime or vandalism, juvenile arrests, the presence of people on parole or probation, and disorderly conduct calls. Some algorithms even give weight in their calculations to things like weather patterns, the presence of liquor stores and population density.
Person-based algorithms rely on data collected about individual histories of interaction with the criminal legal system. Those tools create lists of names and assign accompanying risk scores based on data from arrest records, inclusion in gang databases, parole and probation records, and police reports.
“Data-driven prosecution” refers to a set of data analysis techniques that prosecutors use when making decisions about what charges to bring, what sentences to ask for and how to dispose of cases. As a process, it is relatively understudied, but in general it describes prosecutorial reliance on data from the criminal legal system, including law enforcement data about active cases, data about the locations of previously reported crimes, lists of individuals whom police have identified as priority offenders, and data about probationers and parolees. More research is necessary to understand what specific algorithmic products prosecutors are using; how they are using them; and what policies, if any, guide or constrain these practices.
Risks and biases
Developers build crime forecasting algorithms with historical police data, meaning the algorithm’s outputs will reflect inequities in the criminal legal system. That includes data from decades of documented policing practices that are biased, corrupt and even unlawful: falsifying crime records, planting evidence, targeting Black and Brown communities and otherwise manipulating crime data. All of these biased decisions, originally made by humans, become part of the algorithm that corporations and law enforcement agencies claim can “predict” future crime. Police then spend disproportionate amounts of time targeting the same communities and individuals they have targeted in the past, creating a feedback loop.
Historical crime and arrest data don’t give reliable information about where the most crime is happening but about where the most policing is happening. What gets measured and subsequently fed into policing algorithms are the incidents that are reported to police or that the police themselves report and the people police arrest. But the crime that is reported to police varies by community — white and wealthier communities are less likely to report crimes to the police. And even where the data shows that Black and white people are likely to commit certain offenses at the same rate — for example, offenses relating to drug use and sale — in many cases Black people are significantly more likely than white people to be arrested or incarcerated. Crime forecasting tools take the data from that kind of biased policing and represent it as objective evidence of where future offenses will be committed and by whom.
See the appendix for more information.