## The Counted Part 3: Law Enforcement Officers Killed In Line Of Duty

July 22, 2015 1 Comment

As a follow up to this post, I decided to look on the other side of the gun –> police officers killed in the line of duty. Fortunately, the FBI collects this data here. It looks like the FBI is a bit behind on their summary reports:

So taking the 2013 data as the closest data point to The Counted 2015 data, it took a couple of minutes to download the excel spreadsheet and format it as a useable .csv:

After importing in the data in R studio, I did a quick summary on the data frame. The most striking thing out of the gate is how few Officers are killed. There were 27 in 2013, compared to over 500 people killed by police officers in the 1st half of 2015:

1 officers.killed <- read.csv("./Data/table_1_leos_fk_region_geographic_division_and_state_2013.csv") 2 sum(officers.killed$OfficersKilled) 3

I then added in the state population to do a similar ratio and map:

1 officers.killed.2 <- merge(x=officers.killed, 2 y=state.population.3, 3 by.x="StateName", 4 by.y="NAME") 5 6 officers.killed.2$AdjustedPopulation <- officers.killed.2$POPESTIMATE2014/10000 7 officers.killed.2$KilledRatio <- officers.killed.2$OfficersKilled/officers.killed.2$AdjustedPopulation 8 officers.killed.2$AdjKilledRatio <- officers.killed.2$KilledRatio * 10 9 officers.killed.2$StateName <- tolower(officers.killed.2$StateName) 10 11 choropleth.3 <- merge(x=all.states, 12 y=officers.killed.2, 13 sort = FALSE, 14 by.x = "region", 15 by.y = "StateName", 16 all.x=TRUE) 17 choropleth.3 <- choropleth.3[order(choropleth.3$order), ] 18 summary(choropleth.3) 19 20 qplot(long, lat, data = choropleth.3, group = group, fill = AdjKilledRatio, 21 geom = "polygon") 22

So Louisiana and West Virginia seem to have the highest number of officers killed per capita. I am not surprised, being that I had no expectations about states that would have higher and lower numbers. It seems likely a case of “gee-wiz” data.

Since there is so few instances, I decided to forgo any more analysis on police killed and instead combined this data with the people who were killed by police:

1 the.counted.state.5 <- merge(x=the.counted.state.4, 2 y=officers.killed.2, 3 by.x="StateName", 4 by.y="StateName") 5 6 names(the.counted.state.5)[names(the.counted.state.5)=="AdjKilledRatio.x"] <- "NonPoliceKillRatio" 7 names(the.counted.state.5)[names(the.counted.state.5)=="AdjKilledRatio.y"] <- "PoliceKillRatio" 8 9 the.counted.state.6 <- data.frame(the.counted.state.5$NonPoliceKillRatio, 10 the.counted.state.5$PoliceKillRatio, 11 log(the.counted.state.5$NonPoliceKillRatio), 12 log(the.counted.state.5$PoliceKillRatio)) 13 14 colnames(the.counted.state.6) <- c("NonPoliceKilledRatio","PoliceKilledRatio","LoggedNonPoliceKilledRatio","LoggedPoliceKilledRatio") 15 16 plot(the.counted.state.6) 17

and certainly the log helps out and there seems to be a relationship between states that have police killed and people being killed by police (my hand-drawn red lines added):

With that in mind, I created a couple of linear models

1 non.police <- the.counted.state.6$LoggedNonPoliceKilledRatio 2 police <- the.counted.state.6$LoggedPoliceKilledRatio 3 police[police==-Inf] <- NA 4 5 model <- lm( non.police ~ police ) 6 summary(model) 7 8 model.2 <- lm( police ~ non.police) 9 summary(model.2) 10

Since there are only 2 variables, the adjusted R square is the same for x~y and y~x.

The interesting thing is the model has to account that many states had 0 police fatalities but had at least 1 person killed by the police. The next interesting thing is the value of the coefficient: in starts where there was at least 1 police fatality and 1 person killed by the police, every police fatality increases the number of people killed by police .96 –> and this .96 is the log of the ratio of population. So it shows that the police are better at killing then getting killed, which makes sense.

The full gist is found here.