## The Counted Part 3: Law Enforcement Officers Killed In Line Of Duty

As a follow up to this post, I decided to look on the other side of the gun –> police officers killed in the line of duty.  Fortunately, the FBI collects this data here.  It looks like the FBI is a bit behind on their summary reports:

So taking the 2013 data as the closest data point to The Counted 2015 data, it took a couple of minutes to download the excel spreadsheet and format it as a useable .csv:

to

After importing in the data in R studio, I did a quick summary on the data frame.  The most striking thing out of the gate is how few Officers are killed.  There were 27 in 2013, compared to over 500 people killed by police officers in the 1st half of 2015:

```1 officers.killed <- read.csv("./Data/table_1_leos_fk_region_geographic_division_and_state_2013.csv")
2 sum(officers.killed\$OfficersKilled)
3 ```

I then added in the state population to do a similar ratio and map:

``` 1 officers.killed.2 <- merge(x=officers.killed,
2                            y=state.population.3,
3                            by.x="StateName",
4                            by.y="NAME")
5
8 officers.killed.2\$AdjKilledRatio <- officers.killed.2\$KilledRatio * 10
9 officers.killed.2\$StateName <- tolower(officers.killed.2\$StateName)
10
11 choropleth.3 <- merge(x=all.states,
12                     y=officers.killed.2,
13                     sort = FALSE,
14                     by.x = "region",
15                     by.y = "StateName",
16                     all.x=TRUE)
17 choropleth.3 <- choropleth.3[order(choropleth.3\$order), ]
18 summary(choropleth.3)
19
20 qplot(long, lat, data = choropleth.3, group = group, fill = AdjKilledRatio,
21       geom = "polygon")
22 ```

So Louisiana and West Virginia seem to have the highest number of officers killed per capita.  I am not surprised, being that I had no expectations about states that would have higher and lower numbers.  It seems likely a case of “gee-wiz” data.

Since there is so few instances, I decided to forgo any more analysis on police killed and instead combined this data with the people who were killed by police:

``` 1 the.counted.state.5 <- merge(x=the.counted.state.4,
2                            y=officers.killed.2,
3                            by.x="StateName",
4                            by.y="StateName")
5
8
9 the.counted.state.6 <- data.frame(the.counted.state.5\$NonPoliceKillRatio,
10                                   the.counted.state.5\$PoliceKillRatio,
11                                   log(the.counted.state.5\$NonPoliceKillRatio),
12                                   log(the.counted.state.5\$PoliceKillRatio))
13
14 colnames(the.counted.state.6) <- c("NonPoliceKilledRatio","PoliceKilledRatio","LoggedNonPoliceKilledRatio","LoggedPoliceKilledRatio")
15
16 plot(the.counted.state.6)
17 ```

and certainly the log helps out and there seems to be a relationship between states that have police killed and people being killed by police (my hand-drawn red lines added):

With that in mind, I created a couple of  linear models

``` 1 non.police <- the.counted.state.6\$LoggedNonPoliceKilledRatio
2 police <- the.counted.state.6\$LoggedPoliceKilledRatio
3 police[police==-Inf] <- NA
4
5 model <- lm( non.police ~ police )
6 summary(model)
7
8 model.2 <- lm( police ~ non.police)
9 summary(model.2)
10 ```

Since there are only 2 variables, the adjusted R square is the same for x~y and y~x.

The interesting thing is the model has to account that many states had 0 police fatalities but had at least 1 person killed by the police.  The next interesting thing is the value of the coefficient: in starts where there was at least 1 police fatality and 1 person killed by the police, every police fatality increases the number of people killed by police .96 –> and this .96 is the log of the ratio of population.  So it shows that the police are better at killing then getting killed, which makes sense.

The full gist is found here.

## Analytics in the Microsoft Stack

Disclaimer:  I really don’t know what I am talking about

I received an email from a coworker/friend yesterday with this in the body:

So, I have a friend who works for a major supermarket chain. In IT, they are straight out of the year 2000. They have tons and tons of data in SQL Server and I think Oracle. The industrial engineers (who do all of the planning) ask the IT group to run queries throughout the day, which takes hours to run. They use Excel for most of their processing. On the weekends, they run reporting queries which take hours and hours to run – all to get just basic information.

This got my wheels spinning about how I would approach the problem with the analytics toolset that I know is available.  The supermarket chain has a couple of problems

• Lots of data that takes too long to munge through
• The planners are dependent on IT group for processing the data

I would expect the official Microsoft answer is that they should implement Sql Server Analytics with Power BI.  I would assume if the group threw enough resources at this solution, it would work.  I then thought of a couple of alternative paths:

The first thing that comes to mind is using HDInsight (Microsoft’s Hadoop product)  on Azure.  That way the queries can run in a distributed manner and they can provision machines as they need them -> and when they are not running their queries, they can de-allocate the machines.

The second thought is using AzureML to do their model generation.  However, depending on the size of the datasets, AzureML may not be able to scale.  I have only used Azure ML on smaller datasets.

The third thought was using R?  I don’t think R is the best answer here.  Everything I know about R is that it is designed for data exploration and analysis of datasets that comfortably fit into the local machine’s memory.  Performance on R is horrible and scaling R is a real challenge.

What about F#?  So this might be a good answer.  If you use the Hive Type Provider, you can get the benefits of HDInsight to do the processing and then have the goodness of the language syntax and REPL for data exploration.  Also, the group could look at MBrace for some kick-butt distributed processing that can scale on Azure. Finally, if they don come up with some kind of insight that lends itself for building analytics or models into an app, you can take the code out of the script file and stick it into a compliable assembly all within Visual Studio.

What about Matlab, SAS, etc..  No idea.  I stopped using those tools when R showed up.

What about Watson?  No idea.  I think I will have a better idea once I go to this.

## F# Record Types with Entity Framework Code-Last

So based on the experience with code-first, I decided to look at using EF code-last (OK, database first).   I considered three different possibilities

1. 1) Use AutoMapper
2. 2) Use Reflection
3. 3) Hand-Roll everything

AutoMapper

If you are not familiar, Automapper is a library to allow you to,well, map types. The first thing I did was to create a database schema like this:

``` 1 use FamilyDomain
2
3 CREATE TABLE Family
4 (
5 Id int NOT NULL  IDENTITY(1,1) PRIMARY KEY,
6 LastName varchar(255) NOT NULL
7 )
8
9 CREATE TABLE Parent
10 (
11 Id int NOT NULL  IDENTITY(1,1) PRIMARY KEY,
12 FamilyId int NOT NULL,
13 FirstName varchar(255) NOT NULL
14 )
15
16 CREATE TABLE Child
17 (
18 Id int NOT NULL  IDENTITY(1,1) PRIMARY KEY,
19 FamilyId int NOT NULL,
20 FirstName varchar(255) NOT NULL,
21 Gender varchar(10) NOT NULL,
23 )
24
25 CREATE TABLE Pet
26 (
27 Id int NOT NULL  IDENTITY(1,1) PRIMARY KEY,
28 ChildId int NOT NULL,
29 GivenName varchar(255) NOT NULL
30 )
31
33 (
34 Id int NOT NULL  IDENTITY(1,1) PRIMARY KEY,
35 FamilyId int NOT NULL,
36 StateCode varchar(2) NOT NULL,
37 County varchar(255) NOT NULL,
38 City varchar(255) NOT NULL
39 )
40
41 ALTER TABLE Parent
43 FOREIGN KEY (FamilyId)
44 REFERENCES Family(Id)
45
48 FOREIGN KEY (FamilyId)
49 REFERENCES Family(Id)
50
51 ALTER TABLE Child
53 FOREIGN KEY (FamilyId)
54 REFERENCES Family(Id)
55
56 ALTER TABLE Pet
58 FOREIGN KEY (ChildId)
59 REFERENCES Child(Id)
60
61
62 INSERT Family VALUES
63 ('Andersen')
64
65 INSERT Parent VALUES
66 (1,'Thomas'),
67 (1,'Mary Kay')
68
69 INSERT Child VALUES
70 (1,'Henriette Thaulow','Female',5)
71
72 INSERT Pet VALUES
73 (1,'Fluffy')
74
76 (1,'WA','King','Seattle')
77 ```

I then  installed automapper and entity framework type provider to a FSharp project.

``` 1 #r @"../packages/AutoMapper.3.3.0/lib/net40/AutoMapper.dll"
2 #r "FSharp.Data.TypeProviders.dll"
3 #r "System.Data.Entity.dll"
4
5 open Microsoft.FSharp.Data.TypeProviders
6 open System.Data.Entity
7 open AutoMapper
8
9 //Entity Framework Types via Type Provider
10 let connectionString = @"Server=.;Initial Catalog=FamilyDomain;Integrated Security=SSPI;MultipleActiveResultSets=true"
11 type EntityConnection = SqlEntityConnection<ConnectionString="Server=.;Initial Catalog=FamilyDomain;Integrated Security=SSPI;MultipleActiveResultSets=true",Pluralize=true>
12 ```

I then created some local FSharp record types the reflect the domain:

```1 type Pet = {Id:int; GivenName:string}
2 type Child = {Id:int; FirstName:string; Gender:string; Grade:int; Pets: Pet list}
3 type Address = {Id:int; State:string; County:string; City:string}
4 type Parent = {Id:int; FirstName:string}

So then I was ready to start mapping.  I started with a basic GET to a single type:

``` 1 //AutoMapper setup
3
4 //Get one from the database
5 let context  = EntityConnection.GetDataContext()
8
9 //map database to record type
11 ```

And I got a fail:

Source value:

SqlEntityConnection1.HomeAddress —> System.ArgumentException: Type needs to have a constructor with 0 args or only optional args

Parameter name: type

So I added [<CLIMutable>] to the record types like so

``` 1 [<CLIMutable>]
2 type Pet = {Id:int; GivenName:string}
3 [<CLIMutable>]
4 type Child = {Id:int; FirstName:string; Gender:string; Grade:int; Pets: Pet list}
5 [<CLIMutable>]
6 type Address = {Id:int; State:string; County:string; City:string}
7 [<CLIMutable>]
8 type Parent = {Id:int; FirstName:string}
9 [<CLIMutable>]
11 ```

And I get the expected results

With one thing kinda interesting.  The State is null because it is defined as “StateCode” on the server and “State” in the domain.  Autopmapper is customizable to allow field name differences so that was a small issue.  Feeling confident, I went ahead and created maps to all of the domain types and pulled down a complex type from the database

``` 1 //AutoMapper setup
2 Mapper.CreateMap<EntityConnection.ServiceTypes.Pet, Pet>()
3 Mapper.CreateMap<EntityConnection.ServiceTypes.Child, Child>()
5 Mapper.CreateMap<EntityConnection.ServiceTypes.Parent, Parent>()
6 Mapper.CreateMap<EntityConnection.ServiceTypes.Family, Family>()
7
8 //Get Family from the database
9 let context  = EntityConnection.GetDataContext()
10 let familyQuery = query {for family in context.Families do select family}
11 let family = Seq.head familyQuery
12
13 //map database to record type
14 let family' = Mapper.Map<Family>(family)```

When I attempted to map it, I got a pretty ugly exception

Source value:

System.Data.Objects.DataClasses.EntityCollection`1[SqlEntityConnection1.Parent]

at AutoMapper.MappingEngine.AutoMapper.IMappingEngineRunner.Map(ResolutionContext context)

So the problem is that automapper is not picking up on the foreign keys, which means I have to write the associations by hand.  Ugh!  I then tried to auto map to F# choice types like this:

`1 type Gender = Male | Female`

No dice.

Reflection

I quickly spun up another project that uses System.Reflection to map the types.

``` 1 #r "System.Data.Entity.dll"
2 #r "FSharp.Data.TypeProviders.dll"
3
4 open System.Reflection
5 open System.Data.Entity
6 open Microsoft.FSharp.Data.TypeProviders
7
8 let connectionString = "Server=.;Database=FamilyDomain;Trusted_Connection=True;"
9
10 type entityConnection = SqlEntityConnection<ConnectionString = "Server=.;Database=FamilyDomain;Trusted_Connection=True;">
11
12 let context = entityConnection.GetDataContext()
13
14 //Local Idomatic Types
15 [<CLIMutable>]
16 type Pet = {Id:int; ChildId:int; GivenName:string}
17 [<CLIMutable>]
18 type Child = {Id:int; FirstName:string; Gender:string; Grade:int; Pets: Pet list}
19 [<CLIMutable>]
20 type Address = {Id:int; State:string; County:string; City:string}
21 [<CLIMutable>]
22 type Parent = {Id:int; FirstName:string}
23 [<CLIMutable>]
24 type Family = {Id:int; LastName:string; Parents:Parent list; Children: Child list; Address:Address}
25
26 //Reflection
27 let AssignMatchingPropertyValues sourceObject targetObject =
28     let sourceType = sourceObject.GetType()
29     let targetType = targetObject.GetType()
30     let sourcePropertyInfos = sourceType.GetProperties(BindingFlags.Public ||| BindingFlags.Instance)
31     sourcePropertyInfos
32         |> Seq.map(fun spi -> spi, targetObject.GetType().GetProperty(spi.Name))
33         |> Seq.iter(fun (spi,tpi) -> tpi.SetValue(targetObject, spi.GetValue(sourceObject,null),null))
34     targetObject
35
36
37 let newEfPet = entityConnection.ServiceTypes.Pet()
38 let newPet = {Id=0;ChildId=1;GivenName="Duke"}
39
40 AssignMatchingPropertyValues newPet newEfPet
41
43 context.DataContext.SaveChanges()```

Sure enough, reflection does what it is supposed to do:

The problem quickly becomes that by using reflection, I have to hand roll all of the relations.  I might as well use Automapper (though apparently reflection is much faster than Automapper, even on a per-call basis).

Another problem with using reflection is that the field names in the database need to match the domain naming exactly.  Finally, like automapper, there is not out of the box way to map choice types

Hand Roll

On my last stop of the entity framework code-last hit parade, I looked at what it would take to roll my own mappings.  This has the greatest amount of yak shaving because I would have to spin up mapping from the domain and to the domain.  The nice thing is that with that kind of detail, naming mismatches can be handled and the nested hierarchy and choice types are accounted for.  I first started with a basic script that handled the gettting and setting as well as nested types:

``` 1 #r "System.Data.Entity.dll"
2 #r "FSharp.Data.TypeProviders.dll"
3
4 open System.Linq
5 open System.Data.Entity
6 open Microsoft.FSharp.Data.TypeProviders
7
8 let connectionString = "Server=.;Database=FamilyDomain;Trusted_Connection=True;"
9 type entity = SqlEntityConnection<ConnectionString = "Server=.;Database=FamilyDomain;Trusted_Connection=True;">
10 let context = entity.GetDataContext()
11
12 type Pet = {Id:int; ChildId: int; GivenName:string}
13 type Child = {Id:int; FirstName:string; Gender:string; Grade:int; Pets: Pet list}
14 type Address = {Id:int; State:string; County:string; City:string}
15 type Parent = {Id:int; FirstName:string}
16 type Family = {Id:int; LastName:string; Parents:Parent list; Children: Child list; Address:Address}
17
18 let MapPet(efPet: entity.ServiceTypes.Pet) =
19     {Id=efPet.Id; ChildId=efPet.ChildId; GivenName=efPet.GivenName}
20
21 let MapChild(efChild: entity.ServiceTypes.Child) =
22     let pets = efChild.Pet |> Seq.map(fun p -> MapPet(p))
23                            |> Seq.toList
24     {Id=efChild.Id; FirstName=efChild.FirstName;
26
27 let GetPet(id: int)=
28     let efPet = context.Pet.FirstOrDefault(fun p -> p.Id = id)
29     MapPet(efPet)
30
31 let GetChild(id: int)=
32     let efChild = context.Child.FirstOrDefault(fun c -> c.Id = id)
33     MapChild(efChild)
34
35 let myPet = GetPet(1)
36
37 let myChild = GetChild(1)
38 ```

Of all of the implementations, the hand-rolled actually made the most sense to me.  it was clean and, most importantly, it worked.

I then swapped out a Choice type for gender (was a string)

```1 type Gender = Male | Female
2 type Pet = {Id:int; ChildId: int; GivenName:string}
3 type Child = {Id:int; FirstName:string; Gender:Gender; Grade:int; Pets: Pet list}
4 type Address = {Id:int; State:string; County:string; City:string}
5 type Parent = {Id:int; FirstName:string}
6 type Family = {Id:int; LastName:string; Parents:Parent list; Children: Child list; Address:Address}
7 ```

And then added the choice type mapping and then updated child mapping

``` 1 let MapGender(efGender) =
2     match efGender with
3     | "Male" -> Male
4     | _ -> Female
5
6 let MapChild(efChild: entity.ServiceTypes.Child) =
7     let pets = efChild.Pet |> Seq.map(fun p -> MapPet(p))
8                            |> Seq.toList
9     {Id=efChild.Id; FirstName=efChild.FirstName;
10         Gender=MapGender(efChild.Gender);
12 ```

Sure enough, it worked like a champ

And finally, I tested the add on both the happy path and an expected exception.

``` 1 let SavePet(pet: Pet)=
2     let efPet = entity.ServiceTypes.Pet()
3     efPet.ChildId <- pet.ChildId
4     efPet.GivenName <- pet.GivenName
6     context.DataContext.SaveChanges()
7
8 let newPet = {Id=0;ChildId=1;GivenName="Lucky Sue"}
9 SavePet(newPet)
10
11 let failurePet = {Id=0;ChildId=0;GivenName="Should Fail"}
12 SavePet(failurePet)```

Both worked as expected.  Here is the exception case where there is not a child to be associated to a pet:

System.Data.UpdateException: An error occurred while updating the entries. See the inner exception for details. —> System.Data.SqlClient.SqlException: The INSERT statement conflicted with the FOREIGN KEY constraint "fk_Pet_Child". The conflict occurred in database "FamilyDomain", table "dbo.Child", column ‘Id’.

The statement has been terminated.

at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)

So of all three ways, hand-rolling worked the best for me.

## F# Record Types With Entity Framework Code-First

I was spinning up a data layer in a new FSharp project and I thought I would take EF Code-first out for a test drive.  I have use EF-CF in a couple of C# projects so I am familiar with the premise (and the promise) of code-first.  The FSharp project uses record types, nested record types, and choice types exclusively so I I thought of attaching each of these types for code first in turn.  The first article that I ran across was this one, which seemed like a good start.  I went ahead a created a family record type like so, matching the example verbatim except I swapped out the class implementation with a record type:

``` 1 #r "../packages/EntityFramework.6.1.2/lib/net45/EntityFramework.dll"
2
3 open System.Collections.Generic
4 open System.ComponentModel.DataAnnotations
5 open System.Data.Entity
6
7 type Family = {Id:int; LastName:string; IsRegistered:bool}
8
9 type CLFamily() =
10     inherit DbContext()
11     [<DefaultValue>]
12     val mutable m_families: DbSet<Family>
13     member public this.Families with get() = this.m_families
14                                 and set v = this.m_families <- v
15
16 let db = new CLFamily()
17 let family = {Id=0;LastName="New Family"; IsRegistered=true}
19 db.SaveChanges() |> ignore
20 ```

But I ran into this:

So I added the Key attribute to the Record type

So I hit up stack overflow with this question and sure enough, I forgot to add a reference to that assembly.  Once I added it, then it compiled.  I then ran the script and I got the following error message:

```1     <add name="CLFamily"
2          connectionString="Server=.;Database=FamilyDomain;Trusted_Connection=True;"
3          providerName="System.Data.SqlClient"/>
4 ```

Ugh!  It was still hitting the default connection string.    I went ahead and adjusted my script to account for the connection string and I swapped out the backing values with CLIMutable:

``` 1 #r "../packages/EntityFramework.6.1.2/lib/net45/EntityFramework.dll"
2 #r "C:/Program Files (x86)/Reference Assemblies/Microsoft/Framework/.NETFramework/v4.5.1/System.ComponentModel.DataAnnotations.dll"
3
4 open System.Collections.Generic
5 open System.ComponentModel.DataAnnotations
6 open System.Data.Entity
7
8 [<CLIMutable>]
9 type Family = {[<Key>]Id:int; LastName:string; IsRegistered:bool;}
10
11
12 type FamilyContext() =
13     inherit DbContext()
14         [<DefaultValue>] val mutable families: DbSet<Family>
15         member this.Families with get() = this.families and set f = this.families <- f
16
17 let context = new FamilyContext()
18 let connectionString = "Server=.;Database=FamilyDomain;Trusted_Connection=True;"
19 context.Database.Connection.ConnectionString <- connectionString
20 let family = {Id=0; LastName="Test"; IsRegistered=true}
22 context.SaveChanges() |> ignore```

And sure enough, the table is created in the database and the record is persisted:

And the cool thing is that even though this is a record type, the Id does adjust to the identity value given by the database.

With that out of the way, I went to tackle nested types.  I added a Child class and a list of children to the family class.

``` 1 [<CLIMutable>]
2 type Child = {[<Key>]Id:int; FamilyId: int; FirstName:string; Gender:string; Grade:int}
3
4 [<CLIMutable>]
5 type Family = {[<Key>]Id:int; LastName:string; IsRegistered:bool; Children:Child list}
6
7 type FamilyContext() =
8     inherit DbContext()
9         [<DefaultValue>] val mutable families: DbSet<Family>
10         member this.Families with get() = this.families and set f = this.families <- f
11         [<DefaultValue>] val mutable children: DbSet<Child>
12         member this.Chidlren with get() = this.children and set c = this.children <- c
13
14 let context = new FamilyContext()
15 let connectionString = "Server=.;Database=FamilyDomain;Trusted_Connection=True;"
16 context.Database.Connection.ConnectionString <- connectionString
17 let children = [{Id=0; FamilyId=0; FirstName="Test"; Gender="Male"; Grade=5}]
18 let family = {Id=0; LastName="Test"; IsRegistered=true; Children=children }
20 context.SaveChanges() |> ignore```

Everything compiled and  ran, but the Children table was not added to the database –> though the new record was added.

Going back to stack overflow, it looks like EF Code First will not auto-update the schema unless you add some more glue code.  Ugh.  At that point, I might as well give up on code-first if all it brings is not having to write sql scripts…

## Implementing (Parts Of) ASP.NET Identity Using F#

##### The CSharp implementation looks like this (I did add some constructor injection b/c I am opposed to touching the Configuration (or any part of the file system for that matter) outside of Main on sanity grounds):
``` 1     public class SendGridEmailProvider: IIdentityMessageService
2     {
3         String _mailAccount = String.Empty;
6
8         {
9             _mailAccount = mailAccount;
12         }
13
15         {
16             var sendGridMessage = new SendGridMessage();
18             List<String> recipients = new List<string>();
21             sendGridMessage.Subject = message.Subject;
22             sendGridMessage.Html = message.Body;
23             sendGridMessage.Text = message.Body;
24
25             var credentials = new NetworkCredential(_mailAccount, _mailPassword);
26             var transportWeb = new Web(credentials);
27             if (transportWeb != null)
28                 return transportWeb.DeliverAsync(sendGridMessage);
29             else
31         }
32     }```

##### I then thought, this is stupid.  I might was well use FSharp type providers to do the implementation.  Less Code, less files, less clutter, more goodness.  I first swapped out the Email to a FSharp implementation:
``` 1 type SendGridEmailService(account:string, password:string, fromAddress:string) =
2     interface IIdentityMessageService with
3         member this.SendAsync(identityMessage) =
4             let sendGridMessage = new SendGridMessage()
6             let recipients = new List<string>()
9             sendGridMessage.Subject <- identityMessage.Subject
10             sendGridMessage.Html <- identityMessage.Body
11             sendGridMessage.Text <- identityMessage.Body
12
13             let credentials = new NetworkCredential(account, password)
14             let transportWeb = new Web(credentials)
15             match transportWeb with
17                 | _ -> transportWeb.DeliverAsync(sendGridMessage) ```

##### I then did a SMS text provider using type providers.
``` 1 type cDyneService = Microsoft.FSharp.Data.TypeProviders.WsdlService<"http://sms2.cdyne.com/sms.svc?wsdl">
2
4     interface IIdentityMessageService with
5         member this.SendAsync(identityMessage) =
6             let cDyneClient = cDyneService.Getsms2SOAPbasicHttpBinding
7             let client = cDyneService.Getsms2SOAPbasicHttpBinding()
8             match client with

## Consuming Azure ML web api endpoint from an array

Last week, I blogged about creating an Azure ML experiment, publishing it as a web service, and then consuming it from F#.  I then wanted to consume the web service using an array – passing in several values and seeing the results.  I created added on to my existing F #script with the following code

``` 1 let input1 = new Dictionary<string,string>()
8
9 let input2 = new Dictionary<string,string>()