July 22, 2014 Leave a comment
I started working though the second chapter of McCaffrey’s book Neural Networks Using C# Succinctly to see if I could write the examples using F#.
McCaffrey’s code is tough to read though because of its emphasis on loops and global mutable variables. I read though his description and this is how <I think> the Perceptron should be constructed.
The inputs are a series of independent variables (in this case age and income) and the output is a single dependent variable (in this case party affiliation). The values have been encoded and normalized like in this post here.
An example of the input (from page 31 of his book) is:
Or in a more abstract manner:
In terms of data structures, individual inputs (each row) is placed into an array of floats and the output is a single float
I call this single set of inputs an “observation” (my words, not McCaffrey).
Looking at McCaffrey’s example for a perceptron Input-Output,
all of the variables you need are not included. Here is what you need:
Where A0 and B0 are the same as X0 and X1 respectively in his diagram. Also, McCaffrey uses the word “Perceptron” to mean two different concepts: the entire system as a whole and the individual calculation for a given list of X and Bias. I am a big believer of domain ubiquitous languages so I am calling the individual calculation a neuron.
Once you run these values through the neuron for the 1st observation, you might have to alter the Weights and Bias based on the (Y)result. Therefore, the data structure coming out of the Neuron is
These values are feed into the adjustment function to alter the weights and bias with the output as
I am calling this process of taking the a single observation, the xWeights, , and the bias and turning them into a series of weights and bais as a “cycle” (my words, not McCaffrey)
The output of a cycle is then fed with the next observation and the cycle repeats for as many observations as there are fed into the system.
I am calling the process of running a cycle for each observation in the input dataset a rotation (my words, not McCaffrey) and that the perceptron runs rotations for an x number of times to train itself.
Finally, the Perceptron takes a new set of observations where the Y is not known and runs a Rotation once to predict what the Y will be.
So with that mental image in place, the coding became much easier. Basically, there was a 1 to 1 correspondence of F# functions to each step laid out. I started with an individual cycle
I used record types all over the place in this code just so I could keep things straight in my head. McCaffrey uses ambiguously-named arrays and global variables. Although this makes my code a bit more wordy (esp for functional people), I think the increased readability is worth the trade-off.
In any event, with the Neuron and Activation calc out of the way, I created the functions that adjust the weights and bias:
This code is significantly different than the for, nested if that McCaffrey uses.
I maintain using this kind of pattern matching makes the intention much easier to comprehend. I also split out the adjustment of the weights and the adjustment of the bias into individual functions.
With these functions ready, I created an input and output record type and implemented the adjustment function
There is not a corresponding method in McCaffrey’s code, rather he just does some Array.copy and mutates the global variables in the Update method. I am not a fan of side-effect programming so I created a function that explicitly does the modification.
And to wrap up the individual cycle:
Up next is to run the cycle for each of the observations (called a rotation)
Again, note the liberal use of records to keep the inputs and outputs clear. I also created a prediction rotation that is designed to be run only once that does not alter the weights and bias.
With the rotations done, the last step was to create the Perceptron to train and then predict:
Before I go too much further, I have a big code smell. I am iterating and using the mutable keyword. I am not sure how to take the results of a function that is applied to the 1st element in a sequence and then input that into the second. I need to do that with the weights and bias data structures –> each time it is used in a expression, it need to change and feed into the next expression. I think the answer is the List.Reduce, so I am going to pick this up after looking at that in more detail. I also need to implement the shuffle method so that that cycles are not called in the same order across rotations….