For our first ever data science happy hour, we invited Sean Reed, a Lead Instructor Galvanize, Inc. to teach us about Artificial Neural Networks (ANNs). Sean is an organizer / instructor for the Python + Data Science meetup group, and holds a Masters in Economics from NYU and B.S. in Physics from Fordham University. He got his start in data science by plunging straight into neural networks study. (With hindsight, he does not recommend starting with this overly complicated entrypoint!)
Sean took us through a discussion of current and past research in neural networks. Starting from the mid to late 50s, scientists were studying cats and cows – looking in their brains and figuring out where the electrical signals went. A lot of that early research informed the modern day field of ANN study.
An ANN, now known more commonly as just a neural network, is a machine learning technique that mimics brain systems, in a scaled-down way, that enables a computer to “learn” from observational data. With a basic feed forward connection model, a set of inputs is received with the goal of having the classifier system tell the user what the inputs are. (ex: “this set of pixel inputs is classified as a cat.”)
- Here, this more detailed diagram shows that the activation function within the hidden layer determines what signal passes through to the next layer.
Sean covered some different models of activation functions, that work better than the above linear feed forward connection model. He explained that what actually works better is a non-linear function, something that functions less similarly to the human brain.
- The traditional model of activation function was called Logistic (sigmund)
- Now the two more common models are the Hyperbolic tangent and the Rectified Linear Unit
According to Sean, “there’s all sorts of different hipster functions that people like to use these days.” Below are some common models.
Expanding on input → hidden layer → output layer diagram above, there are now models that allow for multiple hidden layers, with each layer abstracting from the hidden layer before it. The more hidden layers / neurons, the more complex a problem the model can solve. This is the area of study known as deep learning.
Sean also showed us a great website for playing with neural networks: https://playground.tensorflow.org/
While interesting, his type of model is wholly inappropriate for processing images. Even an 80 x 80 pixel image will require too much power.
So what kind of structure can you use to understand images?
Finally, we touched briefly on the subject of Convolutional Neural Networks (CNNs), which has a radically different structure from ANNs. CNNs allow computers to choose what filters it will apply to process data because the computer knows what answer it is trying to get to in the end (ex: this image is a cat). The computer is able to minimize erroneous functions by changing the filters. CNNs are now the most widely used technique for image processing.
In large part its popularity is attributed to AlexNet, a 2012 research paper covering a CNN that was used to win the 2012 ILSVRC (ImageNet Large-Scale Visual Recognition Challenge).
To conclude, Sean brought the discussion back to practical, real-world applications of neural networks. They’re great for applications such as:
- Image classification
- Language processing (Google translate)
- AI assistants (Alexa, Siri)
- Detecting medical anomalies (arrhythmias, skin cancers)
- And much more!
We loved having Sean come in to share his knowledge about the fascinating world of neural networks. Thank you, Sean!