Hopfield networks were first introduced in 1982 by John Hopfield [1]. They are differ from many modern neural network architectures. A few of the difference are:
Let's start with a simple example. Hopfield networks are fully connected neural networks, with bidirectional weights. The neuron states are bipolar, either -1 or 1. This differs from most neural networks where the neurons assume continuous values. The weights are calculated using the Hebbian learning rule. They are typically used to perform associative recall. This means that given a noisy or partial input, the network can recall the associated memory. [2]
The following visual shows an example of a Hopfield network through the process of recalling a memory. The network has been trained with this "memory", which is just a greyscale image with 9 pixels.
Given any corrupted or partial version of this input, the hopfield network can "recall" the input overtime through asyncronous updates to the network.
The network recalls the memory by updating the states
Like other neural networks, the hopfield network consists of nodes and weights. The weight matrix of a hopfield network is set with the following rule:
where
We can perform this exercise with our dummy example and we get the following:
Then we set the diagonal to zero:
import numpy as np
# n is the number of nodes (neurons) in the network
# also the size of the memory
X = np.array([[-1,1,-1], [-1,1,-1], [-1,1,1]])
n = X.flatten().shape[0]
weights = np.outer(sample, sample) / n
np.fill_diagonal(weights, 0)
Now our simple hopfield network is trained and ready to recall the memory. We can walk through the process of recalling the memory of the network using asyncronous updates.
The recall process starts with a partial or corrupted "memory", and slowly reconstructs the full memory. In our case, we only have one memory, so the network will converge to the same state regardless of what we start with. We will choose a randomly initialized state vector to start with.
At each time step, a node is chosen at random and its state is updated. It is updated according to the following rule [3]:
i = np.random.randint(n)
net_input = np.dot(weights[i], states)
states[i] = 1 if net_input > 0 else -1
The s vector is repeatedly updated until the network converges to a stable state. The following visual shows the network converging to the memory state.
The convergence of the network to a stable state can be thought of as a minimization of the "energy" function of the network. Throughout the training process, energy should be non increasing at each time step. The energy function is defined as:
Or in vector form:
states = states.reshape(-1,1)
energy = -0.5 * np.dot(states.T, np.dot(weights, states))
The energy of the network can be thought of as how "stable" the network is. The lower the energy, the more stable, and the closer the network is to converging on a memory state. Hopfield networks are designed to minimize this energy function.
Hopfield networks are particularly interesting to me becuase they are both non-differentiable, and as a recurrent network, they allow for variable numbers of computations before converging on a solution. This is in contrast to feedforward networks which have a fixed number of computations.
Be sure to check out the sources below to see the inspirations and resources used in this article. The code for this hopfield network is on my github.