Extra Credit: Simulated Annealing

Extra Credit: Simulated Annealing#

Extra Credit

This page is 10 points extra credit

The simulated annealing (linked from bottom) is an extra 10 points extra credit

The Ising model is useful for lots of things, one of which is helping us learn about optimization.

The goal of optimization is to extremize (maximize or minimize) a particular function O, called an objective function, which depends on a bunch of parameters \((p_1,p_2,...,p_n\)). One way to go about this is to compute the objective function’s derivatives with respect to the parameters, \(\partial O/\partial i\). When fed a particular vector of parameters \(p\), the derivatives identify the direction in which O changes most significantly. Iteratively “walking” in this direction — a procedure called gradient descent (for minimization) — leads to an extremum of \(O\). Unfortunately this procedure only finds local extrema; it can get you up the nearest hill or down the nearest valley, but there might be a mountain or gorge someplace further away that you’ve missed. Avoiding this situation requires an algorithm that finds global extrema. In addition, this only works when the optimization problem you are solving is continuous

In general it is impossible to find global extrema quickly (although the algorithm we will explore in this section can go about it slowly).

There are a number of heurestic algorithms that allow us to find global extrema, many of which are inspired by physics or nature. The most famous of these is simulated annealing. In simulated annealing you essentially run a simulation at very high temperature and then slowly cool things down. The intuition behind this is that at high temperatures you will be able to sample the entire landscape of the objective function, and then as you cool down you end up falling into the lowest minima.

In our case, our objective function will be to find the configuration of spins in an Ising model (our parameters) which minimizes the energy (our objective function). In the model that we’ve studied thus far, where all the coefficients \(J\) are the same, this isn’t very interesting (you can probably find the ground state without a computer). On the other hand, if we choose random signs on each of the coefficients, finding the ground state will be hard.

Grading

Modify your Ising model to work with arbitrary signs on each bond.

Choosing the random coupling constants

For this section, we are going to work with a set of coupling constants that have fixed but randomly selected signs. We will work on a \(27 \times 27\) lattice.

Go ahead and choose these coupling constants. For my code I selected them by choosing

Jh=np.random.choice([1,-1],(27,27))
Jv=np.random.choice([1,-1],(27,27))

and the compute the energy of these spins doing

def Energy(spins):
    rightSpins=np.roll(spins,-1,axis=0)
    downSpins=np.roll(spins,-1,axis=1)
    return np.sum(Jh*spins*rightSpins*-1)+np.sum(Jv*spins*downSpins*-1)

As a sanity check, you can also compute the energy the slow way as

def Energy_slow(spins):
  e=0
  for i in range(0,27):
    for j in range(0,27):
      e=e+spins[i,j]*spins[(i+1) % 27,j]*Jh[i,j]*-1
      e=e+spins[i,j]*spins[i,(j+1) % 27]*Jv[i,j]*-1
  return e

and should get the same thing.

In addition to working with your own coupling constants, you can use the set we’ve randomly generated here.

  • File(s) of our coupling constants:

  • A file of our initial configuration of spins (this is only for testing):

You can load these in the following way:

spins=np.loadtxt("ParallelTemperingSpins.dat")
Jh=np.loadtxt("ParallelTemperingJh.dat")
Jv=np.loadtxt("ParallelTemperingJv.dat")
#plt.matshow(spins)
#plt.show()
#plt.matshow(Jv)
#plt.show()
#plt.matshow(Jh)
#plt.show()

For these coupling constants with that initial spin configuration, we get an energy of 16 (after you check the energy you should reset the spin configuration).

Local Optimization#

Let’s start with a poor approach to optimization. Start in the configuration where all spins are up. Loop through the spins over and over again, flipping one anytime its opposite orientation reduces the energy (i.e. running at large beta). Stop after visiting N sequential spins without having performed any flips. This is a local minimum.

Then go ahead and start with various random configurations. Do you get to the same local minima each time? If not, this means that typically you’re not finding global minima.

Grading

Run your local optimization with 100 different random initial conditions.

you need to set up a set of random coupling constants and tried to find minima using local optimization. To do this, you should start with 100 different initial conditions and run your Ising model at a high temperature (say T=1/1000). You can run until you can’t decrease the energy any longer (say 1000 sweeps are constant). With our coupling constants we find this takes ~3000 sweeps to converge.

Histogram the hundred different local minima that you get.
You will potentially get different results but we get local minima between -1020 and -980 using our coupling constants.

Simulated Annealing#

Simulated annealing modifies the above algorithm in a very natural way. Instead of running at zero temperature (which always goes downhill), in simulated annealing we start by running at a high temperature. Over time, we slowly decrease the temperature until we get to zero temperature. The intuition here is that we end up in a deep minima. In fact, you can show that (in some unrealistic cases where you cool very slowly) you are guaranteed to get a global minima.

The algorithm concretely is:

  • Run Monte Carlo at high temperature \(T\) for a while

  • Every so often decrease the temperature

  • When you get to \(T=0\) which is equivalent to our local optimization) stop after your optimization doesn’t go down any more.

Grading

Now using the same random coupling constants as the \(T=0\) case, try to find a minima using simulated annealing. Start with running 1000 sweps at each beta and use a linear temperature schedule in beta - i.e betas=np.arange(0,4.0,0.01)

Do a plt.matshow of your final spins.

Using simulated annealing under this temperature schedule we find that we get an energy of -1024 for our coupling constants.

Simulated annealing is relatively sensitive to the temperature schedule that you are using. Here is a list of some temperature schedules you can use. In each case take \(T_0=1000\). Here we have that \(k\) is round \(k\) (starting at \(k=0\)) where each round is 1000 sweeps.

  • \(T_k = T_0 \alpha^k\) where \(\alpha =0.85\)

  • \(T_k=T_0/(1+\alpha \log(1+k))\). Try \(\alpha=1\) (provably will converge to the global minima if you take enough steps (but it takes too long to take enough steps)) and \(\alpha=1.5\).

  • \(T_k = T_0 /(1+\alpha k)\) for \(\alpha=0.5\)

  • \(T_k = T_0/(1+ \alpha k^2)\) for \(\alpha=0.5\)

To have a fair test, use the same coupling constants.

Grading

Use these five different temperature schedules each once. Graph (up to 100 temperatures each) the energy as a function of temperature.

Simulated annealing is used for a wide variety of tasks. Here you can use it to solve a cartoon model of protein folding for 10 more extra credit points