Accelerating simulated quantum annealing on AWS Graviton processors

9 months ago 52
News Banner

Looking for an Interim or Fractional CTO to support your business?

Read more

This post was contributed by Kohji Nishimura, CTO, Jij, Yoshitaka Haribara, Senior Startup ML/Quantum Solutions Architect and Perminder Singh, Senior Partner Solutions Architect, Quantum, at AWS

Numerical optimization is the process of minimizing or maximizing a given cost function, subject to any given constraints. Business needs across many industry verticals, like manufacturing, logistics, and telecommunication, require solving large-scale optimization problems. A scheduling plan that minimizes traffic congestion in logistics application, or a manufacturing plan that minimizes overall production costs are typical examples of combinatorial optimization – a class of some of the most computationally challenging problems.

Well-established techniques provide algorithmic performance guarantees for specific optimization problems, such as linear programming, where both the cost function and constraints are linear. However, for problems with nonlinear cost functions and constraints, these techniques aren’t suitable. In such cases, heuristics are commonly used to seek approximate solutions. Simulated Annealing (SA), Quantum Annealing (QA) — its quantum analog, and Simulated Quantum Annealing (SQA) — the classical version of QA, are heuristic algorithms designed to approximate solutions for nonlinear optimization problems.

In this post, we demonstrate how SQA can be accelerated by the Arm-based AWS Graviton processors, namely Graviton2 and Graviton3. SQA represents a good example of how we can leverage AWS-designed silicon to improve the implementation of quantum-inspired classical heuristics. We also show that the solution can be seamlessly integrated with JijZept — a cloud service for numerical optimization using quantum technology.

Quadratic unconstrained binary optimization (QUBO) and Ising model

QUBO is an example of a combinatorial optimization problem. In QUBO, one is tasked with minimizing a cost function that is quadratic in binary decision variables qi (each variable can be either 0 or 1). If Qij denotes a square matrixrepresenting a QUBO problem then solving the problem is equivalent to minimizing the cost function, which can be represented as:

The QUBO model is equivalent to the popular Ising model in statistical physics, which is used to describe the behavior of spin systems:

with the transformation si = 2 qi – 1.

Annealing: simulated and quantum

Finding the optimal spin configuration of the Ising model that minimizes the given Hamiltonian H (which also optimizes the corresponding QUBO model) is typically an intractable NP-hard problem.

One of the most popular methods for finding an approximate solution to such problems is using the SA algorithm. This algorithm starts with an initial binary variable (spin) configuration and gradually changes it by selecting a new solution, and accepting or rejecting it based on a probabilistic criterion. This allows the algorithm to escape from local minima and explore the solution space more effectively.

QA is a quantum computing analogue of the SA algorithm. In QA, a system of qubits is time evolved using a Hamiltonian which consists of two parts: the cost function and the driver Hamiltonian, as follows:

The strengths of the driver Hamiltonian and the cost function are tuned by quantum devices so that at the initial point s = 0, the term A(s) is dominant, and at the final point s = sf the term B(s) is dominant. Under the assumption of adiabaticity, the final state of the qubits reaches the ground (lowest energy) state of the cost function Hamiltonian that we want to minimize.

Simulated quantum annealing (SQA)

Currently, applying the QA algorithm to real-world optimization problems using quantum devices is challenging due to limitations in device topology, and the intrinsic noise that affects the QA process. To evaluate the algorithm performance without encountering those challenges, developers created a classical method called SQA to simulate quantum annealing using conventional computers.

SQA algorithm uses the quantum Monte Carlo approach to sample the solution. Applying the Suzuki-Trotter decomposition technique [Trotter (1959), Suzuki (1976)], the original quantum Hamiltonian can be transformed into an equivalent classical Hamiltonian with Trotter slices that characterize the quantum fluctuation. It is important to note that the total number of spins required for the calculation increases by a factor of the number of Trotter slices, and the more Trotter slices are introduced, the more accurately the computation can be performed, especially in regions of low temperature.

Parallelization technique on AWS Graviton processors

One of the main drawbacks of the SQA algorithm is that it requires redundant computational resources to express the Trotter slices. Although the calculation of each Trotter slice can be parallelized using a multi-core CPU, such processors often have a very high cost. However, Graviton processors provide an efficient parallelization solution using its multi-core feature (up to 64 cores in current configurations). Moreover, the latest processor generation, Graviton3, offers up to 2x faster computation than the previous generation, Graviton2. As such, customers can achieve faster computation for their SQA-based optimization problems by using Graviton3 processors.

Jij’s JijZept service

Even with powerful computing devices, there remain many obstacles to real-world optimization applications. These include:

  • Constructing a mathematical model that accurately represents the problem.
  • Converting the mathematical model to the QUBO formulation.
  • Transferring the QUBO model to the solver in a memory-efficient manner.
  • Tuning the hyperparameters of the QUBO-solving algorithm.

Jij’s product, JijZept, helps overcome these bottlenecks.

For instance, consider the Traveling Salesman Problem (TSP), which aims to find the optimal traveling route given distances between each city to be visited. The problem can be formulated as follows:

With the constraints:

Here, the variable xi,t ∈ {0, 1} indicates whether the salesman travels to the i-th city on the t-th route. The variable di,j represents the distance between the i-th city and the j-th city.

We can implement the problem in Python with the JijZept library:

import jijmodeling as jm # define variables d = jm.Placeholder("d", ndim=2) N = d.shape[0] N.set_latex("N") i = jm.Element("i", belong_to=(0, N)) j = jm.Element("j", belong_to=(0, N)) t = jm.Element("t", belong_to=(0, N)) x = jm.BinaryVar("x", shape=(N, N)) # set problem problem = jm.Problem("Traveling Salesman Problem") problem += jm.sum([i, j], d[i, j] * jm.sum(t, x[i, t] * x[j, (t + 1) % N])) problem += jm.Constraint("one-city", jm.sum(i, x[i, t]) == 1, forall=t) problem += jm.Constraint("one-time", jm.sum(t, x[i, t]) == 1, forall=i)

We can easily visualize the modeled object in a Jupyter notebook environment.

Fig. 1 - Graphical representation of the mathematically modeled object within a Jupyter notebook environment. This visualization feature allows users to validate the mathematical model during its construction process.

Fig. 1 – Graphical representation of the mathematically modeled object within a Jupyter notebook environment. This visualization feature allows users to validate the mathematical model during its construction process.

Once the model is constructed, we can send it to the server with only three simple lines:

import jijzept as jz instance_data = {"d": distance} # distance object type: `np.ndarray` with two dimensions. sampler = jz.JijSASampler() sampler.sample_model(problem, instance_data)

This Python code automates:

  • Transferring the mathematical model to the solver in a memory-efficient manner.
  • Converting the mathematical model to the QUBO formulation.
  • Tuning the hyperparameters in the QUBO model.

We can easily switch the solver without changing the mathematical model. Here is an example:

# Using Graviton2 processor sampler = jz.JijAWSGraviton2Sampler() # Using Graviton3 processor #sampler = jz.JijAWSGraviton3Sampler() sampler.sample_model(problem, instance_data)

The more detailed usage is described in the documentation.

Once the problem request is sent to the server, Graviton processors fetch the request and solve the problem, as shown in the architecture diagram.

Fig. 2 - Schematic diagram of the architecture of JijZept. Once a request is submitted, JijZept stores it in a queue system, from which Solver Instances retrieve it to begin the computational process. This approach aims to provide efficient support for multiple requests simultaneously and facilitate automatic scaling capabilities.

Fig. 2 – Schematic diagram of the architecture of JijZept. Once a request is submitted, JijZept stores it in a queue system, from which Solver Instances retrieve it to begin the computational process. This approach aims to provide efficient support for multiple requests simultaneously and facilitate automatic scaling capabilities.

Benchmarking SQA with an instance of the TSP

To evaluate the performance of the SQA algorithm, we experimented using a specific QUBO model to solve TSP defined earlier.

The QUBO model used for benchmarking includes constraints to ensure that all cities are visited exactly once. The QUBO matrix is transformed into a form where the constraint term is converted to penalty terms, which prevents the solution from violating the constraints. The resulting form is given by:

Where λt1 and λi2 are the coefficients of the penalty terms. Note that the TSP in QUBO formulation are chosen as a benchmark purpose; there is room for designing more efficient algorithms for this type of optimization problems [Schuetz et al. (2022)].

Solving this problem (i.e., obtaining the minimum cost function) gives the solution shown in this figure:

Fig. 3 - A typical solution to the TSP, showcasing the cities to be visited as blue dots and the corresponding path, represented by solid orange lines. As evidenced by the figure, the derived solution is nearly optimal.

Fig. 3 – A typical solution to the TSP, showcasing the cities to be visited as blue dots and the corresponding path, represented by solid orange lines. As evidenced by the figure, the derived solution is nearly optimal.

The blue dots indicate cities to be visited, and the orange solid line shows the traveling route specified by the solution xi,t. The total length of the orange solid line is the traveling distance that is to be minimized.

We compared the results of the total traveling route obtained by running the SQA algorithm on Graviton2 and Graviton3 processors with those obtained using a naive SA algorithm on a conventional CPU. The benchmark was performed on TSP instances with 40 cities, and 100 samples of annealing were taken.

To see the experiment code, please refer to the GitHub repository. The file code_with_jijmodeling_graviton.py contains the code used for the benchmark.

The benchmark results appear in the figure that follows, where the horizontal axis represents the calculation time of each sample and the vertical axis indicates the total traveling route’s distance. The dashed lines, dark-shaded areas, and light-shaded areas represent the mean value, standard deviation, and minimum and maximum values of the samples, respectively. We can clearly see that the SQA algorithm produces a smaller traveling distance compared to the SA algorithm, which gets stuck at a distance of around 190. Additionally, we observed that the Graviton3 processor computes around 1.5x faster than the Graviton2 processor even if the experimental condition and parameters are the same.

Fig. 4 Benchmark results of the SA algorithm and the SQA algorithm for the TSP on both Graviton2 and Graviton3. The horizontal axis represents the sampling time, and the vertical axis indicates the total traveling distance. The dashed line indicates the mean value, the dark shaded area represents the standard deviation, and the light shaded area indicates the minimum-maximum value. Under identical conditions, Graviton3 displays 1.5x faster sampling time in comparison to Graviton2.

Fig. 4 Benchmark results of the SA algorithm and the SQA algorithm for the TSP on both Graviton2 and Graviton3. The horizontal axis represents the sampling time, and the vertical axis indicates the total traveling distance. The dashed line indicates the mean value, the dark shaded area represents the standard deviation, and the light shaded area indicates the minimum-maximum value. Under identical conditions, Graviton3 displays 1.5x faster sampling time in comparison to Graviton2.

Conclusion

In this blog post, we have demonstrated the implementation of the SQA algorithm with TSP in QUBO formulation as a benchmark on Graviton processors. We have highlighted the benefits of using Graviton processors, which have a multicore architecture that allows efficient handling of the SQA algorithm. Specifically, we have shown that Graviton3 is 1.5 times faster than Graviton2 for our implementation of the algorithm. These findings highlight the potential of cloud platforms to harness quantum-inspired algorithms effectively. We have also showcased the cloud service JijZept, which enables customers to use the Ising solver without the need to consider the device-specific features of the underlying hardware.

Note that our approach can be applied not only to specific problems but also to various types of real-world applications such as flight scheduling, last-mile package delivery, or work scheduling of workers.

Overall, the combination of quantum-inspired heuristics, such as SQA, and powerful cloud-based platforms, such as Graviton and JijZept, holds great promise for approximating solutions of complex optimization problems in a more efficient and scalable manner. To learn more, please visit the JijZept web page or contact us through the AWS Partner page.

The content and opinions in this blog are those of the third-party author and AWS is not responsible for the content or accuracy of this blog.

Read Entire Article