K-Spin Hamiltonian for Quantum-Resolvable Markov Decision Processes

Eric Jones, Peter Graf, Eliot Kapit, Wesley Jones

Research output: Contribution to journalArticlepeer-review

1 Scopus Citations


The Markov decision process is the mathematical formalization underlying the modern field of reinforcement learning when transition and reward functions are unknown. We derive a pseudo-Boolean cost function that is equivalent to a K-spin Hamiltonian representation of the discrete, finite, discounted Markov decision process with infinite horizon. This K-spin Hamiltonian furnishes a starting point from which to solve for an optimal policy using heuristic quantum algorithms such as adiabatic quantum annealing and the quantum approximate optimization algorithm on near-term quantum hardware. In arguing that the variational minimization of our Hamiltonian is approximately equivalent to the Bellman optimality condition for a prevalent class of environments we establish an interesting analogy with classical field theory. Along with proof-of-concept calculations to corroborate our formulation by simulated and quantum annealing against classical Q-Learning, we analyze the scaling of physical resources required to solve our Hamiltonian on quantum hardware.

Original languageAmerican English
Article number12
Number of pages9
JournalQuantum Machine Intelligence
Issue number2
StatePublished - 2020

Bibliographical note

Publisher Copyright:
© 2020, Springer Nature Switzerland AG.

NREL Publication Number

  • NREL/JA-2C00-83527


  • K-spin hamiltonian
  • Markov decision process
  • Quantum annealing
  • Quantum approximate optimization algorithm
  • Quantum optimization
  • Simulated annealing


Dive into the research topics of 'K-Spin Hamiltonian for Quantum-Resolvable Markov Decision Processes'. Together they form a unique fingerprint.

Cite this