Machine Learning for Quantum Matter

Quantum matter, the research field studying phases of matter whose properties are intrinsically quantum mechanical, draws from areas as diverse as hard condensed matter physics, materials science, statistical mechanics, quantum information, quantum gravity, and large-scale numerical simulations. Recently, researchers interested quantum matter and strongly correlated quantum systems have turned their attention to the algorithms underlying modern machine learning, with an eye on making progress in their fields. Here we provide a short review on the recent development and adaptation of machine learning ideas for the purpose advancing research in quantum matter, including ideas ranging from algorithms that recognize conventional and topological states of matter in synthetic an experimental data, to representations of quantum states in terms of neural networks and their applications to the simulation and control of quantum systems. We discuss the outlook for future developments in areas at the intersection between machine learning and quantum many-body physics.


Introduction
Machine learning studies algorithms and statistical models that computers use to perform tasks without explicit instructions [1]. Machine learning technology currently powers an ever-growing number of aspects of our society including web search, virtual personal assistants, traffic predictions, face recognition on social networks, content filtering on commerce websites, email spam filtering, language translation, online fraud detection, and more [2,3]. The origin behind these technological advances can be largely traced back to a series of breakthroughs in artificial intelligence, in particular those based on deep learning, where data is processed through the sequential combination of multiple nonlinear layers [3]. Deep learning has accelerated the adoption of artificial intelligence with notable advances in areas ranging from computer vision [4] and natural language processing [5], to scientific applications such as drug discovery [6] and protein folding [7]. the algorithms underlying modern machine learning, with the objective of making progress in quantum matter research. This recent resurgence of research interest at the intersection between strongly correlated systems and machine learning is shaped in part by the commonalities in the structure of the problems that these seemingly unrelated fields attack. For example, the complexity associated with the study of the collective behaviour of many-body systems is reflected in the size of the state space, which grows exponentially with the number of particles. Likewise, the "curse of dimensionality" [1,8] affects, e.g., computer vision and natural language processing, where the size of the space where images and sentences live grows exponentially with the number of pixels and words, respectively. Beyond high dimensionality, many-body systems as well as systems of interest in machine learning, exhibit correlations and symmetries with strikingly similar structure. A prominent example of this appears in natural language [9], natural images [10], and music [11], all of which exhibit power-law decaying correlations identical to a (classical or quantum) many-body system tuned at its critical point, which also exhibits spatial correlations decaying with a power law [12]. In the same vein, the adoption of a common set of symmetries simultaneously enriches our understanding of quantum systems [13] and simplifies the computational and sample complexity of certain learning tasks [14,15].
All these common structures suggest that the power and scalability of modern machine learning architectures and algorithms are naturally well-suited to applications in physical systems, in particular to help perform various tasks in strongly correlated systems, quantum matter, quantum information and computation, and statistical physics.
Here we review recent advances in the development and adaptation of machine learning techniques for the purpose advancing research in these areas. This short review starts with a very brief introduction to the essential ideas in machine learning used throughout the review but it does not explain the technical and methodological details. An exceptional discussion of machine learning is available in Ref. [16], where an introduction to the core concepts and tools of machine learning is presented from a physicists' viewpoint. Comprehensive and extremely clear books dealing with modern machine learning ideas include Bishop's Pattern Recognition and Machine Learning [1], The Elements of Statistical Learning [17], and Deep Learning [3]. Whereas this review is focused machine learning applications to quantum matter and strongly correlated many-body systems, a review dealing with the recent research at the interface between machine learning and the broader physical sciences is available in Ref. [18]. Another related research domain that is not covered in this short review is quantum machine learning, which refers to the development and use of machine learning algorithms on quantum devices; this is reviewed in Ref. [19,20]. Peter Wittek, who sadly disappeared during an expedition on Mount Trishul, and Vedran Dunjko wrote a non-review of Quantum Machine Learning [21] where they provide a perspective on the meaning of quantum machine learning, its key issues, progress, and recent trends. A very thorough review with a strong focus on the applications of machine learning in solid-state materials science can be found in Ref. [22]. A review at the intersection between the broad area of quantum physics and artificial intelligence [23] discusses a growing body of recent work at the intersection between quantum computing and machine learning and how results and techniques from one field can be used to tackle the problems of the other. This includes quantum computing as a means to provide speed-ups for machine learning problems, machine learning for advancing quantum technology, and quantum generalizations of statistical learning concepts.
After a brief introduction to the main ideas in machine learning, this review is structured according to several intertwined research trends such as • Machine learning in simulations of strongly correlated fermions • Machine learning phases of matter in simulated and experimental data • Neural-network quantum states and their applications • Machine learning acceleration of Monte Carlo simulations • Quantum information, quantum control, and quantum computation • Quantum physics inspired machine learning

Machine learning
The field of artificial intelligence deals with the theory and development of computational systems endowed with the ability to perform tasks that typically require human capabilities such as visual and speech recognition, language comprehension, decision making, etc. The field of artificial intelligence has already solved a wide array of problems that are laborious for human beings but straightforward for computers. The solutions to this breed of problems can be described by a list of formal rules that computers can process efficiently. Modern machine learning, instead, deals in part with the challenge of automatizing the solution of tasks that are in principle easy for humans but that are hard to formally describe by simple rules. Thus, machine learning can be understood as a sub-field of artificial intelligence which studies algorithms, software, and statistical models to automate tasks without explicit instructions.
Machine learning algorithms are typically divided into the categories of supervised, semi-supervised, unsupervised, and reinforcement learning. As we will discuss, algorithms belonging to all these categories have been applied to quantum systems. While for some of these machine learning categories there are no formal differences when described in the language of probability, such a division is often useful as a way to specify the details of the algorithms, the training setup, and the structure of the datasets involved in the learning task.

Supervised Learning
Machine learning problems where the training data encompasses input vectors paired with their corresponding target vectors are known as supervised learning tasks [1]. Starting with training dataset, the learning algorithm infers a function to make predictions about the output values for unseen input vectors. The system is thus able to infer output vectors for any new input after sufficient training. Examples in this category include classification of vectors, where the aim is to assign each input vector to one of a finite number of discrete categories, and the task of regression, where the desired output is a vector with continuous variables. Classification is useful, e.g., for the problem of recognizing images of handwritten digits and assigning them the most likely digit the images represent. Regression can be used to deal with the problem of determining the orbits of bodies around the sun from astronomical data and to extrapolate the value of observables from simulations on finite systems to the thermodynamic limit [24]. Both classification and regression are illustrated in Fig. 1a and Illustrating the different categories of machine learning tasks. a. In classification, each learning example is associated with a discrete category or target value, which corresponds to a class. There can be, e.g., two classes in binary classification (red and yellow). The function separating the two classes is called decision boundary (black curve). b. In a regression each learning example is associated with a real target value. The goal of the model (black curve in the figure) is to estimate the correct output, given an input vector. c. Clustering refers to the task of grouping unlabelled objects so that data in the same group are similar to each other than to those in other clusters. In the figure there are two clusters identified by green and yellow ovals that naturally group the data (red circles). d. Semi-supervised classification is similar to regular classification but some data points do not have a label (white circles) and their label has to be inferred from the data.

Unsupervised learning
A wide range of tasks in machine learning include situations where the training data is composed of a set of input vectors without a corresponding target output [1]. Unsupervised learning studies algorithms and their ability to infer functions to discover hidden structures in the data [1]. Examples of tasks in unsupervised learning problems include the discovery of groups of similar examples within the data, a task known as clustering (illustrated in Fig. 1c), as well as density estimation, where the objective is to estimate the underlying probability distribution associated with the data. Another useful unsupervised learning task is in the low-dimensional visualization of high-dimensional data in two or three dimensions while retaining the spatial characteristics of the original data as much as possible.

Semi-supervised learning
Semi-supervised learning falls between supervised and unsupervised learning [25]. It refers to a machine learning approach where a small amount of labelled data is combined with a large amount of unlabelled data during training for classification, as illustrated in Fig. 1d. Semi-supervised learning is thus extremely useful in research areas where the acquisition of labelled data is expensive, e.g., when the collection of labelled data requires a skilled human (e.g. a professional translator) a physical experiment or an expensive numerical simulation. The costs associated with the labelling process can impede the development of large labelled training data, while acquisition of unlabelled data is relatively inexpensive in some settings. Semi-supervised learning has been applied to the image captioning problem and video transcription. In physics it has been applied to the classification of phases of matter from snapshots of Monte Carlo simulations without labelled data, as well as to the problem of efficiently sampling rare trajectories of stochastic systems [26].

Reinforcement learning
Reinforcement learning develops algorithms concerned with the problem of discovering a set of actions that maximize a numerical reward signal. [27] The learning algorithm is never directly exposed to examples of optimal actions to take, it must instead discover them by a process similar to trial and error. In many cases, actions affect not only the immediate reward but also subsequent actions and rewards. In some cases, the reward signal may come only after having executed multiple actions, which may be discrete or continuous and can be high dimensional. Reinforcement learning augmented by deep learning has successfully learned policies from high-dimensional sensory input for game playing achieving human-level performance in several challenging games including Atari 2600 [28] as well as the board game Go [29]. The success of reinforcement learning has been also applied to the control of quantum systems as well as to the optimization of quantum error correction codes, one of the key ingredients for fault-tolerant quantum computation.

Machine learning in simulations of strongly correlated fermions
Some of the earliest applications of machine learning techniques to quantum matter arose in the context of molecular systems. Here, machine learning has been used to accurately model atomization energies [30] based on a dataset from energies computed with with hybrid density-functional theory. This work demonstrated the potential applicability of supervised machine learning algorithms for the acceleration and prediction of atomization energies across the molecular space and inspired a wide variety of machine learning applications in molecular and materials science [31]. In the context of density functional theory (DFT), a proof-of-principle demonstration based on a system of free fermions showed that density functionals can be accurately approximated using kernel ridge regression [32]. These works motivated a wide array of machine learning applications of data-enabled chemistry and density functional theory [33] with the aim of predicting, accelerating [22,34,35] and improving the prediction of atomic-scale properties of materials and chemical systems (including uncertainty estimation [36]) reaching quantum chemical accuracy using DFT [37]. Proposals to bypass the solution of the Kohn-Sham equations in the DFT have demonstrated acceleration of simulations of materials and molecules [38,39]. These approaches reduce the computational cost of DFT to linear in the size of the systems, making it orders of magnitude faster, while providing a high-fidelity emulation of exact Kohn-Sham DFT. Another pioneering machine learning application to strongly correlated quantum many-body quantum systems arose in the context of dynamical mean-field theory [40], where the authors in Ref. [41] successfully applied kernel methods [17] to find the Green's function of the Anderson impurity model.

Machine learning phases of matter in synthetic and experimental data
Machine learning has been actively applied to problems in classical and quantum physics and promises to become a basic research tool with the potential for scientific discovery in the study of strongly correlated systems. One of the most elementary machine learning ideas applied to physical systems is in the classification of phases of matter, either in synthetically generated, experimental data or a combination of simulated and experimental data. In particular Ref. [42] demonstrated that neural network technology, can be used to encode and discriminate phases of matter and phase transitions in classical and quantum many-body systems, including the determination of critical exponents. Beyond classification of simulated data, Ref. [42] shows that convolutional neural networks can represent ground states of quantum many-body systems, specifically the ground state of the toric code [43]. A wide variety of data-driven machine learning techniques have been applied to simulations of classical and quantum systems based upon innovative supervised, unsupervised, and semi-supervised learning techniques to discover and analyze classical and quantum phase transitions in equilibrium [44][45][46][47][48][49][50][51][52][53][54][55][56][57][58][59][60][61][62][63] including phases of matter characterized by topological order [42,64,65] and out-of-equilibrium systems [48,[66][67][68][69][70][71]. Unsupervised learning has also been used to model the thermodynamic observables for physical systems in thermal equilibrium [72][73][74][75]. Additionally, machine learning techniques have been applied to the discovery of physical concepts from data and to the identification of physical theories [76][77][78], the discovery of symmetry and conserved quantities [79], and even to generate computerinspired scientific ideas [80].
While results related to the analysis and processing of simulation data has provided significant insight into the potential for machine learning techniques to impact scientific discovery, one of the most important application where machine learning can truly excel is in the analysis of complex experimental data, which is often plagued with noise and other imperfections. In this context, machine learning techniques have been applied to the analysis of data coming from ultracold atom experiments taken with a quantum gas microscope [81], demonstrating that machine learning has the capability to distill microscopic mechanisms and hidden order in experimental data [82,83]. In particular, Ref. [82] found evidence that machine learning applied experimental snapshots from quantum many-body states can help distill the most predictive theory among a multitude of competing theories. Additionally, artificial neural networks and deep-learning techniques have been used to identify quantum phase transitions from single-shot experimental momentum-space density images of ultracold quantum gases displaying results that go beyond conventional methods in terms of accuracy in the detection of phase transitions [84]. Another notable application of machine learning to experimental data is in the analysis of complex electronic-structure images [85]. Ref. [85] reports the development and training of a set of artificial neural networks aimed at recognizing different types of order hidden in electronic quantum matter images from carrier-doped copper oxide Mott insulators. Here, the neural networks discovered the existence of a lattice-commensurate, four-unit-cell periodic, translationalsymmetry-breaking state. In a similar vein, Ref. [86] demonstrated that statistical learning applied to scanning tunneling microscopy data can be used to uncover relevant electronic correlations in the gold-doped BaFe 2 As 2 . Ref. [87] uses an autoencoder to extract model Hamiltonians from experimental data, and to identify different magnetic regimes in the system. The system considered in Ref. [87] is the spin ice material Dy 2 Ti 2 O 7 , for which the authors find an optimal Hamiltonian capable of accurately predicting the temperature and field dependence of both magnetic structure and magnetizationi, among other properties. The observations in Ref. [85][86][87] offer a glimpse into the potential for discovery that machine learning techniques such as k-means, principal component analysis [1], neural networks, autoencoders, and other statistical learning techniques [88] can offer as a complementary view on the physical nature of complex phases of matter beyond traditional methods of analysis of experimental data.

Renormalization group and its relation to machine learning
The renormalization group (RG) refers to the mathematical infrastructure that facilitates the study of the changes of a physical system when viewed at different length scales and is currently the theoretical bedrock for our understanding of phase transitions and critical phenomena [89]. RG methods remain also key to out understanding of modern condensed-matter theory and particle physics.
Motivated by the structural and inner workings of deep neural networks, which have been seen to operate by extracting a hierarchy of increasingly higher-level concepts in its layers, a number of studies have established a series of connections between RG and deep learning [90][91][92][93][94][95][96][97] One of the earliest connections between RG and deep learning appeared in Ref. [90]. The author compares RG and deep learning and establishes that a foundational tensor network architecture, i.e., the multiscale entanglement renormalization ansatz (MERA) [98], can be converted into a learning algorithm based on a generative hierarchical Bayesian network model.
A very influential paper about the connection between RG and deep learning [91] describes an exact mapping between the variational RG and a specific deep learning architecture based on stacked Restricted Boltzmann Machines (RBMs) [99,100]. In addition to the exact mapping, the authors in Ref. [91] numerically explore the training of their deep architecture using data from the two-dimensional Ising model and conclude that these models are trained to approximately implement a coarse-graining procedure similar to Kadanoff's block renormalization.
On a more practical level, Ref. [92] proposed an RG scheme based on mutual information maximization that automatically identifies the relevant degrees of freedom for the RG decimation procedure without any prior knowledge about the system. The authors apply their techniques to the two-dimensional Ising model and a dimer model. For the dimer model, the algorithm subtly discovers the adequate degrees of freedom for the RG procedure; in contrast to the Ising case where the degrees of freedom are groups of spins,the correct degrees of freedom to perform RG on are not dimers, but rather effective local electric fields. Ref. [94] introduces a variational renormalization group approach based on a reversible generative model with hierarchical architecture and apply it in the identification of mutually independent collective variables of the Ising model as well as to the acceleration of Monte Carlo sampling. Such collective variables play an important role in the acceleration of molecular simulations based on metadynamics [101], and thus these machine-learning inspired RG techniques may potentially impact the important area of molecular dynamics.

Neural-network quantum states and their applications
A central concept common to problems in simulations of quantum matter and quantum technologies is the many-body wavefunction, which is one of the most complex mathematical objects in physics. Remarkably, the power of important quantum technologies relies on our ability to accurately control, measure, and characterize the wavefunction of large but brittle quantum systems. As a consequence, an accurate specification and control of the state of a quantum system remains a key research topic in several quantum technology labs around the world. In a series of developments, researchers have taken the fundamental viewpoint that the state of a quantum system, whose exponential complexity is reminiscent of the "curse of dimensionality" encountered in machine learning, can be understood as a generative model of phenomena at the microscopic scale. This has led to the idea of neural-network quantum states, which are representations of a quantum state based on powerful function approximators for the wavefunction based upon neural networks.
The idea of using neural networks to study quantum systems has a relatively long history. Some of the early applications of neural networks in quantum physics appeared in the early 90's, surprisingly during the artificial intelligence winter [102,103]. These studies used regression and data from exact solutions to obtain accurate potential energy surfaces of a two-dimensional harmonic oscillators. Despite their simplicity, it was already clear that direct application of neural networks could be used to investigate questions related to basic issues of physics and chemistry.
Some of the earliest neural-network quantum states were introduced in the 90's a well. In Ref. [104] the authors used a fully connected feed-forward neural network depicted in Fig. 2a to represent the wavefunction of a system of a single particle and found accurate solutions to the Schrödinger equation in several scenarios including muonic atoms, the Morse potential, two-dimensional potentials, three coupled unharmonic oscillators, as well as the Dirac equation for muonic atoms. The parameters of the neural network were optimized using backpropagation and the steepest descent method with an objective function similar to functions used in the local-energy method in electronic structure calculations [105]. Likewise, the author of Ref [106] used a feedforward neural network to represent the wavefunction. The parameters of the network t z 5 N 0 7 a L L T 1 w D C H c + 7 l 3 n u C V H A N j v N t r a y u r W 9 s V r a q 2 z u 7 e / v 2 w W F H J 5 m i r E 0 T k a h e Q D Q T X L I 2 c B C s l y p G 4 k C w b j C + K f z u A 1 O a J / I e J i n z Y z K U P O K U g J E G d t 1 r a e 4 J F g G u Y y 9 I R K g n s f l y T / j y M u m c N 1 y n 4 d 5 d 1 J r X Z R w V d I x O U B 2 5 6 B I 1 0 S 1 q o T a i 6 B E 9 o 1 f 0 Z j 1 Z L 9 a 7 9 T E v X b H K n i P 0 B 9 b n D + o X n Q 0 = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " l / n y R L 7 Z 5 u n 7 Z T q t 3 n s T H Y m n 5 5 0 = " t z 5 N 0 7 a L L T 1 w D C H c + 7 l 3 n u C V H A N j v N t r a y u r W 9 s V r a q 2 z u 7 e / v 2 w W F H J 5 m i r E 0 T k a h e Q D Q T X L I 2 c B C s l y p G 4 k C w b j C + K f z u A 1 O a J / I e J i n z Y z K U P O K U g J E G d t 1 r a e 4 J F g G u Y y 9 I R K g n s f l y T / j y M u m c N 1 y n 4 d 5 d 1 J r X Z R w V d I x O U B 2 5 6 B I 1 0 S 1 q o T a i 6 B E 9 o 1 f 0 Z j 1 Z L 9 a 7 9 T E v X b H K n i P 0 B 9 b n D + o X n Q 0 = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " l / n y R L 7 Z 5 u n 7 Z T q t 3 n s T H Y m n 5 5 0 = " t z 5 N 0 7 a L L T 1 w D C H c + 7 l 3 n u C V H A N j v N t r a y u r W 9 s V r a q 2 z u 7 e / v 2 w W F H J 5 m i r E 0 T k a h e Q D Q T X L I 2 c B C s l y p G 4 k C w b j C + K f z u A 1 O a J / I e J i n z Y z K U P O K U g J E G d t 1 r a e 4 J F g G u Y y 9 I R K g n s f l y T / j y M u m c N 1 y n 4 d 5 d 1 J r X Z R w V d I x O U B 2 5 6 B I 1 0 S 1 q o T a i 6 B E 9 o 1 f 0 Z j 1 Z L 9 a 7 9 T E v X b H K n i P 0 B 9 b n D + o X n Q 0 = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " h P + 6 L r U f 2 z a 3 q t u 1 n d 2 9 / Q P 7 8 K i r k 0 x R 1 q G J S F Q / I J o J H r M O c B C s n y p G Z C B Y L 5 j c F H 7 v g S n N k / g e p i n z J R n F P O K U g J G G d s N r a + 4 J F g F u Y C 9 I R K i n 0 n y 5 p / l I k h n 2 F B + N 4 R w P 7 g n B X T 5 5 l X Q v m q 7 T d O + c e u u 6 j K O K T t A p a i A X X a I W u k V t 1 E E U P a J n 9 I r e r C f r x X q 3 P h a l F a v s O U Z / Y H 3 + A O j X n Q k = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " l / n y R L 7 Z 5 u n 7 Z T q t 3 n s T H Y m n 5 5 0 = " t z 5 N 0 7 a L L T 1 w D C H c + 7 l 3 n u C V H A N j v N t r a y u r W 9 s V r a q 2 z u 7 e / v 2 w W F H J 5 m i r E 0 T k a h e Q D Q T X L I 2 c B C s l y p G 4 k C w b j C + K f z u A 1 O a J / I e J i n z Y z K U P O K U g J E G d t 1 r a e 4 J F g G u Y y 9 I R K g n s f l y T / j y M u m c N 1 y n 4 d 5 d 1 J r X Z R w V d I x O U B 2 5 6 B I 1 0 S 1 q o T a i 6 B E 9 o 1 f 0 Z j 1 Z L 9 a 7 9 T E v X b H K n i P 0 B 9 b n D + o X n Q 0 = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " l / n y R L 7 Z 5 u n 7 Z T q t 3 n s T H Y m n 5 5 0 = " t z 5 N 0 7 a L L T 1 w D C H c + 7 l 3 n u C V H A N j v N t r a y u r W 9 s V r a q 2 z u 7 e / v 2 w W F H J 5 m i r E 0 T k a h e Q D Q T X L I 2 c B C s l y p G 4 k C w b j C + K f z u A 1 O a J / I e J i n z Y z K U P O K U g J E G d t 1 r a e 4 J F g G u Y y 9 I R K g n s f l y T / j y M u m c N 1 y n 4 d 5 d 1 J r X Z R w V d I x O U B 2 5 6 B I 1 0 S 1 q o T a i 6 B E 9 o 1 f 0 Z j 1 Z L 9 a 7 9 T E v X b H K n i P 0 B 9 b n D + o X n Q 0 = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " l / n y R L 7 Z 5 u n 7 Z T q t 3 n s T H Y m n 5 5 0 = " t z 5 N 0 7 a L L T 1 w D C H c + 7 l 3 n u C V H A N j v N t r a y u r W 9 s V r a q 2 z u 7 e / v 2 w W F H J 5 m i r E 0 T k a h e Q D Q T X L I 2 c B C s l y p G 4 k C w b j C + K f z u A 1 O a J / I e J i n z Y z K U P O K U g J E G d t 1 r a e 4 J F g G u Y y 9 I R K g n s f l y T / j y M u m c N 1 y n 4 d 5 d 1 J r X Z R w V d I x O U B 2 5 6 B I 1 0 S 1 q o T a i 6 B E 9 o 1 f 0 Z j 1 Z L 9 a 7 9 T E v X b H K n i P 0 B 9 b n D + o X n Q 0 = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " l / n y R L 7 Z 5 u n 7 Z T q t 3 n s T H Y m n 5 5 0 = " t z 5 N 0 7 a L L T 1 w D C H c + 7 l 3 n u C V H A N j v N t r a y u r W 9 s V r a q 2 z u 7 e / v 2 w W F H J 5 m i r E 0 T k a h e Q D Q T X L I 2 c B C s l y p G 4 k C w b j C + K f z u A 1 O a J / I e J i n z Y z K U P O K U g J E G d t 1 r a e 4 J F g G u Y y 9 I R K g n s f l y T / j y M u m c N 1 y n 4 d 5 d 1 J r X Z R w V d I x O U B 2 5 6 B I 1 0 S 1 q o T a i 6 B E 9 o 1 f 0 Z j 1 Z L 9 a 7 9 T E v X b H K n i P 0 B 9 b n D + o X n Q 0 = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " l / n y R L 7 Z 5 u n 7 Z T q t 3 n s T H Y m n 5 5 0 = " t z 5 N 0 7 a L L T 1 w D C H c + 7 l 3 n u C V H A N j v N t r a y u r W 9 s V r a q 2 z u 7 e / v 2 w W F H J 5 m i r E 0 T k a h e Q D Q T X L I 2 c B C s l y p G 4 k C w b j C + K f z u A 1 O a J / I e J i n z Y z K U P O K U g J E G d t 1 r a e 4 J F g G u Y y 9 I R K g n s f l y T / j y M u m c N 1 y n 4 d 5 d 1 J r X Z R w V d I x O U B 2 5 6 B I 1 0 S 1 q o T a i 6 B E 9 o 1 f 0 Z j 1 Z L 9 a 7 9 T E v X b H K n i P 0 B 9 b n D + o X n Q 0 = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " l / n y R L 7 Z 5 u n 7 Z T q t 3 n s T H Y m n 5 5 0 = " t z 5 N 0 7 a L L T 1 w D C H c + 7 l 3 n u C V H A N j v N t r a y u r W 9 s V r a q 2 z u 7 e / v 2 w W F H J 5 m i r E 0 T k a h e Q D Q T X L I 2 c B C s l y p G 4 k C w b j C + K f z u A 1 O a J / I e J i n z Y z K U P O K U g J E G d t 1 r a e 4 J F g G u Y y 9 I R K g n s f l y T / Convolutional layer Input layer z 1 < l a t e x i t s h a 1 _ b a s e 6 4 = " X g 2 B X I u 9 x + X X 2 n 1 8   4 2 p 0 i y S 9 2 Y S U 1 / g g W Q h I 9 h Y y e 9 q N h D 4 I X 2 a 9 m 5 L v X L F r b o z o G X i 5 a Q C O e q 9 8 l e 3 H 5 F E U G k I x 1 p 3 P D c 2 f o q V Y Y T T a a m b a B p j M s I D 2 r F U Y k G 1 n 8 6 O n q I T q / R R G C l b 0 q C Z + n s i x U L r i Q h s p 8 B m q B e 9 T P z P 6 y Q m v P R T J u P E U E n m i 8  4 2 p 0 i y S 9 2 Y S U 1 / g g W Q h I 9 h Y y e 9 q N h D 4 I X 2 a 9 m 5 L v X L F r b o z o G X i 5 a Q C O e q 9 8 l e 3 H 5 F E U G k I x 1 p 3 P D c 2 f o q V Y Y T T a a m b a B p j M s I D 2 r F U Y k G 1 n 8 6 O n q I T q / R R G C l b 0 q C Z + n s i x U L r i Q h s p 8 B m q B e 9 T P z P 6 y Q m v P R T J u P E U E n m i 8  4 2 p 0 i y S 9 2 Y S U 1 / g g W Q h I 9 h Y y e 9 q N h D 4 I X 2 a 9 m 5 L v X L F r b o z o G X i 5 a Q C O e q 9 8 l e 3 H 5 F E U G k I x 1 p 3 P D c 2 f o q V Y Y T T a a m b a B p j M s I D 2 r F U Y k G 1 n 8 6 O n q I T q / R R G C l b 0 q C Z + n s i x U L r i Q h s p 8 B m q B e 9 T P z P 6 y Q m v P R T J u P E U E n m i 8  4 2 p 0 i y S 9 2 Y S U 1 / g g W Q h I 9 h Y y e 9 q N h D 4 I X 2 a 9 m 5 L v X L F r b o z o G X i 5 a Q C O e q 9 8 l e 3 H 5 F E U G k I x 1 p 3 P D c 2 f o q V Y Y T T a a m b a B p j M s I D 2 r F U Y k G 1 n 8 6 O n q I T q / R R G C l b 0 q C Z + n s i x U L r i Q h s p 8 B m q B e 9 T P z P 6 y Q m v P R T J u P E U E n m i 8    Figure 2. Three examples of neural network architectures to build quantum states. a. A fully connected neural network with 3 layers (input, hidden, and output layer from left to right). In a fully connected neural network every node in a layer forms fully-connected network with those of the adjacent layers but not with those within the same layer. Usually, in a neural network quantum state the input layer corresponds a spin/electron configuration and the output layer determines the amplitude of the wavefunction for the given spin/electron configuration. b. A convolutional neural network with one convolutional layer fully connected to the output layer and perceptron activations. The input corresponds to a spin configuration σ (represented as a two-dimensional array of binary values) and the output quantifies the amplitude of the ground state of the toric code Ψ (σ).
c. An RBM with M hidden neurons and a visible layer with N spins as the input. For each value of the spin configuration σ z 1 , ·, σ z N the neural network returns the value of the wavefunction Ψ(σ).
were optimized using a micro-genetic algorithm so that the neural network satisfied the Schrödinger equation for a one-dimensional harmonic oscillator. The recent resurgence of interest in machine learning in the physical sciences has motivated a new playground for variational calculations and exact representations of quantum states based on neural networks [100,107]. One of these early examples was developed in Ref. [42], where ground states of the toric code [43] were expressed in terms of a convolutional neural network with a structure depicted in Fig. 2b. Here, the idea is to impose the constraints induced by the toric code Hamiltonian H toric = −J p p i∈p σ z i − J v v i∈v σ x i , in particular those imposed by the J p term, directly in a convolutional neural network. This solution takes inspiration from the construction of the ground state of the toric code in terms of projected entangled pair states in that local tensors project out states containing plaquettes with odd parity [108]. The convolutional layer contains 16 2 × 2 filters per sublattice with unit stride in both directions and periodic boundary conditions. The outcome of the convolutional layer is fully connected to a perceptron neuron in the output layer which represents the wavefunction in the computational basis Ψ(σ). Here σσ = (σ z 1 , σ z 2 , · · · , σ z N ) represents a spin-1/2 configuration of the computational basis for N spins.
The most influential study on neural-network quantum states introduced a family of variational wavefunctions based upon a restricted Boltzmann machine (RBM) in Ref. [107] Originally invented by Paul Smolensky [109] and popularized by Geoffrey Hinton [110], this architecture has been recently repurposed as a representation of quantum states [100,107]. In this context, the RBM has been used to approximate the ground state of prototypical systems in condensed matter physics such as the transverse field Ising and Heisenberg models in one and two dimensions [107,111,112], the Hubbard model [112], models of frustrated magnetism [111], the Bose-Hubbard model [113], ground states of molecules [114], to model spectral properties of manybody systems [115], as well as to study non-equilibrium properties of quantum systems [107]. It has also been applied to the study of many-body open quantum systems [116][117][118][119] as well as a tool to perform approximate quantum state tomography for many-body systems [120][121][122].
Additionally, since the RBMs are amenable to analytical treatment, they have been used to find exact representations of quantum states of a wide array of quantum manybody systems. This includes ground states of one-dimensional symmetry-protected topological cluster state [123] and the toric code states [123,124] as well as other topologically ordered states of matter such as the ground states of double semion and twisted quantum double models, states of the Affleck, Lieb, Kennedy and Tasaki (AKLT) model and two-dimensional CZX model, states of stabilizer Fracton models with fracton topological order [125], fractional quantum Hall state [126], among others [125,126].
The RBM has been characterized theoretically as well, including their representational power as classical probability [127] and as a quantum state [128]. Their entanglement properties and capacity have been studied in Ref [124,129], as well as their relation with tensor networks states has been carefully established in Ref. [124,126,130].
Going beyond RBMs, quantum states based on convolutional neural networks have been shown to accurately model ground states of complex frustrated spin systems [131][132][133] as well as finite-temperature states [134] and bosons on the lattice [135]. Wavefunctions and other neural network representation of the quantum state based on autoregressive models allow for uncorrelated sampling from the wavefunction, unlike traditional variational Monte Carlo methods [136], where an expensive Markov chain introduces potential biases in the calculations of observables and during the optimization of the quantum state. Examples of this include a recurrent neural network representation of the quantum state based on generalized measurements [122], as well as neural autoregressive quantum states [137] and a recurrent neural network wavefunctions [138,139], both of which produce state-of-the-art approximations to the ground states of prototypical models in condensed matter physics.
When considering wavefunction approximations for fermionic systems such as molecules, real materials, as well as fermionic models on the lattice such as the Hubbard model, it is fundamental to account for the essential anti-symmetry of the wavefunction ansatz. This can be achieved in several different ways. The simplest approach relies on the specification of variational fermion wavefunctions based on Slater determinants arising from a mean-field Hamiltonian that best match the interacting ground state. Naturally, this approach misses some quantum fluctuations which can be reintroduced through a two-body Jastrow factor supplementing the mean-field treatment, often called Slater-Jastrow wavefunctions. The simplest neural extension of the Slater-Jastrow wavefunction consists in replacing the Jastrow factor with a neural network.
As shown in Ref. [112], a mean-field ansatz is supplemented with an RBM leading to a substantial improvement of the accuracy beyond that achieved by each method separately for in the Heisenberg and Hubbard models on square lattices.
Originally introduced by Feynmann and Cohen [140], the backflow adds correlation to a mean-field ground state by transforming the single-particle orbitals in a configuration-dependent way. Luo and Clark [141] introduced a neural network backflow, where a neural network dresses a mean-field state endowed with the Fermi-Dirac statistics which enables a systematic improvement over mean-field states and provides excellent results on the two-dimensional Hubbard model. Similar ideas to the original backflow idea have also been recently applied to a variety of atoms and small molecule systems directly continuum [142,143]. These Fermionic neural networks predict the dissociation curves of the simple molecules and the hydrogen chain, to significantly higher accuracy than the coupled cluster method in chemistry [143]. Finally, yet another strategy to impose Fermi-Dirac statistics is by transforming the original fermionic into a spin system through a Jordan-Wigner transformation [144]. After transforming the original Hamiltonian to a bosonic one, then one can use neural-network quantum states to perform electronic structure calculations, as demonstrated in Ref. [145].
To conclude, we mention a recent review of neural-network quantum states [146] which illustrates various representations for pure and mixed quantum states and discusses their physical properties and recent progress related to the application of neuralnetwork quantum states to tomography and the simulation of many-body quantum systems [146].

Machine learning acceleration of Monte Carlo simulations
Quantum Monte Carlo (QMC) refers to a wide array of computational methods based on Monte Carlo techniques with the aim of studying of quantum many-body systems, including many methodologies to determine ground-state, excited-state or finitetemperature equilibrium as well as non-equilibrium properties of a variety of quantum systems. Although Fermi's famous suggestion of the first QMC algorithm was already acknowledged in a 1949 paper by Metropolis and Ulam [147], QMC methods remain a powerful and broadly applicable computational tool for finding accurate solutions of Schrödinger equation for atoms, molecules, quantum spin systems, and materials.
While Monte Carlo (MC) simulations remain a powerful tool in the study of classical and quantum many-body systems, a key issue with MC is the lack of a general and efficient update algorithms for large size systems close to critical points and other challenging statistical physics problems, where MC algorithms suffer a slow convergence [148]. Inspired by recent advances in machine learning, Ref. [149][150][151] simultaneously proposed general-purpose methods to accelerate MC simulations. Ref. [149] introduced a method dubbed self-learning Monte Carlo (SLMC), in which an efficient update algorithm is first learned from the training configurations generated in trial unaccelerated simulation and then used to speed up the simulation. The authors demonstrate their technique on a spin model at the phase transition point, achieving a 10 to 20 times speedup. Simultaneously, Ref [150] introduced a general strategy to try to overcome the MC slow down by fitting the unnormalized probability of the physical model to a feed-forward neural network. The authors then utilize the neural network for efficient MC updates and to speed up the simulation of the original physical system. This technique was applied to the Falicov-Kimball model [152] where improved acceptance ratio and autocorrelation times near the phase transition point were observed [150].
The SLMC scheme has been extended in several directions including its application to interacting fermion models within the framework of determinantQMC [153]. The SLMC has been augmented with deep neural networks and applied to the reduction of complexity of simulating quantum impurity models [154]. Ref. [155,156] improved the efficiency of SLMC applied to the Holstein model, which represents one of the most fundamental many-body descriptions of electron-phonon coupling. The authors in Ref. [155] endow the effective description of the action of the model with physical information by defining an effective bosonic Hamiltonian for the phonon fields which incorporates a global Z 2 symmetry of the original Holstein model. This leads to an outstanding reduction of computational complexity from O(L 11 ) to O(L 7 ), enabling the evaluation of the metal-to-charge density wave transition temperature to an order of magnitude higher accuracy than previously available [155]. Additionally, the SLMC has been extended to the framework of a continuous-time MC method with an auxiliary field for quantum impurity models [157], as well as to the framework of Hybrid MC in the context of first-principles molecular dynamics based on density functional theory [158,159]. SLMC has also been applied to the study of the Gross-Neveu-Yukawa chiral-Ising quantum critical point with critical bosonic modes coupled with Dirac fermions [160], where the authors obtain challenging quantities including a comprenhensive set of critical exponents as well as the conductivity of the Dirac fermions of the theory.
The continuous time QMC method remains one of the best impurity solver for dynamical mean-field theory (DMFT). Such an impurity problem describes how the electrons on an impurity site interact with electrons in a bath and whose solution is the most challenging and computationally expensive part of the DMFT approach. Ref. [161] utilizes a machine learning technique, specifically a convolutional autoencoder [3], to reduce the computational complexity of the DMFT procedure. While the machine learning approach is not exact, the authors demonstrate that their approach retains the accuracy in the estimation of important correlation functions, as demonstrated through careful comparisons to the exact solution of the impurity solver.
Projector quantum Monte Carlo (PQMC) techniques are powerful computational methods to simulate properties of quantum many-body systems [136]. The success of these methods crucially rely on the our ability to construct an accurate guiding wavefunction, which in the standard formulation, is optimized in a separate simulation using variational Monte Carlo [136]. Ref. [162] investigates the use of variational wavefunctions based upon unrestricted Boltzmann machines [3] as guiding functions in PQMC simulations of quantum spin models. The authors in Ref. [162] demonstrate that the optimized unrestricted neural-network states as guiding function leads to an increased efficiency of the PQMC algorithms, drastically reducing the most relevant systematic bias of the algorithm, i.e., the bias due to the finite random-walker population [136]. In a similar vein, Pilati, Inack and Pieri propose another class of SLMC which augments the PQMC with a neural network [163]. Here, the authors of Ref. [163] develop PQMC simulations guided by an adaptive RBM wavefunction that is optimized along the PQMC simulation via unsupervised machine learning, avoiding the need of a separate variational optimization. Thus, beyond demonstrating an excellent convergence of the PQMC procedure, this technique provides an accurate ansatz for the ground-state wavefunction, which is obtained by minimizing the Kullback-Leibler divergence [3] with respect to the PQMC samples.
Deep reinforcement learning has also been used in conjunction with MC simulations [164]. Zhao et al develop a deep reinforcement learning framework where a ma-chine agent is trained to search for a policy to generate ground states for the square ice model [165], which belongs to the family of ice models used to describe the statistical properties of the hydrogen atoms in water ice. The authors' analysis of the learned policy and the state value function reveals that the ice rule and loop-closing condition are learned without prior information about other than the Hamiltonian of the system. Importantly, the authors envisage that it is possible to extend this framework to other physical models and quantum systems such as quantum spin ice [166], the toric code [43], and other models endowed with physical constraints which induce long autocorrelation times in with MC simulations.
Neural autoregressive models have also been applied to the solution of classical statistical mechanics in a classical variational setting [167] and to extrapolate observables beyond the region of training [168]. The method in Ref. [167] extends the variational mean-field approach using autoregressive neural networks, computes the variational free energy, estimates physical quantities such as entropy, magnetization and correlations, and generates uncorrelated samples. The authors in Ref. [167] apply their methodology to several systems including two-dimensional Ising models, the Hopfield model, the Sherrington-Kirkpatrick model, and the inverse Ising model, where they find excellent agreement between the exact solution of the models and the neural variational approach. Similarly, Ref. [169] explore the autoregressive neural networks for the improvement of classical MC simulations of the two-dimensional Edwards Anderson spin glass, a paradigmatic classical model of spin-glass theory. These two examples of research anticipate that neural autoregressive models have potential applications in important combinatorial optimization and constraint satisfaction problems, where finding the optimal configurations corresponds to finding ground states of glassy problems, and counting the number of solutions is equivalent to estimating the zero-temperature entropy of the system.
Lattice field theory has also emerged as an area where machine learning techniques can be used to advance MC simulations which traditionally suffer from the critical slowdown problem [170,171]. Ref. [171] conceived a Markov chain update scheme using a machine-learned flow-based generative model [172] and applied it to the simulation of a φ 4 theory. The training of the model systematically improves autocorrelation times in the Markov chain, including regions where standard Markov chain MC, such as Hamiltonian Monte Carlo (HMC) [173], exhibit critical slowing down. The authors find that their algorithm produces ensembles of configurations that are indistinguishable from those generated using local Metropolis and HMC for a number of physical observables but leave the question about scalability and wider applicability for future studies. Ref. [170] proposes to reduce the autocorrelation times in lattice field theory simulations via a Generative Adversarial Network (GAN) [174]. Ref. [170] implements the GAN as an overrelaxation step in combination with the traditional HMC algorithm. This allows the method to meet all the statistical requirements to produce correct results by using the Metropolis-Hastings accept/reject rule. This combination breaks the Markov chain but effectively reduces the autocorrelation time of observables and correctly captures the dynamics of the theory.
Finally, we highlight the concept of Boltzmann generators introduced in Ref. [175]. Boltzmann generators combine deep learning and statistical mechanics to generate unbiased one-shot equilibrium samples of challenging condensed-matter systems and proteins described by some energy function U (X) at inverse temperature β. This is accomplished by optimizing an invertible neural network [176] to represent a coordinate transformation from the actual system configuration space to a modified space that is easy to sample. Importantly, the model is trained so that low-energy configu-rations in the so-called latent space are close to each other. Due to the invertibility of the neural model, samples in the latent space Z can be transformed to a system configuration X with approximately the right Boltzmann probability. This approximate sample, together with its model's likelihood P x (X), can be used in combination with a reweighting scheme [148] to produce correct samples from the Boltzmann distribution e −βU (X) for challenging systems including proteins and other difficult condensed matter systems. The Boltzmann generator is illustrated in Fig. 3. 8. Quantum information, quantum control, and quantum computation

Measurements
In this section we review recent applications of machine learning ideas to quantum information processing. Due to the strong connections between machine learning and quantum information, which stems from the fact probability and statistics lay the foundations for these two areas of research, this field has a relatively long history. Thus, our aim is not to review these developments entirely, but focus on a set of recent and illustrative studies.
Hentschel and Sanders propose to use machine learning technology to tackle the quantum measurement problem, where the aim is to infer parameters of interest from measurements on a quantum system [177]. Beyond estimating the parameters, quantum metrology is concerned with the identification of optimal measurement strategies for a given parameter estimation problem. Ref. [177] tackles the phase estimation problem through particle swarm optimization to autonomously generate adaptive feedback measurement policies for interferometric phase estimation problems. The setup they study corresponds to the estimation of an unknown phase difference φ between the two arms of a Mach-Zehnder interferometer. The authors find that the particle swarm optimization policies achieve an optimal scaling of precision for single shot interferometric phase estimation.
In Ref. [178] Greplova, Andersen, and Mølmer adapt image recognition algorithms based on neural networks to estimate rates of coherent and incoherent processes in simulated quantum systems from discretized time measurement records. Their neural networks approach translates a quantum parameter estimation into a regression problem, which conveniently, does not require the characterization of quantum or classical noise. The neural network used for the parameter estimation comprises a one-dimensional convolutional layer endowed with several filters followed by a pooling layer and a densely connected layer. The output layer provides the likely candidate values of the parameters characterizing the input signal. A measurement-related study is discussed in Ref. [179], where it is shown that reinforcement learning algorithms can be used to identify optimal quantum controls for a precise quantum parameter estimation. These studies showcase the power of reinforcement learning as an alternative to conventional optimal control methods. Machine learning methods have also been used to measure logarithmic negativity from very few measurements [180]

Quantum state reconstruction
The tasks of reconstructing the quantum states and processes, which are known as quantum state and process tomography, respectively, are the gold standards for verification and benchmarking of quantum devices [181]. While powerful, exact quantum      2. An invertible deep neural network (red and blue blocks) is trained to transform the distribution Pz(Z) to a distribution Px(X) that approximates the desired Boltzmann distribution of a complex system e −βU (X) . 3. To compute quantities of practical interest, the samples are reweighted to the Boltzmann distribution [148]. Two configurations of the complex system, e.g., a protein, is depicted in the figure below.

E j a k i j J j I y r a E L z F l 5 d J 8 7 z q u V X v 7 q J c u 8 7 j K M A x n E A F P L i E G t x C H R r A Q M A z v M K b 8 + C 8 O O / O x 7 x 1 x c l n j u A P n M 8 f d P q P k g = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " C q F 4 z c I U E Y k 8 8 p w z 1 L w u t T g J h O Y = " > A A A B 7 3 i c b V B N S 8 N A E J 3 4 W e t X 1 a O X x S L U S 0 l E 0 G P R i 8 c K t g 2 0 o W y 2 0 3 b p Z h N 3 N 2 I J / R N e P C j i 1 b / j z X / j t s 1 B W x 8 M P N 6 b Y W Z e m A i u j e t + O y u r a + s b m 4 W t 4 v b O 7 t 5 + 6 e C w q e N U M W y w W M T K D 6 l G w S U 2 D D c C /
state tomography (QST) has high computational complexity: the number of measurements required for an accurate reconstruction, the time to analyze such measurements, and the memory required to store the resulting state all scale exponentially with the size of the system. This makes traditional QST unfeasible for anything except small systems. Machine learning techniques can be used to alleviate the scaling of QST at the cost of assuming that the state under scrutiny possesses a structure amenable to a description using machine learning architectures [121,122,182]. Torlai et al [182] demonstrated that neural networks can be used to perform QST of entangled states with more than a hundred qubits. This work demonstrated that machine learning enables the reconstruction of traditionally challenging quantities, such as the entanglement entropy, from experimentally accessible projective measurements. Ref. [121] extended the approach to small mixed states and Ref. [122] provided a scalable way to reconstruct mixed states and introduced a built-in approximate certificate of the reconstruction which makes no assumptions about the purity of the state under scrutiny. The strategy in Ref. [122] can handle complex systems including prototypical states in quantum information, as well as ground states of local spin Hamiltonians. The problem of analysis speed was addressed in Ref. [183], where a machine learning based algorithm for QST with adaptive measurements was seen to provide orders of magnitude faster processing while retaining a high reconstruction accuracy. Ref. [184] constructed a neural-network based QST framework from a set of coincidence measurements. The authors consider the situation where a number of the projective measurements are not performed, which corresponds to the task of reconstructing a density matrix from informationally incomplete projective data. The authors find a dramatic improvement in the average reconstruction fidelity even when only a small fraction of the total measurements are performed, which suggests that the power and generalization ability of neural networks aid the reconstruction in the absence of an informationally complete measurement.
Most of the studies discussed so far have demonstrated the utility and scalability of machine learning approaches to quantum state reconstruction by providing proofof-principle demonstrations based on numerically generated data. Here, we highlight Ref. [185] which demonstrated quantum many-body state reconstruction from experimental data generated by a programmable quantum simulator based upon Rydberg atoms. The experiment, which uses 8 and 9 atoms and has access to only a single measurement basis, applies a novel regularization technique to mitigate the effects of measurement errors in the training data. By exploiting structural information about the state produced by the experiment, the quantum state reconstructions enable the inference of one-and two-body observables not directly accessible to experimentalists, as well as observables such as the Rènyi mutual information. A schematic depiction of the experiment and algorithmic architecture of the analysis is depicted in Fig. 4

Quantum error correction
Quantum error correction (QEC) promises to help protect quantum information from errors due to decoherence and quantum noise in quantum computation [186]. QEC is an essential ingredient for the future of fault-tolerant quantum information processing. Machine learning techniques can help develop fast and flexible decoding algorithms for a wide variety of quantum error correcting codes [187][188][189][190][191][192][193] Torlai and Melko [187] first devised an algorithm for error correction in topological codes where the decoder is constructed from a RBM, as summarized below. Ref. [187] [185]. Individual 87 Rb atoms (depicted as red circles) are trapped in an array of optical tweezers (depicted as up/down triangles behind the red atoms) and coupled to a Rydberg state with Rabi frequency Ω. Fluorescence imaging provides noisy measurements in the σz basis. The RBM (blue, hidden variables h; green, visible spins σ) represents the reconstructed quantum state via a set of parameters λ. The binary data τ accessible to the experimentalist are included as a noise layer (yellow neurons). Training on this data, the RBM learns a representation of the experimental quantum state, which can be used to evaluate observables Ô and Rènyi entropies. studies the two-dimensional toric code [43] and considers the simple phase-flip channel described by a Pauli operator where σ z is applied to each qubit with probability p err . This operation produces error chains e, whose boundary is called a syndrome S(e) and can be accessed experimentally without destroying the quantum state. Error correction consists of applying a operator whose chain r generates the same syndrome. The recovery operation succeeds if the logical information in the code is unchanged by the operation. Datasets of error chains and their syndromes D = {e, S} produced via numerical simulation. The dataset is used to train a model to approximate the underlying probability distribution p data (e, S) with a neural network. Once the model is trained, i.e., p model (e, S) ≈ p data (e, S), the model takes an input syndrome S 0 and samples the distribution p model (e|S 0 ) until it generates an error chain e 0 compatible with S 0 . The resulting error chain e 0 is selected for the recovery operation. Numerical results show that the neural decoder has a logical failure probability that is close to minimum weight perfect matching procedure [194]. Building on these ideas, Ref. [188] provides a summary of efforts on applying machine learning techniques to error correction, introduces several decoding algorithms based on deep neural decoders, and applies them to several fault-tolerant error correction protocols.
Reinforcement learning has also been applied to the quantum error correction problem. Ref. [189] shows how an "agent" discovers quantum-error-correction strategies, protecting a collection of qubits against noise. This neural-network-based reinforcement learning approach constitutes a fully autonomous, human-guidance-free approach to the discovery of quantum error correction. Ref. [191] implements a quantum error correction algorithm for bit-flip errors on the toric code using reinforcement learning and finds that their algorithm is again close to the minimum weight perfect matching algorithm for code distances up to d = 7. Ref. [193] presents a reinforcement learning framework for optimizing and adapting quantum error correction codes. The reinforcement learning algorithm learns to design good error correction codes that make use of a small number of qubits. Ref. [190] introduces a reinforcement learning framework for obtaining classes of decoding algorithms applicable to the fault-tolerant quantum computation setting, which the authors exemplify utilizing deep Q-learning [28] to obtain surface code decoders for a variety of noise models.
Finally, we mention that Ref. [192] trained neural belief-propagation decoders for quantum low-density parity-check codes. The authors report significant improvements for quantum low-density parity-check codes. Results on the toric code, the quantum bicycle code, and the quantum hypergraph product code all show orders of magnitude of enhancement in decoding accuracy.
In summary, the results in this subsection indicate that machine learning algorithms combined with domain knowledge of quantum error correction represents a viable route for finding decoding schemes that perform on par with hand-made algorithms. These strategies open up the possibility to develop future machine learning decoders for more general error models and error correcting codes.

Quantum Control
Quantum control, the precise manipulation of physical systems whose behaviour is prescribed by the laws of quantum mechanics, has been a significant goal in quantum physics, chemistry, and engineering since the establishment of quantum mechanics [195]. Since optimal control theory lays the foundations of reinforcement learning algorithms to a large extent [27], it is natural to expect that modern reinforcement learning technology can provide a platform for the accurate control of quantum systems. Ref. [196] implements a set of reinforcement learning algorithms and shows that their performance for the task of finding short driving protocols from an initial quantum many-body state initial state to a target state is comparable to optimal control methods. The reinforcement learning methods developed in Ref. [196] use a single scalar reward, i.e., the fidelity of the state produced by the simulations of the physical with respect to the target state. In a similar setting, Ref. [197] provide convincing numerical evidence that such quantum control problems exhibit a universal spin-glass transition in the space of protocols as a function of the protocol duration. The authors suggest that he critical point exhibits a proliferation of protocols with nearly optimal fidelity, though the protocol with the truly optimal fidelity is exponentially hard to locate.
In a similar vein, Ref. [198] leverages the power of reinforcement learning to develop a framework to optimize the speed and fidelity of quantum computation against leakage and stochastic control errors for a broad family of two-qubit unitary gates. The author's framework showcases an impressive improvement of two orders of magnitude in average-gate-error with respect to baseline stochastic gradient descent and up to a one-order-of-magnitude improvement in gate time from optimal gate synthesis counterparts. Similarly, Ref. [199] constructs a deep Q-learning framework to find the optimal time dependence of controllable parameters to implement a local Hadamard gate and a two-qubit CNOT gate. Importantly, Ref. [200] benchmarks reinforcement learning for quantum control against traditional control methods for the problem of preparing a desired quantum state. More specifically, the authors compare the efficacy of three reinforcement learning algorithms, namely, tabular Q-learning, deep Q-learning, and policy gradient, with two traditional control methods: stochastic gradient descent and Krotov algorithms. The authors find that the deep Q-learning and policy gradient algorithms outperform others techniques when the problem is discretized, namely, when the controls of the problem are discrete. These comparisons shed light onto the suitability of reinforcement learning for quantum control problems. Finally, Ref. [201] introduced a deep learning framework based on recurrent neural networks for mitigating noise, and characterizing and controlling the dynamics of an open quantum system based on measurements.

Quantum circuits and gates
Variational quantum algorithms such as the variational quantum eigensolver (VQE) [202] or the quantum approximate optimization algorithm (QAOA) [203] aim to simulate low-energy properties of quantum many-body systems or to find approximate solutions of combinatorial optimization problems. These families of algorithms represent one of the most promising avenues for observing computational advantages through near-term quantum computers. These algorithms employ quantum states produced by low-depth quantum circuits endowed with parameters that are used to variationally optimize a cost function in the form of an expectation value of an operator over the produced quantum state. While promising, these algorithms still face some significant challenges, and in this context, machine learning techniques have been applied to several variational quantum state preparation problems. Ref. [204] applies automatic differentiation to a differentiable photonic quantum computer simulator [205] to find circuits of photonic quantum computers that perform a desired transformation between input and output states. Whereas in the case of a single input state the method discovers circuits for preparing a desired quantum state, for the case of several input and output states, the method obtains circuits that reproduce the action of a target unitary transformation. Specific examples include learning short-depth circuits to synthesize single photons, cubic phase gates [206], random unitaries, GottesmanKi-taevPreskill states [207], NOON states [208], and other states and gates.
Ref. [209] considers the challenge of finding good parameter initialization heuristics that improve convergence to a local minima of the parametrized circuit. The authors consider a meta-learning [210] approach where a classical neural network assists the learning of the quantum circuit. The neural network rapidly finds a global approximate optimum of the parameters, which is used as an initialization point for other local search heuristics. This combination yields superior optima of the quantum circuit, which accelerates the search by several orders of magnitude with respect to other commonly used search strategies. Yao, Bukov, and Lin tackle a similar problem and show that policy-gradient-based reinforcement learning algorithms are well suited for the optimization of variational parameters of QAOA in a noise-robust fashion [211]. Their technique is expected to help mitigate the unknown sources of errors in modern quantum simulators. An analysis of the performance of the algorithms in the context of quantum state transfer problems demonstrates that excellent performance beyond state-of-the-art optimization algorithms.
All in all, the studies described in this section suggest that machine learning technology can be successfully repurposed for controlling brittle quantum devices and quantum computers and can help with the design and improvement of variational algorithms in the era of approximate, near-term quantum computing.

Quantum physics inspired machine learning
The connection between machine learning and physics has a long history beyond the recent adoption of machine learning as a tool to study physical systems. A prominent example of this connection is Hopfield's neural network model of associative memory [212]. Hopfield's model consists of a single layer containing one or more fully connected recurrent neurons. The Hopfield network, which is commonly used for autoassociation and optimization tasks, sparked research about the application of spin glass theory to the understanding of neural networks [18].
The relation between physics and machine learning has experienced a recent revival with quantum systems inspiring new breeds of classical and quantum machine learning methods [19,213], though early suggestions of quantum neural networks abound [214][215][216][217]. Here, we focus on classical machine learning methods inspired by quantum physics. In particular, we consider tensor networks, which originated in condensed matter theory and have served as a theoretical tool to simulate and understand the role of entanglement in many-body physics [218]. Stoudenmire and Schwab used the matrix product state representation [219? ], also known as the tensor train decomposition in machine learning [220], for supervised learning, specifically for multi-class classification [213]. The strategy uses is based on non-linear kernel learning, where input vectors x are mapped into a higher dimensional space via a non-linear function φ(x). This is followed by a decision function f (x) = W φ(x) which classifies the vector x. The vector W is a high-dimensional weight vector, which is, in turn, approximated and optimized using tensor network methods. More precisely, the authors consider expressing W as a matrix product state. The map φ(x) consists of a tensor product of the same local feature map applied to every component x i of the vector x, i.e., φ(x) s1,s2,...,sN = φ(x 1 ) s1 ⊗ φ(x 2 ) s2 ⊗ . . . ⊗ φ(x N ) sN , where N is the dimensionality of the input vector x and the value d is known as the local dimension. Therefore, each x j is mapped to a d-dimensional vector, and the full map can be understood as a vector living in an d N -dimensional space. The training of the model is based on a sweeping algorithm that sequentially optimizes a few of the local tensors of the matrix product state while leaving the rest fixed, in steps resembling the expectationmaximization algorithm [1]. Number-state preserving tensor networks, including matrix product states, tree tensor networks [221], as well as the multi-scale entanglement renormalization ansatz, have also been used for classification successfully [222].
Tensor networks have been successfully applied to several learning tasks including dimensionality reduction [223], unsupervised learning and generative modelling using matrix product states [224][225][226], representation learning with multi-scale tensor networks [227], sequence-to-sequence learning using matrix product operators [228], language modelling [229,230], Bayesian inference [231]. Ref. [232] provides a rigorous analysis of the expressive power of various tensor-network architectures for probabilistic modelling, including non-negative matrix product states and Born machines [233]. Ref. [234] introduced a quantum inspired generative model suitable for raw audio signals. The architecture, which is equivalent to a continuous matrix product state [235], is built from a stochastic Schrödinger equation describing a continuous time measurement on a quantum system. By construction, the model is autoregressive, which enables exact sampling of the distribution represented by the model. We conclude mentioning that a tensor-network library for physics and machine learning has been recently developed [236].

Conclusions and outlook
Modern machine learning techniques have started to spread through the landscape of quantum matter and strongly correlated systems research. While cross-fertilization between physics and machine learning predates the recent resurgence of applications to physical systems, the body of recent work reviewed here showcases the opportunities that machine learning techniques, ideas, and research culture can spark in the field of quantum many-body physics. A plausible goal for the near term is the development of models that combine quantum many-body physics with machine learning that deliver the necessary accuracy required for the prediction of novel phenomena at the speed of modern machine learning. Applied to the different research areas discussed in this review, these developments may have the potential to help us find good approximations to open problems in quantum many-body systems including frustrated magnetism and fermionic matter in models like the Hubbard and t − J models [237]. They may also significantly ameliorate the critical slowing down problem in classical and quantum Monte Carlo simulations [148]. Furthermore, these ideas may enable the simulation out-of-equilibrium dynamics of many-body systems beyond what's currently possible, and may ultimately help us delineate the boundary between quantum systems that can be simulated classically and those which would ultimately require quantum computing and quantum simulation strategies.
It is also natural to anticipate further contributions from physical sciences back to machine learning, as evidenced by the growing number of research studies related to the development and characterization of new physics-inspired machine learning models, their expressive power, and their training strategies. This include classical methods such as tensor networks but also quantum machine learning algorithms such as the quantum Boltzmann machine [238], quantum Helmholtz machine [239], and the Born machine [233]. Likewise, tools from statistical mechanics have brought new conceptual advances to the field of deep learning where questions about expressiveness of deep neural networks, their information propagation capabilities, their generalization properties have been studied [18,240,241]. Tangentially, work a the intersection between physics and machine learning may aid interpretability more broadly since this has been a central topic the physics context [42,[242][243][244].
Now is a privileged time for strongly correlated quantum systems research due to the enormous opportunities arising from artificial intelligence and quantum computing, two of today's most promising computing paradigms. Artificial intelligence is bustling with invigorating opportunities, research ideas, as well as with research practices with great potential to impact computational and experimental physics research. We are only seeing the beginning of adoption of machine learning in the study of strongly correlated systems and quantum matter. We anticipate that researchers in condensed matter, quantum information, atomic, molecular, and optical physics, and related areas of science will materialize this potential and produce exciting results through a sustained research effort at the intersection between machine learning and the broad area of quantum physics.