Quantum Entanglement of Macroscopic Mechanical Objects

Title: Direct observation of deterministic macroscopic entanglement

Authors: Shlomi Kotler, Gabriel A. Peterson, Ezad Shojaee, Florent Lecocq, Katarina Cicak, Alex Kwiatkowski, Shawn Geller, Scott Glancy, Emanuel Knill, Raymond W. Simmonds, José Aumentado, John D. Teufel

Institution: National Institute of Standards and Technology (NIST)

Manuscript: Published in Science, open access on arXiv

Quantum entanglement is one of the most bizarre and powerful phenomena in quantum mechanics. Over the years, physicists have created and observed entanglement of a wide range of systems, from the spin states of atoms to the polarization of photons. Most experiments to date, however, have studied quantum entanglement in the smallest of microscopic systems, the regime where quantum mechanics is most easily observed. It is much more difficult to observe quantum entanglement in macroscopic objects, where environmental disturbances seemingly destroy their quantum behavior. A recent paper from researchers at NIST reports observation of such entanglement: namely, the position and momentum of two physically separate mechanical oscillators. Entanglement of mechanical oscillators isn’t exactly new: position entanglement was first observed in the vibrational states of two atomic ions back in 2009. But this entanglement explores an entirely different regime, where the vibrations are not just of singular atoms, but the collective motion of billions of atoms in a macroscopic object.

SEM image of the two aluminum drums, and the complete LC circuit.

The study analyzes the mechanical oscillations of two drum-like membranes. The drums are patterned out of aluminum on a sapphire chip, are roughly 20 microns in length, and weigh roughly 70 picograms. While the drums are tiny to us- each drum is smaller than the width of a human hair- they contain several billion atoms, large enough to be considered ‘macroscopic’ for a quantum system. The membranes are designed to oscillate at 11MHz and 16MHz frequencies, respectively (they are purposefully designed to oscillate at different frequencies, so that each membrane can be identified). There is a metal base below each drumhead, so that the drumhead and the metal base act like a parallel-plate capacitor. When the drum vibrates, the distance between the plates changes, thereby changing the capacitance of the drum. By wiring up the drum to a large spiral inductor, we form an LC circuit, which oscillates at a resonant frequency given by 1/\sqrt{LC} . The LC circuit in this work is designed to oscillate at 6GHz. As the drum vibrates, the changing capacitance of the drum changes the resonant frequency of the LC circuit. By probing the circuit frequency, we gain information about the motion of the drum. The device is placed inside a dilution refrigerator which cools the device down to temperatures below 10mK. At this temperature, aluminum becomes a superconductor and both the circuit and drums have very few energy loss mechanisms. Once energy enters either one of the cavities, it can remain for milliseconds. This gives the cavities narrow resonances in frequency space, making them well-suited to behave quantum mechanically.

Quantum Electromechanics- The Basics

We can measure the quantum properties of this electromechanical system by noting that both the microwave circuit and the mechanical drums are harmonic oscillators, which we can treat quantum mechanically with creation and annihilation operators: \hat{a} for the LC circuit, and \hat{b}_1 and \hat{b}_2 for the two drums. Then a quantum measurement of drum i ‘s position is given by

\hat{x}_i = x_{0, i}(\hat{b}^{\dagger}_i + \hat{b}_i) ,

and momentum by

\hat{p}_i = ip_{0, i}(\hat{b}^{\dagger}_i - \hat{b}_i) .

Quantum mechanically, the energies of these two oscillators are quantized. The average energy of the circuit is given by \hbar\omega_c (n_c + 1/2) , where n_c is the average number of microwave-frequency photons inside the circuit. The drum energies are given by \hbar\omega_m (n_{m, i} + 1/2) , where n_{m, i} is the average number of phonons in drum i . Basic statistical mechanics tells us that the circuit and drums are naturally in a thermal state, with average photon/phonon numbers given by the Bose-Einstein occupation factor:

n(\omega) = \frac{1}{e^{\hbar\omega/kT} - 1}

At 10mK, the 6GHz circuit is naturally in the ground state, with n_c \approx 0 photons. The lower-frequency drums are more occupied with n_m \approx 20 phonons in each drum. With careful engineering, the authors can control and measure the two-drum system with single-phonon level precision.

Schematic drawing of the three peaks in frequency space: center frequency, red sideband at f_c - f_m, and blue sideband at f_c + f_m.

Let’s take a closer look at the circuit frequency measurement. As the vibrations of the drums modulate the LC circuit frequency, this shows up in frequency space as sidebands, two peaks which are separated from the circuit frequency f_c by exactly the mechanical frequency f_m of the oscillators (see image above). We call the peak at (f_c - f_m) the red sideband, and the peak at (f_c + f_m) the blue sideband. By sending a sequence of microwave pulses at these sideband frequencies, the authors are able to initialize, entangle, and readout the motional states of the two drums.

To see how this works, let’s focus on a single drumhead \hat{b} coupled to an LC circuit \hat{a} . If a red sideband pulse is applied, the interaction Hamiltonian is given by

\hbar g(\hat{a}^{\dagger}\hat{b} + \hat{a}\hat{b}^{\dagger}) .

(See derivation here. It’s straightforward but too long for this article.)

This acts like a phonon-photon swap operation, where a phonon of energy in the drum is converted into a photon of energy in the LC circuit at rate g and vice versa. For example, when applied to the state |1_m, 0_c \rangle (1 phonon, 0 photons), for a time t = \pi/2g , the resulting evolution gives |0_m, 1_c\rangle . If a blue sideband pulse is applied, the interaction is very different :

\hbar g(\hat{a}^{\dagger}\hat{b}^{\dagger} + \hat{a}\hat{b})

(See derivation here. It’s straightforward but too long for this article.)

This interaction serves to generate an entangled photon-phonon pair. For example, when applied to the state |0_m, 0_c \rangle , the resulting state takes the form (no normalization for simplicity) |0_m, 0_c\rangle + \sqrt{p} |1_m, 1_c \rangle + \mathcal{O}(p) , where p is the probability of generating an entangled pair.

Experimental Sequence

The experimental sequence in this work is in three steps: state preparation, where the drums are actively cooled to their motional ground state, entanglement, in which the motional state of the drums are entangled, and readout, in which the position and momentum fluctuations of the drums are measured. This sequence is repeated a large number of times, and the study looks at the correlations between x_1 , x_2 , p_1 , and p_2 .

State Preparation

Recall that at 10mK, the ~10MHz drums have an average of n_m \approx 20 phonons of vibrational energy. The drums should ideally be in their motional ground state (n_m = 0) to maximize the fidelity of the entanglement protocol. A red sideband pulse can be used to cool the drums to their quantum ground state. Due to the swap interaction described above, a phonon of energy in the drum is converted into a photon of energy in the LC circuit. If the decay rate of the circuit is fast enough (which it is in this experiment), the converted photon is emitted out of the circuit before it can be swapped back into the drum. If the pulse is applied for a long enough time, phonons are continually removed from the drum until there are nearly 0. This ground-state cooling technique was first demonstrated in macroscopic objects 10 years ago, using microwave radiation and even optical radiation, and has worked remarkably well since.


To perform entanglement, the authors implement two pulses in parallel: a blue sideband pulse on drum 1, and a red sideband pulse on drum 2. The blue sideband pulse entangles a phonon in drum 1 and a photon in the LC circuit, then the red sideband converts the photon into a phonon in drum 2. The net effect is to generate a phonon in each of drum 1 and drum 2 which are entangled.


A blue sideband pulse can be used to measure the position and momentum of the drums (a red sideband pulse can be used for this too, but this work uses a blue sideband scheme). By sending a blue sideband pulse and looking at the reflected signal, the position and momentum of the oscillator can be indirectly probed.

It can be shown that the position and momentum of the drums are imprinted in the two quadratures of the reflected signal. For those unfamiliar, the quadratures of an oscillating signal s(t) refer to the cosine and sine components of the signal:

s(t) = I(t) \cos(\omega t) + Q(t)\sin(\omega t)

I(t) represents one quadrature, Q(t)  represents the other. In a blue sideband measurement, I(t) \propto \hat{b}^\dagger + \hat{b}  is proportional to position fluctuations and Q(t) \propto \hat{b}^\dagger - \hat{b}  is proportional to momentum fluctuations. The authors send in a blue sideband pulse and look at the reflected I and Q signals to extract the position and momentum of each drum. These I and Q measurements can be done relatively easily using standard microwave electronics.

Pulse sequence for entangling drums 1 and 2. Red indicates a red sideband pulse at frequency f_c - f_m, whereas blue indicates a blue sideband pulse at frequency f_c + f_m.

The full pulse sequence is shown above: this implements ground state preparation, entanglement, and readout of the two-drum mechanical state. The authors perform this pulse sequence a large number of times and record the values of {x_1, x_2, p_1, p_2}  , and plot the results. To show how the position and momentum of the drums are correlated, the authors plot each data point in phase space where the (x, y) axes represent different combinations of {x_1, x_2, p_1, p_2}  . The authors do this for two different cases: no entangling pulse, and with entangling pulse, and examine the differences with each case.


Position/momentum data for the ground state with no entangling pulse applied.

As expected, the position and momentum of the two drums showed no significant correlations for the data with no entangling pulse. The circular shape of the data in phase space indicates the fluctuations are randomly distributed and uncorrelated. From the magnitude of the fluctuations, the authors can also extract the average energy of the drums at n_{m, 1} = 0.79  and n_{m,2} = 0.6  phonons respectively, which indicates that the ground-state cooling is pretty successful.

Position/momentum data after an entangling pulse is applied.

The entangling pulse data tells a different story. The positions x_1  and x_2  are clearly correlated, while momenta p_1  and p_2  are clearly anti-correlated. This is a remarkable result as the two drums are physically separated and yet are moving in a coordinated way.

While the position/momentum data is impressive, these correlations could still be classical in nature. To verify that the correlated motion is a result of entanglement, the authors use the covariance matrix C_{ij}  , with elements defined by

C_{ij} = \langle \Delta s_i \Delta s_j \rangle = \langle(s_i - \langle s_i \rangle)(s_j - \langle s_j\rangle)\rangle

where s_i  can represent x_1  , p_1  , x_2  or p_2  . For example, C_{x_1, x_2} = \langle \Delta x_1 \Delta x_2 \rangle   . If two variables, say x_1  and x_2  are not correlated with one another, then C_{x_1, x_2} = 0  . If they are correlated, then C_{x_1, x_2}  will have some nonzero value.

According to the Simon-Duan criterion for entanglement, if the smallest eigenvalue \nu  of the partial transpose of the covariance matrix satisfies \nu < 1/2  , then the two-drum mechanical state is entangled. Covariance matrices for the two cases are shown below:

Covariance matrix for position/momentum data of the ground state and entangled state.

In the case with no entangling pulse, the position/momentum measurements for drums 1 and 2 were not correlated. Therefore the off-diagonal elements are nearly zero, and the covariance matrix is purely diagonal. After applying the entangling pulse, the covariance matrix looks quite different. The correlated nature of x_1/x_2  and p_1/p_2  creates off-diagonal elements in the covariance matrix. The authors find that by varying the entangling pulse time, the value of \nu  decreases below 1/2  , verifying quantum entanglement for long enough entangling pulses. At the longest entangling time measured, \nu  is an order of magnitude below the entanglement threshold.

Pesky Pesky Noise

What makes observing quantum properties in macroscopic objects so difficult in the first place is the presence of environmental noise which corrupts the state of a macroscopic object. Ideally, one would like the measurements {x_1, x_2, p_1, p_2}   to reflect only position/momentum fluctuations, without any additional unwanted fluctuations. In practice, however, the I and Q measurements also contain vacuum noise, so that the position/momentum measurements take the form

x_i = \sqrt{\eta_i}X_i + \sqrt{1 - \eta_i}\xi_i ,

p_i = \sqrt{\eta_i}P_i + \sqrt{1 - \eta_i}\xi_i

where X_i  , P_i  are the true values of position/momentum, \xi_i  is the vacuum noise of each I/Q measurement (basically just a random variable with variance 1/2), and \eta_i  is the measurement efficiency. If the value of \eta  is small enough, then the measurements of {x_1, x_2, p_1, p_2}  become corrupted with noise, and true entanglement becomes hard to verify. The measured value of \nu  differs from the true value by

\nu_{\mathrm{meas}} = \eta \nu + (1 - \eta)\cdot 1/2

where \eta = \sqrt{\eta_1\eta_2}  is the geometric mean of the efficiencies. The smaller the value of \eta  , the closer \nu_{\mathrm{meas}}  is to 1/2  and the harder it is to verify the \nu<1/2  threshold. The authors show the calculated value of \nu   as a function of entangling pulse time:

Measured values of \nu (left) and extracted true values of \nu (right) vs. entangling pulse duration. \nu < 0.5 indicates the threshold for quantum entanglement.

The authors find that even without calibrating out the noise in their measurements, they obtain values of \nu_{\mathrm{meas}}  that are >40% below the entanglement threshold for the longest pulse time in this work. This is a remarkable result: the authors are able to observe macroscopic entanglement directly from the measured data, even in the presence of noise!

To summarize, this work demonstrates the ground-state cooling, entanglement, and measurement of the quantum motional states of two mechanical oscillators. The authors observe quantum behavior of the collective motion of billions of atoms, further confirming that even large objects can be described with a quantum-mechanical wavefunction. The results of this work pave the way for many unanswered questions: how large can a system get and still behave quantum-mechanically? Will gravity destroy quantum states at some intermediate size? Can we use entanglement in large objects as a resource for quantum computing? This work is an exciting step in the long road ahead towards answering these questions.

Quantum Communication with itinerant surface acoustic wave phonons

Authors: E. Dumur, K.J. Satzinger, G.A. Peairs, M-H. Chou, A. Bienfait, H.-S. Chang, C.R. Conner, J. Grebel, R.G. Povey, Y.P. Zhong, A.N. Cleland

First Author’s Primary Affiliation: Pritzker School of Molecular Engineering, University of Chicago, Chicago, IL 60637, USA

Manuscript: Published in NPJ Quantum Information


Superconducting qubits are among the state of the art architectures in the development of quantum processors. In order to successfully build a functioning quantum computer, it is essential to be able to transfer information about quantum states amongst multiple qubits while maintaining the “quantum” properties of these states. Typically, one would couple two or more superconducting qubits via a transmission line where the signal travels at the speed of light. Importantly, because superconducting qubits operate in the GHz frequency range, the wavelength of light with this frequency is large relative to the size of the qubit, which is approximately (1mm)^{2}. The wavelength of light at these frequencies is given by \lambda = 6 \textrm{cm} for a signal with frequency 5 GHz. This means that the structures which couple our qubits together must be (of order) this size and are much larger than the qubits themselves! For a simple case, like coupling two qubits together this does not present any challenges[2], but as superconducting processors become larger in quantum volume (and therefore spatial size), it becomes more and more important to think critically about how we can create a smaller spatial structure with which to couple two or more qubits.

Surface acoustic wave (SAW) devices utilize the “slow” speed of surface sound waves in crystals (typically about 4000 m/s) in order to create high frequency resonant structures with a small spatial footprint. For example, in order to create a structure with a resonant frequency of 4 GHz, one would need a wavelength of \lambda = (4000 \textrm{m/s})/(4\textrm{GHz}) = 1 \mu \textrm{m}, which is approximately 5 orders of magnitude smaller than the wavelength of a signal which travels at the speed of light! SAW devices are created by fabricating metal strips called interdigitated transducers (IDT for short) on a piezoelectric substrate. In a piezoelectric material, the electric fields in the material induce mechanical strain and vice versa so that an AC voltage applied across the metal strips launches a strain wave propagating across the substrate at the same frequency (see Fig. 1 for a schematic). Here, the wavelength of the surface wave is defined by the periodicity of the metal finger structure, so we are able to create high frequency resonators using standard nano-fabrication techniques.

FIg. 1
Schematic of an IDT structure (red) which is driven by an AC voltage and launches surface acoustic waves (green)

In addition to using IDT structures to launch SAWs, we can also add periodic metallized structures on either side of the IDT launcher which act to reflect phonons emitted from the IDT (called mirrors). See Fig. 2 (adapted from [3]) for a schematic which details both the IDT as well as the mirror structures.

Figure 2 Schematic of a SAW resonator which contains both the IDT which launches SAWs (center), and the mirrors on either side of the IDT which form an acoustic cavity.

Together, the IDT and mirror structure create an acoustic cavity for phonons, where the spatial size is much smaller than a cavity for microwave photons at the same frequency!
GHz-frequency SAW resonators have been coupled to superconducting qubits before, sometimes in a “flip-chip” configuration[4]. This allows the experimentalist to fabricate a standard superconducting qubit on one substrate (typically on silicon or sapphire) and the SAW resonator on a separate piezoelectric substrate (LiNbO_3 is very common for these types of experiments). The chip containing the SAW resonator is then fastened on top of the substrate where the qubit is fabricated. Using an experimental setup like this also allows one to tune the coupling between the qubit and SAW via on-chip inductors, which can allow us to study each system independent from one another. By coupling the qubit to a SAW device, we can transfer excitations from the qubit to the SAW (and vice versa). For example, one can often write the interaction between the SAW and the qubit to be:

\hat{H}_{int} = \hbar g\left(\hat{\sigma}_{+} \hat{m} + \hat{\sigma}_{-} \hat{m}^{\dagger} \right)

Here, {\sigma}_{\pm} are the creation and destruction operators for excitations in the qubit, and \hat{m} and \hat{m}^{\dagger} are bosonic operators for the phonon modes in the SAW. If we prepare the qubit in the excited state and have no phonons in the SAW resonator, then after a time \pi/g, the excitation will be transferred to the SAW! As an equation:

|e,0\rangle \rightarrow |g,1\rangle

Here the quantum state is written as a product of both the qubit state and the state of the SAW, where |e(g)\rangle is the excited (ground) state of the qubit and the number in the state vector denotes the number of phonons excited in the SAW device.

Experimental Details and Preliminary Results

In this set of experiments, the primary goal is to use two SAW resonators to mediate the quantum state transfer between two qubits which are separated spatially by using a phonon based communication channel. Here, the previously mentioned flip-chip configuration will be used. On the sapphire substrate, the two qubits are fabricated. Each qubit contains a SQUID loop, which means that the resonant freuquency of the qubit is tunable via an external magnetic flux threading the SQUID loop. Additionally extra control lines are added near each qubit which can manipulate the quantum state of the qubit. The control lines which manipulate the individual qubit states are known as XY lines, while control lines which provide local magnetic flux control to each qubit are known as Z lines. On the “top” LiNbO_3 chip, two IDT devices with the same resonant frequency (near 4GHz) are fabricated. These two IDT are separated by 2mm, which means it takes a phonon approximately 500ns to traverse from one IDT to the other. An acoustic mirror structure is added on one side of each IDT so that phonons are preferentially launched in one direction at certain frequencies (this specific design is called a unidirectional transducer, or UDT for short). This allows for constructive interference of phonons at some frequencies, which we will call the UDT regime. At all other frequencies phonons will not constructively interfere, and we will call this the IDT regime. Two tunable couplers are added on each chip so that the interaction strength between each qubit and each SAW resonator can be independently tuned. See Fig. 3 for a full schematic of the composite device.

Figure 3
(a) Schematic of the composite device and a description of each piece. (b) Effective circuit diagram which describes the circuitry necessary to manipulate the qubit states as well as couple the qubits to the acoustic resonators. For each qubit Q_i, the control lines Q_{xyi} control the state of that qubit and the control line labeled Q_{zi} controls the local magnetic field which tunes the resonant frequency of that qubit. The coupler labeled G_i controls the coupling to the acoustic channel on a separate chip, and the control line G_{zi} allows for the control of the coupling strength via an external magnetic flux.

The first experiment that can be done with this device is the independent characterization of a single qubit, for example qubit Q1, when it is weakly coupled to the phononic quantum channel. This characterization allows the authors to verify that the qubits have long enough coherence to take full advantage of the communication channel. This means that we need the qubit to maintain its state much longer than the phonon travel time of 500ns, otherwise we won’t be able to measure any effects due to phonons traversing the communication channel! In order to measure how long the qubit can maintain its state, a T_1 measurement is performed, where the qubit is put into its excited state via a microwave pulse, and then the probability of the qubit remaining in its excited state as a function of time is measured. The result is shown in Fig. 4.

Figure 4
T_1 data for qubit Q1 across a broad range of qubit frequencies. Interestingly, we see that when the qubit is near-resonant with the SAW device, its lifetime drops dramatically!

At first glance, many striking features of this measurement are apparent. First, over the frequency range of approximately 3.85GHz to 4.15GHz, the qubit does not remain in its excited state for very long. This is because over this frequency range, the SAW resonator has a high conductance, and therefore the qubit excitation is transferred into a phonon. Finally, and perhaps most interestingly, in the range where the qubit excitation is lost to a phonon, the qubit excited state actually increases after roughly 1\mus. This is because the qubit excitation is lost to a phonon, and the phonon travels to the other end of the phonon channel, then it is reflected back to the original SAW where it is in turn converted back to a qubit excitation! A similar, yet weaker feature is also noticeable near 2\mus. Because we can see these features, this is an indication that the qubit coherence is long enough such that we can use the full potential of the phonon communication channel in this device!
After significantly long coherence is verified, the authors attempt a quantum state transfer between the qubits. The experimental protocol is as follows: prepare qubit Q1 into its excited state, then turn on the coupling between qubit Q1 and a SAW resonator. This will allow for a phonon to be launched across the phonon channel. Then approximately 500ns later, the authors turn on the coupling between the other SAW resonator and qubit Q2. This will allow for the transiting phonon to be converted into an excitation in qubit Q2. The results are shown in Fig. 5.

Quantum state transfer in the UDT regime (left) and the IDT regime (right). We see that the state transfer is possible in both regimes, but much more effective in the UDT regime. The pulse sequence in the right panel demonstrates the measurement protocol: a pulse applied to qubit 1 via the control line Q_{xy1} prepares qubit 1 into its excited state. Then the coupling between the acoustic channel and qubit 1 is turned on (represented by K_1). After a set amount of time, the coupler between the acoustic channel and qubit 2 is turned on (represented by K_2), and the state of qubit 2 is measured.

Here we can see that when the SAW is operated in the UDT regime, the probability of Q2 being excited via a phonon is near 68%, while in the IDT regime it is much lower (only about 15%). This is an indication that operating in the UDT regime allows for highly efficient state transfer from one qubit to another mediated by phonons!!


Now that we know we can transfer a quantum state from one qubit to the other using phonons as an intermediate step, a logical next step is to attempt to create a non trivial multi-qubit state, specifically a Bell state! In order to do this experiment, the authors harness the utility of the tunable couplers mentioned previously. If we load an excitation into a qubit and turn on the coupling between the qubit and SAW resonator for a specific amount of time, the qubit excited state probability will decay to approximately 50% (see Fig. 6, approximately 175ns). At this time, there is a 50% chance the qubit has lost its excitation to the emission of a phonon in the communication channel, and we will call this launching “half” a phonon. Of course, we can write the process quantum mechanically:

|e,0,g\rangle \rightarrow \frac{1}{\sqrt{2}}\left(|e,0,g \rangle+ |g,1,g\rangle\right)

Here we have labeled the quantum states as the following |Q1,\gamma,Q2\rangle, where the first index denotes the state of qubit 1, \gamma represents the number of phonons in the acoustic channel, and the final index labels the state of qubit 2. Upon the arrival of the phonon on the other side of the channel, the authors turn coupler 2 on and “catch” the traveling phonon so that the total process is:

|e,0,g\rangle \rightarrow \frac{1}{\sqrt{2}}\left(|e,0,g \rangle+ |g,1,g\rangle\right) \rightarrow \frac{1}{\sqrt{2}}\left(|e,g\rangle + e^{i\phi}|g,e\rangle\right)\otimes|0\rangle

Here, we have introduced a relative phase difference \phi, as well as factored out the index which denotes the phonon number. Because we can factor out the phonon number here, we can write the two qubit wavefunction after this process as |\psi\rangle = \frac{1}{\sqrt{2}}\left(|e,g\rangle + e^{i\phi}|g,e\rangle\right), which we recognize to be a Bell state, which is entangled! Results from this experimental protocol are shown in Fig. 6a. Additionally, a reconstruction of the two qubit density matrix allows the authors to verify that the state they have prepared is actually a Bell state! See Fig. 6b for a comparison with theory.

Figure 6
(a) Experimental results for the generation of the Bell state. We see that we have approximately 50% chance of measuring each qubit in its excited state. (b) A reconstruction of the two qubit density matrix. Here the red boxes represent the expectation for a perfect Bell state, and the dashed boxes are simulation results which take into account all of the losses in the system.

Phonon-Qubit Dispersive Interaction

The final set of experiments performed with this remarkable device uses phonons as a probe of the state of one of the qubits. For example, the phase change of a phonon will be different if it interacts with a qubit in its excited state rather than its ground state. In order to test this, again the authors launch half a phonon using qubit Q1. When this phonon is traveling, the resonant frequency of qubit Q1 is changed so that the quantum state of Q1 is changed. When the phonon reaches qubit Q2, the coupler is turned on for a fixed amount of time (200 ns), and the phonon and qubit are allowed to interact. The phonon then reflects back to qubit Q1 and the coupler is turned back on so that the excitation is transferred back to Q1. If the phase of the qubit and the phase of the phonon interfere constructively, the qubit will return to its excited state. However, if they interfere destructively, the qubit will emit its remaining energy and relax to its ground state. Therefore, a measurement of the excited state probability of Q1 will tell us about the phase interference between the phonon and Q1! As we sweep the relative phase of Q1, we should expect to see oscillations in the excited state probability of Q1, where the peaks are constructive interference conditions and the valleys are destructive interference conditions. The relevant pulse sequences are shown in the right panel of Fig. 7.

Figure 7
A measurement of qubit Q1’s excited state probability as a function of its induced phase. There is a discrete phase change (salmon) when the qubit Q2 is prepared in its excited state prior to the measurement.

The experimental process can then be repeated, with the only difference being we have first excited qubit Q2 into its excited state, which means that the phonon should pick up an additional phase shift! This is read out as a discrete phase shift in the left panel of Fig. 7 (the salmon dots are shifted in phase relative to the blue dots by \Delta\phi = 0.40\pi). Here, we say that Q1 probes the state of Q2 via phonons.

Finally, the authors swap the roles of the two qubits and perform one final measurement. They prepare qubit Q2 in a superposition of its ground and excited states, with some variable phase \theta. As an equation: |\psi\rangle = \frac{1}{\sqrt{2}}\left(|g\rangle + e^{i\theta}|e\rangle\right). Experimentally, the phase \theta is set by the phase of a microwave pulse. Once the state is prepared, they wait a fixed amount of time and apply another pulse which rotates the state by \pi/2 radians about the x-axis of the Bloch sphere and measure the state of qubit Q2. As we sweep the phase of the first pulse, we should expect an oscillation in the excited state probability of qubit Q2. As a contrast, they repeat the measurement where the only change is they have first excited qubit Q1 and turned on the relevant couplers. If a phonon is released via qubit Q1, this will again manifest as a phase change relative to the first measurement. The relevant pulse sequence and results are shown in Fig. 8.

Figure 8
A measurement of qubit Q2’s excited state probability as a function of its induced phase via a microwave drive. There is a discrete phase change (salmon) when the qubit Q2 is prepared in its excited state prior to the measurement.

Again, there is a discrete phase shift in the excited state probability of qubit Q2, this time of \Delta \theta = 0.95\pi. This means that they can use the phonon channel to perform phase sensitive measurements of “arbitrary” quantum systems (where of course here that system is another qubit)!


In conclusion, this remarkable set of experiments shows that it is possible to use a phonon-based communication channel to not only transfer a quantum state from one qubit to another, but it is also possible to perform more complex operations, such as preparing a two qubit Bell state! Finally, we can harness the power of traveling phonons to probe the characteristics of other quantum systems and learn about them by measuring a separate qubit!


[1] E. Dumur et al, npj Quantum Information 7, 1734 (2021)

[2] J. Majer et. a, Nature 449, 443–447 (2007)

[3] T. Aref et. al, Quantum acoustics with surface acoustic waves, in Super-
conducting Devices in Quantum Optics, edited by R. H. Hadfield and G. Johansson (Springer International Publishing, Cham, 2016) pp. 217–244.

[4] K. J. Satzinger et. al, Nature 563, 7733 (2018)

Many thanks to Piero Chiappina for his helpful comments, edits, and suggestions!

Quantum Information and the Second Law

Title: Irreversible work and Maxwell demon in terms of quantum thermodynamic force

Authors: B. Ahmadi, S. Salimi, A. S. Khorashad

Institutions: Department of Physics, University of Kurdistan, P.O.Box 66177-15175, Sanandaj, Iran and International Centre for Theory of Quantum Technologies, University of Gdansk, Wita Stwosza 63, 80-308, Gdansk, Poland

Manuscript: Published in Nature Scientific Reports (open access)

Decoherence is the phenomenon that successfully explains the so-called quantum-classical transition: quantum coherence, which allows systems to maintain uniquely quantum superposition between states, is lost to an external environment. Once coherence is lost, the system effectively acts as though it is classical. This effect explains the rarity of macroscopic quantum phenomena, and it can be understood as a quantum information flow process: quantum information which is originally localised in the system of interest, describing the superpositions initially present between the eigenstates of the system, is dissipated via the system’s interaction with a large reservoir.

However, some classes of ‘system + reservoir’ dynamics are characterised by a backflow of information into the system: information flows out from the system into the reservoir, and after a finite time, some amount of this information returns. This is often described as memory, because the information describes past states of the system. Dynamics of this kind are called non-Markovian, and can often be seen when the reservoir has a small size, or is structured in some way. The presence or absence of information in a system is closely linked to the system’s entropy. In fact, entropy can be thought of as the amount of information about a system which is inaccessible. So decoherence – the irreversible loss of information to a reservoir – is associated with a large entropy increase, whereas non-Markovian information backflow is associated with entropy decrease.

The Second Law of thermodynamics states that entropy production is always positive for irreversible processes, and zero for reversible processes. It does not allow a system’s entropy to globally decrease – although small local decreases may be observed in thermal fluctuations, they provide a negligible contribution to the whole system. In quantum information theory terms, we’d say that the Second Law describes the tendency of information to diffuse out of systems. So it is not completely clear how non-Markovian dynamics, in which information becomes more localised, can be consistent with the Second Law.

Thermodynamics as we usually understand it cannot be straightforwardly applied to quantum systems. It relies on the ability to average over thermal fluctuations, which are negligible with respect to the large systems considered. In quantum systems, no such guarantee can be made: both thermal fluctuations and quantum fluctuations can play a very significant role in the dynamics of the whole system, and systems can be made of very few component parts. Therefore, in order to understand how local entropy reductions can be consistent with the Second Law, we need to rethink our understanding of thermodynamics to explicitly include quantum information. In a paper published in early 2021, Ahmadi et al [1] explicitly incorporate quantum information into an expression of the Second Law, and give information flow and backflow a thermodynamic interpretation.

Thermodynamic Efficiency

The connection between thermodynamics and information theory can be encapsulated by the Maxwell’s demon thought experiment. In the thought experiment, there is a box filled with gas at temperature T, and a partition in the centre of the box. A small demon stands by a massless door in the partition, and strategically opens the door so that all the particles pass into one half of the box, and the other half of the box is a vacuum. Maxwell intended his demon to challenge the interpretation of the Second Law: without doing any work, the demon has lowered the entropy of the box. However, it can be understood in a different way: the demon is only able to change the system’s entropy via possession of information about the particles – i.e. their position and momentum. Therefore, information can be used to perform more work than expected by the Second Law. This understanding led to the famous slogan“information is physical” [2] – meaning it has to be accounted for in thermodynamics.

The authors address the question of incorporating information into the Second Law by considering the work that can be done by a variety of thermodynamic systems. The fundamental relation of thermodynamics can be written as

\textrm{d} F = \textrm{d} U - T \textrm{d} S = \textrm{d} W - T \textrm{d} S,

where F is the Helmholtz energy, describing the maximum extractable work, U the internal energy of the system, W the actual extracted work, and S the entropy production during the process in which work is extracted. We can see from this relation that when entropy increases, \textrm{d} S > 0, the extracted work is smaller than the maximum extractable work. But when entropy decreases, \textrm{d} S < 0 – i.e. because of a quantum non-Markovian process – the extracted work is higher than the maximum.

A generalised heat engine operating between hot (T_h) and cold (T_c) reservoirs.

Consider a classical engine which uses two reservoirs, at temperatures T_h and T_c, as in the above image. It absorbs an amount of heat \Delta Q_h from the hot reservoir, performs an amount of work \Delta W, and rejects an amount of heat \Delta Q_c to the cold reservoir. During the most efficient possible process – the Carnot cycle, which is reversible and generates no entropy – the efficiency of the engine is

\eta_C = \frac{W}{\Delta Q_H} = 1 - \frac{T_c}{T_h}.

Other, less efficient, processes cannot reach the Carnot efficiency, due to irreversibility and entropy increase. If the entropy production during the cycle is \Delta S = \Delta S_1 + \Delta S_2, the efficiency is

\eta = \eta_C - \frac{T_c \Delta S}{\Delta Q_h},

with \eta = 1 - \frac{T_3}{T_1}. The goal now is to determine a similar expression for a corresponding quantum engine, which may well have \Delta S < 0.

Reversible and Irreversible Work

In order to analyse the work done by a quantum thermodynamic system, the authors partition the work into two parts: the reversible work \Delta W_{\rm rev} and the irreversible work \Delta W_{\rm irr}. The reversible work is

\Delta W_{\rm rev} = k T \Delta I + \Delta F_{\infty},

where F_{\infty} is the Helmholtz energy of the equilibrium state – essentially, the work that can be extracted from the system after it has reached thermal equilibrium with its environment –
and I(t) = S(\rho(t) || \rho_{\infty}) is the relative von Neumann entropy between the state of the system at time t and the equilibrium state \rho_{\infty}. Relative entropy is a measure of how much information is shared between two quantum states – how easy it is to distinguish the two states, so I(t) tells us how far away \rho(t) is from the equilibrium state.
In general, the reversible work should be negative because it is being done by, not on, the system.

The irreversible work is \Delta W_{\rm irr} = k T \Delta S, positive when \Delta S > 0, which means it reduces the magnitude of the reversible work.

These definitions can be understood as follows. The reversible work \Delta W_{\rm rev} is the maximum amount of internal energy which would be “spent” by the system if the system was undergoing a reversible process – e.g. a Carnot cycle. This is directly dependent on the information content of the system, via the relative entropy. The irreversible work is the amount of internal energy which cannot be spent, due to loss of information from the system. Therefore, we can refer to \Delta W_{\rm irr} as encoded information, because it is inaccessible.

Quantum Decoding

Let’s think about an example quantum system. Because it interacts with an environment, it must be described by a density matrix \rho(t) rather than a wavefunction \psi(t). The density matrix is a much more general description of a quantum state, and describes systems which can lose information. When we are using the density matrix, expectation values of operators are found by taking the trace – for example, the energy expectation value is \langle E \rangle = {\rm Tr} (\hat{H}\rho).
Our example system has initial state \rho_0 at time t=0, and it interacts with a bath at temperature T, according to Hamiltonian H. After the system has evolved over a time t , the state of the system is \rho_t . The irreversible work after this time is

\Delta W_{\rm irr} = k T [S(\rho_0 || \rho_{\infty}) - S(\rho_t || \rho_{\infty})].

The first term is the information shared by the initial state and the equilibrium state, which represents an information minimum. The second term is the information shared between the current state and the equilibrium state – so the quantity \Delta W_{\rm irr} quantifies the information which has been lost between time 0 and t, relative to the total amount of information the system can contain. In other words, it is the entropy which has been gained over the evolution from \rho_0 to \rho_t. It is worth noting that the information lost during an open quantum system evolution is usually information about the coherence – i.e. which superpositions the initial state contained.

We want to consider a possible cycle that this quantum system can undergo. Let’s construct a cycle between two reservoirs at temperature T_h and T_c. There are four steps:

  1. The system begins in state \rho_0, and interacts with the hot reservoir over time t_1, until it is in state \rho_1. The interaction is described by the Hamiltonian H_0. The change in the system’s energy expectation value over the interaction is equivalent to the amount of heat it absorbs: \Delta Q_h ={\rm Tr}[H_0(\rho_0) - {\rm Tr}[H_0(\rho_1)]= {\rm Tr}[H_0(\rho_0 - \rho_1)]. The entropy of the system changes by \Delta S_h, which has a contribution from the heat absorption, \frac{\Delta Q_h}{k T_h}, and a contribution from the evolution of the state of the system: S(\rho_0||\rho_{\infty}) - S(\rho_1 || \rho_{\infty}) - \int_0^{t_1} {\rm Tr}\left (\rho(t) d_t \ln \rho(t) \right) \textrm{d} t.
  2. The system decouples from reservoir. While staying in the same state \rho_1, the interaction Hamiltonian is slowly (adiabatically) changed from H_0 to H_1. No entropy is produced.
  3. Much like in Step 1, the system interacts with the cold reservoir according to the Hamiltonian H_1 and evolves from state \rho_1 to state \rho_0. The heat rejected to the cold reservoir during this step is \Delta Q_c = {\rm Tr}[H_1(\rho_0 - \rho_1)]. The entropy of the system changes by \Delta S_c, which – just like in Step 1 – has a contribution from the heat rejection, and a contribution from the evolution of the state of the system.
  4. The system decouples from the cold reservoir and, while the system remains in state \rho_0, the interaction Hamiltonian is adiabatically changed from H_1 to H_0. No entropy is produced.

The work done during this cycle is \Delta W = \Delta W_{\rm rev} + \Delta W_{\rm irr} . The irreversible work is

\Delta W_{\rm irr} = k T_h \Delta S_h + k T_c \Delta S_c,

which contains a contribution from Step 1, the hot reservoir interaction, and a contribution from Step 3, the cold reservoir interaction.
Therefore, the efficiency is

\eta_Q = \eta_C - \frac{k T_h \Delta S_h + k T_c \Delta S_c}{\Delta Q_h}

When the system is non-Markovian, the evolution of the state of the system can cause a negative entropy contribution. For a sufficiently non-Markovian system, k T_h \Delta S_h + k T_c \Delta S_c < 0, and then \eta_Q > \eta_C. Therefore, non-Markovian dynamics can be used to construct engines which are more efficient than a Carnot engine.

This can never happen in classical equilibrium thermodynamics – unless there is an external feedback mechanism, i.e. a Maxwell demon. Maxwell’s demon can be thought of as an information decoder – it takes information which was inaccessible to the system, and makes it accessible. This information decoding describes a negative entropy production, and therefore an efficiency greater than the Carnot efficiency. However, this cannot be achieved without a demon in classical thermodynamics, whereas in quantum thermodynamics non-Markovianity can play the role of the demon.

The Second Law of Thermodynamics

The Second Law can now be extended to quantum systems:

In a quantum thermodynamic process, information can be encoded and also decoded for the system to do work, and this encoded (decoded) work equals temperature T times entropy production of the system, i.e.

This explicitly incorporates information into the Second Law. For classical macroscopic thermodynamics, it reduces to just the encoded part. This definition emphasises the connection between thermodynamics and information, instead of focusing on defining an arrow of time dependent on positive entropy production, and ensures that there are no violations in the presence of quantum non-Markovianity or Maxwell demons.


The authors of this paper aimed to explicitly include information into a more general definition of the Second Law of thermodynamics. They divided the work done by a thermodynamic system into two contributions: the reversible work, which is the maximum available work in the absence of information flow at all, and irreversible work, which quantifies the amount of work that is gained or lost due to information flow into or out of the system. This partitioning allowed the authors to derive the generic efficiency of an engine, which in the quantum non-Markovian case can be higher than the Carnot efficiency. This was given a thermodynamic interpretation: negative entropy production corresponds to information being decoded, so that it becomes accessible to the system, and more work is performed than expected by the usual formulation of the Second Law. Based on this analysis, the authors have introduced a novel formulation of the Second Law which incorporates information and is not violated by quantum non-Markovian systems.


[1] B. Ahmadi, S. Salimi, A. Khorashad,Scientific reports2021,11, 1–9.

[2] R. Landauer et al.,Physics Today1991,44, 23–29.

Sapphire Lally works on modelling non-Markovian effects in open quantum systems.

Thanks go to Akash Dixit for his many helpful edits and suggestions.

Catching and counting photons

Title: Number-Resolved Photocounter for Propagating Microwave Mode

Authors: Rémy Dassonneville, Réouven Assouly, Théau Peronnin, Pierre Rouchon, Benjamin Huard

Institution: Univ Lyon, ENS de Lyon, Univ Claude Bernard, CNRS, Laboratoire de Physique, F-69342 Lyon, France

Manuscript: Published in Physical Review Applied [1], Open Access on arXiv


Quantum technologies, based on superconducting circuits and microwave photons, are rapidly developing. At the heart of these devices are superconducting qubits, highly customizable and capable of strong interactions with microwave photons [2]. This basic building block enables everything from multi-qubit quantum processors to state of the art sensors. One capability missing from the toolkit of superconducting qubits is the ability to detect propagating photons, which can be used to send quantum information over long distances. Developing a method to resolve the number of photons contained in a wavepacket traveling down a transmission line could unlock the ability to construct quantum networks, entangle remote qubits, and build quantum sensors.

The strong interactions between qubits and photons make qubits an ideal device for photon detection. In a stationary mode, the photons can be held for a long time, allowing a qubit to distinguish between 0, 1, 2, … photons [3]. However, a propagating wavepacket travels so quickly that the qubit only has enough time to determine if there are an even or odd number of photons. In this work, the authors devise a scheme to catch an arbitrary signal propagating down a microwave transmission line and efficiently count the number of photons in the wavepacket [1]. I first describe the device and its components used to make this possible. Next, I go through the catch protocol, useful for holding a wavepacket for long enough to be measured. I break down the photon counting measurement that resolves N photons in only \log_2{N} measurements. Finally, I discuss the potential applications of the device and protocol developed in this work.


The device consists a series of carefully designed microwave components as shown in Figure 1. A propagating wavepacket first encounters a buffer cavity, a superconducting LC circuit that can hold photons with frequency matched to the resonance of the cavity \omega = 1/\sqrt{LC}. The buffer cavity is strongly coupled to the transmission line in order to capture the traveling wavepacket and temporarily hold it, making it a stationary mode. Resolving the number of photons in the signal requires enough time to make the measurements, however the strong coupling between the transmission line and buffer cavity, used to capture the wavepacket, results in the signal quickly leaking out of the stationary mode and back into the propagating line. To ensure there is enough time to measure the photon number, the state is swapped into a long lived memory cavity that can store the signal while it is being read out. This is done using a Josephson ring modulator (JRM), which swaps the states of the buffer and memory cavities. Once the state has been transferred into the memory, a qubit and readout system can be used to determine the number of photons present. After measuring the state, the JRM swaps the state back into the buffer where is it quickly emitted into the transmission line. This resets the system and allows for the next operation to proceed.

Catching Photons

A Josephson ring modulator (JRM) is a device used to swap the states of two cavity modes. In this work, it is used to transfer the wavepacket captured in the leaky buffer cavity into the long lived memory cavity. The JRM is pumped at the frequency corresponding to the difference between the buffer (resonant frequency \omega_b/2\pi = 10.220 GHz) and memory (\omega_m/2\pi = 3.745 GHz) cavities frequencies. This provides the energy required to transfer a state from the buffer to the memory cavity. The pump enables a beam splitter interaction between the buffer and memory, described by the Hamiltonian shown in Equation 1.

\mathcal{H}_{bs} = g p(t) b m^{\dagger} + h.c. (Equation 1)

g is the strength of the pump, p(t) is the pulse shape of the pump, b is the annihilation operator of the buffer mode, and m^{\dagger} is the creation operator for the memory mode. When the pump is on, the state present in the buffer mode is swapped into the memory at a rate g. For example, if we start with 1 photon in the buffer cavity and 0 photons in the memory, after a time t=\pi/g, the buffer will contain 0 photons and the memory will have 1 photon. The authors use this interaction to move the wavepacket from the buffer cavity to the memory for measurement. This process can also be used in reverse; a photon contained in the memory cavity can be transferred into the buffer by applying the same pump. The authors use this interaction to reset the device by emptying out the memory cavity. The pump is turned on to swap the memory state into the buffer, where the wavepacket quickly escapes into the transmission line.

Counting Photons

Once the wavepacket is successfully transferred into the long lived memory cavity, the qubit and readout are used to count the number of photons present. The authors make a series of measurements to resolve the photon number of the wavepacket. Each measurement involves allowing the qubit and memory cavity states to interact for a carefully chosen amount of time. At the end of each measurement the qubit state (ground, g or excited, e) is read out and recorded. The authors devise protocol that requires making the minimal number of measurements: resolving up to N photon with only \log_2{N} qubit measurements. The measurements are designed such that the series of recorded qubit states represent a binary decomposition of the photon number. The binary decomposition is a way to represent any integer as a sum of powers of 2. An integer is represented as a series of bits (which take the value 0 or 1), where the k^{\mathrm{th}} bit determines if 2^k is included in that sum. For example, 13 = 1(2^0) + 0(2^1) + 1(2^2) + 1(2^3), so for 13, bit 0 = 1, bit 1 = 1, bit 2 = 1, and bit 3 = 1. Here, the qubit state after each measurement (either g or e) represents the value of the bit being measured (which can be either 0 or 1). Each photon number is then identified as a unique series of g’s and e’s (or 0’s and 1’s). In this work, the authors measure wavepackets that contain up to 3 photons, using two measurements to distinguish between 0, 1, 2, and 3 photons as shown in Table 1.

Bit 0Bit 1Number
000 = 0(20) + 0(21)
101 = 1(20) + 0(21)
012 = 0(20) + 1(21)
113 = 1(20) + 1(21)
Table 1: The series of qubit states resulting from the measurements represent a unique way to identify the photon number in the memory cavity. In this work, the authors are able to resolve up to 3 photons with a series of two measurements.

Binary decomposition measurement protocol

The measurement harnesses the interaction between the qubit and memory cavity a described by the Hamiltonian in Equation 2. The memory creation and annihilation operators are represented as m^{\dagger} and m. The qubit ground and excited states are \left|g\right\rangle and \left|e\right\rangle.

\mathcal{H}_{qm} = -\chi m^{\dagger}m \left|e\right\rangle \left\langle e \right| (Equation 2)

Without the interaction between the qubit and memory cavity, the Hamiltonian of the just the qubit would look like \mathcal{H}_{q} = \omega_q \left|e\right\rangle \left\langle e \right|, where \omega_q is the transition frequency of the qubit. When we add in the interaction the full Hamiltonian can be expressed as \mathcal{H}_{\mathrm{total}} = \mathcal{H}_{q} + \mathcal{H}_{qm} = (\omega_q - \chi m^{\dagger}m )\left|e\right\rangle \left\langle e \right|. The combination m^{\dagger}m is the operator version of the number of photons, n, in the memory cavity. By comparing the total Hamiltonian with the one of just the qubit, we see that the effect of the interaction is to modify the transition frequency of the qubit (represented by everything before the \left|e\right\rangle \left\langle e \right|). So now, the qubit transition frequency is dependent on the number of photons in the memory cavity (n = m^{\dagger}m). For every additional photon in the memory cavity, the qubit transition frequency shifts by \chi.

To access the memory cavity photon number requires multiple similar, but subtly slightly different measurements, all relying on the interaction that causes a qubit frequency shift proportional to the number of memory photons. In each measurement, the authors entangle the cavity state with that of the qubit. This is done by placing the qubit in a superposition state \frac{1}{\sqrt{2}} (\left|g\right\rangle + \left|e\right\rangle) using a \pi/2 rotation about the x-axis and allowing it to interact with the memory cavity state, according to Equation 2, for a carefully chosen time. During this interaction time, \tau, the qubit state acquires a phase that is proportional to the number of photons, n, in the cavity at a rate of \chi. The total phase acquired is \phi = n \chi \tau. The qubit is then projected back onto the z-axis using a second \pi/2 rotation.

In the first measurement, the goal is to distinguish between even and odd photon numbers, 0/2 or 1/3, the 0th bit of information. The interaction time is chosen to be \tau_0 = \frac{2 \pi}{2 \chi} so that when the photon number is even the phase acquired is an even multiple of \pi and when the photon number is odd the phase acquired is an odd multiple of \pi. The second \pi/2 rotation is performed around the -x-axis, rotated by \pi relative to the original axis. If there are an even number of photons in the memory (0 or 2), the second \pi/2 rotation just undoes the first one since the qubit superposition phase is \phi = l\pi with l =0, 2. Since the qubit state remains \frac{1}{\sqrt{2}} (\left|g\right\rangle + e^{il\pi} \left|e\right\rangle) = \frac{1}{\sqrt{2}} (\left|g\right\rangle + \left|e\right\rangle) after the interaction, the qubit ends up back in the ground state. If there are an odd number of photons in the memory (1 or 3), the qubit phase is \phi = m\pi, m=1, 3. The qubit state is \frac{1}{\sqrt{2}} (\left|g\right\rangle + e^{im\pi} \left|e\right\rangle) = \frac{1}{\sqrt{2}} (\left|g\right\rangle - \left|e\right\rangle). The second \pi/2 rotation acts in concert with the first, combining to form a \pi pulse, which takes the qubit to its excited state.

In the second measurement, the interaction time is halved to be \tau_1 = \frac{1}{2} \tau_0 = \frac{2 \pi}{4 \chi} to distinguish between 0 and 2 photons (or 1 and 3 photons), the 1st bit of information. The axis of the second \pi/2 rotation is conditioned upon the result on the first measurement. If the first measurement results in the qubit remaining in the ground state (photon number is even), the second pulse is performed around the -x-axis, rotated by \pi relative to the original axis. If there are 0 photons in the memory, the qubit returns to the ground state and if there are 2 photons, the qubit is excited. If the first measurement results in the qubit being excited (photon number is odd), the second pulse is performed around the -y-axis, rotated by 3\pi/2 relative to the original axis. If there is 1 memory photon, the qubit ends in the ground state and if there are 3 photons, the qubit is excited.

The series of qubit states depending on the memory photon number is shown in Table 2. This protocol realizes the binary decomposition shown in Table 1 where the qubit ground state (g) serves as a 0 and the excited state (e) as 1.

1st Result2nd ResultPhoton Number
g (0)g (0)0
e (1)g (0)1
g (0)e (1)2
e (1)e (1)3
Table 2: Binary decomposition protocol results in a series of qubit state readouts. This set of measurements results corresponds to the decomposition of the wavepakcet photon number into its constituent bits.

Counting more photons

In order to measure wavepackets with even larger photon number, the series of measurements can be extended to extract more bits of information. For each bit, a measurement similar to the ones described above would be performed, where the axis of rotation for the second \pi/2 pulse depends on the result of previous measurement.

Large integer values can be represented by only a few bits, for example integers up to 1024 can be represented by only \log_2(1024) = 10 bits. Since it takes only one measurement per bit, large numbers of photons up to N can be resolved in only \log_2 N measurements. This provides a way to efficiently measure lots of information encoded in quantum states with many photons.

Future outlook

The authors devise a protocol to catch an arbitrary wavepacket by capturing it in a buffer cavity and swapping it into a long lived memory cavity. Once the wavepacket is in the memory cavity, a qubit is used to count the number of photons contained using the minimal number of measurements. In this work, the authors are able to distinguish between 0, 1, 2, 3 photons in a wavepackets.

This device can serve as a central component of a quantum network. Quantum information can be encoded into a wavepacket with different superpositions of photon numbers. The information can be transported along a transmission line to a secondary location, where the device presented in this work can capture and readout out the information stored by assessing the photon number of the wavepacket. The photon detection and counting component can also be used to entangle remote qubits. An emitter qubit can be coupled to a transmission line such that the qubit state is encoded as a wavepacket with a superposition of different photon numbers. This wavepacket can be transported along a transmission line to a remote receiver qubit. By counting the wavepacket photon number, the state of the receiver qubit can be conditioned on the number of photons in the wavepacket, and by extension the state of the original emitter qubit. Finally, by combining the two techniques described in this work, the device can be used as a quantum sensor, with potential applications in dark matter searches and gravitational wave detection. An arbitrary microwave signal can be caught efficiently and distinguished from backgrounds by measuring the number of photons in the wavepacket.


[1] Dassonneville, R., Assouly, R., Peronnin, T., Rouchon, P. & Huard, B. Number-resolved photocounter for propagating microwave mode. Physical Review Applied 14 044022 (2020).

[2] Wallraff, A. et al. Strong coupling of a single photon to a superconducting qubit using circuit quantum electrodynamics. Nature 431, 162–167 (2004).

[3] Schuster, D. I. et al. Resolving photon number states in a superconducting circuit. Nature 445, 515–518 (2007).

Akash Dixit builds superconducting qubits and couples them to 3D cavities to develop novel quantum architectures and search for dark matter.

Thanks to Sapphire Lally for thoughtful and insightful edits.

A Review and Discussion of Variational Quantum Anomaly Detection

One of the first motivations for building quantum computers was the potential to use them for quantum simulations: the controlled simulation of complex quantum mechanical systems. An important part of understanding a complex system of this sort is to know its phase diagram. The recent emergence of QML (Quantum Machine Learning) [1] involves – in one of its facets – the application of machine learning techniques to the problem of quantum control. In this review, we summarize and discuss a recent publication that introduced a new QML algorithm called VQAD (Variational Quantum Anomaly Detection) [2] to extract a quantum system’s phase diagram without any prior knowledge about the quantum device.


Since Feynman’s seminal lecture on quantum computing in 1981, scientists have had the goal of simulating quantum mechanical systems using quantum computer hardware [3].

“Nature isn’t classical dammit, and if you want to make a simulation of nature, you’d better make it quantum mechanical.”


A body of work exists around the problem of quantum control [4], and many control systems for various types of quantum computers have been designed and demonstrated to date. Quantum control refers broadly to the application of control theory to the management of a quantum system, which can make use of optical, electrical, mechanical and other types of control mechanisms. Using machine learning for quantum control has become a common and interesting approach. In a somewhat unique fashion, the technique we are reviewing introduces a new method for using a quantum machine learning algorithm (VQAD) to aid in the control of a quantum system on the same host device.

The technique uses a quantum circuit as a neural network. This circuit is the Variational Quantum Anomaly Detector. The quantum data that serves as the input to this circuit are the ground states of a quantum system that result from a quantum simulation.

Unlike your typical computer with a central CPU, a neural network is a computational system that uses a large number of very simple computational units – the neurons. A neural network connects these neurons together in layers so that one layer’s neurons’ single-bit outputs are the next layer’s inputs.

The VQAD circuit’s parameters are trained using a classical feedback loop, which allows it to learn the characteristics of the quantum simulation’s results. The original VQAD paper shows how a VQAD circuit can be used to map out the phase diagram of a particular quantum system. However, the implication is that the technique may be capable of detecting anomalous simulation results in general, provided that the anomaly syndrome can be learned.

The Proposal

The proposal made by the authors of the original VQAD paper is summarized in this section [2].

The proposal splits a quantum circuit into two parts: the quantum simulation that prepares an initial state and the anomaly detection circuit that calculates the anomaly syndrome of the state. The anomaly syndrome is a calculation that verifies the quantum simulation’s expected outputs.

A very general circuit is a quantum auto-encoder. A quantum auto-encoder encodes information that is originally found in a larger number of qubits into a smaller number of qubits. A subset of the original qubits are used for the final encoding, and the others are “discarded” – decoupled from the rest.

The proposal repurposes the auto-encoder circuit. It is used to check how well the initial state can be encoded into the encoding qubits. An auto-encoder generally consists of several parts. These include the encoder that is responsible for reducing the dimensionality of the data and thereby compressing it with minimal loss. The bottleneck is the layer that contains the compressed low-dimensional representation of the data. Auto-encoders often have a decoder step as well, which reverses the compression at the receiving end of a communication. However, VQAD is not concerned with decoding the compressed information, only in repurposing the compression algorithm for the purpose of qubit decoupling. So, the only other component of the quantum auto-encoder used in VQAD is the method for determining the loss.

To determine this, the approach uses the Hamming distance dH, which is essentially a bit-wise comparison between two bit-strings. Measuring the discarded qubits into bit-strings N times and calculating the Hamming distance gives us an accurate idea of how much the discarded qubits’ measurement outcomes are correlated to the rest of the system. The idea is that the discarded qubits should not be correlated with the others, and the Hamming distance should always be zero regardless of what has been encoded.

The approach defines a cost C that summarizes this succinctly. When the Hamming distance is what we want, then the cost is minimized.

If we measure nd discarded qubits in the computational basis, the cost can be restated in terms of the expectation values of local Z operators local to each qubit j.

Since the purpose of the anomaly syndrome circuit is to check how coupled the discarded qubits are to the others, the circuit consists of layers with parameterized single-qubit y-rotations ry followed by controlled-Z gates between the discarded qubits and the encoding qubits. In each layer, one discarded qubit is entangled to one encoding qubit.

The circuit’s single-qubit rotations’ parameters are determined by an unsupervised learning mechanism. First, the circuit is repeatedly evaluated on a set of initial states that are considered free of anomalies. The parameters are tuned so that these states encode perfectly (C = 0).

A so-called unsupervised learning algorithm assumes no a-priori knowledge about the inputs to the algorithm’s training step. This means that the individual rows of training data are not specifically labelled for the unsupervised algorithm to learn to accept or reject. Rather, the unsupervised algorithm attempts to learn from the data what is typical and what is noise. The VQAD algorithm specifically learns the shape of the training data that minimizes the cost function C.

After the training step is complete, the circuit’s parameters are frozen. Then, the circuit will differentiate between anomalous states and non-anomalous states. If the cost is zero then the state is indeed a ground state. Otherwise, the circuit has detected an anomaly.

Since a phase transition is a change in the ground state of a quantum system, recognizing all of the ground states that comprise the system’s phase diagram is equivalent to learning the phase diagram. The phase diagram gives us a useful picture of the quantum system, so it is useful to be able to recognize it.

The Results

The authors of the original VQAD paper [2] evaluated the proposed method using two approaches. First, they took a look at the performance of the method with a theoretical data set. Second, they performed an experimental demonstration of the method on the IBMQ Jakarta quantum computer and analyzed the results.

Ideal Data

The behaviour of the VQAD circuit with ideal quantum data was studied using a Bose Hubbard Model with Dimerized Hoppings (DEBHM), which can be mapped to a spin-½ system [5].

This model describes a lattice of length L that with at most one boson in each of i sites. Here ni denotes the number of bosons at site i, and J−δJ(J+δJ) are tunneling amplitudes (coupling strengths) between odd and even nearest- neighbour sites respectively. b†, b are the bosonic raising and lowering operators.

Three distinct phases and their diagrams are known for DEBHM models with various on-site and nearest-neighbour repulsions. Density Matrix Renormalization Group (DMRG) simulations were performed to generate a subset of ideal states from these phases [6], for use as training data for the VQAD algorithm. Then, the trained VQAD model was used to identify anomalies in more DMRG-generated data from the same phases.

The authors found that a generated ground state was unique enough to the problem that it could be used on its own to train the VQAD circuit, which was then able to infer all three phases.

Experimental Data

The authors chose to use a Variational Quantum Eigensovler (VQE) for their experiment using the IBMQ Jakarta computer.

The VQE was used to perform the preparation of ground states in the Transverse Longitudinal Field Ising (TLFI) model.

Here gx, gy are the transverse and longitudinal fields, respectively. Zi, Xi are the Pauli-x and Pauli-z operators at the sites i, and J are the coupling strengths between neighbouring sites.

A gate-model VQAD anomaly syndrome circuit was created. The VQE algorithm was executed in the same run as the VQAD algorithm on the Jakarta computer.

A few additional optimizations were introduced in this experiment to help combat noise in the device and the resultant errors. First, the authors performed measurement error mitigation [7]. They also initialized the untrained rotation gate parameters with trained parameters from runs involving states nearby in the phase diagram.

With these added optimizations, they were able to train the circuit to sort its inputs. A significant cost difference was observed between even the most difficult states to differentiate. For example, the states |10101⟩ and |01010⟩ have a similar energy, which can create a problematic local minima for optimizers. However, the VQAD circuit was able to differentiate between them successfully.


This is a unique and interesting use of quantum auto-encoders in an application that will potentially have utility in quantum software and quantum algorithm design – two advanced areas in quantum control. Arguably, the software which the authors made available on GitHub is already a useful contribution to the field.

The original VQAD paper showed that the approach is a viable way to encode the phase diagrams of quantum simulations of many-body systems, and that the approach’s main bottleneck is environmental noise that affects our existing physical real-world quantum computing devices.

The use of an auto-encoder to learn the shape of noisy data is not entirely new, but with VQAD it has been transplanted from the field of communication to quantum machine learning [2].

To assess the usefulness of an auto-encoder we are interested in several quantities. These include the lossiness, error rates, code rates and bounds such as the Gilbert-Varshamov bound. While the authors do provide evidence that VQAD is capable of learning phase diagrams, none of these theoretical quantities are considered in the original VQAD paper.

An auto-encoder traditionally employed in communication or error correction schemes would have the elements depicted below.

Typically, we would identify an error syndrome that would take into account the encoding action m → Gm, channel action c → c 0 and decoding action c 0 → Hc0 . Here m is an input message vector, G is the generator matrix of the code used for the encoding (meaning it takes a message vector to an encoded message vector called a codeword), c is the encoded cryptogram corresponding to m, c 0 is the transmitted cryptogram, and H is the decoder matrix.

However, VQAD repurposes the encoding portion of the typical scheme. The auto-encoder in VQAD is a quantum auto-encoder and the bottleneck is created by the decoupling and measurement of the discarded qubits. Therefore, we may disregard the channel and decoder actions.

With a message (input) space (0, 1)^k , the minimum distance (i.e. the space between encodable codewords) is d(G). d(G) is the Hamming weight minimized over all linear combinations of the columns of G.

A generator matrix G will generate a code with (k-n)-bit codewords from k-bit messages. If such a code is linear (I.e. it can be written in matrix form m->Gm), then its rate is simply given by Rk(G) = k / (k-n).

The Gilbert-Varshamov bound gives us the limit of the code rate as the input size grows.

Here h is the binary entropy function.

In the case of VQAD, the generator matrix is the anomaly syndrome circuit. G is a parameterized matrix whose values are only fixed after the unsupervised learning step. Furthermore, it is unlike linear error-correcting codes (like the repetition code, etc.) in that it utilizes entanglement.

The gates involved are the rotation operators Ry(θ) and entanglement operator CX.

Within this framework, the question VQAD uses to determine whether an input is anomalous asks whether an input message m of k bits can be encoded completely into a cryptogram Gm = c of k-n bits. This could be restated in terms of a bound not unlike the Gilbert-Varshamov bound.

In the original paper’s examples, and in their provided software, k = 5 qubits and n = 2 discarded qubits. These quantities are fixed for the purposes of their experiments.

However, if the VQAD approach is inherently useful then it should be expected to scale beyond these 5 qubits. In general, the layers of G become G(k, n)L=layer. Here I’ve moved the bottleneck (measured) qubits to the lowest order indexes for convenience.

G concludes with the bottleneck B.

Therefore, we might define a meaningful limit for VQAD using the following approach.

Instead of using the linear code rate Rk to identify when the rate is less than the binary entropy, we can use the cost function C to identify when the mapping from input to output is lossy. This serves our purpose since the cost function the authors defined is minimized when k input bits are mapped cleanly into k-n output bits. Since we are not discussing a linear map, it is of no concern whether the inputs are mapped to sufficiently well spaced measurement outcomes – only that they are mapped to measurement outcomes whose cost is low.

We can also make use of Hk, the Shannon entropy of the input space. We do want to account for each element of the input space in the map.

We want to see that the weight |M| of the input space M is optimized to within an error . The mathematical expression of our proposed limit is the following.

Unpacking this equation into meaningful steps would allow us to optimize the training of VQAD further by augmenting the training feedback loop:

  1. Perform training circuit execution t, ending with measurement of the k-n encoding qubits.
  2. Observe the cost of the circuit at this time, and set  = C. The circuit currently supports e-encoding. Note that this epsilon will not exactly equate to the usual definition at the onset of training. However, as the model parameters attenuate through this training loop, we would expect to see a convergence with the usual epsilon.
  3. Evaluate the weight of the input space seen thus far. How does it compare to the Shannon entropy? During any training step, the input space is being cumulatively constructed and contains t bitstrings. If during any training loop t, VQAD sees a message (a ground state input corresponding to a measurement outcome) that causes the cumulative “seen” subspace M^(t) to cease to respect this limit, then the hyperparameters k, n, L and local parameters require further attenuation. M is itself a subspace of the space of all possible many-body configurations of the simulation qubits – M is composed of the anomaly-free configurations.
  4. Perform parameter estimation not only to minimize the cost but also to observe the limit.
  5.  If the limit is respected after parameter estimation, increase k, n, L

This could provide a general approach for tuning not only the parameters of the VQAD circuit but also the size (in terms of inputs and outputs) and the number of layers to the problem.

There are interesting unexplored connections between this work and quantum error correction [8]. There is also a potentially interesting and unexplored connection to entanglement distillation, which is a fundamental component of several quantum-cryptographic communication schemes [9].

It is also open to explore whether the VQAD algorithm is useful strictly as a quantum algorithm, or whether there may be a classical or quantum-inspired version of the approach that would benefit from the mathematics of the encoding methodology. One might compare the performance of the simulated VQAD algorithm to other existing vector encoders in order to assess this.

Finally, it would be interesting to look at the algorithm as an encoder more rigorously, from a theoretical standpoint. We might consider its capabilities in the context of different message spaces and codes. How would the minimum distance of a code be limited by this architecture, and the noise levels in existing quantum computer? What code rates and bounds could be derived in theory and achieved in practice?


[1]  J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost, N. Wiebe, and S. Lloyd, “Quantum machine learning,” Nature, vol. 549, no. 7671, pp. 195–202, Sep. 2017. [Online]. Available: https://doi.org/10.1038/nature23474

[2]  K. Kottmann, F. Metz, J. Fraxanet, and N. Baldelli, “Variational Quantum Anomaly Detection: Unsupervised mapping of phase diagrams on a physical quantum computer,” arXiv e-prints, p. arXiv:2106.07912, Jun. 2021, _eprint: 2106.07912.

[3]  A. Trabesinger, “Quantum simulation,” Nature Physics, vol. 8, no. 4, pp. 263–263, Apr. 2012, bandiera_abtest: a Cg_type: Nature Research Journals Number: 4 Primary_atype: Editorial Publisher: Nature Publishing Group Sub ject_term: Quantum information Subject_term_id: quantum-information. [Online]. Available: https://www.nature.com/articles/nphys2258

[4]  R. Wu, J. Zhang, C. Li, G. Long, and T. Tarn, “Control problems in quantum systems,” Chinese Science Bulletin, vol. 57, no. 18, pp. 2194–2199, Jun. 2012.

[5]  K. Sugimoto, S. Ejima, F. Lange, and H. Fehske, “Quantum phase transitions in the dimerized extended Bose-Hubbard model,” \pra, vol. 99, no. 1, p. 012122, Jan. 2019.

[6]  S. R. White, “Density matrix formulation for quantum renormalization groups,” \prl, vol. 69, no. 19, pp. 2863–2866, Nov. 1992.

[7]  S. Bravyi, S. Sheldon, A. Kandala, D. C. Mckay, and J. M. Gambetta, “Mitigating measurement errors in multiqubit experiments,” \pra, vol. 103, no. 4, p. 042605, Apr. 2021, _eprint: 2006.14044.

[8]  C. H. Bennett, D. P. DiVincenzo, J. A. Smolin, and W. K. Wootters, “Mixed-state entanglement and quantum error correction,” Physical Review A, vol. 54, no. 5, pp. 3824–3851, Nov. 1996, publisher: American Physical Society (APS). [Online]. Available: http://dx.doi.org/10.1103/PhysRevA.54.3824

[9]  R. Renner, “Security of Quantum Key Distribution,” arXiv:quant- ph/0512258, Jan. 2006, arXiv: quant-ph/0512258. [Online]. Available: http://arxiv.org/abs/quant-ph/0512258

Speeding up Control-Z gates on a fluxonium quantum computer

Title: Fast logic with slow qubits: microwave-activated controlled-Z gate on low-frequency fluxoniums

Authors: Quentin Ficheux, Long B. Nguyen, Aaron Somoroff, Haonan Xiong, Konstantin N. Nesterov, Maxim G. Vavilov, and Vladimir E. Manucharyan

First Authors’ Institution: Department of Physics, Joint Quantum Institute, and Center for Nanophysics and Advanced Materials, University of Maryland

Status: Preprint: https://arxiv.org/abs/2011.02634

We exist in the era of noisy intermediate-scale quantum (NISQ) processors [1], currently available in the form of two 53-qubit processors made by IBM and Google. These are very promising for simulating many-body quantum physics, as was recently demonstrated when Google’s Sycamore processor claimed a “quantum advantage” [2]. NISQ processors, however, are still limited in processing power due to their small size and the presence of noise in quantum gates.

The commonality in current NISQ processors is their qubit implementation: the transmon. The transmon is composed of a capacitor in series with a Josephson junction (Fig 1a), effectively a weakly-anharmonic electromagnetic oscillator (Fig 1b). First demonstrated in 2007 [3], it has been widely adopted in many-qubit processors as it is a very simple design to implement. The transmon’s weak anharmonicity, however, is its limiting factor for current performance and further scaling.

Figure 1 (a) Transmon circuit schematic and (b) potential structure with overlaid energy levels [4], as compared to (c) Fluxonium circuit schematic and (d) potential structure with overlaid energy levels [4]. (e) Frequency vs. flux bias for a two-fluxonium system [this work], where the minimum in 00 – 10 transition corresponds to the “sweet spot” in flux bias.

There are many promising alternatives to the transmon, Fluxonium being a favorite because of its incredible high anharmonicity and subsequent long coherence times (with observed T_{2} up to 500\mu s [5, 6]). Fluxonium has similar elements to the transmon but is additionally shunted with a large inductor (Fig 1c) attributing to its highly anharmonic spectrum (Fig 1d) and allowing it to be insensitive to offset charges [4]. Further, you can tune its resonant frequency to the so-called “sweet spot” in flux bias where it is first-order insensitive to flux noise (Fig 1e). Such a coherent and noise insensitive qubit would be ideal for scaling up quantum processors, right? So, why doesn’t a fluxonium quantum computer exist yet? It turns out, this noise insensitivity is exactly the problem. Let me explain!

Let us first consider the simplest form of circuit-circuit coupling: mutual capacitance (Fig 2a). With two fluxonium circuits, the coupling term is proportional to n_{a}n_{b}, where n_a and n_b are the amount of charge across the Josephson junctions in each circuit. The capacitive coupling produces little effect on the computational states |00\rangle, |01\rangle, |10\rangle, |11\rangle , since transition matrix elements of n_a vanish with the transition frequency. The |10\rangle and |20\rangle states will also remain unaffected due to parity selection rules. The states that will be affected are the higher energy non-computational states |12\rangle and |21\rangle which have higher transition frequencies, meaning the n_a transition matrix elements should be more dominant, causing a significant level repulsion, \Delta (see Fig. 2b). This level repulsion becomes key in connecting the two fluxonium subspaces, inducing an on-demand qubit-qubit interaction.

Figure 2 (a) Image of two fluxonium coupled on-chip by a mutual capacitance; see also a zoomed in image of the large inductive element composed of >100 Josephson junctions [this work]. (b) The lowest energy levels for this two-fluxonium system. Pink arrows describe transitions at frequencies f_{10-20} and f_{11-21}, which are detuned by \Delta. This detuning is due to repulsion between |12\rangle and |21\rangle caused by the n_{A}n_{B} coupling.

In a previous paper [7], the authors describe how one can use these coupled subspaces to perform a microwave activated control-Z (CZ) gate between two fluxoniums by applying a 2pi-pulse between the |11\rangle and |21\rangle states.

When one applies a CZ gate, if qubit 2 is in the ground (0) state, this transition on qubit 1 will not occur. However, if qubit 2 is in the excited (1) state, this transition on qubit 1 will occur! You can now readout the state of qubit 2 and infer the state of qubit 1! Read more about quantum logic gates here.

The nearby transition |10\rangle to |20\rangle will stay unexcited as long as the gate pulse is much longer than 1/\Delta. If the gate pulse is applied over a short duration, one will have unwanted leakage to the nearby transition. Although the prospect of a CZ gate between fluxoniums is attractive, for the device parameters used in this work, \Delta= 22 MHz and a high-fidelity CZ gate would require ~450 ns. For perspective, a transmon-transmon CZ gate is on the order of 10s of ns. We can now see that insensitivity to offset charges makes fluxonium relatively insensitive to capacitive coupling, ie. the repulsion \Delta is very small. This causes gate times between two fluxonium to be very long, making them less attractive for large scale processors.

Is it possible to speed up this gate without significantly decreasing gate fidelity via leakage to the nearby |20\rangle state? This question leads us to the most impressive result of this work: exact leakage cancellation by synchronized Rabi oscillations!

If we apply a constant drive tone f_{d} at the transition frequency f_{11-21}, we will observe Rabi oscillations between the |10\rangle and |20\rangle states (the state vector traces a circle along the Bloch sphere from the bottom of the sphere, to the top, and to the bottom again). If our drive is slightly detuned from the transition frequency, f_{d} = f_{11-21} - \delta , the circle traced by the Bloch vector shifts such that it doesn’t make it all the way to the excited state (a review of Rabi oscillations can be found here). Since the detuning between the f_{11-21} and f_{10-20} transitions, \Delta, is very small, the authors were able to choose a drive frequency f_{d} which is near both transition frequencies (Fig. 3a). Since we are driving near both of these transitions, we see Rabi oscillations in both of these two level systems (Fig. 3b).

Fig. 3 (a) signal vs. drive frequency. Note a strong signal response for transition frequencies f_{11-21} and f_{10-20}, which are separated by \Delta. The Rabi drive tone f_{d} is detuned from f_{11-21} by \delta such that full Rabi oscillations for both transitions are completed in the same amount of time. (b) Cones traced by each state vector on the Bloch sphere due to the applied Rabi drive tone, f_{d}. Note the projection of these paths trace circles in opposite directions. (c) Optimal gate time of ~61 ns was chosen to minimize infidelity, leakage error, and phase error.

What is really interesting is that the detuning d from f_{11-21} can be chosen such that the circles traced by both Rabi oscillations are completed in the same amount of time, t, which is exactly equal to 1/\Delta. Now, we are able to perform a 2\pi rotation on both states simultaneously, but how does this help us perform a selective CZ gate on just |11\rangle to |21\rangle ? The key is to visualize how these two Rabi oscillations are evolving the state vector on the Bloch sphere.

As observed from the center of the Bloch sphere, one will notice that the two oscillating state vectors travel in different directions and define two distinct cones inside the spheres (see Fig. 3b again, noting the projection of the paths). These cones define the solid angles \Theta_{10} = 2 \pi (1-(\Delta-\delta)/ \Omega ) and \Theta_{11} = 2\pi (1+\delta/ \Omega ) , corresponding to a “geometric phase” accumulation \Phi_{ij}=\Theta_{ij}/2 for each system (you can read more about geometric phase here, but essentially it arises from the fact that the state vector traces a closed loop!).

Since these two transitions are now distinguishable by their geometric phase, the authors can then apply a unitary operation U = diag(1, 1, 1, e^{i\Delta\Phi}) to the states. This is effectively the same as assigning a phase difference \Delta\Phi between two trajectories to realize a control-Phase operation (again, you can review quantum logic gates here)! Therefore, a CZ gate is obtained when \Delta\Phi = -(\Theta_{11}-\Theta_{10})/2 = -\pi\Delta/\Omega = \pm\pi !

Basically, the f_{11-21} and f_{10-20} transitions are very close in frequency space, but when their Rabi oscillations are synchronized in time, they become distinguishable by their geometric phase accumulation. A CZ gate becomes possible in a time as short as 1/\Delta if a control-Phase operation is also applied! In this work, the theory is verified by simulation of the complete system hamiltonian and verified by experiment! The authors state that this procedure can readily be extended to any other phase accumulation, an exciting result that can be further studied in other systems.

Now that this leakage cancellation has been performed, the authors determine the shortest gate time possible with sufficient fidelity by using Optimized Randomized Benchmarking over a variety of pulse parameters. Given their specific device parameters for their two fluxoniums (see their paper for these details), this results in an optimal gate time of ~ 61 ns. This is a huge improvement over the ~ 450 ns required without any leakage cancellation! Further, the ratio of coherence time : gate time (347 \mu s : 61 ns) is unmatched across quantum computing platforms (to the best of the authors’ knowledge).

Final Remarks
Typically, a two-fluxonium system would have very long gate times. While it can have very high coherence, slow gates make it less ideal for large scale quantum processors. However, one can engineer the system using synchronized Rabi oscillations and a control-Phase operation to significantly shorten CZ gate times. By doing this, the authors demonstrated the best ratio of gate speed to coherence time that we know of to-date! Even though these fluxonium are about fifty times slower than transmons (ie. their coherence is about fifty times longer), the two-qubit gate is faster than microwave-activated gates on transmons, with gate error comparable to the lowest reported. Further work can be done by testing this procedure on other phase accumulation processes and in other two-qubit systems.

The only remaining factor preventing the development of large scale fluxonium processors is fabrication. A simple meandering inductor is physically limited by its maximum impedance. Instead, one can either use an array of hundreds of Josephson Junctions (as in this paper, Fig. 2a) or a NbTiN nanowire [8] to create this large inductive element. Current limitations in fabrication make fluxonium much more difficult to create than a transmon; however, we can expect fabrication techniques and equipment to improve in the future, making fluxonium a more viable option for scaling up NISQ processors!

[1] Preskill, John, “Quantum Computing in the NISQ era and beyond”, https://arxiv.org/abs/1801.00862

[2] Arute, F., Arya, K., Babbush, R. et al., “Quantum supremacy using a programmable superconducting processor”, https://www.nature.com/articles/s41586-019-1666-5

[3] Koch, J., Yu, T. M. et al, “Charge insensitive qubit design derived from the Cooper pair box”, https://arxiv.org/abs/cond-mat/0703002

[4] Masluk, Nicholas Adam, “Reducing the loss of the fluxonium artificial atom”, https://qulab.eng.yale.edu/documents/theses/Masluk,%20Nicholas%20A.%20-%20Reducing%20the%20losses%20of%20the%20fluxonium%20artificial%20atom%20(Yale,%202012).pdf

[5] Nguyen, L. B. et al., “The high-coherence fluxonium qubit”, https://arxiv.org/abs/1810.11006

[6] Zhang, H. et al., “Universal fast flux control of a coherent, low-frequency qubit”, https://arxiv.org/abs/2002.10653

[7] Nesterov, K. N., Pechenezhskiy, I. V., Wang, C., Manucharyan, V. E., and Vavilov, M. G., “Microwave-activated controlled- Z gate for fixed-frequency fluxonium qubits”, https://arxiv.org/abs/1802.03095

[8] Hazard, T. M. et al., “Nanowire superinductance fluxonium qubit”, https://arxiv.org/abs/1805.00938

Just How Much Better is Quantum Machine Learning than its Classical Counterpart?

Title: Information-theoretic bounds on quantum advantage in machine learning

Authors: Hsin-Yuan Huang, Richard Kueng, John Preskill

First Author’s Institution: California Institute of Technology

Status: Pre-print on arXiv

In the past few years, machine learning has proven to be an extremely useful computational tool in a variety of disciplines. Examples include natural language processing, image recognition, beating players at Go, quantum error correction, and shifting through massive quantities of data at CERN. As is often the case, we may choose to ponder whether replacing classical algorithms by their quantum counterparts will yield computational advantages.

The authors consider the complexity of training classical machine learning (CML) and quantum machine learning (QML) algorithms on the class of problems that map a classical input (bit string) to a real number output through any physical process (including quantum processes). The goal of the ML algorithms is to learn functions of the following form: f(x) = \text{tr}(O \mathcal{E}(\left|x\right>\left<x\right|)) . Unpacking the notation, we see that given some input bit string x\in \{0,1\}^n that designates an initial quantum pure state, we get an output by processing the input through a quantum channel \mathcal{E} and then determining the expectation value of an operator O with respect to the output of the channel (i.e. quantum evolution of the initial state). The ML algorithms will attempt to predict the expectation value of O after training on many input bit strings and channel uses. You may wonder whether a single bit string is a sufficiently complicated input, but recall that all of the movies you watch, the articles on Wikipedia you read, and the music you play on your computer are represented by bit strings. Given a sufficiently long input bit string, we can feed into our algorithm arbitrarily long and precise classical data. What is very interesting about this class of problems is that we can describe quantum experiments with many different input parameters in this language, meaning the ML algorithms are being trained to predict the outcome of (potentially very complicated) quantum physical experiments. This could include predicting the outcome of quantum chemistry experiments, determining the ground state energy of a system, or predicting a variety of other observables from a myriad of physical processes.

As is often the case, there isn’t a single clear answer to the question posed, but the authors discuss two scenarios of interest. First, a secnario in which QML models pose no significant advantage over CML models and another in which QML models provide exponential speedup. According to the paper, if one consider’s minimizing average prediction error (over a distribution of all the inputs), then QML does not provide a significant advantage over classical machine learning. However, if one is interested in minimizing worst-case prediction error (on the single input with the worst error), then quantum machine learning requires exponentially fewer runs of the quantum channel.

What are the ML Models of Interest?

The goal of supervised machine learning is to use a large quantity of input data and a corresponding set of outputs to train a machine to generate accurate predictions from new data. In this paper, the objective is to see how many times it is necessary to run a physical process \mathcal{E} such that the machine can accurately predict the results of the experiments to some tolerance. The scaling of the number of times the quantum process \mathcal{E} is used in the training phase of the algorithm defines the complexity of the algorithm. Each of the ML algorithms generates a predictive function h(x) , where the prediction error on each input is given by \left|h(x_i)-f(x_i)\right|^2 . If given a probability distribution \mathcal{D}(x) over all possible inputs, we would like to figure out the number of times that \mathcal{E} must be run to either produce an average-case prediction error of \sum_{x\in\{0,1\}^n} \mathcal{D}(x)\left|h(x)-\text{tr}(O\mathcal{E}(\left|x\right>\left<x\right|))\right|^2 \leq \epsilon or a worst-case prediction error given by \text{max}_{x_i\in\{0,1\}^n}\left|h(x_i)-f(x_i)\right|^2\leq \epsilon .

The quantum advantage arises only when one considers the worst-case prediction error instead of the average prediction error.

It is very reasonable to think quantum machine learning would be better than classical machine learning when trying to predict the outputs of quantum experiments, but the main result throws that assumption into question. We begin by explaining how QML and CML are defined.

Figure 1: Illustration of classical and quantum machine learning models (Figure 1 from article)

Classical ML models are composed of a training phase that consists of taking a randomized set of inputs x_i and performing quantum experiments defined by a physical map (a CPTP map) from a Hilbert space of n qubits to one of m qubits that yields an output quantum state for each input. However, CML needs to be trained on classical output and so a POVM measurement (which is a generalization of a projective measurement) is performed that yields an output o_i \in \mathbb{R} . Any classical machine learning algorithm (neural networks, kernel methods, etc.) can then be applied to generate the predictive function h(x) that approximates f(x) (the true function) from the set of pairs of inputs and classic outputs \{(x_i,o_i)\}_{i=1}^{N_c} . Here N_c denotes the number of times that applying the channel \mathcal{E} is required for CML since it is only possible to obtain one input/output pair per channel usage. The authors present a further restricted classical ML algorithm, which is identical to the general case, but where instead of doing arbitrary measurements to generate the outputs, only the target observable O is directly measured.

A quantum ML model has the same goal as the classical models, but has the added advantage that the algorithm can be trained directly on quantum data. The general flow is that given an initial state \rho_0 on many qubits, the quantum channel \mathcal{E}\otimes \mathbb{I} is applied N_Q times with quantum processing maps \mathcal{C}_i inserted in between \mathcal{E} applications. The quantum data processing maps take the place of the classical machine learning algorithms. The resultant state that is stored in quantum memory is the learned prediction model. To predict new results, the final state after all of the channel applications and processing (\rho_{Q} ) is measured differently based on the desired input bit strings \tilde{x}_i . The outcomes of these measurements are the outputs of the predictive function h_Q(\tilde{x}_i) that approximate the desired function f(\tilde{x}) .

The Meat of the Paper

Average-case Prediction Error Result

The main results of the paper are presented in the form of theorems. The first theorem shows that if there exists a QML algorithm with an average prediction error bounded by \epsilon that requires N_Q uses of the channel \mathcal{E} , then there exists a restricted CML algorithm with order \mathcal{O(\epsilon)} average prediction error that requires \mathcal{O}\left(\frac{m N_q}{\epsilon}\right) channel uses. The training complexity of the classical algorithm is directly proportional to that of the quantum model up to a factor of \frac{m}{\epsilon} where m is the number of qubits at the output of the map \mathcal{E} .

Sketch of First Theorem Proof

Consider an entire family of maps \mathcal{F} that contains the map \mathcal{E} of interest. Cover the space of the family of maps with a net by choosing the largest subset of maps \mathcal{S}=\{\mathcal{E}_a \}_{a=1}^{|\mathcal{S}|} \subset \mathcal{F} such that the functions generated by different maps f_a(x) = \text{tr}\left(\mathcal{O} \mathcal{E}_a (\left|x\right>\left<x\right|)\right) are sufficiently distinguishable over the distribution of inputs \mathcal{D}(x) . (In the paper they take the average-case error between any two different elements of \mathcal{S} to be at least 4\epsilon ).

The proof proceeds as a communication protocol that starts by having Alice choose a random element of the packing net (\mathcal{E}_a \in \mathcal{S} ). Alice then prepares the final quantum state \rho_{\mathcal{E}_a} by using the chosen channel N_Q times and interleaving those uses with the quantum data processing maps \mathcal{C}_i . This results in the state the QML algorithm is supposed to generate to make predictions through measurements. Alice then sends \rho_{\mathcal{E}_a} to Bob and hopes he can determine what element of the packing net she initially chose by using the predictions of the QML model. Bob can then generate the function h_{Q, a}(x) that approximates f_{a}(x) with low average-case prediction error by construction. Moreover, because the different elements of the packing net are guaranteed to be far enough apart, Bob can with high-probability determine which element Alice chose. Assuming Bob could perfectly decode which element Alice sent, Alice could transmit log|\mathcal{S}| bits to Bob. With this scheme we expect the mutual information (a measure of the amount of information that one variable can tell you about another) between Alice’s selection and Bob’s decoding to be on the order of \Omega(log|\mathcal{S}|) bits. The authors now make use of Holevo’s theorem, which provides an upper bound on the amount of information Bob is able to obtain about Alice’s selection by doing measurements on the state \rho_{\mathcal{E}_a} . This is on the order of the previously given mutual information, however, it is also possible to relate the Holevo quantity directly to the number of channel uses N_Q required to prepare the state Bob receives. Through an induction argument in the Appendix, the authors show the Holevo information is upper bounded by \mathcal{O} (m N_Q) . From this it follows that the number of channel uses for the quantum ML algorithm is lower-bounded by N_Q = \Omega( \frac{log |\mathcal{S}|}{m}) .

All that’s left is to relate the number of channel uses required for the QML model to the necessary number needed for the classical machine learning algorithm. A restricted CML is constructed such that random elements x_i are sampled from \mathcal{D}(x) and the desired observable is measured after preparing \mathcal{E}(\left|x_i\right>\left<x_i\right|) each time. As mentioned earlier, N_c experiments are performed. The pairs of data \{(x_i,o_i)\}^{N_c} are used to perform a least-squares fit to the different functions f_a(x) of the packing net, which are sufficiently distinguishable as defined earlier. Therefore, given a sufficiently large number of data points, it is possible to determine which element of the packing net was used as the physical map from the larger family of maps. The authors show that if the number of points obtained N_c is on the order of \mathcal{O}\left(\frac{log|S|}{\epsilon}\right) , it is possible to find a function that achieves an average-case prediction error on the order of \epsilon . Relating N_c and N_Q through log|\mathcal{S}| directly yields that N_c = \mathcal{O}\left(\frac{m N_Q}{\epsilon}\right) . Hence, if there is a QML model that can approximately learn f(x) , then there is also a restricted CML model that can approximately learn f(x) with a similar number of channel uses. This means that there is no quantum advantage in terms of the training complexity when considering minimum average-case prediction error as any QML model could be replaced by a restricted CML model that achieves comparable results.

Worst-case Prediction Error Result

The second main result states that if we instead consider worst-case prediction error, then an exponential separation appears between the number of channel uses necessary in the QML and CML cases. This is shown in the paper through the example of trying to predict the expectation values of different Pauli operators for an unknown n-qubit state. As a refresher, the Pauli operators (I, X, Y, \text{and } Z ) are 2×2 unitary matrices that form a basis for Hermitian operators, which correspond to observable operators in quantum mechanics. Since any Hermitian operator can be decomposed in terms of sums of Paulis, it is natural to wonder about the different expectation values of each Pauli operator given an unknown state. Given a 2n -input bit string we can specify an n -qubit Pauli operator P_x and the channel \mathcal{E} , which both generates the unknown state of interest and maps P_x to a fixed observable O . Therefore, the function we are getting the ML model to learn is f(x) = \text{tr}\left(\mathcal{O} \mathcal{E}_{\rho} (\left|x\right>\left<x\right|)\right) = \text{tr}\left(P_x \rho\right) . The authors show that by cleverly breaking the QML model into two stages, the first of which is estimating the magnitude of the expectation value (|\text{tr}(P_x \rho)| ) and the second estimating the sign of the expectation value, only N_Q = \mathcal{O}\left(\frac{log(M)}{\epsilon^4}\right) channel uses are necessary to predict expectation values of any M Pauli observables. Therefore it is possible to predict all 4^n Pauli expectation values up to a constant error with only a linear scaling in the number of qubits (\Omega(n) ). On the other hand, the paper shows the lower bound for estimating all the expectation values of the Pauli observables using classical machine learning is 2^\Omega(n) . Therefore, there is an exponential separation between QML and CML models in the number of channel uses necessary for predicting all Pauli observables in the worst-case prediction error scenario. The authors numerically demonstrate the difference in complexity between the different types of algorithms and show the separation clearly exists for a class of mixed states, but vanishes for a class of product states.


Classical machine learning is much more powerful than may be naively assumed. For average-case prediction error, CML models are capable of achieving a comparable training complexity to QML models. This means we may not need not to wait for QML to predict the outcomes of physical experiments. Performing quantum experiments and using the output measurements to train CML models is a process that can be implemented in the near-term. In the case of approximately learning functions it appears that using a fully quantum model does not provide a significant advantage. However, the authors did demonstrate an interesting class of problems in which QML models are capable of exponentially outperforming even the best possible classical algorithm. The question is whether there are interesting tasks that require having a minimum worst-case prediction error, where learning the function over a majority of inputs does not suffice. I leave it to the reader as an exercise to search out other interesting learning tasks where quantum advantage is achieved and where the classical world of computing does not suffice.

Ariel Shlosberg is a PhD student in Physics at the University of Colorado/JILA. Ariel’s research focuses on quantum error correction and finding quantum communication bounds and protocols.

Landau-Zener interference: a “beam splitter” for controlling composite qubits

Title: Universal non-adiabatic control of small-gap superconducting qubits

Authors: Daniel L Campbell, Yun-Pil Shim, Bharath Kannan, Roni Winik, David K. Kim, Alexander Melville, Bethany M. Niedzielski, Jonilyn L. Yoder, Charles Tahan, Simon Gustavsson, and William D. Oliver

Status: Published 14 December 2020 on Phys. Rev. X 10, 041051

Are two qubits better than one? In this QuByte, we will be looking at a new variation of superconducting qubits, proposed by the EQuS lab at MIT. The new qubit, referred as a superconducting composite qubit (CQB), is made up of two coupled transmon qubits. The authors of the paper answered the question in the affirmative: the composite qubit is more resilient to environmental noise permitting a longer qubit lifetime than a single transmon qubit. Moreover, the fast and high-fidelity gate operations in the composite qubit utilizing Landau-Zener interference require less microwave resources compared with standard on-resonance Rabi drive techniques. In this QuByte, we review the mechanism of Landau-Zener (LZ) interference, and elucidate its role in qubit state initialization and gate implementation.

The composite qubit itself

Optical image of two composite qubits, \phi_i (i=1,2,3,4) is the reduced magnetic flux that determines the frequency of each transmon, controlled in real time by the arbitrary waveform generators (AWGs); \Omega_r and \Omega_{QB} are the microwave control fields for qubit state readout and initialization, respectively.

The composite qubit is formed by two transmons capacitively coupled together. When using a single transmon as a qubit, its energy eigenstates are used to encode information so that|0\rangle=|g_i\rangle and |1\rangle=|e_i\rangle where i denotes the transmon index. A single transmon qubit has frequency \omega_i–that is, the energy difference between its ground state |g_i\rangle and the excited state |e_i\rangle. This frequency \omega_i is controlled by the magnetic flux \Phi_i threading the SQUID loop of the transmon circuit. In the figure below, the dashed lines show the transmon frequency as a function of the reduced magnetic flux \phi_i=\Phi_i/\Phi_0 where \Phi_0 is the magnetic flux quantum. As two transmons are flux-tuned to \phi_1=-\phi_A^* and \phi_2=\phi_A^* they should have the same frequency \omega_1=\omega_2=\omega_A^*. But instead of their energy levels crossing an energy gap called an “avoided crossing”appears for the coupled transmons (solid lines in the plot below). The magnitude of the avoided crossing \Delta is 65 MHz in this device, and it depends on the coupling strength between two transmons. (For an intuitive and pictorial explanation of avoided crossings, I recommend this wonderful article with a classical example of coupled mechanical oscillators.)

Qubit energy spectrum as a function of the flux biases; dashed lines: frequencies of bare transmon states (diabatic states); solid lines: frequencies of the coupled two-transmon states (adiabatic states). The avoided crossing of size \Delta=65 MHz appears at \varphi_{2}=2 \varphi_{\mathrm{A}}^{*}+\varphi_{1} when two transmons are biased at the same frequency \omega_{1}=\omega_{2} \equiv \omega_{\mathrm{A}}^{*}.

At the avoided crossing, the eigenstates of the system are the equal superposition states of bare transmon states which we use to define a computational basis of a qubit: |0\rangle,|1\rangle=|g_1,e_2\rangle \pm |e_1,g_2\rangle. By this definition, the composite qubit has the frequency of the gap \Delta=65 MHz.

The composite qubit device emulates the following qubit Hamiltonian: H(t)/\hbar=-\frac{\Delta}{2} \sigma_z + \frac{\epsilon(t)}{2} \sigma_x, with states |0\rangle and |1\rangle associated with the eigenstates of \sigma_z, and the bare transmon states |g_1,e_2\rangle and |e_1,g_2\rangle associated with the eigenstates of \sigma_x. The parameter \epsilon(t) in the second term is determined by the frequency difference between the two bare transmons \epsilon(t)=\omega_1(t)-\omega_2(t), and is controlled by varying the flux biases on the individual transmons simultaneously as a function of time. As we expect, at the composite qubit operating point \epsilon(t)=0, the qubit exhibits an energy splitting of \Delta.

Landau-Zener interference

To understand how gates are implemented in this new qubit, we must first understand Landau-Zener interference. For a system described by the aforementioned Hamiltonian H(t), let’s denote its instantaneous eigenstates as \psi_-(t) and \psi_+(t), corresponding to the lower and the higher energy eigenstates, respectively. The adiabatic theorem tells us that if the qubit state is initialized in one of the eigenstates, say \psi_-(t), and if the time-dependent term in the Hamiltonian \epsilon(t) changes infinitely slowly, then the qubit always remains in that eigenstate \psi_-(t) throughout the evolution. However, If \epsilon(t) is varied such that the system traverses the avoided level crossing region in a finite time, a transition between two energy levels can occur and the final state becomes a linear combination of two instantaneous eigenstates. This transition between two energy levels that takes place while traversing the avoided crossing is called the Landau-Zener transition. It acts as a coherent beam splitter for a qubit state. The transition probability P_\mathrm{LZ} from state \psi_-(t) to \psi_+(t) is defined P_\mathrm{LZ}=|\alpha|^2=\exp\Big(-2\pi\frac{\Delta^2}{\hbar v}\Big) and depends on the size of the avoided crossing \Delta as well as the “velocity” of traversing the avoided crossing region v\equiv \dot{\epsilon}(t). (For a pedagogical derivation of this formula, see Vutha). Moreover, if the \epsilon(t) is varied periodically such that the system traverses the avoided crossing multiple times, a sequence of LZ transitions can be induced. The phase accumulated between successive LZ transitions can constructively and destructively interfere in a controlled manner, which can be used to create a general superposition state.

In the following, we will see, the adiabatic evolution (i.e. varying \epsilon(t) slowly) is used for state initialization. The non-adiabatic LZ transitions induced by quickly varying \epsilon(t) at the avoided crossing are used to implement gates to modify quantum states.

Qubit state initialization

To use the composite qubit to encode information, one needs to be able to initialize it into the computational states |0\rangle or |1\rangle. The state initialization protocol makes use of adiabatic evolution: we vary the system (\epsilon(t) in this case) very slowly such that if the system starts in an eigenstate of the initial Hamiltonian, it ends in the corresponding eigenstate of the final Hamiltonian. Let’s go through it step by step, and it will become clear exactly how slow the change in \epsilon(t) needs to be.

Energy levels of transmons 1 and 2 in the presence of a coherent microwave field with frequency \omega_{QB} and amplitude \Omega_{QB}.
  1. Start with both transmons in their ground states |g_1,g_2\rangle (corresponds to the filled in black circle in the figure above). This is achieved by waiting a sufficient time for the transmons reach the thermal equilibrium as the transmon temperature (around 40 mK at the bottom of the dilution fridge) is much smaller than their energy gaps (around 7 GHz for both transmons) so that k_{B} T \ll \hbar \omega_{i} and the transmon thermal state is approximately its ground state. Then turn on a static coherent microwave drive field of frequency \omega _{QB}; this drive hybridizes the states \left|g_{1}, g_{2}\right\rangle and \left|g_{1}, e_{2}\right\rangle as shown in the diagram above, and causes a splitting with magnitude 2\hbar\Omega_{QB} in the qubit energy spectrum. This splitting is known as the Autler-Townes splitting, and describes how the energy spectrum of an atom is modified when an oscillating electric field is close to resonance with the atom transition frequency.
  2. Sweep the frequency \omega_2 of transmon 2 by tuning its flux bias \phi_2. Let \epsilon_{QB} be the detuning between transmon 2 frequency and the drive frequency \epsilon_{QB}=\omega_2-\omega_{QB}, and sweep \omega_2 through the drive frequency slowly such that the transmon state adiabatically evolves to \left|g_{1}, e_{2}\right\rangle (to the filled in purple circle in the figure above). The final occupation probability of state \left|g_{1}, e_{2}\right\rangle (i.e., the probability of remaining in the upper energy level during the sweep) is given by one minus the Landau-Zener transition probability P_{g_1,e_2}=1-e^{-2\pi\Omega_{QB}^2/\dot{\epsilon}_{QB}}. Thus we see that high-fidelity state initialization requires that the change in transmon 2 frequency be slow enough such that \dot{\epsilon}_{Q B}(t) \ll 2 \pi \Omega_{Q B}^{2}.
  3. Turn off the drive, adiabatically tune the transmons in fluxes to the composite qubit operating point \phi_1=-\phi_2=\phi_A^* so that the qubit state adiabatically evolves to |1\rangle \equiv (1 / \sqrt{2})\left(\left|g_{1}, e_{2}\right\rangle+\left|e_{1}, g_{2}\right\rangle\right) .

Similarly, one can prepare the logical state |0\rangle by adiabatically evolve the state: \left|g_{1}, g_{2}\right\rangle \rightarrow \left|e_{1}, g_{2}\right\rangle \rightarrow (1 / \sqrt{2})\left(\left|g_{1}, e_{2}\right\rangle-\left|e_{1}, g_{2}\right\rangle\right) \equiv|0\rangle. It takes about 250 ns to complete the above state initialization steps; this is mostly limited by how slow the change of the system needs to be to meet the adiabatic condition \dot{\epsilon}_{Q B}(t) \ll 2 \pi \Omega_{Q B}^{2}.

Single-qubit gates

In the following, we will talk about some elementary gates that are used to manipulate the state of a single composite qubit, namely, the X, Y and Z gates. The X and Y gates change the probabilities of measuring the qubit in a state |0\rangle or |1\rangle, whereas the Z gate modifies the relative phase between states |0\rangle and |1\rangle. Their operations can be visualized in the Bloch sphere representation as rotations of the qubit state around the x-, y– and z-axes, respectively. In composite qubits, the single-qubit gates are implemented through the transmon flux controls \phi_i. The physical set-up involves using arbitrary waveform generators (AWGs) to generate electrical current waveforms v(t) in the transmon circuits. This in turn leads to the time-dependent magnetic fluxes to control the transmon frequency difference \epsilon(t) as desired.

The flux control pulse \epsilon(t) to implement Z gates, t_{\Delta} \equiv 2 \pi / \Delta.

A Z-gate implements a rotation around z-axis in the Bloch sphere representation of qubit states. This rotation is described by a unitary operator, parameterized by a rotation angle \theta: Z(\theta)=e^{-i\theta\sigma_z/2}. The Z-gate of the composite qubit is realized by simply idling the qubit (that is, biasing the qubit with a constant magnetic flux) at the avoided crossing \epsilon(t)=0, as shown in the schematic above, and letting the qubit evolve under the Hamiltonian H_z=-\frac{\Delta}{2}\sigma_z for some gate time t_d. Governed by the Schrodinger equation, the evolution of the state from the initial state \psi(t=0) to the final state \psi(t=t_d) after applying Z-gate is given by an unitary operator U(t_d,0), i.e., \psi(t=t_d)=U(t_d,0)\psi(t=0), where U(t_d)=\exp[-iH_z(t_d)] – this is exactly the unitary we need to realize a Z-gate; the gate time t_d controls the rotation angle \theta = \Delta t_d.

\mathbf{X} and \mathbf{Y} gates

In transmon qubits, X and Y gates are conventionally implemented by subjecting a qubit to a continuous microwave driving field on resonance with the qubit. If the qubit has non-degenerate energy levels, under this drive, the probability amplitudes of the qubit state on its ground and excited states oscillate – this oscillation is referred as the Rabi oscillation. For a transmon qubit with a frequency of a couple of gigahertz, the gate time of this type is on the order of tens of nanosecond. How fast and how accurately these gates can be implemented are restricted by the validity of the rotating wave approximation. This approximation allows one to neglect the fast-oscillating terms in treating and describing Rabi oscillations, and it only holds true for qubits with large transition frequencies. For a composite qubit with a small gap on the order of tens of MHz between between its computational states, the Rabi drive will be complicated to deal with, both mathematically and practically, due to the breakdown of this approximation. The alternative solution here, as you may have already guessed, is to use LZ transitions in small-gap qubits to manipulate qubit states.

Take the single-qubit gate X(\frac{\pi}{2})=e^{-i\pi\sigma_x/4} as an example; in the Bloch sphere representation, this gate rotates the qubit state about the x-axis by an angle \theta=\pi/2. The schematic below shows the control \epsilon(t) for its implementation. This protocol is designed to induce non-adiabatic LZ transitions at the avoided crossings to manipulate the qubit’s state. Specifically, the control pulse contains a period of sinusoid pulse \epsilon(t)=\epsilon_p\sin(\omega_p t) with idling operations (i.e., constant flux biases at the avoided crossing, equivalent to Z-gates) padded before and afterwards.

The flux control pulse \epsilon(t) to implement X gates

During the implementation of the gate, the flux controls on two coupled transmons are varied simultaneously with respect to the avoided crossing. Let \delta f_A denotes how much the flux controls \phi_i are biased away from the avoided crossing bias point \phi_A^* (i.e., \phi_{1}=-\phi_{\mathrm{A}}^{*}+\delta f_{\mathrm{A}} and \phi_{2}=\phi_{\mathrm{A}}^{*}+\delta f_{\mathrm{A}}), let \psi_\pm(t) be the instantaneous eigenstates of the time-dependent Hamiltonian H(t) so that H(t) \psi_{\pm}(t)=E_{\pm}(t) \psi_{\pm}(t), and let \Omega(t)=\sqrt{\Delta^{2}+\varepsilon(t)^{2}} be the energy gap between two states. The plot below shows how the qubit energy gap \Omega(t) changes with respect to the flux detuning \delta f_A, which is varied in time as shown in the lower panel of the plot.

Upper: the measured CQB excited state spectroscopy to form a two-level system; lower: non-adiabatic control implemented by a non resonant sinusoidal excursion about the avoided crossing.

The dashed lines in the lower panel marked at 1, 2 and 3 correspond to the start, the middle and the end of the sinusoidal flux tuning. Here is a glance of what happens to the qubit state during this sinusoidal excursion about the avoided crossing \delta f_A=0:

  • At 1 (\delta f_A=0) an LZ transition takes place as the qubit state leaves the avoided crossing.
  • During 1\rightarrow 2 (\delta f_A<0) phase accumulates between the two qubit eigenstates.
  • At 2 (\delta f_A=0) another LZ transition takes place when the qubit traverses the avoided crossing.
  • During 2\rightarrow 3 (\delta f_A>0) phase accumulates between the qubit eigenstates (the same as what happened during 1\rightarrow 2).
  • At 3 (\delta f_A=0) another LZ transition takes place when the qubit returns to the avoided crossing.

To better illustrate this process, let’s redraw the previous plot of the qubit energy levels during the sinusoid pulse as a function of time:

A sketch of the time-dependent qubit energy levels during the sinusoidal flux pulse.

The non-adiabatic LZ transitions induced at the time points 1, 2 and 3 are described by a unitary operator U_{LZ}=\left(\begin{array}{cc}\cos (\theta / 2) \exp \left(i \tilde{\phi}_{S}\right) & i \sin (\theta / 2) \\i \sin (\theta / 2) & \cos (\theta / 2) \exp \left(-i \tilde{\phi}_{S}\right)\end{array}\right). Thus, the occupation amplitude in the upper and lower eigenstates interfere, resulting in the state transitions with the probability P_\mathrm{LZ}=\cos^2\frac{\theta}{2}, and a relative phase \tilde{\phi}_S introduced between \psi_+(t) and \psi_-(t).

During the adiabatic evolution from time 1 to 2 and from time 2 to 3, the upper energy state \psi_+(t) acquires a phase relative to the lower energy level \psi_-(t); no LZ transition is encountered due to adiabaticity. For the evolution 1\rightarrow 2, the phase accumulated is \phi_1=\int_{t_i}^{t_2}\Omega(t)dt. Geometrically this can be interpreted as the area under the curve \Omega(t) (the shaded yellow area above). Similarly, for the evolution 2\rightarrow 3 the accumulated phase is \phi_2=\int_{t_2}^{t_f}\Omega(t)dt (the shaded blue area above).

The process described above is, in effect, an interferometer made out of beam splitters placed at the avoided crossings; the fast changing fluxes near \delta f_A=0 in the sinusoid induce transitions between the upper and lower energy branches just like how a beam splitter splits light into different optical paths. Similarly, the slow changing fluxes away from \epsilon(t)=0 contribute to the phase evolution and set the stage for constructing and destructive interference between successive LZ transitions just like how sources of light creates interference pattern depending on their relative phase. The differences between the two scenarios are as follows: the optical interference is between photons, while here the interference is between quantum states of a superconducting qubit; the optical interference pattern is determined by the optical length, while here the interference happens in the phase space and is determined by the time-dependent qubit energy splitting; and lastly the qubit LZ interferometer is more fragile than an optical interferometer since photon states are very robust to decoherence.

As explained above, the evolution under the sinusoidal flux pulse can be seen as a combination of X-rotations (state transition) and Z-rotations (phase evolution). Padded Z-gates are introduced to cancel out the excess Z-rotation to implement a pure X gate with a desired rotation angle. The start of the X(\frac{\pi}{2}) pulse, once chosen, establishes an x-axis of the Bloch sphere. The Y(\frac{\pi}{2}) gate, a rotation around y-axis, is then implemented by advancing the onset of the X(\frac{\pi}{2}) pulse by the time t_{xy} which corresponds the gate time of a Z(\frac{\pi}{2}) gate, i.e., t_{xy}=t_\Delta/4, as shown below.

The flux control pulse \epsilon(t) to implement Y gates

The frequency of the sinusoidal pulse realized by the flux controls is chosen to be \omega_p/2\pi= 125 MHz (corresponds to a 8 ns sine pulse). The frequency of the sine pulse (thus determines the gate time) is chosen such that the passage through the avoided crossing is fast enough to induce LZ transitions, but also not too fast so fast that successive LZ transitions overlap with each other. The flux control can be easily realized using an arbitrary waveform generator. Compared with the Rabi-type of gates where one needs to modulate the microwave signal at several gigahertz which require expensive microwave generators and IQ mixers, the type of gates here using LZ interference are simpler and cheaper to implement in the hardware.

Let’s overview how quantum computation would proceed using a composite qubit. It begins by initialization the qubit in the computational state |0\rangle or |1\rangle using the adiabatic evolution we described previously via a static microwave field. The single-qubit X and Y gates are implemented using LZ interference, and the Z gate is implemented by the idling operation. To complete a universal gate set for quantum computation, the two-qubit controlled-Z (CZ) gate is also demonstrated in the paper by turning on an effective \sigma_z\otimes \sigma_z interaction between two composite qubits. This interaction is realized by adiabatically tuning the frequency of the second composite qubit to realize an avoided crossing that evolves the second excited state of the bare transmons – this operation is similar to how a CZ gate is implemented between two standard transmon qubits. The composite qubit state is read out by uniquely mapping the computational states |0\rangle and |1\rangle to the bare transmon states \left|e_{1}, g_{2}\right\rangle and |\left.g_{1}, e_{2}\right\rangle by the adiabatic evolution, and then read out through the readout resonators as in standard transom qubits.

Noise immunity

Finally, let me discuss why it is better to use two transmons instead of one as a qubit. Recall we choose the eigenstates of the coupled transmons at the avoided crossing as the qubit states |0\rangle and |1\rangle. Remarkably, this choice of computational basis allows immunity to certain noise process. For example, the processes of thermal excitation and energy relaxation (the former causes the transition |g_i\rangle\rightarrow |e_i\rangle and the latter |e_i\rangle\rightarrow |g_i\rangle) are sources of errors in a single transmon qubit. However, to cause a transition between the qubit computational states |0\rangle and |1\rangle, it takes a correlated excitation and relaxation event to flip the states of both transmons. Flipping the states of both transmons is less likely to happen as it requires a correlated two-photon interaction with the environment; this makes the composite qubit states insensitive to uncorrelated state transitions in single transmons. As a figure of merit for qubit lifetime, the T_1 time measures the timescale on which state transition between qubit states occurs. The measured T_1 of a composite qubit is longer than 2 ms – this is 2 orders of magnitude longer than that of a single transmon. The relaxation process to the state |g_1,g_2\rangle, however, takes the qubit state out of the computation subspace – this happens on the timescale of tens of microseconds (comparable to the T_1 time of the bare transmons). The good news is that this leakage error can be detected by continuously monitoring the readout resonators when biased at the avoided crossing, since the leakage to state |g_1,g_2\rangle will result in a shift on the resonant frequency of the readout resonator; while errors of this sort can occur, they are easily detectable.

Furthermore, the composite qubit is robust to the frequency fluctuations in individual transmons; for example, in a single transmon qubit the frequency fluctuations due to environmental flux noise and the photon shot noise from the readout resonator cause the qubit to decohere. However, when biased at the avoided crossing, the qubit frequency of the composite qubit \Delta is determined by the fixed coupling strength between two transmons, and thus is insensitive to bare transmon frequency fluctuations. Another figure of merit for the qubit lifetime is the qubit decoherence time T_2, which measures the time scale on which the qubit goes from a maximal superposition state to a classical probability mixture. The T_2 time of the composite qubit, measured in a Hahn echo decay experiment, is reported to be greater than 23 \mus which is much larger than the T_2 of 3 \mus for a single transmon qubit.

I hope now you are convinced that it is worthwhile to use two transmons as one qubit to gain protection against certain environmental noise. With the protected computation states come the new challenges in performing state initialization and state manipulate for qubits in low frequency regime. The authors of the paper demonstrate that it is feasible to use adiabatic evolution to initialize states in a qubit with frequency below the environmental temperature, and to use the Landau-Zener interference to perform fast qubit gates.

The solutions demonstrated here can be extended to other types of small-gap superconducting qubits with near-degenerate eigenstates. For instance, the gate operations using the Laudau-Zener interference has been implemented in the early pioneering works on superconducting charge qubits, more recently in fluxoniums, and may be found useful in some new qubit designs, for example, the 0-\pi qubit, the very small logical qubit (VSLQ) design. The results of today’s paper highlight an inexpensive flux control protocol that performs universal control in small-gap superconducting qubits, paving the way to a more scalable hardware architecture as researchers push for larger qubit numbers.

Haimeng Zhang is a PhD student in Electrical Engineering at the University of Southern California. Haimeng’s research focuses on non-Markovian dynamics and quantum error suppression protocols in superconducting qubits.

The first trapped-ion quantum computer proposal

Title: Quantum Computations with Cold Trapped Ions

Authors: Ignacio Cirac, Peter Zoller

Status: Published 1994 in Physical Review Letters

In 1994, theorists Ignacio Cirac and Peter Zoller published a paper that marked the birth of a new field in experimental physics: trapped-ion quantum computing.

The idea that we could use quantum systems to solve some problems more efficiently than classical computers had been around for a while already, but Cirac and Zoller proposed a key component to the physical realization of an actual quantum computer on a trapped-ion system: the two-qubit gate.

Trapped ions were a natural choice for quantum computers because the technology for controlling these systems at the quantum level was already advanced. Laser cooling, a staple technique in atomic physics, was first demonstrated on a cloud of ions, and quantum jumps were first observed in single trapped-ion systems.

So, when buzz about universal quantum computers began, the ion trappers tuned in. They thought they had (or could develop) all of the tools necessary to build the first quantum computer.

There are a few requirements for making a quantum computer, but two of the most fundamental are:

  1. Good qubits with long coherence times relative to the calculation time. This means that:
    • If the qubit is in state |1\rangle it will remain so without decaying to state |0\rangle and vice versa. (In the field of quantum computing, the time it takes for this decay to happen is known as “T1 coherence time” or “energy coherence time.”)
    • If the system is in a superposition state |\psi\rangle = a|1\rangle + b|0\rangle then the phase relationship between the two terms will remained well defined, i.e. there is no “dephasing” noise. The time for an equal superposition state, |+\rangle, to completely dephase to an orthogonal state, |-\rangle, is known as “T2 coherence time” or “dephasing time.”
  2. A way to implement multi-qubit gates. These are the basic building blocks of any computational algorithm. In classical computing this would be like an AND or an OR gate. The quantum version of these gates are a little more complex, however, since the outcome of these gates is often an entangled state among the qubits involved. But you need just one two-qubit gate combined with single-qubit rotations to build a universal quantum computer.

The first point is easy. We just have to define two states in the ion to be the qubit states |1\rangle and |0\rangle. As long as the upper state is long-lived and the qubit is sufficiently isolated from the environment, trapped-ion qubits can have extremely long coherence times (the record is over 10 minutes! [1]).

But point two wasn’t quite so obvious when people first started considering a trapped-ion quantum computer. You can’t directly couple the electronic levels of two different ions to share their quantum information, so they needed an indirect way to mediate coupling between two qubits. This ended up being the shared motion of the ions in a trap.

Let me explain. An ion confined in a harmonic trap will have its motional energy quantized into harmonic oscillator levels n \hbar \omega. If there are N ions in this trap, then, just like coupled harmonic oscillators, the system is defined by the 3N normal modes of motion shared among the ions in the trap. This means that, because the ions are electrically charged and thus through their Coulombic repulsion the motion of one ion affects the motion of another, we can use this interaction to couple qubits together—as an information bus for multi-qubit gates.

But this only works if we have a way to couple the qubit to the motion. In 1994, when this paper was written, this coupling had already been demonstrated. Through laser cooling, physicists showed that light could be used to control the motion of an atom [2]. And through an extension of the general laser cooling concept, physicists showed that they could use light to couple the electronic degree of freedom to a single, particular harmonic oscillator energy level, provided the transition linewidth is narrow enough that these harmonic levels can be resolved. This is known as a resolved sideband interaction [3].

If an ion is in state the ground qubit state and the nth motional energy level, \psi = |0\rangle |n\rangle , then we can drive this sideband transition by applying a laser whose frequency \omega_l = \omega_0 \pm \omega_m, where \omega_0 is qubit frequency splitting and \omega_m is one of the shared motion normal mode frequencies. Depending on whether we choose a positive or negative detuning, this will cause a blue sideband transition up to \psi  = |1\rangle |n+1\rangle or a red sideband transition to \psi  =| 1\rangle |n-1\rangle , respectively. In this way we can add and subtract single phonons from the trapped ion system, which can allow us to cool the system to the ground state of motion and also move information from the electronic state of one ion to the electronic state of another by transferring it through their shared motional mode.

One important thing to note: if we start in |0\rangle |n=0\rangle , then applying a red sideband will do nothing, since there is no motional energy level lower than n=0, which is necessary to satisfy energy conservation in this case. The same reasoning can be applied for the case where we try to apply a blue sideband pulse on a starting state |1\rangle |n=0\rangle —there is no motional state below n=0 , so the blue sideband does nothing to this state. See the figure below for a pictorial representation:

So how do you make a two-qubit gate out of this interaction? Starting with ions with all modes cooled to the ground state of motion, and three relevant internal energy levels, |g\rangle , |e_0\rangle , and | e_1\rangle (where |g\rangle and |e_0\rangle are the qubit levels and | e_1\rangle is an auxiliary level) Cirac and Zoller proposed the following three steps:

  1. Red sideband \pi-pulse between |g\rangle _1 and |e_0\rangle _1 on ion 1. This will move the population in state |e_0\rangle_1 to state |g\rangle _1 and add a quantum of shared motion to the system. It will do nothing to state |g\rangle _1. (The subscript outside of the ket denotes which ion.)
  2. Red sideband 2\pi-pulse between|g\rangle _2 and |e_1\rangle _2 on ion 2. If a quantum of motion was added in step 1, then this will cause a transition between |g\rangle _2 |n=1\rangle and |e_1\rangle _2|n=0\rangle. Since it is a 2\pi-pulse, the population won’t change, but it will acquire a \pi phase shift.
  3. Red sideband \pi-pulse between |g\rangle _1 and |e_0\rangle _1 on ion 1. This transfers anything in |g\rangle _1 |n=1 \rangle back to |e_0\rangle _1 |n=0\rangle, leaving the system back in the ground state of motion.

Now, let’s look at a truth table of the results of these pulses on two qubits. From the original paper we get:

If we combine this gate with single qubit rotations (and reverting back to standard qubit state labels |0\rangle and |1\rangle), then the truth table can be simplified to:

|0\rangle_1 |0\rangle_2 \rightarrow |0\rangle_1 |0\rangle_2
|0\rangle_1 |1\rangle_2 \rightarrow |0\rangle_1 |1\rangle_2
|1\rangle_1 |0\rangle_2 \rightarrow |1\rangle_1 |1\rangle_2
|1\rangle_1 |1\rangle_2 \rightarrow |1\rangle_1 |0\rangle_2

This is the controlled-NOT (CNOT) gate. The first ion acts as the “control” qubit. If it is in state |1\rangle, then a NOT gate is performed on the “target” qubit, or ion 2, which flips the state of the qubit. If the control qubit is |0\rangle, then nothing happens.

The fact that this proposal enabled quantum computing on trapped ions with such a simple series of pulses created a ton of excitement among ion trappers. However, it had one fatal flaw: if the ions’ motion heats up during the gate, then it will fail. Keeping ions in the ground motional state for long periods of time unfortunately was an unrealistic expectation, since their motion is extremely sensitive to electric field noise. So, while this is a very important paper from a historical perspective, the Cirac-Zoller gate is not used in any modern trapped-ion quantum computers. In fact, it was never experimentally realized with the originally proposed setup, since a few years after this proposal, Klaus Mølmer and Anders Sørenson came up with their scheme for a two-qubit gate that was more robust to ion heating [4]. The Mølmer-Sørenson gate is still commonly used today.

[1] Wang, Y., Um, M., Zhang, J. et al. Single-qubit quantum memory exceeding ten-minute coherence time. Nature Photon 11, 646–650 (2017). https://doi.org/10.1038/s41566-017-0007-1

[2] D. J. Wineland, R. E. Drullinger, and F. L. Walls. Radiation-Pressure Cooling of Bound Resonant Absorbers. Phys. Rev. Lett. 40, 1639 (1978).

[3] Diedrich, F., Bergquist, J., et al. Laser cooling to the zero-point energy of motion. Phys Rev Lett. 62:403-406 (1989).

[4] Sørensen, A., Mølmer, K. Quantum Computation with Ions in Thermal Motion. Phys. Rev. Lett82 (9): 1971–1974. (1999).

Will Quantum Computers without Error Correction be Useful for Optimization?

Title: Limitations of optimization algorithms on noisy quantum devices

Authors: Daniel Stilck Franca, Raul Garcia-Patron

First Author’s Institution: University of Copenhagen

Status: Pre-print on arXiv

The Big Idea

The authors develop a theoretical technique to identify situations where a noisy quantum computer without error correction loses to current classical optimization methods. The authors use their technique to provide estimates for many popular quantum algorithms running on near-term devices, including variational quantum eigensolvers,quantum approximate optimization (both gate-based), and quantum annealing (think D-Wave). The authors found that for quantum computers without error correction “substantial quantum advantages are unlikely for optimization unless current noise rates are decreased by orders of magnitude or the topology of the problem matches that of the device. This is the case even if the number of qubits increases.”

The author’s conclusion allows others researchers to identify and focus on the slim regime of experiments where quantum advantage without error correction is still possible, and shift more time into development of error-corrected quantum computers. Even seemingly “negative” results advance the field in meaningful ways.

Currently in Quantum Computing

Current quantum computers are noisy (errors frequently occur), and of intermediate size: large enough to compete with the best classical computers on (almost) useless problems, but not large enough to be fault-tolerant (~50-100 qubits). A fault tolerant quantum computer has enough error correction so that errors mid-calculation do not affect the final result. Fault tolerance requires more and better qubits than are available today. The community is fairly confident fault tolerant quantum computers will outperform classical computers on many useful problems, but it is unclear if noisy, intermediate scale quantum (NISQ) computers can do the same.

Optimization problems are a very useful, profitable, and ubiquitous class of problem where the goal is to minimize or maximize something: cost, energy, path length, etc. Optimization problems occur everywhere, from financial portfolios to self-driving cars, and often belong to the NP complexity class, which is widely accepted as extremely difficult for classical computers to solve. Comparing computers based on ability to solve optimization problems has two benefits. First, it is easy to see which computer’s solution is better. Second, if quantum computers have an advantage, there are immediate applications.

How to Tell if a NISQ Computer is a Poor Optimizer

Quantum state diagram. In red is the region of classical optimization superiority. Figure 1 from Limitations of optimization algorithms on noisy quantum devices

Get familiar with the state diagram- I’ll be using it to explain the entire technique!

The optimization task is to minimize a “cost function,” which takes an input and assigns a “cost”(the function’s output). Here, the input is the quantum state of the device \rho, and the “cost” is the energy of the state. The function is characterized by a linear operator H (usually a Hamiltonian). Every H has a family of thermal equilibrium states (inputs) that do not change with time. The device has a unique thermal equilibrium state (Gibbs state for short) labelled by \sigma_\beta for every temperature. \beta represents the inverse of temperature, so \beta = 0 is “burning hot,” though for such a tiny device “hot” just means randomly scrambled (i.e. noisy). Likewise, \beta = \infty is absolute zero temperature, meaning everything is perfectly in order and functioning as intended (i.e. noiseless).

Figure 1 from the paper shows device states (labelled black dots) at various points in a quantum computation. The noisy quantum computer with n qubits initialized at state \rho=|0\rangle^{\otimes n} attempts to follow the absolute zero temperature path (orange arrow) through the space of “noiseless” quantum states (black Bloch sphere) and arrive at the true answer \sigma_{\beta=\infty}. However, the real computation takes a noisy path (black arrow) that after enough time, leads to the steady state of the noise at \sigma_{\beta=0}.

The quantum device state at an intermediate point in the real computation \Phi(\rho) is too difficult to simulate classically (NP complexity), but at each intermediate state \Phi(\rho), there is a set of states(located on the blue line) with the same cost function value (i.e. energy) to within some error \epsilon. One of those equal-energy states is the Gibbs state at temperature \beta, denoted in the diagram by e^{-\beta H} / \mathcal{Z}. In fact, we can “mirror” the real path of the quantum device (black arrow) as it progresses in time by matching all intermediate states with Gibbs states, producing a “mirror descent” (green line) from the correct answer to the steady state of the noise \sigma_{\beta} = 0.

For any class of optimization problems, there is a critical threshold \beta_C below which we are guaranteed to efficiently classically compute the Gibbs state, providing a cost function estimate better than the current quantum state \Phi(\rho): this threshold is depicted in red and labelled \mathcal{C}. Once a quantum computation passes the threshold( i.e. has an equivalent Gibbs state at with \beta<\beta_C), one has certified classical superiority. All that remains to use this technique is a method for quantifying how far along in \beta the associated quantum state \Phi(\rho) is.

The authors quantify the distance between the quantum state \Phi(\rho) and the steady state of the noise \sigma_{\beta=0} by the relative entropy (for our purposes, relative entropy is simply a way to measure the distance between quantum states). The relative entropy only decreases throughout the computation (meaning \Phi(\rho) is always moving towards \sigma_{\beta=0}, never away). Sparing the relative entropy proof (see the paper), it is possible to identify an upper bound on the relative entropy (distance between \Phi(\rho) and \sigma_{\beta=0}) at any time during the computation. If the Gibbs state corresponding to the upper bound has \beta<\beta_C, the noisy quantum calculation now provides a certifiably worse answer than a classical optimizer would.

Any noiseless quantum optimization computation produces better answers the longer it is allowed to run, but in real processes, the longer the computation, the more the noise corrupts the answer. This technique gives a hard limit on how long a computation can be run for some quantum noise/device/algorithm combination before the noise makes the process worse than classical optimization.

Takeaways from the Authors

This technique requires one to choose the noise model, device parameters, optimization technique, and target problem before “ruling out” the quantum advantage of any such combination. This freedom is very powerful. One can use this technique to find regions of classical superiority for all near-term quantum optimization devices, as well as to quantify the reduction in noise level required for a chance at quantum advantage.

Let us return to fault-tolerance for a moment. If the reduction in noise required for a NISQ device/algorithm to reach quantum advantage is below the noise threshold where fault-tolerance becomes possible, it will likely be better to perform fault tolerant operation instead. Any NISQ device/algorithm revealed by this technique to have such a noise reduction requirement will likely be a bad optimizer for its entire existence, rendered obsolete by current classical superiority followed by the rise of fault-tolerant quantum computing.

The authors consider variational quantum eigensolvers, quantum annealing, and quantum approximate optimization with currently available superconducting qubit devices and simple noise models, finding a noise reduction requirement below the fault tolerant threshold. However, accurate noise models are often far more complex, and unique to devices than the local depolarizing noise used by the authors for quantum circuits, so some hope of quantum advantage in optimization remains for existing devices. Further, the authors found current NISQ devices have a chance at quantum advantage in the arena of Hamiltonian simulation, a related and important problem (though perhaps not as profitable as optimization).

The most interesting applications of this technique maybe yet to come, as more scientists begin applying the technique to a wider variety of quantum computations. At its best, this technique can guide experimental NISQ devices into parameter regions with a real chance at quantum advantage and reveal which emerging quantum devices have the best shot at quantum advantage.

Thanks for reading – I’ll answer questions in the comments, and don’t be afraid to look at the paper (open-access here) if you are curious about the details.

Matthew Kowalsky is a graduate student at the University of Southern California whose research focuses on quantum annealing, open quantum system modeling, and benchmarking dedicated optimization hardware.