In Search Of The Essence Of Quantum Machine Learning
How parameterized quantum circuits turn exponential Hilbert spaces into learning models
Classic machine learning already provides universal approximators and massive models. So why resort to quantum circuits? This is the search for the real challenge, namely to design problem-specific circuits that translate theoretical capacities into practical learning.

What could
offer that would justify the effort involved in establishing the new field of The bar is set high. Over the past decade, classical has achieved remarkable results. now capture far-reaching dependencies in texts with more than billion parameters. Their results—texts, source code, images, even videos—are often indistinguishable from human work.
And that's only one part of it. Classic
encompasses much more than . are universal function approximators: with sufficient capacity, they can represent any continuous function. map data to high-dimensional where linear separation is possible. algorithms such as and can be reliably scaled to millions or billions of parameters. Together, these advances form an arsenal of techniques that dominate tasks in the fields of image processing, language, and structured data alike.Against this backdrop, we must critically examine the core approach of
, which focuses on a as a . The is trained and its parameters are updated through a classical loop. This design turns the itself into a function approximator that competes directly with , , and other classical architectures.For such an approach to be viable, a
must offer some representation capacity or computational efficiency that goes beyond what these established already achieve. Without this, they run the risk of becoming expensive imitations of methods that already work, and the rationale for disappears.So what does a
offer that a classical algorithm does not? The first real clue comes from comparing how classical and quantum expand data. In a classical , an input is assigned to a . In this space, the number of grows polynomially (usually linearly) with the dimension .A quantum
, on the other hand, embeds an input into the state space of qubits. This state is a of all computational . Instead of a polynomially growing set of features, the opens up access to an exponentially large basis. Each doubles the number of accessible states, creating a representation landscape that no classical can efficiently match.This shift puts the problem in a new context. The promising potential of
lies not in marginal improvements, but in the fact that they open up a whose size grows exponentially with the number of . This exponential expansion is the structural reason for assuming that a quantum could offer an advantage for .However, access to a
of dimension is meaningless if the does not use it. A quantum must do more than embed inputs in a huge space. It must use this space in such a way that a separation is achieved that cannot be achieved with any efficient classical construction.
So even though we know that capacity is exponential, when we use naive
we fall back on forms that also classical can approximate, too. Any possible vanishes. The unresolved problem therefore lies in finding out how data and algorithms can be designed that actually make use of the exponential structure. Until then, the exponentially growing remains a theoretical promise rather than a practical advantage.Fortunately,
is an active area of research. And there is a number of publications that show how access to an exponentially large .Yet, unfortunately, that research is not precisely beginner-friendly. Because the articles sound something like this:
Define the
and the associated quantum as theFor any positive semidefinite reproducing property:
, there exists a unique of functions with , characterized by theIn the case of
, the is the closure of finite linear combinations of the quantum features . show that for certain , constructed from commuting approximating theto additive error classically hard on average, unless the collapses. Thus, while values can be efficiently estimated on a , they are conjectured to be intractable to approximate classically.
isConsequently, the computationally accessible only with quantum resources, establishing a separation in computational expressivity between quantum and classical kernel methods.
isThe findings in the literature are rigorous, but not always practical. They prove that certain
are difficult to approximate classically. Yet, they do not explain how to develop working models for real data sets. There are two main challenges here.Challenge 1: Choosing useful encodings
- It is easy to write a function
- captures the structure of the actual learning problem and
- remains provably hard to approximate classically, while still being trainable on real quantum hardware.
Most scientific papers provide
and but they stop short of giving recipes that align naturally with typical datasets.Challenge 2: Making circuits trainable
In this setting, disappear exponentially fast with the number of , a problem known as . This makes practically impossible without additional structure or careful initialization.
Therefore, it is not enough to simply place a model in the vast the . architecture must be chosen in such a way that learning remains possible
. But
The key mechanism for accessing the exponential
is . When a is running, the system into a of many computational paths. Each path contributes a and the probability of an outcome depends on the square of the sum of these .By tuning the
parameters, you change how the paths reinforce or cancel each other out. This selective allows the to highlight certain patterns in the input and suppress others. The effect is a that can be highly nonlinear and structured. In classical capturing the same pattern would often require deep or a large number of parameters.- In physics, the
- A puts the into a of and .
- A rotation around the , -axis , introduces a that depends on the input data .
- A second brings the two paths back together so that their .
The resulting probability of measuring outcome
is thenwhile outcome
occurs with probabilityAs you can see, both probabilities depend on the sine and cosine of the specified rotation around the
-axis that determines the .This explanation is quite mathematical. It's more than time to make the principle tangible and look at it in practical terms.
The following listing depicts the core of the circuit implementation in
.- We define a quantum circuit with two registers.
- defines a quantum register with a single .
- defines a classical register with one classical bit to store the measurement outcome.
- We add three
- The puts the into a state in which the of both are equal. In other words, if we were to measure this , we would measure and with the same probability.
- The applies a rotation around the -axis by the angle that we take as an external parameter . This means, it is not yet fixed at a certain value but is specified when we use the quantum circuit. The rotation does not affect the measurement probability of either . But it changes the of .
- The moves the away from a state in which both have an of the same size. Depending on the applied, it moves either in the direction of or in the direction of .
- The final measures the qubit in the
cr
. and puts the result into the classical register
- This function the quantum circuit
qc
and the parametertheta
.
If you look at the mathematical explanation of our circuit again, you will see that we do not feed
("theta") directly into the - operator, but rather . This a function ("phi") of .This function
serves as a feature mapping that translates the external variable into the actual gate angle used in the circuit. It implements a simple linear transformation:where
controls how many oscillations of the pattern occur over the interval . introduces a phase shift along the -axis.This mapping ensures that the measurement probabilities
and exhibit the characteristic sinusoidal patterns associated with the of individual , with adjustable frequency and phase shift. These parameters and are optional and are therefore sometimes omitted for the sake of conciseness.The corresponding function in Python is straightforward:
In the next step, we use our double_slit
circuit.
- We start with a
- We and obtain its instance and an instance of the parameter
theta
. - We spcify and map it into a circuit parameter through the feature map
phi
. This step turns classical data into a phase angle. Withw=3.0
, we specify to pack three oscillations into the range between and . Further, we specify the starting offset to beb=0.1
. - Next the symbolic placeholder . This makes the circuit concrete: when executed, it rotates exactly by this phase. Without this binding, the circuit is just a template.
- We and obtain its instance and an instance of the parameter
- In the next block, we run the quantum circuit. In our case, a local
- We specify an ,
- times,
- ,
- and of measuring the qubit as either or .
- Finally, we .
The following listing depicts the measurement output.
As we can see, our empiric results are close to the calculated theoretical values.
Before you try all the different values for for x in np.linspace(0, 2*np.pi, 50)
, we get points between and .
The following figure shows the plotted empirical results along the calculated theoretial values.

The output of the double-slit circuit is not a linear threshold but a smooth, periodic curve. The measurement probability oscillates (here
times) as a cosine squared of the input phase. That periodicity is the direct result of between the two computational paths.To mimic this behavior in a classical
, you would need to generate the same sinusoidal dependence on the input. That is why, when comparing to a feedforward , the relevant question becomes: how well can it approximate functions like ? A ReLU network can only generate piecewise linear functions, so approximating a smooth periodic curve like requires hidden units to reach accuracy . The quantum circuit does it with depth and one parameter.The circuit in
is minimal. But it impressively shows how patterns encode input-dependent . It is the most direct method to see how use and to convert information into a nonlinear classification signal.do not strive for general universality. They target specific structures that are costly for classical , such as periodic decision boundaries. Therefore, are precise tools that are valuable not everywhere, but precisely where classical methods are oversized in scope and depth.
The big challenge in
lies not in pure expressiveness, but in modeling. A has exponential capacity, but this capacity is useless if it is not shaped by the structure of the learning task. Just as with classical methods, the effectiveness of a quantum kernel or depends on the problem domain. A useful model must exploit quantum correlations in a way that is tailored to the specific structure of the task.Progress is achieved through architectures and training methods tailored to real-world problems, not by striving for larger
. The advantage of will only be realized if they are developed to solve specific learning tasks and not to demonstrate .