Skip to main content
  • Original article
  • Open access
  • Published:

Quantum cluster algorithm for data classification


We present a quantum algorithm for data classification based on the nearest-neighbor learning algorithm. The classification algorithm is divided into two steps: Firstly, data in the same class is divided into smaller groups with sublabels assisting building boundaries between data with different labels. Secondly we construct a quantum circuit for classification that contains multi control gates. The algorithm is easy to implement and efficient in predicting the labels of test data. To illustrate the power and efficiency of this approach, we construct the phase transition diagram for the metal-insulator transition of VO2, using limited trained experimental data, where VO2 is a typical strongly correlated electron materials, and the metallic-insulating phase transition has drawn much attention in condensed matter physics. Moreover, we demonstrate our algorithm on the classification of randomly generated data and the classification of entanglement for various Werner states, where the training sets can not be divided by a single curve, instead, more than one curves are required to separate them apart perfectly. Our preliminary result shows considerable potential for various classification problems, particularly for constructing different phases in materials.


Machine learning techniques have demonstrated remarkable success in numerous topics in science and engineering, including artificial intelligence (Mitchell et al. 1990; Duda et al. 1973), molecular dynamics (Botu and Ramprasad 2015), light harvesting systems (Häse et al. 2017), molecular electronic properties (Montavon et al. 2013), surface reaction network (Ulissi et al. 2017), density functional models (Brockherde et al. 2017), phase classification, and quantum simulations (Wang 2016; Carrasquilla and Melko 2017; Broecker et al. 2017; Ch’Ng et al. 2017; Van Nieuwenburg et al. 2017; Arsenault et al. 2014; Kusne et al. 2014). In addition, modern machine learning techniques have also been applied to the state space of complex condensed-matter systems for their abilities to analyze exponentially large data sets (Carrasquilla and Melko 2017), speed-up searches for novel energy generation/storage materials (De Luna et al. 2017; Wei et al. 2016) and classification of entanglement (Gao et al. 2018). With the rapid development of quantum computers (Leibfried et al. 2003; Debnath et al. 2016; Karra et al. 2016; Arute et al. 2019; Zhong et al. 2020), it has become a new frontier to recognize patterns using quantum computers. Considering recent advancements in both quantum computing and machine learning, the combination of the two techniques – quantum machine learning – is expected to be a promising application of quantum computer in the near future. Many quantum machine learning algorithms were proposed in the past few years (Rebentrost et al. 2018; Cao et al. 2016; Biamonte et al. 2017; Roy et al. 2021; Dixit et al. 2021). Moreover, researchers have succeeded to apply quantum machine learning algorithms to various systems such as superconducting circuits (Havlíček et al. 2019) and photonic systems (Cai et al. 2015), which leads to enormous enthusiasm applying quantum algorithms into various areas (Xia and Kais 2018; Hu et al. 2020; 2020; Li et al. 2021; Xia et al. 2017; Sajjan et al. 2021; Xia et al. 2021).

There is no doubt that we are now in the age of big data and there is an urgent need for developing game-changing quantum algorithms to perform machine learning tasks on large-scale scientific datasets for various industrial and technological applications based on optimization. For a proof of concept, Du and coworkers have successfully distinguished handwriting numbers ‘6’ and ‘9’ with the quantum support vector machine (Li et al. 2015). However, it could be difficult to deal with more challenging problems, especially when the training data can not be divided apart by a single curve, instead, more than one curve or even enclosed curves might be required to separate them apart. Another remarkable development is applying quantum machine learning on variational circuits (Schuld and Killoran 2019; Arrazola et al. 2019), which theoretically, can always be able to classify data with complex distribution. Yet generally, these algorithms rely heavily on a gradient-based systematic optimization of parameters (Mitarai et al. 2018; Farhi and Neven 2018). On the other hand, quantum nearest neighbor algorithm (Wiebe et al. 2014) offers another option to classify data without the gradient based optimization process. In brief, the core of nearest-neighbor classification algorithm is to assign the training vectors into classes, and in each class vectors are close to each other.

In this study, we will propose a quantum classification algorithm, with which we can build a quantum circuit that is able to classify artificially generated data, and all parameters in the circuit can be obtained without relying on the gradient based optimization process. For this purpose, we introduce ‘sublabels’ to assist in classifying data with intricate distribution, where ‘sublabel’ represents a minor label subordinates to the main one, it also called ‘subclass’. There are two main tasks in our developed algorithm: how to find the appropriate sublabels and how to build the quantum classification circuit with these sublabels. With the numerical simulation we will demonstrate the application on various classification problems, especially on constructing different phases of materials. First, in “Introduction” section, we will present the basic elements of the algorithm. Then, we will apply the algorithm for classifications for several systems: classification of metallic and insulating phases in the phase diagrame of VO2; classification of entanglement in Werner states, and classification of randomly generated data. Finally, we will present scaling analysis and discuss generalization of quantum classification algorithm in higher dimensional space. In addition, we present in the supplementary materials all the details of the quantum classification algorithm with examples.

Algorithm design

Consider the training data set {xi,yi}, where xi is a vector in \(\mathbb {R}^{d}\), where d is the dimension and yi represents the label with possible values \(\{l_{1}, l_{2}, \dots l_{M}\}\). Our goal is to build a quantum classification circuit that can be used to predict the label for new vectors {xt}. The classification algorithm is divided into two steps: The first step is a learning process, where one needs to find the “sublabels” for each class of the training data. Then, based on the information obtained in this learning process, we construct a quantum classification circuit that contains multi control quantum gates.

In the learning process, firstly we apply the Lloyd’s algorithm (MacKay and Mac Kay 2003) for unsupervised machine learning, which assigns training vectors to the same class as the closest mean vector. However the results derived by Lloyd’s algorithm can not be used directly as there might be a sublabel redundancy or not enough sublabels to reconstruct the initial distribution. To address this issue, in addition to the algorithm for clustering, we propose to use two adjusting algorithms: one to reduce excessive sublabel and the other to make sure that there is no overlap between sublabels.

For each sublabel, we need to store the information and build a quantum circuit to estimate the inner product, which is shown in Fig. 1. When xi are vectors in two-dimensional space, each data point (or sublable) could be represented by a single qubit. An arbitrary state of a single qubit could be written as

$$ |\psi\rangle = e^{i\alpha}\left[\ cos(\theta/2)|0\rangle + sin(\theta/2)e^{-i\phi/2}|1\rangle\ \right] $$
Fig. 1
figure 1

Sketch of the quantum circuit for estimating the inner product: Circuit to estimate the inner product of two-dimensional vectors contains six rotational gates. Additionally, we also need a memory to store the information of this group, which could be represented by θm,ϕm, and an integer N, which represents the total number of data in this group

where α is the global phase and the vector xi=(x1,i,x2,i) is mapped as

$$\begin{array}{*{20}l} \theta_{i} = \frac{2\pi(x_{1,i}-min\{x_{1}\})}{max\{x_{1}\}-min\{x_{1}\}}\\ \phi_{i} = \frac{2\pi (x_{2,i}-min\{x_{2}\})}{max\{x_{2}\}-min\{x_{2}\}} \end{array} $$

Here max{x1} and max{x2} represent the maximum value of all x1,i and x2,i respectively, while min{x1} and min{x2} represent the minimum value of all x1,i and x2,i respectively. Then we need to find a measure describing the distance between the two states, where the ‘distance’ might be the Euclidean distance between the two vectors (Wiebe et al. 2014), or the inner product of their two corresponding quantum states. Here, we chose to calculate the inner product, as calculating the Euclidean distance is more time and resource-consuming.

An arbitrary state of a single qubit as shown in Fig. 1, could be prepared by three rotational gates:

$$ |\psi(\theta_{1}, \phi_{1})\rangle = R_{z}(\phi_{1}/2)R_{y}(\theta_{1})R_{z}(-\phi_{1}/2)\ |0\rangle $$

Thus, the inner product between |ψ(θ1,ϕ1)〉 and |ψ(θ2,ϕ2)〉 is given by:

$$ \langle\psi(\theta_{1}, \phi_{1})|\psi(\theta_{2}, \phi_{2})\rangle =\langle 0|R_{z}(\phi_{1}/2)R_{y}(-\theta_{1})R_{z}(-\phi_{1}/2)R_{z}(\phi_{2}/2)R_{y}(\theta_{2})R_{z}(-\phi_{2}/2)|0\rangle. $$

The circuit that estimates the inner product will contain six rotational gates, as shown in Fig. 1. After a measurement of the final state in the Z-basis, the probability of getting a state |0〉 will be an estimation of the inner product 〈ψ(θ1,ϕ1)|ψ(θ2,ϕ2)〉. For simplicity, in the following sections, we will write

$$R(\theta, \phi)=R_{z}(+\phi/2)R_{y}(\theta)R_{z}(-\phi/2) $$

Moreover, for every sublabel it is required to store 2 floating numbers θm and ϕm that represent the centroid vector of this subgroup, and an integer N representing the total number of data points in this subgroup.

In the learning process, three basic algorithms are applied to assist in obtaining “sublabels” from the given training data. Algorithm (S1) is designed for an initial clustering of the training data. When designing Algorithm (S1), we refer to Lloyd’s algorithm (MacKay and Mac Kay 2003)in which we need to assign each vector to the cluster with the closest mean, and then recalculate the centroids of the new cluster. Algorithm (S1) will divide the training data with the same prior label into several subgroups. Algorithm (S2) will reduce redundancy, and Algorithm (S3) is introduced to make sure there will be no overlap between any two left sublabels of the different prior labels. The goal is to leave only the minimal sublabels without losing important information of the training data. After applying Algorithm (S1), and repeating Algorithm (S2), Algorithm (S3) for a number of times, we can get a set with minimal sublabels and information of the centroid vectors for each subgroup (Details of these three algorithms are in the supplementary materials).

Now, the next step is to build the quantum classification circuit based on the previously obtained information. Consider the following sublabel-control operations,

$$\begin{array}{@{}rcl@{}} \left.\begin{aligned} &\ \mathbf{IF} \ SUB LABEL = l_{j}\\ &\ \qquad\mathbf{DO} \quad ROTATION \ R\left(-\theta^{l_{j}}_{m}, -\phi^{l_{j}}_{m}\right)\\ &\ \mathbf{IF} \ SUB LABEL = l_{j+1}\\ &\ \qquad\mathbf{DO} \quad ROTATION \ R\left(-\theta^{l_{j+1}}_{m}, -\phi^{l_{j+1}}_{m}\right)\\ &\ \cdots\\ &\ \mathbf{FOR}\ SUB LABEL \ in\ PRIOR LABEL \ L_{i}\\ &\qquad\mathbf{DO} \quad OPERATION \ U(L_{i}) \end{aligned}\right. \end{array} $$

where the operations U(Li) are obtained with the aim to reach the final state \(\phantom {\dot {i}\!}|\Psi ^{f}_{L_{i}}\rangle \). If one wants to set all vectors belong to prior label Li close to the final state \(\phantom {\dot {i}\!}|\Psi ^{f}_{L_{i}}\rangle \), then U(Li) can be chosen to satisfy

$$ |\Psi^{f}_{L_{i}}\rangle = U(L_{i})|0\rangle $$

To classify a test vector with unknown label, we prefer to rely on a single quantum classification circuit, instead of repeating comparing inner products with the training data. The very basic and intuitive idea is to measure inner product between the new one and all centroid vectors of subgroups. Circuits for classification will consist of two parts: the control qubits representing the sublabels, and others representing the given new vector. Figure 2 is a sketch showing the structure of the main circuit. First, one map the test data xt into the prepared circuit as

$$ \mathbf{x}_{\mathbf{t}}\rightarrow|\Psi(\mathbf{x}_{\mathbf{t}})\rangle=\frac{1}{\sqrt{n}}\left[\sum_{i}^{n}|l_{i}\rangle\right]\bigotimes|\psi(\theta,\phi)\rangle $$
Fig. 2
figure 2

Sketch of the main circuit: Qubits in the main circuit can be divided into two groups: L-qubits to represent the sub labels, and will play the role of control bits and the V-qubits to represent a given vector. Initially, L-qubits will be prepared at state \((\frac {|0\rangle +|1\rangle }{\sqrt {2}})^{\otimes N}\), where there are N qubits representing the sub labels. The minimum N=log2n, and n is the total number of sub labels. In sum n control rotation gates are needed

where |li〉 are the orthogonal eigenstates for the control qubits, representing sub labels li, and there are n sublabels in total. In the mapping process, we need to apply Hadamard gates for L-qubits (representing the sublabels), and apply operator T(xt) on the V-qubits (representing the test data), where |ψ(θ,ϕ)〉=T(xt)|0〉. The quantum classification circuit can be described as:

$$ U = \sum_{i}^{n}\left[|l_{i}\rangle\langle l_{i}|\bigotimes U^{}(L_{i}) R(-\theta^{l_{i}}_{m}, -\phi^{l_{i}}_{m})\right] + \sum_{j=n+1}^{2^{N}-1}\left[|j\rangle\langle j|\bigotimes I\right] $$

where there are N qubits used to represent sublabels, and n sublabels totally. We can notice that for 0≤kn,

$$ \begin{aligned} \langle l_{k}, \Psi^{f}_{L_{k}}| U|\Psi(\mathbf{x})\rangle & = \langle l_{k}, \Psi^{f}_{L_{k}}|\sum_{i}^{n} \left[|l_{i}\rangle\bigotimes U^{}(L_{i}) R(-\theta^{l_{i}}_{m}, -\phi^{l_{i}}_{m})|\psi(\theta,\phi)\rangle\right]\\ &\quad+ \langle l_{i}, \Psi^{f}_{L_{k}}|\sum_{j=n+1}^{2^{N}-1}\left[|j\rangle\langle j|\bigotimes I\right]\\ & = \sum_{i}^{n}\left[\delta_{{ik}} \langle\Psi^{f}_{L_{i}}|U^{}(L_{i}) R(-\theta^{l_{i}}_{m}, -\phi^{l_{i}}_{m})|\psi(\theta,\phi)\rangle\right]\\ & = \langle\psi(-\theta^{l_{k}}_{m}, -\phi^{l_{k}}_{m})|\psi(\theta,\phi)\rangle \end{aligned} $$

By applying a measurement with eigenstate \(\phantom {\dot {i}\!}|l_{k}\rangle |\Psi ^{f}_{L_{k}}\rangle \), the inner product 〈ψ(−θmlk,−ϕmlk)|ψ(θ,ϕ)〉 can be estimated. Figure 2 is a sketch of the main circuit, where T(xt) represents the given test data, l is the sublabel and L is prior label.

When predicting the label for a new test vector, we assume that the probability for each possible sublabels is the same. Based on this assumption, we applied N Hadamard gates on the label qubits, preparing them as a uniform superposition state. In fact, one can always adjust the probability for sublabels based on the training result, and only keep the states representing ‘valid’ sublabels. Assume that we finally derived K sublabels, and N qubits are used as label qubits, where 2N−1<K≤2N. For a training data set with two labels {xi},yi},yi{0,1},K0 sublabels are assigned with label y=0, while the other K1 labels with y=1,K0+K1=K. Then the Hadamard gates HN can be replaced as some certain operation, in order to prepare the label qubits at state

$$ |\phi_{L}\rangle= \sum_{n=0}^{K_{0}-1}\sqrt{p_{n}}|n\rangle + \sum_{m=0}^{K_{1}-1}\sqrt{p_{m+K_{0}}}|m+2^{N-1}\rangle $$

where pn≥0, and \(\sum _{n=0}^{K-1}p_{n}=1\). Obviously, for sublabels assigned with label y=0, the first label qubit is set as state |0〉, and for the others the first qubit is set as |1〉. For a new test data, one only needs to measure the first label qubit and the data qubits. By comparing \(P(q_{1}=|y\rangle, V=|\Psi ^{f}_{y}\rangle), y_{i}\in \{0, 1\}\), the new test vector will be assigned with label corresponding to the larger P, where \(P(q_{1}=|y\rangle, V=|\Psi ^{f}_{y}\rangle)\) represents probability to find the first label qubit q1 at state |y〉 and meanwhile find the data qubits at state \(|\Psi ^{f}_{y}\rangle \). pn are chosen to maximize

$$ \sum_{i}\left\{ P(q_{1}=|y_{i}\rangle, V=|\Psi^{f}_{y_{i}}\rangle) -\lambda P(q_{1}\neq|y_{i}\rangle, V=|\Psi^{f}_{y_{i}}\rangle) \right\} $$

where λ≥0 is the penalty.


Classification of metallic-insulating phases of vanadium dioxide

Strongly correlated electron materials and their phase transitions have attracted great interest for device application and in condensed matter physics (Dagotto 2005; Arko et al. 1997). Recently, metallic-insulating phase transition of vanadium dioxide (VO2), as a prototype of strongly correlated electronic materials, attracted experimental and theoretical attention for its distinct structures and electronic properties (Qazilbash et al. 2007; Jeong et al. 2013).

Here, we will apply the developed quantum classification algorithm to distinguish the metallic state from the insulating state of VO2. Data used in this section is based on experimental results reported in Ref. (Chen et al. 2017). VO2 exhibits several special structures under different temperature and pressure. As shown in Fig. 3a, the red dots represent metallic state, the blue dots represent the insulating state, and the black solid line represents the phase transition line. Note, our training data were chosen far from the phase transition line in order to test the classification power of the designed quantum algorithm. Figure 3b show the initial clustering results after applying the Algorithm (S1) once. In Algorithm (S1) one should determine the parameter D(0,1], which is the minimum inner product between the centroid vector and arbitrary vectors with the same sublabel. In other words, a vector can be assigned to a certain sublabel when the inner product between itself and the centroid vector of the sublabel are larger than D. Here we set D=0.99. Now, we need to repeat Algorithm (S2) and Algorithm (S3) several times to reduce the excessive sublabels as shown in Fig. 3c. After repeating Algorithm (S2) and Algorithm (S3) three times, the number of classes can not be reduced further as shown in Fig. 3c.

Fig. 3
figure 3

Classification of metallic and insulating states of VO2: a Initial data used for classification. Red dots represent metallic state, and blue ones represent insulating state. Phase transition line indicated by the black solid curve. b Forming subgroups after applying algorithm[S1] once. Similarly in (b) and (c), blue or red spheres are used to represent data with the same sublabel, where the center of sphere represents the average vector, and the radius represents number of vectors belong to this sublabel. c Results after repeating algorithm[S2,S3] 3 times. d Prediction of new data. New vector in the blue part will be recognized with label ‘insulating’, and label of new vectors in yellow part will be predicted as ‘metallic’. Blue and red dots are still the initial data

In both Fig. 3b and c, blue and red spheres are used to represent data with the same sublabel, where the center of sphere represents the average vector, and their radius is used to represent the number of vectors belonging to a certain sublabel. In Fig. 3d, we demonstrate the prediction results of arbitrary given data. States in the yellow parts are predicted to behave as metallic states, and the ones in the blue parts are predicted to be insulating states. In the training process, we used 100 data vectors for the metallic states and 100 data vectors for the insulating states, out of the totally 1100 data vectors.

Simulation results shows that our developed quantum algorithm can efficiently classify metallic or insulating states of VO2. Finally we need 7 sublabels for insulating states and 8 sublabels for metallic states, which means that the classification circuit consists of 5 qubits (4 for sublabels as control qubit and 1 for data).

It is important to note, when Pressure (P) or Temperature (T) is small, prediction can be incorrect. The error appears because of the fact that few vectors in these area where used in the training process. Moreover, when we convert a vector (P,T) into quantum states, we changed them as θ,ϕ[0,2π). Though classically T=0oC and T=120oC are extremely different, when we convert them as angles, θ=0 and θ=2π are nearly the same. One might notice that there is a slim yellow line around P=20GPa in our prediction. It is reported in Chen et al. (2017) that there is a structural transition between the state M1 and M1. However, this is not a metallic-insulating transition. In our classification algorithm, we did not expect to predict this transition and the slim line shows up “coincidentally”. To get a better prediction results, an option is to map both training data and test data into [0,π]×[0,π] instead of [0,2π]×[0,2π]. However, here we do not focus on the performance at low temperature, nor the phase transition about P=20GPa. As a result, mapping data into [0,2π]×[0,2π] is still an acceptable choice. More simulation results of the phase transition with different training data are offered in the supplementary materials.

Classification of randomly generated data

Here we will show another classification example, where the distribution of training data is artificially generated randomly. Different from the example of VO2, here the two groups can not be divided with a simple single boundary. We generated 1100 red points and 1100 blue points at random, from which 100 red points and 100 blue points are picked up randomly as training data, the others will be used as test data. All test points are shown in Fig. 4a, and the training points in Fig. 4b. The two isolated groups of red points in Fig. 4a are scattered along with the blue ones covering the whole area as shown in Fig. 4b. The distribution of initial data makes it more challenging to classify the test data. After appropriate learning process, 54 sublabels (22 for red and 32 for blue) are obtained from the training data. Finally a classification circuit can be build with 7 qubits, 6 as control qubits and 1 for the given data. Prediction for labels of test data is shown in Fig. 4c, where light blue dots are training points that are predicted as ‘blue’, yellow dots are predicted as ‘red’, red and blue cross represent the training data. Totally, 878 red test data and 836 blue test data are classified correctly, the matching rate for red and blue points can be estimated as 87.8% and 83.6% respectively (when calculating match rate the training data are all excluded).

Fig. 4
figure 4

Classification of randomly generated data: a Training data. In total, we generated 1100 blue points and 1100 red pints by the same generating function. 100 red points and 100 blue points are chosen randomly as training data, and the left pints are used as test data. b Training data including 100 red points are 100 blue points. c Prediction for the labels of test data. Light blue points are test data predicted as ‘blue’, and yellow points are predicted as ‘red’. Red and blue cross represent training data

Entanglement classification in Werner states

Further, our method can also be applied in entanglement classification. Consider the following scenario, Alice wants to send a message to Bob, in which she will send some entangled photon pairs of states as digit 1 and some pairs at an untangled state as digit 0. Initially, Alice will send some photon pairs at various states for training by informing Bob which pair represents 1 and which represents 0. Later she will use photon pairs states for communication. Although the widely used CHSH inequality (Clauser et al. 1969) can be used to detect entanglement as the violation of CHSH inequality guarantees the existence of entanglement, however we can’t make any conclusion if the inequality is not violated. To address this issue consider Werner states in the density matrix form:

$$ \rho_{W}(p,\phi)=\frac{p}{4}|\Psi_{B}(\phi)\rangle\langle\Psi_{B}(\phi)|+\frac{1-p}{4}I $$

where I is the 4×4 identity matrix, the parameter 0<p<1, and |ΨB(ϕ)〉 is the Bell state given by:

$$ |\Psi_{B}(\phi)\rangle=\frac{1}{\sqrt{2}}\left(|\uparrow\downarrow\rangle+e^{i\phi}|\downarrow\uparrow\rangle\right) $$

Assume that Alice uses Werner states to transport information, while Bob will carry on the Bell test experiment with the following measurements: \(Z, X; \frac {Z+X}{\sqrt {2}}\), and \( \frac {Z+X}{\sqrt {2}}\). From the measurement results, Bob will calculate four important correlation functions \(E(Z, \frac {Z+X}{\sqrt {2}}), E(X, \frac {Z+X}{\sqrt {2}}), E(Z, \frac {Z-X}{\sqrt {2}}), E(X, \frac {Z-X}{\sqrt {2}})\) where

$$ E(a,b)=\frac{N_{++}+N_{--}-N_{+-}-N_{-+}}{N_{++}+N_{--}+N_{+-}+N_{-+}} $$

N++ are the number of photon pairs whose measurement results are both +1 in the two channels. If Alice sets ϕ=0,π, Bob will observe violation of CHSH inequality for \(p>\frac {1}{\sqrt {2}}\). If Alice sets \(\phi =\pm \frac {\pi }{2}\), Bob can never observe violation of CHSH inequality. However, ρW(p,ϕ) will be entangled state when \(p>\frac {1}{3}\). Consequently, CHSH inequality will not be a good classification way for Bob. Instead, if Bob can set up a machine learning based on neural network, he will be able to ‘decode’ Alice’s information with a much higher match rate (Gao et al. 2018). Here, we will show that our quantum classification algorithm can classify entanglement states in such Werner state. We will take the 4-dimesinal vectors \(E(Z, \frac {Z+X}{\sqrt {2}}), E(X, \frac {Z+X}{\sqrt {2}}), E(Z, \frac {Z-X}{\sqrt {2}})\), and \(E(X, \frac {Z-X}{\sqrt {2}})\) as input calculated based on measurement results.

By changing ϕ and p in Eq. (9), we can generate different entangled or untangled states. We prepared 200 entangled states and 200 untangled states as the test data. Moreover, we also generated a few different training data set, in each set half are entangled and the other half are untangled. After learning based on different training set, we can build quantum classification circuit to distinguish entanglement from the test data, and the simulation results are shown in Fig. 5. In (a), the training set only contains 32 points, and we keep them all as sublabels. So that there are 7 qubits in the classification circuit (5 for sublabels and 2 for test data). 12 points are predicted with wrong label. In (b) the training set contains 64 points and all are kept as sublabels, and there are 8 qubits in the classification circuit. Finally 8 points are predicted with wrong labels. In (c) there are 128 points in the learning set and we derived 64 sublabels. Similarly as (b) we need 8 qubits to build the classification circuit, and only 6 qubits are predicted with wrong sublabels. However in (c) we do not plot the sublabels, as by applying our classification algorithms we can only get the parameters E for every sublabel instead of r,ϕ.

Fig. 5
figure 5

Entanglement classification for Werner states: In the plots, the radius represents the parameter p, and the angle represents ϕ. Every single dot represents a Werner state. Yellow dots represent test data that are predicted as ‘untangled’, and light blue dots represent test data predicted as ‘entangled’. Cross represent training data, and red cross for untangled states, blue for entangled states. In all three figures we used the same 400 test data, half of which are entangled and the other half are untangled. Half of the training data are entangled states and the other half are untangled states. a 32 test data are used, all are kept as sublabels. b 64 test data are used, all are kept as sublabels. c 128 test data are used, and only 64 of them are used as sublabels

In these figures the points (r,ϕ) represent Werner states. Notice that r,ϕ are not used in the learning or classification process, as Bob does not know exact r,ϕ either. In the supplementary materials we provide details of the simulation.

In the above discussion we assume that in communication between Alice and Bob, all measurement results are discrete, and one can easily calculate the parameters E. However, in chemical reactions the measurement results are often continuous, and one will get some special distributions after measurement. Recently, Zare and coworkers (Perreault et al. 2018) reported the rotationally inelastic collisions of deuterium hydride (HD, prepared at certain quantum states) with H2 and D2 under extremely low temperature (mean collision energy around 1 K), and they found that the orientation of HD molecules will lead to different distribution of scattering angle (Perreault et al. 2018). If the scattering experiment is applied as measurement to detect entanglement, then it would be nearly impossible to derive information about entanglement directly from the intricate raw data. Under such situations, some special methods, as we discussed in Li et al. (2021), are required. With the assistance of auxiliary functions it will be possible to obtain the parameters E from raw measurement results, after which by the same procedure we can build a quantum circuit for classification.

Further discussions

So far, our classification has been restricted to two and four-dimensional vectors.Here, we discuss how to use it to classify vectors in higher dimensional space. Depending on the structure of qubits, two different mapping methods can be used: Mapping method I: An arbitrary quantum state of N-qubits can be described as

$$ |\Psi\rangle= \sum_{i=0}^{2^{N}-1}{c_{i}|i\rangle} $$

where ci can be rewritten as a function of \(\phantom {\dot {i}\!}\boldsymbol {\Theta }=(\theta _{1},\cdots,\theta _{2^{N}-1},\phi _{1},\cdots,\phi _{2^{N}-1})\):

$$\begin{array}{*{20}l} &c_{0} = \cos\theta_{1}\\ &c_{1} = e^{i\phi_{1}}\sin{\theta_{1}}\cos{\theta_{2}}\\ &\cdots\\ &c_{2^{N}-2} = e^{i\phi_{2^{N}-2}}\Pi_{j=1}^{2^{N}-2}\sin{\theta_{j}}\cos{\theta_{2^{N}-2}}\\ &c_{2^{N}-1} = e^{i\phi_{2^{N}-1}}\Pi_{j=1}^{2^{N}-1}\sin{\theta_{j}} \end{array} $$

and 0≤θjπ,0≤ϕj≤2π. Then there exists a mapping operator T(Θ),|Ψ(Θ)〉=T(Θ)|0〉, by which one could map a vector in (2N−1)-dimensional space into a quantum state of N-qubits. For this mapping method, the main structure of circuit is still like the one shown in Fig. 2. If all qubits in the main circuit are connected with others and we could build arbitrary quantum gates between any qubits in the main circuit, then we can obviously apply this mapping method I. However, sometimes connection in the machine could not satisfy our demand, then the second mapping method might be more acceptable Mapping method II: An arbitrary untangled quantum state of N-qubits can be described as:

$$ \bigotimes_{j=1}^{N}{T_{j}(\mathbf{x}_{\mathbf{t}})|0\rangle} = \bigotimes_{j=1}^{N}{\left[\cos{\theta_{j}|0\rangle}+e^{i\phi_{j}}\sin{\theta_{j}}|1\rangle\right]} $$

Then we could map the vector Θ=(θ1,,θN,ϕ1,,ϕN) into the untangled quantum state. Method II requires a circuit where the qubits representing sub labels are connected with all the qubits representing our vector, yet the connection between the ‘data qubits’ are not required. A sketch of the main circuit using method II can be found in Fig. 6, where for simplicity, we note that \(U_{{ij}}=U_{j}^{}(L_{i}) R_{j}(-\theta ^{l_{i}}_{m}, -\phi ^{l_{i}}_{m})\)

Fig. 6
figure 6

Sketch of the main circuit for Method II: Qubits in the main circuit can be divided into two groups: One group will represent the sub labels, and will play the role of control bits, as the L part in the figure. The other will represent the given vector, as the V part in the circuit. Furthermore, qubits representing the vectors are divided into a few groups (In this Figure, 2 groups), and the sublabel qubits will control them respectively. We need to measure all of them to get our results. Connection between the V qubits are not required in this circuit

For the complexity analysis, let us assume that M d-dimensional vectors are offered as training data, thus log2d qubits are required to represent the vectors. When measuring the inner product of two vectors, \(\mathcal {O}\left \{\exp {(\log _{2}{d})}\right \}\) times of measurements are required. In the learning process to obtain sublabels, as discussed in sec.I, we need to repeat calculating inner products for \(\mathcal {O}\left (M^{2}\right)\) times. Totally, the time complexity to obtain sublabels is \(\mathcal {O}\left (M^{2}d\right)\). Then assume that we finally obtained L sublabels, then log2L qubits are required to represent the sublabel. Thus, the quantum circuit to predict labels of test data contains L multi-control gates. If we prepare the label qubits at uniform superposition state, then all qubits should be measured at last. For the label qubits, \(\mathcal {O}\left \{\exp {(\log _{2}{L})}\right \}\) times of measurements are required, while for the data qubits \(\mathcal {O}\left \{\exp {(\log _{2}{d})}\right \}\) times of measurements are required. As a result, we need to repeat measurement for \(\mathcal {O}\left (Ld\right)\) times. However, if we prepare the label qubits at state Eq. (7), then we only need to measure the first label qubit, and required times for measurement will be \(\mathcal {O}\left (d\right)\).


In summary, we developed a quantum classification algorithm where the training data is firstly clustered and assigned as various sublabels, and then based on these sublabels the quantum circuit is built for classification. Further we applied this method for classifications of metallic-insulating transition in VO2, distinguish entanglement in Werner states, and classify some randomly generated data. Numerical simulation result shows that our algorithm is capable for various classification problems, especially the study of phases transition in materials.

Availability of data and materials

All data is available from the corresponding author on reasonable request.


  • A. J. Arko, J. J. Joyce, A. B. Andrews, J. D. Thompson, J. L. Smith, D. Mandrus, M. F. Hundley, A. L. Cornelius, E. Moshopoulou, Z. Fisk, et al., Strongly correlated electron systems: Photoemission and the single-impurity model. Phys. Rev. B.56(12), R7041 (1997).

    Article  CAS  Google Scholar 

  • J. M. Arrazola, T. R. Bromley, J. Izaac, C. R. Myers, K. Brádler, N. Killoran, Machine learning method for state preparation and gate synthesis on photonic quantum computers. Quantum Sci. Technol.4(2), 024004 (2019).

    Article  Google Scholar 

  • L. -F. Arsenault, A. Lopez-Bezanilla, O. A. von Lilienfeld, A. J. Millis, Machine learning for many-body physics: The case of the anderson impurity model. Phys. Rev. B. 90(15), 155136 (2014).

    Article  CAS  Google Scholar 

  • F. Arute, K. Arya, R. Babbush, D. Bacon, J. C. Bardin, R. Barends, R. Biswas, S. Boixo, F. G. Brandao, D. A. Buell, et al., Quantum supremacy using a programmable superconductingprocessor. Nature. 574(7779), 505–510 (2019).

    Article  CAS  Google Scholar 

  • J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost, N. Wiebe, S. Lloyd, Quantum machine learning. Nature. 549(7671), 195 (2017).

    Article  CAS  Google Scholar 

  • V. Botu, R. Ramprasad, Adaptive machine learning framework to accelerate ab initio molecular dynamics. Int. J. Quantum Chem.115(16), 1074–1083 (2015).

    Article  CAS  Google Scholar 

  • F. Brockherde, L. Vogt, L. Li, M. E. Tuckerman, K. Burke, K. -R. Müller, Bypassing the kohn-sham equations with machine learning. Nat. Commun.8(1), 1–10 (2017).

    Article  CAS  Google Scholar 

  • P. Broecker, J. Carrasquilla, R. G. Melko, S. Trebst, Machine learning quantum phases of matter beyond the fermion sign problem. Sci. Rep.7(1), 1–10 (2017).

    Article  CAS  Google Scholar 

  • X. -D. Cai, D. Wu, Z. -E. Su, M. -C. Chen, X. -L. Wang, L. Li, N. -L. Liu, C. -Y. Lu, J. -W. Pan, Entanglement-based machine learning on a quantum computer. Phys. Rev. Lett.114(11), 110504 (2015).

    Article  CAS  Google Scholar 

  • J. Cao, Y. Fang, Q. Liu, A. Liu, in 2016 5th International Conference on Computer Science and Network Technology (ICCSNT). Combined prediction model of quantum genetic grey prediction model and support vector machine (IEEE, 2016), pp. 247–251.

  • J. Carrasquilla, R. G. Melko, Machine learning phases of matter. Nat. Phys.13(5), 431–434 (2017).

    Article  CAS  Google Scholar 

  • Y. Chen, S. Zhang, F. Ke, C. Ko, S. Lee, K. Liu, B. Chen, J. W. Ager, R. Jeanloz, V. Eyert, et al., Pressure–temperature phase diagram of vanadium dioxide. Nano Lett.17(4), 2512–2516 (2017).

    Article  CAS  Google Scholar 

  • K. Ch’Ng, J. Carrasquilla, R. G. Melko, E. Khatami, Machine learning phases of strongly correlated fermions. Phys. Rev. X. 7(3), 031038 (2017).

    Google Scholar 

  • J. F. Clauser, M. A. Horne, A. Shimony, R. A. Holt, Proposed experiment to test local hidden-variable theories. Phys. Rev. Lett.23(15), 880 (1969).

    Article  Google Scholar 

  • E. Dagotto, Complexity in strongly correlated electronic systems. Science. 309(5732), 257–262 (2005).

    Article  CAS  Google Scholar 

  • P. De Luna, J. Wei, Y. Bengio, A. Aspuru-Guzik, E. Sargent, Use machine learning to find energy materials (Nature Publishing Group, 2017).

  • S. Debnath, N. M. Linke, C. Figgatt, K. A. Landsman, K. Wright, C. Monroe, Demonstration of a small programmable quantum computer with atomic qubits. Nature. 536(7614), 63 (2016).

    Article  CAS  Google Scholar 

  • V. Dixit, R. Selvarajan, T. Aldwairi, Y. Koshka, M. A. Novotny, T. S. Humble, M. A. Alam, S. Kais, Training a quantum annealing based restricted boltzmann machine on cybersecurity data. IEEE Trans. Emerg. Top. Comput. Intell. (2021). IEEE.

  • R. O. Duda, P. E. Hart, et al., Pattern classification and scene analysis, vol. 3 (Wiley, New York, 1973).

    Google Scholar 

  • E. Farhi, H. Neven, Classification with quantum neural networks on near term processors. arXiv preprint arXiv:1802.06002 (2018).

  • J. Gao, L. -F. Qiao, Z. -Q. Jiao, Y. -C. Ma, C. -Q. Hu, R. -J. Ren, A. -L. Yang, H. Tang, M. -H. Yung, X. -M. Jin, Experimental machine learning of quantum states. Phys. Rev. Lett.120(24), 240501 (2018).

    Article  CAS  Google Scholar 

  • F. Häse, C. Kreisbeck, A. Aspuru-Guzik, Machine learning for quantum dynamics: deep learning of excitation energy transfer properties. Chem. Sci.8(12), 8419–8426 (2017).

    Article  Google Scholar 

  • V. Havlíček, A. D. Córcoles, K. Temme, A. W. Harrow, A. Kandala, J. M. Chow, J. M. Gambetta, Supervised learning with quantum-enhanced feature spaces. Nature. 567(7747), 209 (2019).

    Article  CAS  Google Scholar 

  • Z. Hu, R. Xia, S. Kais, A quantum algorithm for evolving open quantum dynamics on quantum computing devices. Sci. Rep.10(1), 1–9 (2020).

    CAS  Google Scholar 

  • J. Jeong, N. Aetukuri, T. Graf, T. D. Schladt, M. G. Samant, S. S. Parkin, Suppression of metal-insulator transition in vo2 by electric field–induced oxygen vacancy formation. Science. 339(6126), 1402–1405 (2013).

    Article  CAS  Google Scholar 

  • M. Karra, K. Sharma, B. Friedrich, S. Kais, D. Herschbach, Prospects for quantum computing with an array of ultracold polar paramagnetic molecules. J. Chem. Phys.144(9), 094301 (2016).

    Article  CAS  Google Scholar 

  • A. G. Kusne, T. Gao, A. Mehta, L. Ke, M. C. Nguyen, K. -M. Ho, V. Antropov, C. -Z. Wang, M. J. Kramer, C. Long, et al., On-the-fly machine-learning for high-throughput experiments: search for rare-earth-free permanent magnets. Sci. Rep.4(1), 1–7 (2014).

    Article  Google Scholar 

  • D. Leibfried, B. DeMarco, V. Meyer, D. Lucas, M. Barrett, J. Britton, W. M. Itano, B. Jelenković, C. Langer, T. Rosenband, et al., Experimental demonstration of a robust, high-fidelity geometric two ion-qubit phase gate. Nature. 422(6930), 412 (2003).

    Article  CAS  Google Scholar 

  • J. Li, Z. Hu, S. Kais, A practical quantum encryption protocol with varying encryption configurations. arXiv preprint arXiv:2101.09314 (2021).

  • Z. Li, X. Liu, N. Xu, J. Du, Experimental realization of a quantum support vector machine. Phys. Rev. Lett.114(14), 140504 (2015).

    Article  CAS  Google Scholar 

  • D. J. C. MacKay, D. J. C. Mac Kay, Information theory, inference and learning algorithms (Cambridge university press, 2003).

  • K. Mitarai, M. Negoro, M. Kitagawa, K. Fujii, Quantum circuit learning. Phys. Rev. A. 98(3), 032309 (2018).

    Article  CAS  Google Scholar 

  • T. Mitchell, B. Buchanan, G. DeJong, T. Dietterich, P. Rosenbloom, A. Waibel, Machine learning. Ann. Rev. Comput. Sci.4(1), 417–433 (1990).

    Article  Google Scholar 

  • G. Montavon, M. Rupp, V. Gobre, A. Vazquez-Mayagoitia, K. Hansen, A. Tkatchenko, K. -R. Müller, O. A. Von Lilienfeld, Machine learning of molecular electronic properties in chemical compound space. New J. Phys.15(9), 095003 (2013).

    Article  CAS  Google Scholar 

  • W. E. Perreault, N. Mukherjee, R. N. Zare, Cold quantum-controlled rotationally inelastic scattering of hd with h 2 and d 2 reveals collisional partner reorientation. Nat. Chem.10(5), 561 (2018).

    Article  CAS  Google Scholar 

  • M. M. Qazilbash, M. Brehm, B. -G. Chae, P. -C. Ho, G. O. Andreev, B. -J. Kim, S. J. Yun, A. V. Balatsky, M. B. Maple, F. Keilmann, et al., Mott transition in vo2 revealed by infrared spectroscopy and nano-imaging. Science. 318(5857), 1750–1753 (2007).

    Article  CAS  Google Scholar 

  • P. Rebentrost, T. R. Bromley, C. Weedbrook, S. Lloyd, Quantum hopfield neural network. Phys. Rev. A.98(4), 042308 (2018).

    Article  CAS  Google Scholar 

  • S. Roy, Z. Hu, S. Kais, P. Bermel, Enhancement of Photovoltaic Current through Dark States in Donor-Acceptor Pairs of Tungsten-Based Transition Metal Di-Chalcogenides. Adv. Funct. Mater.31(23), 2100387 (2021).

    Article  CAS  Google Scholar 

  • M. Sajjan, S. H. Sureshbabu, S. Kais, Quantum machine-learning for eigenstate filtration in two-dimensional materials. arXiv preprint arXiv:2105.09488 (2021).

  • M. Schuld, N. Killoran, Quantum machine learning in feature hilbert spaces. Phys. Rev. Lett.122(4), 040504 (2019).

    Article  CAS  Google Scholar 

  • Z. W. Ulissi, A. J. Medford, T. Bligaard, J. K. Nørskov, To address surface reaction network complexity using scaling relations machine learning and dft calculations. Nat. Commun.8(1), 1–7 (2017).

    Article  Google Scholar 

  • E. P. L. Van Nieuwenburg, Y. -H. Liu, S. D. Huber, Learning phase transitions by confusion. Nat. Phys.13(5), 435–439 (2017).

    Article  CAS  Google Scholar 

  • L. Wang, Discovering phase transitions with unsupervised learning. Phys. Rev. B.94(19), 195105 (2016).

    Article  Google Scholar 

  • J. N. Wei, D. Duvenaud, A. Aspuru-Guzik, Neural networks for the prediction of organic chemistry reactions. ACS Cent. Sci.2(10), 725–732 (2016).

    Article  CAS  Google Scholar 

  • N. Wiebe, A. Kapoor, K. Svore, Quantum algorithms for nearest-neighbor methods for supervised and unsupervised learning. arXiv preprint arXiv:1401.2142 (2014).

  • R. Xia, T. Bian, S. Kais, Electronic structure calculations and the ising hamiltonian. J. Phys. Chem. B.122(13), 3384–3395 (2017).

    Article  CAS  Google Scholar 

  • R. Xia, S. Kais, Quantum machine learning for electronic structure calculations. Nat. Commun.9(1), 1–6 (2018).

    Article  CAS  Google Scholar 

  • Y. Xia, W. Li, Q. Zhuang, Z. Zhang, Quantum-enhanced data classification with a variational entangled sensor network. Phys. Rev. X. 11(2), 021047 (2021).

    CAS  Google Scholar 

  • H. -S. Zhong, H. Wang, Y. -H. Deng, M. -C. Chen, L. -C. Peng, Y. -H. Luo, J. Qin, D. Wu, X. Ding, Y. Hu, et al., Quantum computational advantage using photons. Science. 370(6523), 1460–1463 (2020).

    Article  CAS  Google Scholar 

Download references


The authors would like to thank Dr. Masoud Mohseni, Dr. Manas Sajjan and Dr. Zixuan Hu for the helpful suggestions and discussions.


We acknowledge the financial support in part by the National Science Foundation under award number 1955907 and funding by the U.S. Department of Energy (Office of Basic Energy Sciences) under Award

Author information

Authors and Affiliations



SK was in charge of the overall direction and planning. SK and JL designed the model and the computational framework. JL carried out the implementation and performed the calculations. All authors discussed the results and wrote the paper. Both authors read and approved the final manuscript.

Corresponding author

Correspondence to Sabre Kais.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1

Supplementary Materials.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, J., Kais, S. Quantum cluster algorithm for data classification. Mater Theory 5, 6 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: