A Study on Hybrid Quantum-Classical Convolutional Neural Networks and its Quantum Image Encoding Methods for Image Classification

Abstract {#abstract}

This dissertation explores the innovative intersection of quantum computing and classical convolutional neural networks through the development of a Hybrid Quantum-Classical Convolutional Neural Network (HQCCNN). By integrating quantum image encoding methods—specifically the Enhanced Novel Enhanced Quantum Representation (ENEQR) and Enhanced Flexible Representation of Quantum Images (EFRQI)—this research pushes the boundaries of traditional image classification techniques. The quantum encoding methods utilized demonstrate a unique ability to enhance image classification tasks by leveraging quantum mechanics’ complex and high-dimensional space. Experimental results reveal that HQCCNNs incorporating these quantum encoding strategies surpass conventional models in accuracy and efficiency, particularly when processing intricate image datasets. This study not only highlights the potential of quantum technologies to revolutionize fields reliant on image classification but also sets the stage for future advancements in quantum artificial intelligence, suggesting a pathway towards more sophisticated quantum-enhanced computational models.

GitHub Code: https://github.com/ybmirz/UG-Final-Year-Project

Keywords: Hybrid Quantum-Classical Algorithms, Convolutional neural network, Quanvolutional neural network, Quantum image processing, Quantum machine learning, Computer vision

Terminologies and Abbreviations {#terminologies-and-abbreviations}

Below are some terminologies and abbreviations that are used constantly throughout the dissertation. They are defined here for easier and better readability.

QC — Quantum Computing
NN — Neural Network
CNN — Convolutional Neural Network
HQCCNN — Hybrid Quantum-Classical Convolutional Neural Network
NEQR — Novel Enhanced Quantum Representation
FRQI — Flexible Representation of Quantum Images
ENEQR — Enhanced Novel Enhanced Quantum Representation
EFRQI — Enhanced Flexible Representation of Quantum Images
RX, RY, RZ — Quantum Rotation Gates
CNOT — Controlled-NOT Gate
NISQ — Noisy Intermediate-Scale Quantum
Qubits — Quantum Bits
Quantum Gates — Operations applied towards a qubit, given a quantum circuit or algorithm
Quantum Algorithms — Algorithms defined by utilising quantum operations in the form of quantum gates, representation, or measurement.
Hybrid Quantum-Classical Algorithms — Algorithms defined with a mix of quantum and classical operations.

1 Introduction {#1 -introduction}

Quantum mechanics, a fundamental theory in physics, has been instrumental in shaping our understanding of the behaviour of matter and energy at the atomic and subatomic levels. Initially proposed and popularised by Feynman [1], whereby the principles of superposition, entanglement, and interference within the world of quantum mechanics are to be exploited for computation. From this, quantum computing emerged as a technology that has the potential to solve complex problems exponentially faster than the present day’s method of classical computation. With many studies such as Grover’s and Shor’s algorithm being the pinnacle proof of its potential, quantum computation has proceeded to amass more attention. These algorithms demonstrate exponential and quadratic speed-ups by exploiting quantum computing’s parallelism to perform calculations, and with direct practical applications, these algorithms have demonstrated the increasing potential quantum computing has to solve real-world problems.

As its attention further develops, many subdomains have emerged under the domain of quantum computing such as quantum algorithms, the physical realisation of quantum computers and the models of quantum computation. These domains all proceed towards the common challenge of advancing quantum technologies, with works such as [2] clearing the path of advancements. These advancements typically involve objectives such as scaling up quantum systems, improving their stability and control through noise reduction and mitigation, and developing software or algorithms that can fully harness quantum computation. Designing and optimising quantum algorithms goes in under two common directions, either a more optimised approach in implementing an algorithm on a quantum computer or developing more practical quantum algorithms for real-world problems. Many recent studies look towards implementing more real-world quantum algorithms, such as Variational Quantum Algorithms [3] and Quantum Algorithms for Quantum Chemistry and Quantum Materials Science [4].

1.1 Quantum Information {#1.1 -quantum-information}

The world of quantum machine learning requires some prior background knowledge in basic quantum information processing and representation. Literature such as [5] notes the extensive developments in quantum information and computation has developed from the initial proposal of Feynman. In the context of this dissertation, knowledge of quantum mechanics at the physical level is not required, and hence this section will provide discourse starting from the basics of quantum computation. For the purposes of this dissertation, this subsection will provide the fundamentals of quantum machine learning and computation.

Qubits. Classical computers represent information through the use of a deterministic binary digit (bit), of 1 or 0. Quantum computers represent information through the use of a qubit, or a quantum bit. Unlike classical bits, which can only exist in states of 0 or 1, qubits can exist in a superposition of both states simultaneously, represented by the quantum state vector [6] The exact methods of implementing this in the physical model of quantum systems vary, from optoelectronics to semiconductors and ions. This unique property allows qubits to perform multiple calculations at once, providing the potential for quantum computers to solve certain problems much faster than classical computers. Each qubit state can be represented as a linear combination of basis states on a Bloch sphere, where the state vector is described by a pair of complex amplitudes (,) satisfying the normalisation condition of ||2+||2=1. The Bloch sphere is an imaginary plane on which these states can reside. These amplitudes determined the probability of measuring the qubit in either the |0⟩ or |1⟩ state when observed.

![][image3]
Figure 1. The Bloch Sphere representing qubit states

Quantum Gates. Quantum gates are the fundamental operations that manipulate and transform the state of qubits, analogous to classical logic gates. These gates are represented as a unitary matrix transformation acting on the quantum state vector. They are the basis of quantum operations, when built upon, creates quantum algorithms. Common quantum gates include the Pauli-X gate (a bit-flip operation), the Hadamard gate (creating superposition states), and controlled gates like the CNOT (Controlled-NOT) gate, which introduces entanglement between qubits.
Quantum Circuits. Orchestrating multiple quantum operations (gates) as a sequence of operations, defines a quantum algorithm. A quantum algorithm is typically represented as a quantum circuit, hence a sequence of gates applied to an initial quantum state. These circuits can be visualised as a series of operations acting on one or more qubits, with each gate performing a specific unitary transformation. By carefully designing circuits, researchers aim to develop algorithms that can leverage quantum phenomena to achieve computational advantages over classical algorithms.

![][image4]
Figure 2. A typical Quantum Circuit Diagram
More recently, with the advent of focus on machine learning as an application of computer science, an intersection of quantum computing and machine learning has given rise to a new field of research, dubbed quantum machine learning. Quantum machine learning aims to leverage quantum computing to enhance the performance of machine learning models. [44] describes it as the exploitation and use of quantum computation to improve machine learning solutions. This emerging field has the potential to revolutionise various applications, including image recognition, natural language processing and even recommender systems. Recently, several promising quantum algorithms such as [7] have shown theoretical speed-up over their classical counterparts. However, the current methods and hardware of quantum technologies are not powerful enough to execute these theoretical algorithms. The current capabilities of quantum hardware are described to be Noisy Intermediate-Scale, or NISQ for short [8]. This era of quantum hardware is limited by the small number of qubits and high error rates due to noise, restricting the accuracy and capability of executable quantum instructions. This limitation has regulated researchers to provide attention onto hybrid quantum-classical algorithms, where classical methods are enhanced and combined with quantum computation.

1.2 Convolutional Neural Networks {#1.2 -convolutional-neural-networks}

In the area of computer vision, image classification tasks are commonly tackled through the use of convolutional neural networks. Convolutional Neural Networks, or CNNs for short, are a type of neural network that consists of multiple layers of convolutional and pooling operations, followed by fully connected layers. CNNs have been shown to be highly effective in image classification tasks and the advent of deep learning architectures, providing increasing performances in various benchmark datasets. Deep learning CNN models have been noted to be computationally expensive and require large amounts of data to train (to achieve better performance metrics). This is where quantum computation may provide an advent, by utilising and exploiting quantum parallelism in theory, to reduce the computational costs of training CNN models. One approach is to implement a hybrid quantum-classical CNN (HQCCNN) to use quantum computation as replacement or modification of specific components. For example, quantum computation can be used to perform the matrix multiplications involved in the convolutional layers, or to accelerate the computation of the activation functions. By offloading these computationally expensive tasks, the overall computational cost of training the CNN model can be reduced.

Convolutional and Pooling Layers. CNNs are composed of multiple convolutional and pooling layers, followed by fully connected layers. The convolutional layers are the core building blocks of CNNs, responsible for extracting and learning relevant features from the input image data. In a convolutional layer, a set of learnable filters (or kernels) are applied to the input image through a sliding window operation known as convolution. These filters are typically small (e.g., 2×2 or 4×4) and are convolved across the entire image, computing the dot product between the filter weights and the corresponding image patch at each spatial location. This operation produces a feature map that captures the presence and strength of the learned features at different spatial positions within the image.

Pooling layers are typically introduced after convolutional layers to reduce the spatial dimensions of the feature maps, effectively downsampling the representations while retaining the most salient features. Common pooling operations include max pooling, which selects the maximum value within a local neighbourhood, and average pooling, which computes the average value within a local neighbourhood.

![][image5]
Figure 3. Traditional CNN architecture

Neurons and Activation Functions. Similar to traditional neural networks, CNNs consist of interconnected neurons organized into layers. Each neuron in a convolutional or fully connected layer performs a weighted sum of its inputs and applies a non-linear activation function, such as the rectified linear unit (ReLU) or sigmoid function, to introduce non-linearity into the model.

Optimisers, Hyper Parameters and Fine-tuning of the Model. CNNs are trained using backpropagation and gradient-based optimization techniques, such as stochastic gradient descent (SGD) or adaptive optimisers like Adam [9]. During training, the model’s parameters (filter weights and biases) are iteratively updated to minimize a loss function, which measures the discrepancy between the model’s predictions and the ground truth labels. Hyperparameters, such as learning rate, batch size, and regularization techniques (e.g., dropout, weight decay), play a crucial role in the training process and model performance. Fine-tuning these hyperparameters through techniques like cross-validation or grid search can significantly impact the model’s accuracy and generalization capabilities. Deep learning CNN architectures, such as AlexNet [10], VGGNet [11], and ResNet [12], have achieved remarkable success in various benchmark datasets and real-world applications, pushing the boundaries of computer vision and image recognition. However, these deep CNN models are often computationally expensive and require large amounts of data and computational resources to train effectively.

1.3 Aims and Objectives {#1.3 -aims-and-objectives}

The primary aim of this dissertation is to investigate the development of a hybrid quantum-classical convolutional neural network that addresses the limitations of traditional image representations and encoding methods. As literature and previous works are further described, the aim of this dissertation expands into enhancements and improvements to the HQCCNN. More specifically, this dissertation will also look into the encoding methods of the hybrid model as an avenue for improvement and enhancement or an improvement to the quantum circuit itself as a component of convolution.

The objectives of this dissertation include:

Implementing a baseline hybrid quantum-classical convolutional neural network, with details on its components and dataset used.
Performing evaluations on the baseline HQCCNN, in comparison with a baseline CNN model under training of the MNIST dataset.
Exploring the use of novel quantum image representation and encoding methods, including Enhanced Flexible Representation of Quantum Images (EFRQI) and an Enhanced Novel Enhanced Quantum Representation (ENEQR), in a variational HQCCNNs.
Analyse the effectiveness of the proposed encoding methods to the variational HQCCNN model in image classification tasks and identify areas for future improvement.

1.4 Problems and Motivations {#1.4 -problems-and-motivations}

The main motivation for this dissertation is the pursuit of quantum supremacy. As part of the ongoing research towards understanding the capabilities of quantum computation atop classical computing, an analysis, and study towards a hybrid algorithm such as this dissertation will contribute by being a benchmark of what quantum could possibly achieve. Specifically, this research contributes to the development of HQCCNNs by exploring novel image representation and encoding methods, on top of a variational scheme of quantum circuits as convolution. A variational scheme can also be described as a trainable scheme, meaning the quantum circuits can be trained. The findings of this research have the potential to improve the performance and efficiency of HQCCNNs, paving the way for their application in more real-world computer vision tasks.

Despite the promising results of general HQCCNNs, their development is still in its infancy. One of the primary challenges in building HQCCNNs is the representation and encoding of images onto quantum states. This dissertation focuses its research towards providing a baseline implementation of a HQCCNN, and introduce novel encoding methods towards a variational HQCCNNs.

1.5 Dissertation Overview {#1.5 -dissertation-overview}

This dissertation focuses on the realisability and a study of hybrid quantum-classical convolutional neural networks. It will further describe a crucial issue in experimentation and implementation of these hybrid models, which are the encoding methods of images into the quantum states for representation in the model. This dissertation is structured as follows; Section 2 provides background information and related literature to provide context for the investigation provided in this dissertation. Section 3 provides the methodology for research, implementation, and evaluation described in this dissertation. Section 4 continues to provide a detailed method of implementation and experimental design utilised in this dissertation, specifically implementation of models and encoding methods. Sections 5 and 6 will further elaborate on the experimental findings and provide an analysis of the outcomes. Lastly, Section 7 summarises the conclusions from all investigations provided in this dissertation and provides any future work that can be done as an extension of this dissertation.

2 Background {#2 -background}

This section shall provide insights on the current state of HQCCNN research and related works. The aim of this section is to investigate the literature landscape to provide context for the investigation and research in this dissertation. This section will also provide insights into associated quantum image representation and encoding techniques, specifically outlining the development and applications of quantum image encoding. In correlation to this dissertation’s objective, input data encoding or representation is crucial for the implementation and effectiveness of machine learning tasks [13].

This review examines prominent efforts to date in implementing quantum modifications to CNNs. The review will centre on how researchers integrated quantum techniques into classical data flows and architectures, as well as the reported outputs of those hybrid models versus traditional CNNs. Notably, the hybrid models differ from a purely Quantum Convolutional Neural Network (QCNN), such as the model proposed by [14]. It is a purely quantum machine learning model that combines the concept of a CNN, whilst using quantum computing techniques. The architecture makes use of a multiscale entanglement renormalization ansatz and quantum error correction. Though it could accurately recognise quantum states and is implementable in near-term quantum computers, it was not made for higher dimensional inputs such as image classification. This dissertation’s focus rests more on augmenting instead of supplanting classical CNNs to better understand realistic near-term advantages.

One of the first hybrid models, proposed by [15], is the Quanvolutional Neural Network, where the work applies quantum computing techniques to a classical convolutional neural network. This paper introduces a transformation layer, dubbed quantum convolution or quanvolutional layer, which operates on the input data by locally transforming the data using random quantum circuits. The hybrid model was then evaluated on the MNIST dataset among a traditional CNN model and a CNN with additional non-linearities. The results indicate that hybrid models have higher test set accuracy and maintain faster training time compared to purely classical CNNs. However, this does not explicitly determine a definite quantum advantage, and further research would be needed to determine if observed improvements in accuracy and training speed are a direct result of quantum transformations. The work does not provide a detailed analysis of the limitations and challenges, nor a comparison of performance with other classical image classification techniques, and hence it is difficult to assess the true extent of quantum advantage, if any.

Another hybrid model, proposed by [16], is also dubbed the Quanvolutional Neural Network. Similar to the previous model, it applies quantum techniques to a CNN through a quanvolutional layer and extends the approach by making this layer trainable via backpropagation using a variational quantum circuit. The paper implements multiple state preparation algorithms for image encoding as well, allowing for larger quanvolutional filter states. The experimental results indicate that models with trainable circuits show lower error values on training and validation sets compared to models with fixed parameters in quantum circuit.

This method of adding a quantum encoding and quantum circuit as the convolutional layer appears to be more common in various research. There are various other methods for implementing the quantum aspect of a quanvolutional neural network, such as a quantum graph CNN model, where [17] utilised quantum parametric circuits for graph level classification tasks. Some other published work worth noting is [18], where the paper investigates the features proposed by [15] and improves upon the training strategy and evaluate using different topologies, sizes, and depth of filters. They proceed to suggest an efficient configuration for the quanvolutional neural network.

2.2 Quantum Image Representation and Encoding {#2.2 -quantum-image-representation-and-encoding}

Quantum image representation and encoding are pivotal for efficiently utilising quantum computing resources in image processing tasks. It involves the transformation of classical image data into quantum states that can be manipulated using quantum operations. The primary focus in this section is to discuss the various encoding schemes which differ in their approach to mapping classical image information onto quantum states. The following subsections are common quantum image encoding methods described in literature relating to a HQCCNN, or have been utilised in a HQCCNN implementation.

2.2.1 Threshold Encoding. {#2.2.1 -threshold-encoding.}

Threshold encoding is one of the simplest forms of quantum image encoding. It reduces the colour depth of an image to a binary format, where each pixel is represented as either 0 or 1, depending on whether the pixel value is above or below a certain threshold. This method is particularly noted for its simplicity and low computational overhead, but suffers greatly from significant information loss, which can impact the performance of image processing tasks. There is little literature describing this encoding technique, since utilising thresholds is rather a simple and common expression to encode and represent information on a different state space. This encoding was utilised in [H] as the first implementation of a HQCCNN for image classification.

2.2.2 Novel Enhanced Quantum Representation (NEQR) {#2.2.2 -novel-enhanced-quantum-representation-(neqr)}

The Novel Enhanced Quantum Representation (NEQR), initially proposed by [19], provides a more detailed approach by utilising a qubit string to represent each pixel’s gray scale value directly, thus preserving the exact pixel values without compression. NEQR has been recognised for its precision and the ability to hand

NEQR maps the pixel values pij of an 2n x 2n image I with gray range 2q as:

|1⟩=(12n)yx|P(Y,X)⟩|XY⟩

Where |P(Y,X)⟩ is a q-qubit register storing the gray value of the pixel at position (X,Y) and |XY⟩ is a 2n qubit register storing the binary encoding of the pixel coordinates (X, Y). To set the state of |P(Y,X)⟩ NEQR uses q controlled-NOT (CNOT) gates with 2n gates, controls (from |XY⟩ ) to flip the q qubits based on the pixel value. This requires q*22n operations. The NEQR quantum image encoding was utilised in a HQCCNN by [M], and it has been used in quantum-based image encryption schemes to enhance security by combining it with hyper-chaotic systems in [20].

2.2.3 Flexible Representation of Quantum Images (FRQI) {#2.2.3 -flexible-representation-of-quantum-images-(frqi)}

Flexible Representation of Quantum Images (FRQI), initially proposed by [21], uses a single qubit per pixel to store the greyscale value as an angle in the Bloch sphere, which allows for a compact and efficient representation of images. FRQI uses the amplitude of a single qubit to store the greyscale values YX of a pixel at (X,Y):

|2⟩=(12n)yx|P(Y,X)⟩|XY⟩

Where |P(Y,X)⟩ is a q-qubit register storing the gray value of the pixel at position (X,Y) and |XY⟩ is a 2n qubit register storing the binary encoding of the pixel coordinates (X,Y). To set the state of |P(Y,X)⟩ NEQR uses q controlled-NOT (CNOT) gates with 2n gates controls (from |XY⟩ ) to flip the q qubits based on the pixel value. This requires q*22n operations. FRQI is used to compress images on quantum computers, demonstrated in [22]. Similar to NEQR, it has also been used in quantum image encryption and decryption [23], along with multidimensional colour image processing.

2.2.4 Other Quantum Image Representation Methods {#2.2.4 -other-quantum-image-representation-methods}

In addition to the primary quantum image encoding methods, several alternative strategies to address specific needs in quantum image processing have been developed. These methods aim to enhance the capabilities of quantum computing in handling complex image data by optimising the encoding and manipulation of pixel values. Enhanced Novel Enhanced Quantum Representation (ENEQR) and Enhanced Flexible Representation of Quantum Images (EFRQI) are extensions of NEQR and FRQI, respectively. Both initially proposed by [24], ENEQR improves upon NEQR by incorporating additional qubits and modified gate sequences to reduce the quantum cost and improve the encoding speed. EFRQI modifies FRQI by using different amplitude and phase encoding schemes to better manage the quantum state superposition, aiming to increase the accuracy and stability of encoded images.

In the context of this dissertation, the HQCCNN will be trained on greyscale value images. There is an encoding method designed and implemented best with minimal information loss when encoding greyscale images, the Bitplane Representation of Quantum Images (BRQI). Initially proposed by [25], BRQI separates an image into several bit planes, encoding each plane separately. This method allows for a more nuanced manipulation of image details, potentially enhancing image analysis tasks such as edge detection, texture analysis or feature extraction. Another quantum encoding approach, designed to enhance the manipulation and retrieval capabilities of images, is SCMFRQI; another modified FRQI encoding. State Connection Management FRQI (SCMFRQI), proposed by [26], manages the connections between quantum states representing different parts of the image, facilitating better control over operations such as image segmentation and object recognition. Other encoding methods, with a focus on generalisation and security, include GNEQR [27] and QBIR [28]. Generalised NEQR extends NEQR to a broader range of quantum states, allowing for more complex representation such as additional image features beyond pixel value such as depth and transparency. The Quantum Block Image Representation (QBIR) on the other hand, is designed for secure quantum image storage and retrieval, utilising quantum encryption techniques. For more sensitive applications, such as medical imaging or security systems, this approach is crucial. All of these representations deal with greyscale value images.

Hence, and lastly, for colour image processing, Novel Quantum Representation for Colour Images (NCQI) [29] extends quantum image processing techniques to colour images by encoding RGB values into quantum states. This is done by encoding each channel into its own distinct quantum state or as part of a composite state that encodes all three-colour dimensions simultaneously, whilst greyscale images only have intensity values, simplifying the encoding process. These representations could all be applied onto a HQCCN depending on the given learning task at hand. In the context of this dissertation, EFRQI and ENEQR will be implemented to provide a direct enhancement towards previous implementations utilising FRQI and NEQR.

2.2.5 Comparison of Quantum Image Representation Methods. {#2.2.5 -comparison-of-quantum-image-representation-methods.}

Below is a table that compares the aforementioned quantum image representations and encoding for 2n2n images.

Encoding Method	Time Complexity	Input Colour Type	State Representation
FRQI	O(24n)	Greyscale	Amplitude
NEQR	O(2qn22n)	Greyscale	Amplitude
EFRQI	O(2n22n)	Greyscale	Amplitude
ENEQR	O(2n22n)	Greyscale	Basis state
BRQI	O(b*n2)	Greyscale	Basis state
SCMFRQI	Not Described	Greyscale	Amplitude
QBIR	O(2qn2n2n)	Greyscale	Basis State
GNEQR	O(2qn2n2n)	Greyscale	Amplitude
NCQI	O(6qn22n)	RGB Values	Amplitude

3 Methodology {#3 -methodology}

This chapter will discuss the approach that was used to demonstrate, explore and experiment for this study. This section outlines the methodology used to design, implement and evaluate the hybrid quantum-classical convolutional neural network (HQCCNN) for image classification. The methodology is divided into several sections, each of which addresses a specific aspect of the HQCCNN’s development. As part of this dissertation’s objective of improving and enhancing the baseline hybrid model, this section will also describe the methodology used to design, explore, implement and evaluate enhancements to the baseline model. This section is divided into the various components that go into developing the variational HQCCNN approach and the quantum image encoding methods. Such as the hybrid aspect of a HQCCN, the quantum circuit’s designs and the variational aspect of this implementation.

3.1 Research Question and Approach {#3.1 -research-question-and-approach}

The first research question established for this dissertation: can the use of quantum circuits be utilised in the convolutional component of a convolutional neural network? To answer this question, various literature was reviewed and multiple works such as [16] and [18], describe [15] to be the first to propose the approach of a hybrid quantum-classical convolutional neural network. This dissertation will proceed to implement this baseline model and run general evaluation and techniques to be discussed against the traditional CNN. By running this experimentation, the differences between a baseline CNN and a baseline HQCCNN would be provided through rigorous evaluation.

The next and second research question comes naturally, how and what can be done to improve the HQCCNN? To answer this question, various literature note that a variational approach to improving the hybrid model. Through the use of a variational and updating quantum circuit, the hybrid mode would theoretically improve, presented in [16]. This method of improvement will be evaluated in comparison with its baseline counterpart, and furthermore, the image encoding used in these models would be utilising the threshold quantum image encoding. A question deriving from this would be: Would the usage of different quantum image encoding techniques, instead of the threshold quantum image encoding, help improve the performance of the variational HQCCNN? This is the third and final research question.

The approach towards answering these research questions will be divided into three major experimentations. The experimentations have an overarching focus of comparison and evaluation between three different hybrid model implementations: a baseline hybrid quantum classical convolutional neural network with threshold quantum image encoding, a variational implementation of a HQCCNN with threshold quantum image encoding, and lastly, a variational HQCCNN with varying quantum image encodings. The remaining of this section aims to describe the approach and methods used to design, implement and evaluate the baseline and variational HQCCNN for image classification.

3.2 Hybrid Quantum-Classical Design {#3.2 -hybrid-quantum-classical-design}

The idea of utilising quantum circuits to replace the convolution layers in a classical CNN was initially proposed by [15]. This quantum layer performs quantum convolution operations on the input data. The input images are first encoded into quantum states |⟩ using a chosen encoding method, like amplitude or angle encoding, with the first implementation utilising Threshold encoding. This encoded quantum state serves as the input to the quantum circuit layer. The quantum circuit layer consists of a sequence of parametrised quantum gates acting on the encoded state |⟩. This circuit is designed to perform quantum operation analogous to a classical 2D convolution. Mathematically, a classical 2D convolution between an input image I and a filter kernel K can be expressed as:

(I*K)(m,n) = ij(m-i, n-j) K(i,j)

This computes the dot product between the image patch centred at (m,n) and the filter kernel, producing a scalar output for that location. It can be further expressed as a matrix multiplication y=W*x where x is the flattened input image and W being the weight matrix representing convolutional filters, and y is the output feature map. On the other hand, in the quantum circuit layer, this 2D convolution is replaced by a quantum operator U acting on the encoded state |⟩. The circuit is constructed from a sequence of universal quantum gates such as RX, RY, RZ, CNOT, etc.

The goal is to have a circuit U such that the output state |’⟩ encodes features analogous to those obtained from a classical 2D convolution. One approach for this, by [15], is to utilise a random quantum circuit U consisting of single-qubit rotations and entangling two-qubit gates. This works since, mathematically, any circuit U can be represented as a unitary matrix transformation acting on the quantum state vector |⟩ . As such, it can be written:

|’⟩=U|⟩=Wq|⟩

Where Wq|⟩ is the unitary matrix representing the linear transformation performed by the quantum circuit U. After measuring the output state, we obtain a classical output vector y’ that can be fed into the subsequent layers of the neural network.

![][image6]
Figure 4. Quantum Circuit as a unitary matrix

In this way, the quantum circuit layer replaces the classical convolutional layer by performing linear transformations (quantum gates) on the encoded data. In the experiments, from the baseline and the variational approach, with or without varying encoding methods, this dissertation utilises this method of incorporating quantum computation into the CNN pipeline, thus creating a hybrid algorithm for image classification tasks.

![][image7]
Figure 5. Hybrid Quantum-Classical Design Framework

3.3 Baseline and Variational Quantum Circuit Design {#3.3 -baseline-and-variational-quantum-circuit-design}

The quantum circuit layer in the baseline implementation acted as a fixed, non-trainable component while the rest of the neural network layers (pooling, fully connected, etc.) were trained using traditional backpropagation and gradient descent methods. In an attempt to further improve the model, a variational quantum circuit approached was explored, where the circuit becomes a trainable component of the model. More specifically, the parameters of the quantum circuits are treated as trainable variables. These parameters are optimised during the training process using a hybrid quantum-classical optimisation algorithm, with the goal of learning the most effective quantum circuit structure and parameter for the image classification.

This variational approach has been initially proposed by [16]. In their variational approach, the quantum circuit itself becomes trainable by parametrising the quantum gates. It involves initialising these parameter values randomly, then iteratively updating them using a classical optimisation routine to minimise a cost function that measures the performance of the quantum circuit on the image classification task. This dissertation’s implementation will utilise gradient descent as the classical optimisation routine. This subsection will further discuss the methodology of this optimisation routine.

The variational quantum circuit U() is parameterised by a set of continuous parameters = {1, 2,, n}. These parameters represent the rotation angles or other configurable values of the various single-qubit and two-qubit gates that make up the quantum circuit. During the training process, the goal is to optimise these parameters to minimise a cost function C(). This cost function measures the error between the predicted output of the hybrid quantum-classical model and the true target labels for the image classification task.

The cost function utilised here is the cross-entropy loss function, and it is used to evaluate the general performance of all the hybrid model’s predictions against the true labels. For a batch of predictions and labels, the loss is computed as:

Loss = - i=1Nc=1Myiclog(pic)

Where N is the number of instances in the batch, M is the number of classes, yic is a binary indicator (0 or 1) if class label c is the correct classification for observation i. pic is the predicted probability of observation i belonging to class c.

To update the quantum circuit parameters , this implementation employs a gradient-based optimisation technique, where the gradients of a defined cost function C() with respect to each parameter i are computed using a differentiation method, such as the parameter shift rule:

Ci=(C( + 2 * ei) - C( - 2*ei))2

Here, ei is a vector with 1 at the ith position and 0 elsewhere, representing a small shift in the parameter i. Once the gradients of Ci are computed for all parameters i, the parameters are updated using a gradient descent update rule:

ii-*Ci

Where is the learning rate, a hyperparameter that controls the step size of parameter updates. The gradients generally indicate how the parameters should be adjusted to minimise the loss. This process of computing gradients and updating is repeated iteratively throughout the training process. By optimising the quantum circuit parameters in this manner, the variational approach allows the quantum circuit to learn and adapt its operations to effectively extract and amplify relevant features from the encoded quantum image states. This theoretically and ultimately improves the performance of the hybrid quantum-classical model. The experiments in this dissertation have a dominant focus on baseline vs variational quantum circuit, to ensure a wholistic conclusion can be derived.

3.4 Variational HQCCNN {#3.4 -variational-hqccnn}

![][image8]
Figure 6. Hybrid Quantum-Classical Design with Variational Quantum Circuit

The learning process in this variational HQCCNN model builds atop the above described variational quantum circuit design. The model itself prepares the input as a quantum state utilising a quantum image encoding method, proceeding to be processed and convolved by the quantum circuit. The input image, in this dissertation being 14×14 pixels, is divided into 2×2 patches or quanvolutional filters. Each 2×2 patch is encoded into a quantum state using an encoding circuit, for example, one that maps the pixel values onto the amplitudes of the quantum state. Generally, the encoding circuit consists of a series of quantum gates that map the input values onto measurable quantum states. This encoded quantum state is then passed through the parameterised quantum convolutional (or quanvolutional) layer, built upon the variational or baseline quantum circuit, which results in applying a series of quantum operations to extract features from the 2×2 patch.

The quanvolutional layer is applied to each 2×2 patch in a sliding window fashion, across the entire input image, generating a feature map that captures the extracted features. After processing all patches, the resulting quantum state is measured, converting it to classical data that can be processed by the remaining classical components of the hybrid network. This measurement of the resulting quantum state collapses into specific basis states, with some probability values given by the state’s amplitudes. Repeated measurements of the quantum state generate a distribution of measured basis states. This forms a classical feature map, where each basis state represents a feature, and its frequency represents the strength of that feature. This classical data can then be further processed by the proceeding pooling layer of the network.

For this dissertation’s experiments, the variational models have exactly one quanvolutional layer and one pooling layer before being connected by a fully connected layer. Some works such as [11] and [12] describe utilising non-linear activation functions, batch normalisations and other techniques commonly used in deep learning architectures. However, these techniques were not implemented in this dissertation’s experiments. Specific to the variational model, the gradients of the loss function are computed not only for the parameters of the classical components (pooling and fully connected layer) but also for the parameters of the quanvolutional layer. These gradients are used to update the parameters of both quantum and classical components simultaneously.

3.5 Quantum Image Encoding Methods {#3.5 -quantum-image-encoding-methods}

The first crucial step in implementing a HQCCNN is to encode the classical input images into quantum states. Various encoding schemes have been proposed in the literature to represent image data as quantum states. The general methodology involves mapping the pixel values of the image onto the amplitudes or phases of a quantum state vector |⟩ comprised of multiple qubits. Mathematically, a quantum state |⟩ of n qubits can be represented as a superposition of 2n basis states:

|⟩ = xx|x⟩

Where x ranges overall all 2n computational basis states, and x are complex amplitudes satisfying the normalisation condition x|x|2 = 1. The goal of an encoding scheme is to define a mapping function f that takes an image I as input and produces the amplitudes x for each basis state, |x⟩ such that the resulting quantum state |⟩=xf(I,x)|x⟩ represents the image information.

Different encoding schemes, such as amplitude encoding, angle encoding, FRQI, and NEQR, define this mapping function f differently based on considerations like linearity, normalisation, computational complexity, etc. In this dissertation, two novel encoding schemes, EFRQI and ENEQR are primarily explored and utilised to improve the variational HQCCNN. EFRQI and ENEQR as quantum image encodings are proposed by [24]. These two were primarily chosen due to being a direct enhancement with minimal modifications of NEQR and FRQI, resulting in a simpler evaluation of improvement in the models. In the baseline implementation, the threshold encoding was used to encode the input images into quantum states. This encoding is simple and easy to implement, where it uses one qubit to encode one pixel. The pixel value is determined to be either high (1) or low (0) based on a determined threshold. This scheme uses classical pre-processing to convert greyscale values into binary, which are then encoded directly onto the qubits with the Pauli-X gate to flip the state from |0⟩ to |1⟩ where necessary. It’s a simple and efficient scheme, though the encoding is restricted since only binary information is stored. This could lead to significant loss of detail that might be critical for the image processing, hence affecting the model’s classification. The NEQR and FRQI encoding has been previously implemented and experimented upon by [16].

3.5.1 ENEQR Methodology {#3.5.1 -eneqr-methodology}

ENEQR builds upon the NEQR scheme with the objective of providing a more efficient and robust encoding. It builds upon NEQR by using an auxiliary qubit to reduce the complexity, such that:

|ENEQR⟩=(12n)YX|P(Y,X)⟩|0⟩|XY⟩

Where |P(Y,X)⟩ is a q-qubit register storing the gray value of the pixel at position (X,Y) and |XY⟩ is a 2n qubit register storing the binary encoding of the pixel coordinates (X,Y). It uses only one 2n-CNOT gate to load the position to the auxiliary qubit, and q CNOT gates to set the gray value for the pixel, instead of using q 2n-CNOT gates to set it. This position value in the auxiliary then resets to be used for the next pixel and so on, which overall improves the time complexity and the quantum cost of the circuit. This representation employs 2n+q+1 qubits to store greyscale images.

![][image9]
Figure 7. A sample ENEQR Encoded Circuit of a greyscale image

3.5.2 EFRQI Methodology {#3.5.2 -efrqi-methodology}

EFRQI extends the FRQI scheme by introducing a scaling factor to improve representational capacity and mitigate information loss. The EFRQI encoding scheme uses the partial negation operator (and quantum gate) RX =255X to set the qubit amplitudes more efficiently compared to FRQI. Generally, the X gate is a negation gate that acts on one qubit, where it flips the qubit state from |0⟩ to |1⟩ (and vice versa). The partial negation operator RX is the kth root of the X gate [30]. In this gray scale implementation, the RX operator is applied according to the gray value of the pixel, hence k is set to 255 as the maximum 8–bit pixel value. Similar to FRQI, EFRQI uses a single qubit to store the gray value, entangled with a qubit sequence to store the corresponding position. This gray value is stored in the amplitudes of the states as follows:

|EFRQI⟩=(12n)YX(aYX|0⟩ + bYX|1⟩)|XY⟩

Where a and b are the amplitudes of the state, set by applying the RX operator. This essentially requires only a single controlled RX operation per pixel instead of a controlled rotation. In terms of qubit usage, equal to FRQI, EFRQI employs 2n + 1 qubits to represent a 2^n x 2^n greyscale image with 1 qubit to store greyscale value in amplitude form and 2n qubits to store pixel position/coordinates. However, EFRQI provides a computational speed-up over FRQI due to using the partial negation operator to set the greyscale amplitude in the single qubit. This speed up resulted in the time complexity of EFRQI being only O(2n*22n) compared to FRQI’s O(24n) [24].

![][image10]
Figure 8. A sample EFRQI Encoded Circuit of a greyscale image

The experiments provided in this dissertation will also look into how these encoding methods affect the performance of the variational HQCCNN when implemented. These are only two of various quantum image encoding methods that could be implemented, other methods such as State Connection Management FRQI (SCMFRQI) and Bitplane Representation for Quantum Images (BRQI) could also be applied. However, attempts at implementing these encoding methods were limited by the current variational methodology. Both SCMFRQI and BRQI encode the entire images at once as a single quantum state, taking into account the pixel values of the entire image. However, the current variational approach learns by encoding filtered patches of the image rather than the full image at once. BRQI was initially considered to be implemented due to work such as [25] describing the encoding method effectively representing greyscale imaging.

4 Experimental Design and Implementation {#4 -experimental-design-and-implementation}

This section will discuss the technology and direct implementation for all experiments done for this study, along with the dataset used. The experiments were designed in direct correlation to answer the research questions suggested in the previous sections. This section is divided into the three main model types and provides experimentations based on the models. The first model and experimentation focus on the comparison between a baseline HQCCNN and a classical CNN, answering the first research question. The remaining models and experiments focus on answering the second and third research question, by involving and comparing variational with the baseline HQCNN models. Lastly, varying encoding methods will be applied into both the baseline and variational models as part of the encoding methods experiments.

Each experiment consisted of 50 epochs. An epoch included 100 training steps followed by 50 validation steps. The models used a batch size of 2 samples per step. All models were trained using a basic Adam optimizer, a cross-entropy loss function, and a fixed learning rate of 0.01. The order of input data from the datasets, including validation and testing, are the same throughout all the experiments. The decision to utilise 50 epochs, with a relatively large dataset, is to further analyse how the different encoding methods applied affect the learning of the models. As described in the following sections, this resulted in a long duration of experimentation.

With 5 total encoding methods in the models, each of which has a variational quantum circuit and a baseline or fixed quantum circuit, this results in a total of 10 experiments. With an additional experiment to analyse between the baseline HQCCNN and the classical CNN. This brings the total number of experiments to 11.

4.1 Dataset and Implementation Overview {#4.1 -dataset-and-implementation-overview}

This dissertation utilises the MNIST dataset, a widely used benchmark dataset in the field of computer vision and machine learning [31], particularly for an image classification task. Each image in the MNIST dataset is a 28×28 pixel greyscale image, however this dissertation downsamples the images into a 14×14 pixel image set to save on experimentation time. This constraint is set to minimise the time taken to complete all experiments.

![][image11]
Figure 9 (a). Sample Images from the MNIST dataset

The images in the dataset are relatively simple, making it a good starting point for exploring and testing new algorithms such as this hybrid model. More complex datasets could potentially obscure the fundamental workings and differences between the classical and quantum approaches. This simplicity also helps ensure the encoding stays relatively consistent between differing images when encoded into the quantum state, in theory that is. Paired up with a long-epoch training of this image, the utilisation of a low-feature dataset like MNIST provides a better gauge on the exact effects (if any) the encoding methods will have on the variational HQCCNN models.

The models are trained using a set of 10,000 images, a validation set of 200 images, and a test set of 1000 images. The training set was used to optimise the model parameters, the validation set was used for hyperparameter tuning and model selection, and the test set was used to evaluate the final performance of the models. This dissertation’s implementation utilises a fork of [16] with heavy modifications to the classical optimisation methods and with additional encoding methods implemented. The code implementation of this dissertation is available through GitHub, and the upcoming sections will describe modifications towards the code fork. The original implementation builds the codebase through a class system built upon PyTorch’s Neural Network module as a parent system.

The models are implemented using a combination of classical machine learning frameworks, specifically, PyTorch [32] and quantum machine learning libraries, such as Pennylane [33] and Qiskit [34]. The experimentation uses PyTorch for constructing and training the classical components of the hybrid models, such as the pooling and fully connected layers. Pennylane and Qiskit were utilised to define the quantum circuits for image encoding and the quanvolutional layer in the models. Exact implementations for each model is described in the following subsections. Each model was programmed in Python, where the codebase is focused on designing experiments. The training was performed with an NVIDIA RTX 3050, 16 GB VRAM. The host computer has an Intel i5-13400 2.50 GHz central processing unit of 10 cores and 16 GB of Random Access Memory.

The entire experimentation is containerised using the Docker engine [35] to ensure completion of all experiments. The Docker engine is used to further ensure reproducibility of experiments. Due to the current noisy era of quantum computation (NISQ), the training of the models was done under a simulation of quantum computing utilising Qiskit. Noise in quantum computation may affect accuracy in representation. Simulation, as a workaround, is common place between works that focus on quantum algorithm analysis, such as [37] and [38]. The relatively small size and simplicity of MNIST images help reduce the computational resources required for this simulation-based experimentation.

4.2 Baseline HQCCNN Design and Implementation {#4.2 -baseline-hqccnn-design-and-implementation}

The baseline HQCCNN implemented in this dissertation follows the approach proposed by [15]. In this model, the classical convolutional layers of a traditional CNN are simply replaced by the quanvolutional layer.

The first step in this implementation’s pipeline is to encode the input images into quantum states using the threshold encoding method. The threshold encoding method maps the pixel values into the basis state of a qubit, through the use of a pixel value threshold. The encoded quantum state serves as the input to the quanvolutional layer. For this baseline model specifically, the quantum circuit in the quanvolutional layer is constructed using a combination of single-qubit rotation gates (such as RX, RY, and RZ) and two-qubit entangling gates (such as CNOT). The circuit structure and gates’ parameters are initialised randomly, following the approach implemented by [15]. The threshold encoding method implementation follows the method described in the following subsections.

For the purposes of demonstration in this dissertation, the input images are pre-processed and pre-quanvolved through the usage of the quantum circuit. The implementation in Pennylane utilises a RandomLayers class implemented in a QNode or quantum circuit object. QNode is an executable circuit object defined in the Pennylane library. The RandomLayers class is layer(s) of randomly chosen single qubit rotations and two-qubit entangling gates, acting on randomly chosen qubits. The randomness aspect allows for a wide exploration of learning onto the classification task, and no predefined bias.

![][image12]

Figure 9 (b). Pre-quanvolved images onto 4 output channels

After applying the quantum circuit to the encoded input state, the resulting quantum state is measured, collapsing it into a classical output vector (channel). Figure 9 (b) shows a sample quanvolution output based on [15]’s random circuit implementation. This classical output vector serves as the feature map, which is then passed through the subsequent layers fully connected layers. The above pre-quanvolution of the input images are simply demonstrations, in this dissertation’s experimentation, however, the input images are quanvolved at every iteration similar to how a classical CNN convolves at every iteration.

In the baseline implementation, the parameters of the quantum circuit in the quanvolutional layer are fixed and not trainable. During the training process, only the parameters of the classical components are optimised using traditional backpropagation and gradient descent methods, while the quantum circuit remains static in each iteration. This baseline model was implemented for the purposes of answering the first research question, onto describing the possibility and ability of a hybrid quantum-classical convolutional neural network. This model will be used as the basis of comparison with a classical CNN. The remaining section describes the variational quantum circuit implementation with varying encoding methods on it.

4.3 Variational HQCCNN Design and Implementation {#4.3 -variational-hqccnn-design-and-implementation}

While the baseline HQCCNN model demonstrates feasibility of incorporating the quantum circuits into convolutional neural networks, the variational approach to experimentation focuses on the second research question; enhancing the model’s performance by making the quantum circuit trainable and variational. The variational approach builds upon the baseline hybrid model with modifications. Most notably, and a key aspect of this variational approach, is a trainable quantum circuit through changing parameters of the quantum gates within the quanvolutional neural network. The output of the quanvolutional layer is a measurement of the parametrised quantum circuit as a tensor vector, utilised for the remaining classical components of the hybrid model.

4.3.1 A Trainable Quantum Circuit. {#4.3.1 -a-trainable-quantum-circuit.}

Each gate in the quantum circuit is associated with one or more continuous parameters, = {1, 2, 3,, n}. These parameters are rotation angles for the typical single qubit (i.e. RX, RY, RZ) or configurable values for two-qubit entangling gates (i.e. CNOT with a variable control qubit). These parameters are similar to weights in a classical neural network, in that they are learned from data. During the forward pass of training, the quantum circuit processes the qubit states (encoding the input images) using the initial random settings of the parameters .

4.3.2 Parameter Updates. {#4.3.2 -parameter-updates.}

Updating these parameters directly influences the transformation of qubit states. To optimise these parameters during training, a gradient descent method described in the methodology is employed. It updates the quantum circuit parameters based on the gradients of a cost function C(). This cost function is a measure of the error between the predicted output of the hybrid model and the true target labels for the image classification task. This implementation utilises the cross-entropy loss function [40], a widely used metric for classification models. It measures the difference between two probability distributions, the predicted probabilities output by the model and the actual distribution represented by the true labels. Parameters are updated via the Adam optimiser, which adjusts each parameter based on its respective gradient. This optimiser is chosen due to this adaptive learning rate mechanism, helping to stabilise updates in the complex landscape of the quantum circuit parameter space. Below is a pseudocode that outlines the integration of quantum circuit parameterisation, forward pass calculations, backpropagation, and parameter updates using the Adam optimiser.

![][image13]
Figure 10. Pseudocode of variational learning implementation

In implementation, the Adam optimiser under PyTorch’s Neural Network module allows for additional parameters to be optimised, and hence the quantum circuit parameters are passed into the built-in Adam optimiser. Above potentially dealing with complex optimisation landscapes of the quantum parameter search space, the update rules of this optimiser may achieve more stable and efficient convergence due to adjustments based on estimates of first and second moments of gradients.

By incorporating trainable weight parameters in addition to rotation angles, the variational HQCCNN gains more flexibility and expressive power in representing the quantum circuit operations. This can lead to better feature extraction and improved performance throughout. The process of computing gradients and updating parameters is repeated iteratively at each training step, with 100 training steps at each epoch, allowing the circuit to learn and adapt.

4.4 Encoding Methods on the Variational HQCCNN Implementation {#4.4 -encoding-methods-on-the-variational-hqccnn-implementation}

The variational approach aims to enhance the HQCCNN’s performance by making the quantum circuit trainable, and another avenue for potential improvement lies in the quantum image encoding methods used. The baseline implementation, as well as the initial variational approach, relied on the threshold encoding method to encode the input images into quantum states. However, this encoding method may not capture the full nuances of the image data, possibly limiting the performance. This dissertation answers the third research question, this dissertation explores the implementation and integration of more advanced and expressive quantum image encoding methods into the variational HQCNN framework. Specifically, in this dissertation, two new encoding methods; Enhanced Flexible Representation of Quantum Images (EFRQI) and Enhanced Novel Enhanced Quantum Representation (ENEQR), are implemented and evaluated. The experimental design of this focus towards proving the feasibility and ability of varying encoding schemes on a variational HQCCNN. This includes analysing the effects each encoding method has on the variational model’s performance metrics. This subsection describes how the various quantum image encoding methods are implemented, from an overview analysis of the three-pre-implemented method and the two novel methods for variational HQCCNN experiments.

4.4.1 Analysis of Threshold, NEQR and FRQI Encoding Implementation {#4.4.1 -analysis-of-threshold,-neqr-and-frqi-encoding-implementation}

In the baseline implementation, the threshold encoding was used to encode the input images into quantum states. This encoding is simple and easy to implement, where it uses one qubit to encode one pixel. The pixel value is determined to be either high (1) or low (0) based on a determined threshold. This scheme uses classical pre-processing to convert greyscale values into binary, which are then encoded directly onto the qubits with the Pauli-X gate to flip the state from |0⟩ to |1⟩ where necessary. It’s a simple and efficient scheme, though the encoding is restricted since only binary information is stored. This could lead to significant loss of detail that might be critical for the image processing, hence affecting the model’s classification.

The Novel Enhanced Quantum Representation (NEQR) and Flexible Representation of Quantum Images (FRQI) encoding methods were previously explored and implemented as alternatives to threshold encoding. These methods aim to provide a more expressive and accurate representation of the image data in the quantum state. The NEQR encoding represents the greyscale value of each pixel directly. This method involves initialising a quantum superposition of all pixel positions, and then applying controlled operations to encode the exact greyscale values into the qubit states corresponding to each position. It’s an encoding scheme capable of representing each pixel value with no loss of information, preserving the exact intensity though it requires a significant number of qubits, even for relatively small images which could be impractical. This method is further more complex in terms of circuit depth and number of gates, the larger the image pixel size.

FRQI encodes an image into a quantum state where the amplitude of each state in a superposition represents the normalised intensity of a pixel, and the basis states represent the pixel positions. Typically, one qubit is used for position encoding, while the intensity is encoded using the phase of the quantum state. It’s an encoding scheme that requires fewer qubits than NEQR, as the intensity is encoded in the phase rather than in separate qubits. Similar to NEQR, the amplitudes and phase being directly related to pixel positions and intensities allow for no loss of information and easier understanding of the encoded image. The accuracy of intensity representation is limited, however, by the precision with which phases can be manipulated and measured. This method could be sensitive to quantum noise and decoherence, since phases are dependently used in this scheme. Since this implementation utilises quantum computing simulators, this noise sensitivity does not greatly matter.

While NEQR and FRQI provide the foundational frameworks for encoding images into quantum states, implemented onto a variational HQCCNN, their limitations are pronounced and can further be improved upon. NEQR, though offering high fidelity by representing greyscale value directly with qubits, suffers from a significant number of qubits. Conversely, FRQI, which encodes pixel intensities into the phases of quantum states, though more resource-efficient, struggles with precision and scalability, potentially leading to information loss. To address these challenges, enhanced version of these methods: ENEQR and EFRQI can be utilised for a variational HQCCNN. These enhanced methods retain the core advantages of NEQR and FRQI, while introducing optimisations which makes it more suitable for the variational hybrid models with greater accuracy and fewer quantum resources.

4.4.2 Implementing EFRQI {#4.4.2 -implementing-efrqi}

The EFRQI encoding method is an extension of the FRQI scheme, introduced to improve the representational capacity and further mitigate information loss. EFRQI utilises the partial negation operator (RX gate) to set and store the gray value of pixels into the qubit amplitudes more efficiently compared to FRQI. This in turn reduces the time complexity associated with preparing the quantum states for image processing. The reduced time complexity comes from EFRQI simplifying the process of encoding pixel values, hence adjusting the quantum state in fewer steps than a full sequence of rotations.

![][image14]
Figure 11. Pseudocode of EFRQI Encoding Implementation

Based on the pseudocode described above, the EFRQI encoding initialises 2*n+1 qubits where n is derived from the size of the image (assuming square image of 2n2n). With a loop through all the pixel coordinates of x and y, the gray value is retrieved and is used to compute or scale the angle according to it. The qubits were initialised into a state based on the pixel’s given amplitude using a QubitStateVector function. The computed parameterised RX gate is then applied to the qubit at the corresponding position. This step effectively encodes the gray value into the amplitude of the qubits’ quantum state. Below is a sample encoding circuit given an input from the greyscale MNIST dataset. It should be noted that the resultant encoding circuit encodes only a 2×2 filter of the input image since this dissertation’s experiments encodes per convolution filter. This filter input would be encoded into a circuit of 2*(1) +1 = 3 qubits.

![][image15]

![][image16]
Figure 12. Sample encoded EFRQI circuit given input image

This encoding’s biggest strength is the improvement in time complexity and less quantum cost compared to previous encoding methods, specifically against the FRQI encoding method. This method’s performance and advantages are heavily dependent on the precision of quantum operations, hence if implemented on direct quantum hardware, it may turn out to be unusable due to the noise. However, in this dissertation’s experimentation, EFRQI would prove to be useful due to the usage of simulation and not real-world hardware.

4.4.3 Implementing ENEQR {#4.4.3 -implementing-eneqr}

The ENEQR encoding method builds upon the NEQR encoding method by introducing an auxiliary qubit to reduce the number of CNOT gates required for encoding the pixel positions. This optimisation aims to improve the computational efficiency and scalability of the encoding process. This complexity reduction is implemented by orchestrating the encoding method in a more efficient manner, whilst ensuring that the basis states represent the corresponding pixel values correctly.

![][image17]
Figure 13. Pseudocode of ENEQR Encoding Implementation

Based on the above pseudocode, the ENEQR encoding method is first initialised with enough qubits to accommodate the binary encoding of pixel values, 2*n+q + 1 qubits to be exact (for an 2n2n image size and q=8 for 8-bit depth gray scale pixel value). The quantum circuit initialization consists of applying the Hadamard gates to all the wires in the circuit. Iterating through each pixel to retrieve the exact gray intensity value, converting it to an 8-bit binary string. With this string, each bit is iterated and for each bit that is a ‘1’, a CNOT gate is applied with the positional qubit being the control wire. This effectively encodes the binary representation of the gray value into the basis states of the qubit. Below is a sample encoding circuit given an input from the greyscale MNIST dataset. It should be noted that the resultant encoding circuit encodes only a 2×2 filter of the input image (n =1), since this dissertation’s experiments encodes per convolution filter. This filter input would be encoded into a circuit of 2*(1) +8 = 10 qubits.

![][image18]

![][image19]
Figure 14. Sample encoded ENEQR circuit given input image

This encoding’s biggest strength is streamlining the quantum circuit complexity with a more efficient orchestration of gates that alters the basis states corresponding to pixel values in a more direct and less resource-intensive manner. This efficiency allows for a reduction in quantum costs and accelerates image preparation. Reductions in gate usage translate directly into faster image encoding times, making ENEQR particularly advantageous for processing large datasets or for real-time image processing applications. These improvements make ENEQR a high contender for experimental frameworks such as this dissertation’s variational hybrid model experiments, especially in simulated environments where quantum hardware limitations like noise and gates fidelity are not a factor. This is because, as noted previously, the method’s performance are contingent upon the accuracy of quantum operations, which means that the direct implementation on current quantum hardware might prove to be challenging due to inherent noise issues. Simulations idealise quantum operations, thus bypassing the practical limitations posed by noise.

The structure of ENEQR, with its reduced gate complexity and efficient use of quantum states, aligns well with the requirements of a variational learning algorithm. These algorithms rely on the precise manipulation of quantum states, and the efficient nature of ENEQR helps maintain the coherence of quantum states throughout the learning process, thus enhancing the effectiveness of a variational approach.

4.5 Performance Metrics {#4.5 -performance-metrics}

In the experiments implemented in this dissertation, the performance metrics utilised include training accuracy, validation accuracy, training loss, and validation loss. These metrics are standard for evaluation in the field of neural networks, and provide critical insights into the effectiveness and efficiency of different models during training and validation phases. Training accuracy measures the percentage of correct predictions made by a model on the training dataset. Validation accuracy assesses how well a model generalises to new, unseen data, represented by a validation dataset. Training loss quantifies a model’s error on the training dataset. Validation loss measures a model’s error on the validation dataset.

Generally, excessively high training accuracy relative to validation accuracy can sometimes signal overfitting, especially given a significant difference. The validation accuracy is crucial to evaluate the practical applicability of the model, as it reflects performance on unseen data. [41] points out the critical issue of undetected overfitting that can occur when there are significant redundancies between training and validation data. A lower training loss for a given model, indicates better performance and a more accurate model on the training data. Monitoring training loss helps in perceiving how well the learning algorithm is minimising the error and optimising model parameters. Similar to validation accuracy, validation loss is a key metric for evaluating the model’s ability to generalise, though specifically, lower validation loss points to better performance on unseen data [42].

In this dissertation’s experiments, the performance metrics are calculated and recorded in real-time during training and validation. This performance data is collected throughout all 10 HQCCNN models’ training and validations. Once all models are trained, the final results of experiments are compiled through the use of a Jupyter Notebook to ensure readability and conciseness. This comparative analysis helps in discerning the relative performances between varying encoding methods, variational vs baseline and even HQCCNN vs CNN.

5 Experimental Results {#5 -experimental-results}

This section provides a detailed analysis of the performance metrics derived from the experiments conducted as part of this dissertation. The experiments were designed to explore and compare the efficacy, effectiveness, and efficiency of the various HQCCNN models. The experimentations are divided into three exact sections, as previously described, tackling three exact research questions. To ensure a comprehensive evaluation, data was recorded at every epoch during the training and validation phases of each model, considering the extensive training durations involved. This approach then allows for an analysis based on compiling the data and analysing trends over time, in theory, helping provide an overarching perspective on the learning dynamics and stability of these models under different configurations.

5.1 An Overview {#5.1 -an-overview}

Before describing the various experimentation and HQCCNN training results, a comprehensive summary of performance metrics across the HQCCNN models are provided in the following tables. These values are aggregated over multiple training epochs and provide an overview of the results. These metrics include training accuracy, validation accuracy, training time per step (in hours) and test accuracy, each measured under baseline and variation HQCCNN configurations.

![][image20]
Figure 15. Training Time of HQCCNN models

As an overview from Figure 15, the variational circuit approach generally is slower in training with all the variational models showing increased total training times, with an average increase of 32.11% in training time from the baseline and variational increase. This general increase reflects the additional computational overhead required by the variational and learning parameterised circuits. Variational ENEQR in this experimentation has the highest mean training time, indicating possibly more complex quantum operations compared to others.

![][image21]
Figure 16. Testing Accuracy of HQCCNN models

Based upon the results overview of the models’ testing accuracy given in Figure 16, there’s an overall higher test accuracy in variational models. The variational models generally outperform their baseline counterparts in test accuracy, with ENEQR variational achieving the highest mean test accuracy of 0.895. This suggests that the adaptability and enhancements of ENEQR allowed for an enhanced feature extraction capabilities for the variational circuits that contribute positively to the model’s performance on unseen data.

![][image22]
Figure 17. Overview of training and validation accuracy of HQCCNN models

Figure 17 shows the overarching training and validation accuracy, across all HQCCNN models. As a quick summary, the FRQI and NEQR models show improvement when applied in the variational approach over the baseline, but to a lesser extent than EFRQI and ENEQR. The EFRQI encoded models, both variational and baseline approaches, show robust performance with the highest mean training accuracy at 0.8668 and a validation accuracy peaking at 0.91. The ENEQR encoding method also performs well under variational settings, with a notable increase in both training and validation accuracies compared to the baseline.

5.2 Baseline HQCCNN vs Classical CNN {#5.2 -baseline-hqccnn-vs-classical-cnn}

This experiment aims to establish a performance baseline by comparing a traditional Classical CNN with a Baseline HQCCNN model. The configurations of both neural networks are as similar as possible, with the CNN utilising a learning rate of 0.01 and the Adam optimiser. Additionally, the CNN was trained on 10000 training 14×14 pixel MNIST images and 1000 validation images, the same amount as the HQCCNN models. This performance baseline bases upon a HQCCNN with a threshold encoding and fixed non-trainable quantum circuit in the quanvolutional. The performance metrics will help evaluate the fundamental viability and efficiency of integrating quantum computing principles into neural network architectures.

![][image23]
Figure 18. Baseline HQCCNN vs Classical CNN Performance Metrics

The Classical CNN exhibits a consistently high training accuracy, maintaining levels above 90% throughout the epochs. This performance underscores the robustness effectiveness of classical CNNs in capturing features and learning from the training dataset. Possibly due to the high dataset size. Conversely, the baseline HQCCNN displays lower training accuracy, fluctuating around the 70% accuracy mark, which might indicate challenges in optimisation or model capacity when quantum components are involved.

In a similar case, the validation loss for the classical CNN remains low and stable, contrasting with the higher and more volatile loss observed in the baseline HQCCNN. This pattern reinforces the idea that there is a better performance stability in classical models compared to their quantum counterparts in this experimental setup. This comparison result showcases differing results than what common literature describes, where a baseline HQCCNN is about as good as classification as a classical CNN, for the MNIST dataset. These results, however, describe that a baseline HQCCNN barely reaches the performance of a classical CNN.

5.3 Variational HQCCNN vs Baseline HQCCNN {#5.3 -variational-hqccnn-vs-baseline-hqccnn}

This subsection proceeds to describe the differences between the baseline HQCCNN model and its variational counterpart. The variational approach involves parameterised quantum circuits that could potentially offer enhanced adaptability and learning efficiency. In this experiment, the comparison focuses on the performance between the baseline HQCCNN and its variational counterpart across multiple encoding methods, including EFRQI, ENEQR, FRQI, NEQR and the Threshold encoding method as a control. The data from the plots provides insight into the training and validation dynamics across 50 epochs.

![][image24]![][image25]
Figure 19. Variational HQCCNN vs Baseline HQCCNN Training Performance Metrics

Figure 19 shows the training performance metric of the variational and baseline HQCCNN. Comparing the variational models to their baseline counterparts, the former seems to exhibit higher mean and peak training accuracy. More notably, the EFRQI and ENEQR variational models consistently outperform their baseline versions, indicating that the variational approach enhances their ability to capture and learn from training data effectively. Though with a lower margin of improvement than EFRQI and ENEQR, FRQI and NEQR also gain slightly from the variational approach. Particularly for EFRQI and ENEQR, the variational models show a smoother drop in training loss, which led to a reduced loss that was sustained over the course of epochs. The variational models exhibit a smoother decline in training loss, particularly for EFRQI and ENEQR, which resulted in a lower loss maintained throughout the epochs. This suggests that the variational circuits might be more effective in optimising the loss landscape. The baseline models, FRQI and NEQR models, display higher variability and spikes in training loss, which might point to difficulties in model convergence or the presence of more rugged loss surfaces.

![][image26]

![][image27]
Figure 20. Variational HQCCNN vs Baseline HQCCNN Validation Performance Metrics

Figure 20 shows how well the HQCCNN changed and how well it was validated, given the baseline and variational approach. Similar to the training metrics, the variational models achieve a higher validation accuracy, with EFRQI and ENEQR showing the most significant improvements. The higher accuracy of the variational approach suggests that it is better at generalising. The validation accuracy curves for the baseline models are more erratic and random, with pronounced fluctuations across epochs specifically for NEQR, indicative of potential overfitting or instability in the model’s learning process. This means that the model might not be learning well enough. In terms of validation loss, the variational models show lower and stable validation losses, especially ENEQR and EFRQI. This stabilisation in validation loss further describes the enhanced learning effectiveness of variational circuits. The baseline models show increased loss peaks and more variability, aligning with the trends observed in the training’s performance metrics. These results indicate that variational quantum circuits contribute to more effective learning and generalisation capabilities across different encoding methods. The variational models generally tend to smooth out the fluctuations seen in baseline models and maintain a steadier progression across epochs.

5.4 Varying Encoding Methods on variational HQCCNN {#5.4 -varying-encoding-methods-on-variational-hqccnn}

This subsection assesses how various quantum image encoding techniques affect a variational HQCCNN’s performance. In contrast to the other experiments, a great deal of attention is paid to contrasting and matching the performance of various encoding techniques. Given this experimental setting on the MNIST dataset, the goal is to achieve the most optimal encoding method to be implemented in a variational HQCCNN. Through 50 epochs of training and validation, the performance indicators are compared as below. This subsection is further divided based on the direct enhancements of the originally implemented encoding methods, i.e. EFRQI vs FRQI.

![][image28]

![][image29]
Figure 21. Varying Encoding Methods on Variational HQCCNN Performance Metrics

Figure 21 shows an overview of all the encoding methods applied on a variational HQCCNN model, training and validating over time. EFRQI and ENEQR encodings stand out in this battle by showing superior performance in both training and validation phases compared to FRQI and NEQR. The Threshold encoding serves as a control or baseline encoding method, showing that while it performs relatively well, it does not leverage the quantum capabilities as effectively as the more specialised quantum image encodings. A quick glance shows that NEQR as an encoding method seems to have the least stabilised training and validating loss, which could mean a loss of image representation throughout the model’s learning.

5.4.1 Threshold, FRQI and NEQR Encoding {#5.4.1 -threshold,-frqi-and-neqr-encoding}

![][image30]
Figure 22. Threshold, FRQI and NEQR on Variational HQCCNN Performance Metrics

This experimentation focuses on analysing the effects of varying encoding methods when implemented in a variational HQCCNN. As described previously, the threshold encoding can serve to be a control encoding method. In a variational model, threshold encoding shows comparable performance to being implemented in a baseline model, in both training and validation metrics, indicating that the simple threshold encoding does not significantly benefit from the variational approach. The simple threshold encoding also shows a relatively stable training loss, indicating good efficiency but potentially limited capability in handling more complex patterns.

The FRQI encoding results show significant variability in both training and validation metrics. While it does occasionally reach higher accuracy peaks, the overall stability is lower than the threshold encoding, with higher peaks in loss, suggesting sensitivity to the training dynamics. Having described that, the FRQI encoding method turns out above the other encoding methods in terms of training and validation accuracy. The NEQR encoding exhibits slightly better consistency than FRQI, in accuracy, but generally underperforms in terms of peak validation accuracy. This shows NEQR’s lack of generalisability. The variational model with NEQR encoding shows marginal enhancements over the baseline but fails to outperform other encoding methods significantly. The experimental results indicate that FRQI encodes image patches in a way that is suitable for variational quantum circuits to operate on for producing expressive feature maps. This is shown in the high training and validation peak the FRQI encoding achieved during experimentation.

5.4.2 EFRQI Encoding vs FRQI Encoding {#5.4.2 -efrqi-encoding-vs-frqi-encoding}

The EFRQI encoding method is expected to provide a more robust and adaptable encoding strategy compared to the standard FRQI. An enhancement in the encoding is anticipated to better capture the nuances in image data, making it potentially more suitable for deep variational HQCCNN. This is because EFRQI can encode more information per qubit than FRQI. The expectation is that EFRQI would show improved accuracy and stability over FRQI due to its enhanced capability to manage noise and represent more complex patterns. This experimentation focuses on comparing how EFRQI and FRQI differs in experimentation and affects their respective variational HQCCNN model’s performance.

![][image31]
Figure 23. EFRQI vs FRQI encoding on Variational HQCCNN Performance Metrics

Figure 23 shows the experimentation results of training a variational HQCCNN model utilising EFRQI and FRQI encoding. The EFRQI encoding demonstrates a smoother and more stable increase in training accuracy over the epochs compared to FRQI. This suggests that utilising EFRQI allowed for a greater capture of the nuances and complexities of the training data as a representation. In the validation phase, EFRQI consistently maintains a higher accuracy levels compared to FRQI. Based on the training and validation loss graph, both EFRQI and FRQI have difficulty stabilising the variational HQCCNN’s learning process. Though, the noticeable lower variance in validation loss for EFRQI indicates a slightly more enhanced generalisability and robustness of learning against variations in new data, though possibly not significant enough of an enhancement to be crucial for real-world applications. Nevertheless, the EFRQI encoding allowed the variational HQCCNN to continue learning at a higher training and validation accuracy.

5.4.3 ENEQR vs NEQR Encoding {#5.4.3 -eneqr-vs-neqr-encoding}

The ENEQR encoding method implemented in a variational HQCCNN, aimed at providing an improved representation of quantum images by enhancing the original NEQR method’s efficiency and effectiveness, is expected to yield better performance metrics in both training and validation. ENEQR as a scheme intends to optimise the quantum state space usage, which could in turn make the encoding process more efficient. This could further lead to less information loss during encoding, quanvolution and pooling.

![][image32]
Figure 24. ENEQR vs NEQR encoding on Variational HQCCNN Performance Metrics

Key observations from Figure 24, showcasing plots of a variational HQCCNN performance metrics implemented with ENEQR and NEQR, shows a few trends. Both NEQR and ENEQR show a similar trend in training accuracy, but ENEQR consistently performs better, particularly in the latter half of the epochs. This suggests that enhancements in ENEQR help maintain a high accuracy as the model complexity increases. Furthermore, ENEQR exhibits a lower training loss than NEQR, which indicates its superior capability in encoding and learning from the training set. During the validation phase, ENEQR outperforms NEQR in terms of validation accuracy throughout most epochs. The validation loss of ENEQR is not just lower, but very stable across epochs, which could point to much better generalisation capabilities. The reduced fluctuation in validation loss suggests that ENEQR is far less likely to be affected by overfitting compared to NEQR.

6 Discussion {#6 -discussion}

The primary aim of this dissertation is to investigate the development of HQCCNN and explore avenues for enhancing their performances through novel quantum image encoding methods and a variational circuit approach. Specifically, the research focused on three main objectives: evaluating the feasibility of incorporating quantum circuits into a CNN; assessing a variational approach to improve the performance of HQCCNNs, and lastly, analysing the impact of different quantum image encoding methods on this variational HQCCN’s performance. Through extensive research, implementation, experimentation and analysis, these questions provide valuable insights into the potential challenges associated with integrating quantum computing principles into classical learning architectures. The key findings and their implications are further discussed in this section.

6.1 Addressing the Research Question {#6.1 -addressing-the-research-question}

Can quantum circuits be utilised in CNNs? The experimental results comparing the baseline hybrid model with a classical CNN shows the feasibility of incorporating a quantum circuit into a CNN. The baseline hybrid model achieved reasonable performance, with training accuracy fluctuating around 70% on the MNIST dataset. This indicates that quantum circuits can indeed be leveraged in CNNs, albeit with potential limitations in terms of model stability and generalisation. It should be noted, as well, that the high training accuracy result of the classical CNN is likely due to the high number of training images utilised. The results align with the initial work by [15], demonstrating viability in incorporating quantum computation into a neural network architectures, with a slight caveat. Various works describe the feasibility of a HQCCNN, without paying heed to its ability. Through the experimentation done in this dissertation, a baseline HQCCNN with threshold encoding barely holds up against a simple classical CNN, trained under the same conditions. [36] analyses this consequence further. Additionally, various works that train a hybrid model often compare with a CNN trained with not the same experimental configurations, such utilising a different amount of training and validation image set size.

This brings the next question, can a variational approach improve the performance of HQCCNNs? The experimental results comparing variational and baseline HQCCNN models across different encoding methods provide bare proof that variational approach towards a HQCCNN can slightly improve its performance. Judging upon its performance metrics, the variational approach allowed for a smoother training loss convergence and reduced loss variability or sporadic, which suggests the adaptability of the variational approach contributing to a more effective learning. Generally, there were reduced fluctuations in validation metrics, which also indicates enhanced generalisation capabilities. The ENEQR variational HQCCNN model, in particular, exhibited remarkable performance, achieving the highest mean test accuracy of 0.895 among all models.

The final question bears, how do different quantum image encoding methods affect an implemented variational HQCCNN’s performance metrics? The experimental results indicate the choice of image encoding method plays a crucial role in the performance of its implemented model. The simple threshold encoding, in comparison with the other implemented encoding method, merely served as a baseline in results due to its model’s performance stagnation. The encoding method did not leverage the quantum capabilities as effectively as the other encoding methods, and was unable to fully represent the images. The FRQI encoding method showed high accuracy, but was unstable and varied in its performance. Though, our experimentation results showed FRQI to be the best among the initial three (threshold, NEQR and FRQI) encoding methods. This result aligns with [16] work as well. As its more enhanced cousin, the EFRQI encoding method performed better, with slightly smoother but faster accuracy. The ENEQR encoding method emerged as the top performer, outdoing its predecessors NEQR in all metrics. It showed high accuracy, low loss, and exceptional stability, suggesting it can generalise really well.

These results highlight the importance of choosing the right quantum image encoding method of a variational HQCCNN model. The enhanced methods, EFRQI and ENEQR performed better than their original counterparts, FRQI and NEQR, in preserving image features and enabling effective learning.

6.2 Significance of ENEQR and EFRQI {#6.2 -significance-of-eneqr-and-efrqi}

The results show a clear significance that the ENEQR and EFRQI encoding methods outperformed other encoding techniques in various contextual aspects of the variational HQCCNN’s model, some just marginally and some significantly better. Under further analysis, several factors contributed to their performance and ability to preserve essential image features for a more in-depth analysis.

The EFRQI encoding method introduced a scaling factor to improve representational capacity and mitigate information loss. By utilising the partial negation operator, EFRQI can set the qubit amplitudes more efficiently compared to FRQI, reducing the overall time complexity associated with preparing the quantum states for image processing. This is proven in implementation in the training time experimental results (shown in Figure 15). The variational EFRQI model managed to complete its training within less than 3 hours, whilst in comparison, the variational FRQI model took 4 hours to complete its training. This decrease in total training time is direct proof of EFRQI’s shorter time complexity improvement from FRQI. However, it should be noted that it did not demonstrate a significant reduction in training time compared to other encoding methods. Nevertheless, EFRQI proved to be an improvement generally, from FRQI especially, as an encoding method for a variational HQCCNN.

The ENEQR encoding method introduced an auxiliary qubit to reduce the number of CNOT gates required for encoding the pixel positions. This optimisation enhances the computation efficiency and scalability of the encoding process, leading to generally reduced quantum costs and faster image encoding times. [39] explores and utilises auxiliary quantum states in the auxiliary space, resulting in higher efficiency. By orchestrating gates in a more efficient manner, ENEQR can directly alter the basis states corresponding to pixel values with fewer operations. Moreover, ENEQR’s efficient representation of greyscale values using qubits minimizes information loss during the encoding and subsequent quanvolution and pooling operations. In addition to this, the method of quanvolution this implementation follows is through a convolution of 2×2 ENEQR-encoded patches, hence each of these patches are represented with a high preservation of image features. This preservation of essential image features contributes to the robust performance of the ENEQR-encoded variational HQCCNN in both training and generalization tasks. This “each patching encoding” becomes the reason the ENEQR-encoded variational HQCCNN having the highest mean training time among all models, even though having optimised gate orchestration and efficient qubit usage than the other encoding methods.

It is beyond a reasonable doubt, based on the experimental results of this dissertation, the ENEQR-encoded Hybrid Quantum-Classical Convolutional Neural Network, configured with a 2×2 quanvolutional kernel/filter, results in the highest performance of a variational HQCCNN thus far. This model’s performance exceeds the other encoding methods, in all metrics, from training accuracy to validation loss.

6.2.1 ENEQR-Encoded HQCCNN vs Classical CNN {#6.2.1-eneqr-encoded-hqccnn-vs-classical-cnn}

In light of the above discussion, the training, and validation performances of the ENEQR-encoded were compared with the classical CNN performance. This additional experiment is done in the hopes of providing insight towards if this dissertation’s winning variational HQCCNN model could go against the traditional CNN, contributing towards quantum supremacy in a smaller context. This classical CNN model is the same model that was compared with the initial baseline HQCCNN.

![][image33]
Figure 25. Variational HQQCNN w/ ENEQR Encoding vs Classical CNN Performance Metrics

Figure 25 showcases the results of this additional experiment. Both models start with varyingly different accuracies, with the CNN being able to learn the simple dataset within the first 5 epochs (due to the large dataset) and the ENEQR-encoded variational HQCCNN gradually learning over longer epochs. This suggests the hybrid model being more effective in learning but classically, the CNN is efficient. At a quick glance, the experiment results showcase that the classical CNN still reigns supreme over learning on a large dataset. The training loss for both models decreases over time, as the models learn, with the hybrid model starting with a significantly higher loss. Though, this value rapidly decreases and stabilises around and close to the classical CNN, another indicator of effective learning. For validation loss, both models show variability, typical in validation loss due to the unseen nature of validation data. The variational HQCNN shows a much more stable validation loss trend, and even surpassing (having a lower validation loss) the classical CNN in later epochs. This increase in validation loss for the CNN could be indicative of overfitting to the training data, or sensitivity to the validation data set’s variability. Lastly, both models were tested upon and based on the resulting test accuracy and test loss, the variational HQCCNN with ENEQR encoding reaches just as accurate as the classical CNN with minimal difference. However, the hybrid model was able to surpass the CNN in its test loss result.

The higher performance of the HQCCN in both test accuracy and test loss, with support from a stable validation loss trend. This strongly supports the notion that the hybrid quantum-classical model is superior in handling generalisation against new unseen data compared to the classical approach, in this experimental setup. This, in essence, suggests a better capturing and processing of underlying patterns in the data that are not as effectively handled by the classical CNN.

6.3 Limitations of Dissertation {#6.3 -limitations-of-dissertation}

The main limitation of this dissertation comes from the experimental nature of it. This dissertation is intended to be a study, with no outlook on real-world replicability. This limitation shines through every aspect, more notably, the usage of the MNIST dataset. Whilst MNIST is a popular benchmark for demonstrating machine learning concepts, it is relatively simple and does not represent the complexity of real-world image data. [43] points out that the simplicity of MNIST may not represent challenges found in real-time applications where speed and computational efficiency are crucial. The performance observed in this dissertation’s experimentation may not generalise to more complex or noisy datasets, such as those found in medical imaging or autonomous driving. Relating to this, the experimentation in this dissertation can be described as rigid and requiring more variables to ensure a more-rounded analysis and conclusion.

Another limitation would be the lack of a comparative analysis with another quantum model for the image classification task. This dissertation focuses on comparing HQCCNNs with its classical counterpart, and one encoding method to the other, resulting in a focus of improving HQCCNNs. There is a lack of comparisons with other quantum or hybrid models that could have provided a more rounded perspective on where the HQCCNNs stand in terms of efficiency, accuracy, and practicality of the image classification task.

A limitation found directly in the architecture of this hybrid model is the method of convolution. This dissertation’s implementation of HQCCNNs quanvolves the encoded input images at a 2×2 patch, meaning other image encoding methods that require full image information would not be compatible with this architecture. Having a HQCCNN architecture that encodes the image fully first before quanvolving allows for experimentation with a wider range of encoding methods, such as BRQI and SCMFRQI. BRQI specifically may prove to be useful to represent greyscale images with as minimal information loss as possible [25], but implementation of it on a HQCCNN requires a change in architecture.

7 Conclusion {#7 -conclusion}

This dissertation has explored the new domain of hybrid quantum-classical convolutional neural networks (HQCCNNs) with a focus on quantum image encoding methods, under the umbrella aim of bridging the gap between quantum computing capabilities and practical image classification tasks. Through a meticulous experimental design and investigation, several quantum image encoding strategies, including Enhanced Novel Enhanced Quantum Representation (ENEQR) and Enhanced Flexible Representation of Quantum Images (EFRQI), were assessed and compared to more traditional methods.

Key Findings. Based on this dissertation’s investigation, the experimental results conclusively demonstrated the integration of quantum computation into convolutional neural networks, of which provides a promising avenue for enhancing image classification accuracy. Most notably, the variational HQCCNN models employing ENEQR and EFRQI encoding methods consistently outperform other encoding method-employed counterparts, along with a baseline hybrid quantum model in both training and validation phases. The ENEQR-encoded variational hybrid model reaches quite closely to an overfitted classical CNN in terms of accuracy, whilst ensuring a stable validation loss metric, meaning a high accuracy without worry of overfitting. These findings underscore the potential of quantum-implemented or quantum-enhanced neural networks to possibly exploit state superposition, yielding substantial improvements in model performances.

Theoretical and Practical Implications. This dissertation also conclusively demonstrated the implementation of an ENEQR and EFRQI quantum image encoding on a hybrid quantum algorithm. This dissertation contributes towards quantum image processing by implementing additional image encoding onto a hybrid algorithm. The developed models promise substantial advances in areas requiring large-scale image processing and classification, such as medical imaging or real-time video analysis, where classical algorithms may falter in speed and efficiency.

7.1 Future Works {#7.1 -future-works}

As a start of the future works, the limitation of a rigid experimental design could easily be solved by running more iterations of the same experiments, through utilising different randomness seeds. These seeds are the basis of randomness in the hybrid models, more notably the initialised parameters and quantum circuit of the variational HQCCN. This future work specifically tackles a limitation of this dissertation.

Future studies could look towards applying this dissertation’s winning ENEQR-encoded variational HQCCNN towards more complex and diverse datasets, such as CIFAR-10, ImageNet, or domain-specific images like medical or satellite imagery. This would test the model’s robustness and generalisability across different challenges and increase its practical value.

Future studies could also explore and compare different quantum image encoding methods beyond the ones specified in this dissertation, and this could include more advanced quantum image encoding strategies that might offer better compression or error resilience. This error resilience would then allow for an implementation of the HQCCNN on an NISQ-era quantum computer. On the topic of other encoding methods, future studies could tackle various transformational constraints that classical CNNs inherently have, such as rotational or translational equivariance. By nature of encoding, it may be possible for a quantum image encoding out there to bypass this transformational constraint (since the image is represented in a fully different environment). Applying this hypothetical miracle quantum image encoding method may solidify a HQCCNN’s usability above traditional classical CNNs. As of now, various deep learning CNN models provide workaround these constraints, utilising more compute!

References {#references}

[1] R. P. Feynman, “Simulating physics with computers,” International Journal of Theoretical Physics, vol. 21, no. 6–7, pp. 467–488, Jun. 1982, doi: https://doi.org/10.1007/bf02650179.
[2] J. Wang, F. Sciarrino, A. Laing, and M. G. Thompson, “Integrated photonic quantum technologies,” Nature Photonics, vol. 14, no. 5, pp. 273–284, Oct. 2019, doi: https://doi.org/10.1038/s41566-019-0532-1.
[3] M. Cerezo et al., “Variational quantum algorithms,” Nature Reviews Physics, vol. 3, no. 9, pp. 625–644, Sep. 2021, doi: https://doi.org/10.1038/s42254-021-00348-9.
[4] B. Bauer, S. Bravyi, M. Motta, and G. K.-L. Chan, “Quantum Algorithms for Quantum Chemistry and Quantum Materials Science,” Chemical Reviews, vol. 120, no. 22, pp. 12685–12717, Oct. 2020, doi: https://doi.org/10.1021/acs.chemrev.9b00829.
[5] J. Yepez, “Quantum computation for physical modeling,” Computer Physics Communications, vol. 146, no. 3, pp. 277–279, Jul. 2002, doi: https://doi.org/10.1016/s0010-4655(02)00418-6.
[6] X. Zhang, H.-O. Li, G. Cao, M. Xiao, G.-C. Guo, and G.-P. Guo, “Semiconductor Quantum Computation,” National Science Review, vol. 6, no. 1, pp. 32–54, Jan. 2019, doi: https://doi.org/10.1093/nsr/nwy153.
[7] Y. Liu, S. Arunachalam, and K. Temme, “A rigorous and robust quantum speed-up in supervised machine learning,” Nature Physics, pp. 1–5, Jul. 2021, doi: https://doi.org/10.1038/s41567-021-01287-z.
[8] J. Preskill, “Quantum Computing in the NISQ era and beyond,” Quantum, vol. 2, no. 2, p. 79, Aug. 2018, doi: https://doi.org/10.22331/q-2018-08-06-79.
[9] M. H. Alrefaei and S. Andradóttir, “Discrete stochastic optimization using variants of the stochastic ruler method,” Naval Research Logistics (NRL), vol. 52, no. 4, pp. 344–360, Mar. 2005, doi: https://doi.org/10.1002/nav.20080.
[10] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” Communications of the ACM, vol. 60, no. 6, pp. 84–90, May 2012, doi: https://doi.org/10.1145/3065386.
[11] K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” Sep. 2014.
[12] K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2016, doi: https://doi.org/10.1109/cvpr.2016.90.
[13] R. LaRose and B. Coyle, “Robust data encodings for quantum classifiers,” Physical review, vol. 102, no. 3, Sep. 2020, doi: https://doi.org/10.1103/physreva.102.032420.
[14] I. Cong, S. Choi, and M. D. Lukin, “Quantum convolutional neural networks,” Nature Physics, vol. 15, no. 12, pp. 1273–1278, Aug. 2019, doi: https://doi.org/10.1038/s41567-019-0648-8.
[15] M. Henderson, S. Shakya, S. Pradhan, and T. Cook, “Quanvolutional neural networks: powering image recognition with quantum circuits,” Quantum Machine Intelligence, vol. 2, no. 1, pp. 1–9, Feb. 2020, doi: https://doi.org/10.1007/s42484-020-00012-y.

[16] D. Mattern, D. Martyniuk, H. Willems, F. Bergmann, and A. Paschke, “Variational Quanvolutional Neural Networks with enhanced image encoding,” 2021.
[17] J. Zheng, Q. Gao, and Y. Lu, “Quantum Graph Convolutional Neural Networks,” Jul. 2021, doi: https://doi.org/10.23919/ccc52363.2021.9550372.
[18] Parfait Atchade-Adelomou and G. Alonso-Linaje, “Quantum-enhanced filter: QFilter,” Soft computing, vol. 26, no. 15, pp. 7167–7174, Jun. 2022, doi: https://doi.org/10.1007/s00500-022-07190-w.
[19] Y. Zhang, K. Lu, Y. Gao, and M. Wang, “NEQR: a novel enhanced quantum representation of digital images,” Quantum Information Processing, vol. 12, no. 8, pp. 2833–2860, Mar. 2013, doi: https://doi.org/10.1007/s11128-013-0567-z.
[20] Y. Luo, S. Tang, J. Liu, L. Cao, and S. Qiu, “Image encryption scheme by combining the hyper-chaotic system with quantum coding,” Optics and Lasers in Engineering, vol. 124, p. 105836, Jan. 2020, doi: https://doi.org/10.1016/j.optlaseng.2019.105836.
[21] Le Quang Phuc, D. Fangyang, Arai Yoshinori, and Hirota Kaoru, “Flexible Representation of Quantum Images and Its Computational Complexity Analysis,” vol. 25, pp. 185–185, Jul. 2009, doi: https://doi.org/10.14864/fss.25.0.185.0.
[22] P. Q. Le, F. Dong, and K. Hirota, “A flexible representation of quantum images for polynomial preparation, image compression, and processing operations,” Quantum Information Processing, vol. 10, no. 1, pp. 63–84, Apr. 2010, doi: https://doi.org/10.1007/s11128-010-0177-y.
[23] A. Iliyasu, “Towards Realising Secure and Efficient Image and Video Processing Applications on Quantum Computers,” Entropy, vol. 15, no. 12, pp. 2874–2974, Jul. 2013, doi: https://doi.org/10.3390/e15082874.
[24] N. Nasr, A. Younes, and A. Elsayed, “Efficient representations of digital images on quantum computers,” Multimedia Tools and Applications, vol. 80, no. 25, pp. 34019–34034, Aug. 2021, doi: https://doi.org/10.1007/s11042-021-11355-4.
[25] H.-S. Li, X. Chen, H. Xia, Y. Liang, and Z. Zhou, “A Quantum Image Representation Based on Bitplanes,” IEEE Access, vol. 6, pp. 62396–62404, 2018, doi: https://doi.org/10.1109/access.2018.2871691.
[26] Md Ershadul Haque, M. Paul, Anwaar Ulhaq, and T. Debnath, “A Novel State Connection Strategy for Quantum Computing to Represent and Compress Digital Images,” arXiv (Cornell University), Jun. 2023, doi: https://doi.org/10.1109/icassp49357.2023.10094832.
[27] H. Li, P. Fan, H. Xia, H. Peng, and S. Song, “Quantum Implementation Circuits of Quantum Signal Representation and Type Conversion,” IEEE Transactions on Circuits and Systems I-regular Papers, vol. 66, no. 1, pp. 341–354, Jan. 2019, doi: https://doi.org/10.1109/tcsi.2018.2853655.
[28] X. Liu, D. Xiao, W. Huang, and C. Liu, “Quantum Block Image Encryption Based on Arnold Transform and Sine Chaotification Model,” vol. 7, pp. 57188–57199, Jan. 2019, doi: https://doi.org/10.1109/access.2019.2914184.
[29] J. Sang, S. Wang, and Q. Li, “A novel quantum representation of color digital images,” Quantum Information Processing, vol. 16, no. 2, Dec. 2016, doi: https://doi.org/10.1007/s11128-016-1463-0.
[30] A. Younes, “Reading a single qubit system using weak measurement with variable strength,” Annals of Physics, vol. 380, pp. 93–105, May 2017, doi: https://doi.org/10.1016/j.aop.2017.03.008.
[31] Li Deng, “The MNIST Database of Handwritten Digit Images for Machine Learning Research [Best of the Web],” IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 141–142, Nov. 2012, doi: https://doi.org/10.1109/msp.2012.2211477.
[32] “PyTorch,” www.pytorch.org. https://pytorch.org
[33] “PennyLane,” pennylane.aiundefined. https://pennylane.ai
[34] “IBM Quantum,” www.ibm.com. https://www.ibm.com/quantum/qiskit
[35] “Docker Engine overview,” Docker Documentation, Apr. 09, 2020. https://docs.docker.com/engine/
[36] T. Cheng, R.-S. Zhao, S. Wang, R. Wang, and H.-Y. Ma, “Analysis of Learnability of a Novel Hybrid Quantum-Classical Convolutional Neural Network in Image Classification,” Chinese Physics B/Chinese physics B, Dec. 2023, doi: https://doi.org/10.1088/1674-1056/ad1926.
[37] Y. Wang, “Quantum Computation and Quantum Information,” Statistical Science, vol. 27, no. 3, pp. 373–394, Aug. 2012, doi: https://doi.org/10.1214/11-sts378.
[38] Alwin Zulehner and R. Wille, “Advanced Simulation of Quantum Computations,” IEEE transactions on computer-aided design of integrated circuits and systems, vol. 38, no. 5, pp. 848–859, May 2019, doi: https://doi.org/10.1109/tcad.2018.2834427.
[39] W. Liu, H. Wei, and L. Kwek, “Universal Quantum Multi‐Qubit Entangling Gates with Auxiliary Spaces,” Advanced quantum technologies, vol. 5, no. 5, Mar. 2022, doi: https://doi.org/10.1002/qute.202100136.
[40] A. Mao, M. Mohri, and Y. Zhong, “Cross-Entropy Loss Functions: Theoretical Analysis and Applications,” arXiv.org, Apr. 14, 2023. https://arxiv.org/abs/2304.07288
[41] I. Wallach and A. Heifets, “Most Ligand-Based Benchmarks Measure Overfitting Rather than Accuracy.,” Jun. 2017.
[42] A. P. Robinson and R. E. Froese, “Model validation using equivalence tests,” Ecological Modelling, vol. 176, no. 3–4, pp. 349–358, Sep. 2004, doi: https://doi.org/10.1016/j.ecolmodel.2004.01.013.
[43] A. Palvanov and Y. I. Cho, “Comparisons of Deep Learning Algorithms for MNIST in Real-Time Environment,” INTERNATIONAL JOURNAL of FUZZY LOGIC and INTELLIGENT SYSTEMS, vol. 18, no. 2, pp. 126–134, Jun. 2018, doi: https://doi.org/10.5391/ijfis.2018.18.2.126.
[44] M. Schuld, I. Sinayskiy, and F. Petruccione, “An introduction to quantum machine learning,” Contemporary Physics, vol. 56, no. 2, pp. 172–185, Oct. 2014, doi: https://doi.org/10.1080/00107514.2014.964942.

📖 miru's notes

Explorer

Individual Dissertation - B.Sc. in Computer Science (University of Nottingham Malaysia)