This is a summary of my 2023 GSoC project with the ML4SCI-organization. In my project I designed and implemented a Quantum Generative Adversarial Network for the generation of HEP experiment data. The full code for all my work can be found on Github. In the following post I will outline my work and describe some parts of the implementation and the results
Event generation in HEP experiments
In high energy physics experiements like they are conducted at CERN, an integral part of the analysis process is the comparison of measurements with results expected based on predictions from our theory of nature, the Standard Model of particle physics. Generating these predictions is usually done by Monte Carlo simulations. Since these simulations are very demaning, there has been a vast amount of work on generative machine learning models e.g. ( Citation: Oliveira, Paganini & al., 2017 Oliveira, L., Paganini, M. & Nachman, B. (2017). Learning Particle Physics by Example: Location-Aware Generative Adversarial Networks for Physics Synthesis. Comput Softw Big Sci (2017) 1: 4. https://doi.org/10.1007/s41781-017-0004-6 ; Citation: Butter, Plehn & al., 2019 Butter, A., Plehn, T. & Winterhalder, R. (2019). How to GAN LHC Events. SciPost Phys. 7, 075 (2019). https://doi.org/10.21468/SciPostPhys.7.6.075 ; Citation: Hariri, Dyachkova & al., 2021 Hariri, A., Dyachkova, D. & Gleyzer, S. (2021). Graph Generative Models for Fast Detector Simulations in High Energy Physics. Retrieved from http://arxiv.org/abs/2104.01725 ) . The main incentive is to speed up the simulation process by training a generative model like a GAN, from which one can then cheaply sample from.
Classical GANs
Generative Adversarial Networks (GANs) are a class of unsupervised machine learning models proposed in ( Citation: Goodfellow, Pouget-Abadie & al., 2014 Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A. & Bengio, Y. (2014). Generative Adversarial Networks. Retrieved from http://arxiv.org/abs/1406.2661 ) . GANs aim to train a generator $G(z,\Theta_g)$ with a latent space $z$ and parameters $\Theta_g$ to replicate a reference probability distribution when sampling from the latent space $z$.
A GAN consists of two networks, the generator $G$ and a discriminator $D$. The networks are trained by playing a zero sum game, where the generator tries to generate samples which are as realistic as possible, while the discriminator tries to classify real data samples and tag samples generated by the generator as fake.
A more detailed description of classical GANs is given in this post.
A Quantum GAN for event generation
There have been several different proposals for Quantum GANs ( Citation: Lloyd & Weedbrook, 2018 Lloyd, S. & Weedbrook, C. (2018). Quantum generative adversarial learning. Phys. Rev. Lett. 121, 040502 (2018). https://doi.org/10.1103/PhysRevLett.121.040502 ; Citation: Dallaire-Demers & Killoran, 2018 Dallaire-Demers, P. & Killoran, N. (2018). Quantum generative adversarial networks. Phys. Rev. A 98, 012324 (2018). https://doi.org/10.1103/PhysRevA.98.012324 ; Citation: Niu, Zlokapa & al., 2021 Niu, M., Zlokapa, A., Broughton, M., Boixo, S., Mohseni, M., Smelyanskyi, V. & Neven, H. (2021). Entangling Quantum Generative Adversarial Networks. https://doi.org/10.48550/ARXIV.2105.00080 ) including proposals for applications in HEP simulation ( Citation: Rehm, Vallecorsa & al., 2023 Rehm, F., Vallecorsa, S., Grossi, M., Borras, K. & Krücker, D. (2023). A Full Quantum Generative Adversarial Network Model for High Energy Physics Simulations. Retrieved from http://arxiv.org/abs/2305.07284 ; Citation: Rehm, Vallecorsa & al., 2023 Rehm, F., Vallecorsa, S., Borras, K., Grossi, M., Krücker, D. & Varo, V. (2023). Precise Image Generation on Current Noisy Quantum Computing Devices. Retrieved from http://arxiv.org/abs/2307.05253 ) .
In this project I focused on a Hybrid quantum classical model based on ( Citation: Zoufal, Lucchi & al., 2019 Zoufal, C., Lucchi, A. & Woerner, S. (2019). Quantum Generative Adversarial Networks for Learning and Loading Random Distributions. npj Quantum Inf 5, 103 (2019). https://doi.org/10.1038/s41534-019-0223-2 ) . It consists of a quantum circuit acting as generator and a classical discriminator, which recives measured samples as input. Notably, this approach differs from many other approches in the literature, as it does not try to embed the continous data into quantum states through an embedding. We rather try to systematically discretize the data and develop a mapping to basis states of quantum system. The generator can then learn to prepare a quantum state whose distribution of basis states when measured resembles the discretized data.
As a toy example, I take a dataset of detector images originating from Quark initiated Jets ( Citation: Andrews, Alison & al., 2019 Andrews, M., Alison, J., An, S., Bryant, P., Burkle, B., Gleyzer, S., Narain, M., Paulini, M., Poczos, B. & Usai, E. (2019). End-to-End Jet Classification of Quarks and Gluons with the CMS Open Data. Nucl. Instrum. Methods Phys. Res. A 977, 164304 (2020). https://doi.org/10.1016/j.nima.2020.164304 ) and stepwise scale this down to the size of $3\times 3$ with $M=2$ possible values per pixel. Below I show exemplary, how the average of all data samples scales down from a full $125\times 125$ detector image.
As every possible state needs to be represented by a unique basis state, I represent every pixel with $\log_2M$ qubits, where $M$ is the number of discrete values a single pixel can take. The total qubits needed to represent an image of size $N\times N$ with $M$ discrete values per pixel is then $n = N^2\log_2M$.
For our toy example we therefore have $n=9$ pixels. Of course this is a very simplified example, however it can work as a simple benchmark to test our model.
Sampling from the generator we will obtain a measured basis stat which is a list of $0$s and $1$s representing the measured value for every qubit.
To map such a basis state $\ket{q_1q_2\dots}, q_i = 0,1$ to an image and vice versa, we proceed as following:
- Reshape the list $[q_1,q_2,\dots]$ to $(N,N,\log_2 M)$
- Convert the third dimension from an binary list to an integer
- Divide by $2^M - 1$ to normalize
These images can then be passed into a classical deep neural network of your choice, meaning it is possible to utilize convolutional or graph neural networks as discriminator, as has been done in jet phyics before. Since the generator is trained on adversarially against the discriminator, we expect the generator to be able to generalize, given that the discriminator generalizes well on the data.
The training of the hybrid QGAN proceeds as following, for every epoch:
- We draw $N$ samples $s_i$ of basis states
- We evaluate the discriminator $D$ at theses samples giving $D(s_i)$
- We get the probabilites for every sample beeing drawn by the generator $G$ as $p_i = G(s_i)$
- We calculate the generator Loss $\mathcal{L}_G=\sum_i p_i \log D(s_i)$
- We perform a step updating the generator parameters using $\mathcal{L}_G$
- We evaluate the discriminator on a batch of real data samples $D(x_i)$
- We calculate the discriminator loss as $\mathcal{L}_D=\frac{1}{N}\sum^N_i\log D(x_i) - \sum_i p_i \log D(s_i)$.
- We perform a step updating the discriminator parameters using $\mathcal{L}_D$.
In the discriminator loss $\mathcal{L}_D$ the first term corresponds to the discriminator learning the real samples and the second term to learning to spot the fake samples generated by the generator.
I implemented this learning procedure using pennylane and pytorch. For that, I use a qnode
with the pennylane pytorch interface
dev = qml.device("default.qubit.torch", wires=num_qubits)
@qml.qnode(dev, interface="torch", diff_method="backprop", cachesize=1000000)
def circuit(inputs, weights):
for wire in range(num_qubits): qml.Hadamard(wires=wire)
qml.StronglyEntanglingLayers(weights=weights, wires=list(range(num_qubits)))
return qml.probs()
In this example, I return as measurement the probabilites of the basis states to use for the generator loss function. Note that I have to increase the cachesize
for larger ciruits. As a unitarity in this example I use pennylane StronglyEntanglingLayer
, though a more hardware efficient ansatz would be advantageous for execution on real hardware.
To perform the hybrid training with pytorch, we have to convert the qnode
to a torch layer
weight_shapes = {"weights": (n_layers, num_qubits,3)}
qlayer = qml.qnn.TorchLayer(circuit, weight_shapes)
As discriminator we can build a feed forward neural network with the pixel number as input size.
class Discriminator(nn.Module):
def __init__(self, input_size):
super(Discriminator, self).__init__()
self.linear_input = nn.Linear(input_size, 50)
self.leaky_relu = nn.LeakyReLU(0.2)
self.linear1 = nn.Linear(50, 20)
self.linear2 = nn.Linear(20, 1)
self.sigmoid = nn.Sigmoid()
self.flatten = nn.Flatten()
def forward(self, inputs):
x = self.flatten(inputs)
x = self.linear_input(x)
x = self.leaky_relu(x)
x = self.linear1(x)
x = self.leaky_relu(x)
x = self.linear2(x)
x = self.sigmoid(x)
return x
generator = qlayer
discriminator = Discriminator(N**2)
We can now optimize the parameters of these two models using pytorchs standard optimizers and the loss functions described above.
Training results
The training results of the Qgan are shown below
We can see, that the KL divergence and the MSE between the average generated and data image decrease and converge in a controlled manner. Therefore, the learned generator can generate the data distribution fairly well. It is worthwile to note that especially, there is no mode collapse here, as we are, in contrast to many other QGAN proposals, sampling from a quantum state, therefore mode collapse would mean learning a unitarity, which creates exactly a basis state. This is very unlikely and especially would not lead to a minimum in the loss function.
Scaling and prospects
I think this is an interesting model for generative tasks on classical data. Of course we need more qubits than alternative models that embed continous data, however since the discretization scales with $\log_2 M$ in the long run this should not be the limiting factor. I tried applying this model to larger images, however on a classical simulator the requried qubits were too many to simulate reasonably.
The main thing I would like to understand now is what advantages Quantum assisted machine learning can bring over classical methods. My main motivation for this model was the complexity theory based argument that by measurement quantum computers can efficiently sample from distributions, which are hard to sample for classical algorithms. However, I would like to understand this point better, develop an insight to what kind of data distribtuions this applies and especially make sure that the classical discriminator is able to learn these distribtuions.
References
- Andrews, Alison, An, Bryant, Burkle, Gleyzer, Narain, Paulini, Poczos & Usai (2019)
- Andrews, M., Alison, J., An, S., Bryant, P., Burkle, B., Gleyzer, S., Narain, M., Paulini, M., Poczos, B. & Usai, E. (2019). End-to-End Jet Classification of Quarks and Gluons with the CMS Open Data. Nucl. Instrum. Methods Phys. Res. A 977, 164304 (2020). https://doi.org/10.1016/j.nima.2020.164304
- Butter, Plehn & Winterhalder (2019)
- Butter, A., Plehn, T. & Winterhalder, R. (2019). How to GAN LHC Events. SciPost Phys. 7, 075 (2019). https://doi.org/10.21468/SciPostPhys.7.6.075
- Dallaire-Demers & Killoran (2018)
- Dallaire-Demers, P. & Killoran, N. (2018). Quantum generative adversarial networks. Phys. Rev. A 98, 012324 (2018). https://doi.org/10.1103/PhysRevA.98.012324
- Goodfellow, Pouget-Abadie, Mirza, Xu, Warde-Farley, Ozair, Courville & Bengio (2014)
- Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A. & Bengio, Y. (2014). Generative Adversarial Networks. Retrieved from http://arxiv.org/abs/1406.2661
- Hariri, Dyachkova & Gleyzer (2021)
- Hariri, A., Dyachkova, D. & Gleyzer, S. (2021). Graph Generative Models for Fast Detector Simulations in High Energy Physics. Retrieved from http://arxiv.org/abs/2104.01725
- Lloyd & Weedbrook (2018)
- Lloyd, S. & Weedbrook, C. (2018). Quantum generative adversarial learning. Phys. Rev. Lett. 121, 040502 (2018). https://doi.org/10.1103/PhysRevLett.121.040502
- Niu, Zlokapa, Broughton, Boixo, Mohseni, Smelyanskyi & Neven (2021)
- Niu, M., Zlokapa, A., Broughton, M., Boixo, S., Mohseni, M., Smelyanskyi, V. & Neven, H. (2021). Entangling Quantum Generative Adversarial Networks. https://doi.org/10.48550/ARXIV.2105.00080
- Oliveira, Paganini & Nachman (2017)
- Oliveira, L., Paganini, M. & Nachman, B. (2017). Learning Particle Physics by Example: Location-Aware Generative Adversarial Networks for Physics Synthesis. Comput Softw Big Sci (2017) 1: 4. https://doi.org/10.1007/s41781-017-0004-6
- Rehm, Vallecorsa, Grossi, Borras & Krücker (2023)
- Rehm, F., Vallecorsa, S., Grossi, M., Borras, K. & Krücker, D. (2023). A Full Quantum Generative Adversarial Network Model for High Energy Physics Simulations. Retrieved from http://arxiv.org/abs/2305.07284
- Rehm, Vallecorsa, Borras, Grossi, Krücker & Varo (2023)
- Rehm, F., Vallecorsa, S., Borras, K., Grossi, M., Krücker, D. & Varo, V. (2023). Precise Image Generation on Current Noisy Quantum Computing Devices. Retrieved from http://arxiv.org/abs/2307.05253
- Zoufal, Lucchi & Woerner (2019)
- Zoufal, C., Lucchi, A. & Woerner, S. (2019). Quantum Generative Adversarial Networks for Learning and Loading Random Distributions. npj Quantum Inf 5, 103 (2019). https://doi.org/10.1038/s41534-019-0223-2