Login / Signup

Implementation of a Hamming distance-like genomic quantum classifier using inner products on ibmqx2 and ibmq_16_melbourne.

Kunal KathuriaAakrosh RatanMichael McConnellStefan Bekiranov
Published in: Quantum machine intelligence (2020)
Motivated by the problem of classifying individuals with a disease versus controls using a functional genomic attribute as input, we present relatively efficient general purpose inner product-based kernel classifiers to classify the test as a normal or disease sample. We encode each training sample as a string of 1 s (presence) and 0 s (absence) representing the attribute's existence across ordered physical blocks of the subdivided genome. Having binary-valued features allows for highly efficient data encoding in the computational basis for classifiers relying on binary operations. Given that a natural distance between binary strings is Hamming distance, which shares properties with bit-string inner products, our two classifiers apply different inner product measures for classification. The active inner product (AIP) is a direct dot product-based classifier whereas the symmetric inner product (SIP) classifies upon scoring correspondingly matching genomic attributes. SIP is a strongly Hamming distance-based classifier generally applicable to binary attribute-matching problems whereas AIP has general applications as a simple dot product-based classifier. The classifiers implement an inner product between N = 2 n dimension test and train vectors using n Fredkin gates while the training sets are respectively entangled with the class-label qubit, without use of an ancilla. Moreover, each training class can be composed of an arbitrary number m of samples that can be classically summed into one input string to effectively execute all test-train inner products simultaneously. Thus, our circuits require the same number of qubits for any number of training samples and are O ( log N ) in gate complexity after the states are prepared. Our classifiers were implemented on ibmqx2 (IBM-Q-team 2019b) and ibmq_16_melbourne (IBM-Q-team 2019a). The latter allowed encoding of 64 training features across the genome.
Keyphrases
  • highly efficient
  • virtual reality
  • mental health
  • healthcare
  • ionic liquid
  • primary care
  • physical activity
  • quality improvement
  • molecular dynamics
  • deep learning
  • dna methylation
  • gene expression
  • genome wide