Integrating nanopore technology with machine-learning

Integrating nanopore technology with machine-learning

Rapid diagnosis of infectious disease is a prerequisite at any time and especially so during the current coronavirus pandemic. Traditional molecular methods such as the polymerase chain reaction (PCR) or immunosensing are frequently used for viral detection. These techniques can efficiently detect and assess viral infections, but demand time, expertise, and several pre-treatments. Advanced nanopore sensing has been studied as a novel method to detect viruses or other biological particles based on its sensitivity towards a single particle. During nanopore sensing, the sample is passed through the pore electrophoretically, single particles then cause a temporal change in the ionic current, which is measured as resistive pulses. These changes stem from the physical properties of particles such as their volume, surface charge, shape, and mass. However, the observed variability between samples and the validity of the evaluation remains a challenge as observed by Arima et al. who in 2018, successfully identified influenza A(H1N1), A(H3N2), and B using nanopore integrated with machine-learning.

Virus detection using nanopore technology

Recently, the same research group developed label-free respiratory tract virus identification using the same nanopore sensing approach linked with machine-learning classification. A solid-state cylindrical nanopore was developed to detect different viruses, i.e., influenza A virus, influenza B virus, respiratory syncytial virus (RSV). The coronavirus and adenoviruses being responsible for the two big pandemics, were also included in the study. The nanopore was fabricated using a SiNx membrane. The 300 nm sized nanopore was covered with a 5 μm thick polyamide layer to cut down device capacitance and to increase the temporal resolution of the resistive pulse measurements. The nanopore chip was also sealed with two polydimethylsiloxane (PDMS) blocks to contain the sample solutions and the prepared biosamples were filtered to remove any large-sized contaminants that may lead to clogging of the nanopore. The test liquid was injected into the anode side of the PDMS cell while filling the other side with PBS to let virions pass through the pore by electrophoresis.

Why was machine-leaning introduced in this approach?

The initial approach involved analysis based on two dimensional features, i.e., height Ip and width td, representing volume and surface charge density of the resistive pulse signals. However, these parameters had a wide range and showed substantial overlap between the five viral species. Therefore, to obtain more specific and better discrimination, machine learning was introduced.

Machine learning features

In machine learning-based sensing, a supervised approach is used, whereby a classification model called a classifier, ‘learns’ feature parameters from each signal and then attributes those parameters to a virus species as a class. There are several choices of method which may be adopted for classifier creation naïve Bayes, SVM, k-nearest neighbour, random and rotation forest.

In this study by Arima et al, machine learning was firstly used to estimate individual virus discrimination by analysing patterns of ionic current spikes. This was done using the Waikato Environment for Knowledge Analysis (WEKA) machine learning workbench with 67 Rotation Forest ensembles in conjunction with a distinct base classifier such as the naıve Bayes model.

Random forest is an ensemble learning algorithm that utilizes a plurality of models called decision trees. The target data are categorized in a stepwise manner using the conditions set by the training data. These ‘trees’ attempt to solve the classification problem by first defining a distinct class and then estimating the value. Random forest usually adopts bootstrap sampling for generating the various characters to be used for classification.

As a second step, machine learning was employed to estimate viral copy number. The algorithm randomly selected several feature parameters and coupled them to 22 current and time vectors, creating 60 feature vectors for each resistive pulse. 23 of the feature vectors were then used to train classifiers so as to assign the resistive pulses to a specific virus. This feature-based discrimination showed over 99% accuracy for detection of the five different viral species.


  1. This type of approach using computer algorithms can detect a specific virus within milliseconds.
  2. The method is based on physical rather than biological properties of the viruses and can therefore be used for identification of new strains without the need to develop antibodies/markers.
  3. The measurement protocol is quiet simple. It only requires the introduction of the sample into the fluidic channels and the application of a voltage.
  4. Results from different nanopores can be calibrated by constructing a signal database for several nanopores with different open pore conductance.
  5. The method enables selection of the resistive pulses which have similar nanopore conductance to the ones being measured during the machine learning stage.
  6. This technology with machine learning can also been used for detection in human samples and has been used for viral detection in saliva in earlier publications.
  7. The resistive pulses obtained by the method are capable of distinguishing coronavirus from other viruses. Based on machine learning, novel mutated strains may also be identified in future through retraining of the classifiers.


Share This Post

Leave a Reply