Development of an HRM-based tool for the automated identification of nucleotide sequences in large datasets

Jean-Christophe Avarre 1,*,Matthieu Vignoles2, Mathieu Laffont2, Lise Grewis2, Christelle Reynes 3,4
1Institut des Sciences de l’Evolution de Montpellier, UMR IRD-CNRS-UM-EPHE, Montpellier, France
2 IDVet Genetics, Montpellier, France
3 Institute of Functional Genomics, UMR CNRS-INSERM-UM, Montpellier, France
4 Laboratory of Biostatistics, Informatics and Pharmaceutical Physics, UFR Pharmacy, University of Montpellier, France; 

Abstract
Nucleic acid characterization by High Resolution Melting (HRM) is a simple, flexible, low-cost and powerful technique for identify- ing sequence variations, making it attractive for a broad range of diagnostic and research applications including infectious diseases, oncology, epigenetics and even metabarcoding. Current procedures for analyzing HRM curves mostly rely on unsupervised methods, principally via subtraction (difference) plots against a known con- trol sample, and less commonly on supervised methods through the use of discrimination analyses. If these procedures have proven useful for discriminating a small number of variants, they are yet limiting for analyzing large HRM data sets and do not provide pre- cise feedback to the user.
In this context, we have developed an innovative method that enables the simultaneous discrimination of a large number of variants from their HRM profile. This method relies on the estab- lishment of a melting profile library, computes new descriptors from the HRM curves for an optimal discrimination and offers a fully automated analysis of melting profiles. The output consists in the possibility to assign a given melting profile to an existing group included in the reference library (assorted with a confidence index) or to reject any assignment in case of an unknown profile.
This method was first validated on a set of 19 nontuberculous mycobacterial species. Each species was represented by 3–20 bio- logical samples consisting of genomic DNA extracted either from animal tissues or from cultivated isolates. Each sample was ampli- fied with a unique pair of primers targeting the 16S-ITS region and yielding amplicons with sizes ranging from ∼230 to ∼350 bp. Melt- ing profiles of the corresponding amplicons were generated using 3-9 replicates per sample. On a total of 95 samples, 91 were allot- ted to the right species. Automatic group rationalization led to split two species into two subgroups, suggesting that this method is able to integrate intraspecific sequence variations.
The method was then applied to develop a diagnostic tool tar- geting five different pathogens responsible for abortive diseases in cattle. Each pathogen was represented by 10 to 22 different biological samples consisting of genomic DNA extracted from dif- ferent matrices (e.g. feces, swabs, whole blood). Each biological sample was amplified and melted in 4 replicates with a ready-to- use mastermix containing 5 sets of primers, specific to the targeted pathogens. The limit of detection of this multiplex test was equiva- lent to that of the current individual tests using hydrolysis probes, around 10 copies/PCR. Moreover, all samples containing at least 10 copies of pathogen(s) were correctly identified with high confi- dence.
These results underline the high potential of this novel HRM- based method for the simultaneous detection and identification of a large number of nucleotide sequences, in both simplex and multiplex formats.
http://dx.doi.org/10.1016/j.bdq.2017.02.081


Back to qPCR Data Analysis
Bookmark the permalink.

Comments are closed.