alexa MLVA_Normalizer: Workflow for Normalization of MLVA Profiles and Data Exchange between Laboratories | OMICS International
ISSN: 0974-276X
Journal of Proteomics & Bioinformatics

Like us on:

Make the best use of Scientific Research and information from our 700+ peer reviewed, Open Access Journals that operates with the help of 50,000+ Editorial Board Members and esteemed reviewers and 1000+ Scientific associations in Medical, Clinical, Pharmaceutical, Engineering, Technology and Management Fields.
Meet Inspiring Speakers and Experts at our 3000+ Global Conferenceseries Events with over 600+ Conferences, 1200+ Symposiums and 1200+ Workshops on
Medical, Pharma, Engineering, Science, Technology and Business

MLVA_Normalizer: Workflow for Normalization of MLVA Profiles and Data Exchange between Laboratories

Paul Bachelerie1, Arnaud Felten2, Marie-Léone Vignaud1, Benjamin Glasset1, Carole Feurer3, Renaud Lailler1 and Sabrina Cadel Six1*

1Université PARIS-EST, ANSES, Laboratory for Food Safety, Listeria, Salmonella, E. coli Unit, 14 rue Pierre et Marie Curie, 94701 Maisons Alfort, France

2Université PARIS-EST, ANSES, Laboratory for Food Safety, Modelling and Quantitative Risk Assessment Unit, 14 rue Pierre et Marie Curie, 94701 Maisons Alfort, France

3French Pig and Pork Institute, 7 Avenue du Général de Gaulle, 94700 Maisons-Alfort, France

*Corresponding Author:
Sabrina Cadel Six
Laboratory for Food Safety, Listeria
Salmonella, E. coli Unit, 14 rue Pierre et Marie Curie
94701 Maisons Alfort, France
Tel: +33 1 49 77 27 19
E-mail: [email protected]

Received Date: December 17, 2015 Accepted Date: February 03, 2016 Published Date: February 06, 2016

Citation: Bachelerie P, Felten A, Vignaud ML, Glasset B, Feurer C, et al. (2016) MLVA_Normalizer: Workflow for Normalization of MLVA Profiles and Data Exchange between Laboratories. J Proteomics Bioinform 9:025-027. doi:10.4172/jpb.1000385

Copyright: © 2016 Bachelerie P, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Visit for more related articles at Journal of Proteomics & Bioinformatics


Motivation: MLVA (Multiple Loci VNTR Analysis) is a typing method used today to characterize several major pathogens such as Brucella, Mycobacterium tuberculosis or Salmonella. It takes advantage from the comparison of the size of specific genomic loci constituted by tandem repeat sequences. Unfortunately the raw size estimate is instrument dependent and consequently the results obtained cannot be compared without normalization between laboratories involved in disease monitoring, surveillance and official controls.

Results: To overcome this problem we developed a workflow tool, MLVA_normalizer, conceived to normalize MLVA results. This normalization workflow tool is designed to be applied to any bacterial genera and does not depend on the MLVA protocol used.

Availability: MLVA_normalizer is available under the GNU general public license (version 2, June 1991). Source code is available at:


MLVA; Workflow; Python 2.7 langage; Normalisation; Laboratory data exchange


Multiple Loci VNTR Analysis (MLVA) is a method employed for typing microorganisms, such as pathogenic bacteria. This analysis takes advantage of the genomic polymorphism of tandemly repeated DNA sequences called VNTRs (variable-number tandem repeats) [1]. Use of this typing method has grown significantly over the past decade in response to numerous limitations encountered by the standard typing methods and thanks to its ease and speed of application in the field. For Salmonella, for example, the standard typing method is pulsed-field gel electrophoresis (PFGE). This method requires specific technical expertise, is labor-intensive and provides insufficient discrimination within several serovars [2]. As a result, on account of its capacity to differentiate closely related strains, MLVA has become a major first line typing tool for a number of pathogens such as Mycobacterium tuberculosis [3], Bacillus anthracis [4], Brucella [5], Staphylococcus aureus [6], Salmonella [7] and Shiga toxin-producing Escherichia coli [8].

In a typical MLVA assay, a number of VNTR loci are amplified by polymerase chain reaction (PCR) so that the size of each locus can be measured, usually by electrophoresis of the amplification products together with reference DNA fragments (known as DNA size markers). From this size, the number of tandem repeat units (TRs) at each locus can be deduced. The number of TRs for each locus is collected in a code that becomes the MLVA profile of the organism analyzed.

Unfortunately, depending on the electrophoresis equipment used, errors can appear in the estimation of the size of VNTRs. The raw size estimate, for the same DNA fragment, is indeed instrument dependent and can results in differences in the attribution of allele numbers. Consequently the MLVA profile obtained for the same strain can differ between laboratories. The capillary equipment usually come with software enabling the allele calling from size bins but their setting is often elaborated and limited in term of correcting the raw data. It’s why paying softwares exist and ECDC published in 2011 for Salmonella enterica subsp. enterica serovar typhimurium, a pathogen of major interest in public health, an Excel file in order to standardize data from laboratories participating at interlaboratory comparison study for the MLVA analysis of this pathogen. In the context of major health problems or foodborne pathogens, precise and rapid characterization is fundamental for the implementation, strengthening and evaluation of health and sanitary policies. We built MLVA_normalizer workflow to ensure accurate MLVA results to aid data exchange between laboratories involved in disease monitoring and surveillance.


The MLVA_normalizer workflow was built using Python 2.7. The algorithm was inspired by an Excel file published in ECDC’s Laboratory Standard Operating Procedure for MLVA of Salmonella enterica serotype typhimurium (2011). The essential prerequisite for running MLVA_normalizer workflow, as in any assay and quality control, is to have a reference strain panel for which the lengths of VNTR loci have been confirmed by sequencing. For several pathogens, such as Mycobacterium tuberculosis [3], Salmonella serovar typhimurium [9] and enteritidis [10], this reference strain table already exists. It includes the name of the reference strains, the correct MLVA profile and the real length of each VNTR locus analyzed. Any laboratory wishing to use MLVA_normalizer workflow must possess the appropriate panel of reference strains and analyze them together with the sample strains. The electrophoresis data (measured length) must be collected (both for the reference strain and for the sample strain) and organized in two different input files: i/ reference.txt input file, containing the data for the reference strain (Figure 1) and ii/ capillary_electrophoresis.txt input file, with data for the sample strain. The capillary_electrophoresis.txt input file must be compiled by the user as in Figure 1 and the name of the VNTR must be typographically the same as that of the reference.txt input file. The two input files must be in (.txt) format to run in MLVA_normalizer workflow. The algorithm follows three steps (Figure 1):


Figure 1: Input and output data files and steps of the MLVA normalization workflow algorithm. a: Name of the VNTR locus; b: Length of the regions upstream and downstream of the VNTR locus; c: Length of the tandem repeat unit; d: Length measured by electrophoresis including instrument error; e: Real length confirmed by sequencing.

i) Creation of a correction matrix: This matrix is built with the data present in the reference.txt input file. The difference between the real corrected length and the measured length is calculated for each VNTR for each reference strain. After that, a sliding average of the differences is calculated to obtain the best correction fit for each VNTR as equation [11]. The best correction fit equation corresponds to the average between the difference calculated for one repeat unit and the previous one (dn-1 and dn).

ii) Creation of a normalization matrix: This matrix is built with the correction matrix obtained above and the capillary_electrophoresis.txt input file. The measured lengths, obtained for the sample strains, are corrected with the correction fit to obtain normalized results.

iii) Choosing the correct number of tandem repeats: This step is performed by comparing, for each VNTR, the normalized results obtained above with the corrected length recorded in the reference.txt input file. Four possibilities are considered: i) if the normalized result gives the same length as the flanking size (Figure 1), no tandem repeats are present and the MLVA profile will be 0. ii) If the normalized result is equal to 0, no PCR product is present, so the MLVA profile will be -2 in accordance with a previously published convention [1]. iii) If the normalized result is within two base pairs (plus or minus) of the corrected length, then the same number of tandem repeats of the reference strain will be assigned (see “strain_1 in Results.txt”, Figure 1: (110-2) <111< (110+2) =>1 TR as in the reference strain). iv) If the condition in point iii/ is not met, a warning (“Check”) is displayed, meaning that the value obtained must be verified. Three checking messages can appear they are illustrated in the results_new_algo.txt file. For example, for the VNTR called STTR9, the checking message [-∞; 2.0] means that the measured fragment is smaller than what was previously found, in this case: 2 repeated units. In the same way, the message [9.0;+∞] means that the fragment size is longer compared to what was previously found; in this case, the fragment size exceeds the size of 9 united repeats, the longer size observed. Finally the message [3.0; 4.0] indicates that the fragment size cannot be assigned to an allele, in this case the alleles 3 and 4. This could depend on a lot of factors and the analysis should be done again.

The results.txt output file will contain the normalized MLVA profiles for the strains analyzed. The Python scripts, user and technical documentations can be found on gitub (

To validate the workflow we compared 215 MLVA profiles for Salmonella enterica serovar typhimurium with results from the ECDC spreadsheet cited above. Up today, in our laboratory we have analyzed more than 800 MLVA profiles, for Salmonella typhimurium, enteritidis and dublin with this workflow.


The use of this workflow is laid to the availability of a panel of strains (known as reference strains) for which the size of alleles in every locus is known. The importance to have this set of calibration strains is largely explained by Larson in 2013 [12]. This author used 20 international laboratories analyses to proof that without a set of reference strains it is not possible to obtain comparable results between laboratories [12]. The workflow proposed in this study can name alleles in new isolates only if the size of these alleles is in the reference set. If the size of the allele is beyond the set of alleles of reference strains, this allele is considered as “new” and must be verified by sequencing. Another condition to use MLVA_normalizer is lied to the distance between alleles: it must be higher than five base pairs (pb). This is due to the fact that two bp (plus or minus) are the lapse of error accepted in calculation to can name alleles. Hence the workflow fails if the typing scheme uses repeats shorter than five bp. It is the case for some VNTR as SAL20 for Salmonella and O157-3, O157-25 or O157-17 for E.coli [8,13]. However, an international scientific consensus exists exhorting to not include in a subtyping protocol, repeat units shorter than five bp because of the limitations in sizing reproducibility in capillary electrophoresis platforms [1,8].

MLVA_normalizer is easy-to-use tool that computes reference values and electrophoresis data stored in .txt format in a manner more prone to automatization (high throughput, large data sets) and versatility (no manual filling of spreadsheets required) compared to existing tools such as the ECDC spreadsheet. This tool can evolve with the user. Indeed, even if we propose .txt files for the analysis of Salmonella, MLVA_normalizer can be used for whatever other organism at condition to have a panel of reference strains (as discussed upper in the document) and to analyze repeated units which size is higher than five bp.

MLVA_normalizer is finally a free tool that can allows, the normalization of MLVA results whatever the organism analyzed and protocol used. In the Food Safety Laboratory of ANSES, we use the MLVA_normalizer workflow routinely to normalize the MLVA profiles of Salmonella strains such as S. typhimurium, Enteritidis and Dublin for monitoring and surveillance, investigations of outbreaks and official controls.


Research reported in this publication was supported by funds from the Ministère de l’Agriculture, de l’Agroalimentaire et de la Forêt and the Association de Coordination Technique pour l’Industrie Agro-Alimentaire (ACTIA-UMT ARMADA).

Conflict of Interest

None declared.


Select your language of interest to view the total content in your interested language
Post your comment

Share This Article

Relevant Topics

Recommended Conferences

  • Glycobiology, Lipids & Proteomics
    August 27-28, 2018 Toronto, Canada
  • Computational Biology and Bioinformatics
    Sep 05-06 2018 Tokyo, Japan
  • Advancements in Bioinformatics and Drug Discovery
    November 26-27, 2018 Dublin, Ireland

Article Usage

  • Total views: 8268
  • [From(publication date):
    February-2016 - Aug 19, 2018]
  • Breakdown by view type
  • HTML page views : 8205
  • PDF downloads : 63

Post your comment

captcha   Reload  Can't read the image? click here to refresh

Peer Reviewed Journals
Make the best use of Scientific Research and information from our 700 + peer reviewed, Open Access Journals
International Conferences 2018-19
Meet Inspiring Speakers and Experts at our 3000+ Global Annual Meetings

Contact Us

Agri & Aquaculture Journals

Dr. Krish

[email protected]

+1-702-714-7001Extn: 9040

Biochemistry Journals

Datta A


[email protected]

1-702-714-7001Extn: 9037

Business & Management Journals


porn sex

[email protected]

1-702-714-7001Extn: 9042

Chemistry Journals

Gabriel Shaw

Gaziantep Escort

[email protected]

1-702-714-7001Extn: 9040

Clinical Journals

Datta A


[email protected]

1-702-714-7001Extn: 9037


James Franklin

[email protected]

1-702-714-7001Extn: 9042

Food & Nutrition Journals

Katie Wilson

[email protected]

1-702-714-7001Extn: 9042

General Science

Andrea Jason

mp3 indir

[email protected]

1-702-714-7001Extn: 9043

Genetics & Molecular Biology Journals

Anna Melissa

g[email protected]

1-702-714-7001Extn: 9006

Immunology & Microbiology Journals

David Gorantl

[email protected]

1-702-714-7001Extn: 9014

Materials Science Journals

Rachle Green

[email protected]

1-702-714-7001Extn: 9039

Nursing & Health Care Journals

Stephanie Skinner

[email protected]

1-702-714-7001Extn: 9039

Medical Journals


Nimmi Anna

[email protected]

1-702-714-7001Extn: 9038

Neuroscience & Psychology Journals

Nathan T


[email protected]

1-702-714-7001Extn: 9041

Pharmaceutical Sciences Journals

Ann Jose

[email protected]

1-702-714-7001Extn: 9007

Social & Political Science Journals

Steve Harry

[email protected]

1-702-714-7001Extn: 9042

© 2008- 2018 OMICS International - Open Access Publisher. Best viewed in Mozilla Firefox | Google Chrome | Above IE 7.0 version
Leave Your Message 24x7