FastMLST: A Multi-core Tool for Multilocus Sequence Typing of Draft Genome Assemblies

dc.contributor.authorGuerrero-Araya, Enzo
dc.contributor.authorMuñoz, Marina
dc.contributor.authorRodríguez, César
dc.contributor.authorParedes-Sabja, Daniel
dc.date.accessioned2025-03-18T21:22:24Z
dc.date.available2025-03-18T21:22:24Z
dc.date.issued2021-11
dc.descriptionIndexación: Scopus.
dc.description.abstractMultilocus Sequence Typing (MLST) is a precise microbial typing approach at the intra-species level for epidemiologic and evolutionary purposes. It operates by assigning a sequence type (ST) identifier to each specimen, based on a combination of alleles of multiple housekeeping genes included in a defined scheme. The use of MLST has multiplied due to the availability of large numbers of genomic sequences and epidemiologic data in public repositories. However, data processing speed has become problematic due to the massive size of modern datasets. Here, we present FastMLST, a tool that is designed to perform PubMLST searches using BLASTn and a divide-and-conquer approach that processes each genome assembly in parallel. The output offered by FastMLST includes a table with the ST, allelic profile, and clonal complex or clade (when available), detected for a query, as well as a multi-FASTA file or a series of FASTA files with the concatenated or single allele sequences detected, respectively. FastMLST was validated with 91 different species, with a wide range of guanine-cytosine content (%GC), genome sizes, and fragmentation levels, and a speed test was performed on 3 datasets with varying genome sizes. Compared with other tools such as mlst, CGE/MLST, MLSTar, and PubMLST, FastMLST takes advantage of multiple processors to simultaneously type up to 28 000 genomes in less than 10 minutes, reducing processing times by at least 3-fold with 100% concordance to PubMLST, if contaminated genomes are excluded from the analysis. The source code, installation instructions, and documentation of FastMLST are available at https://github.com/EnzoAndree/FastMLST. © The Author(s) 2021.
dc.description.urihttps://journals-sagepub-com.recursosbiblioteca.unab.cl/doi/10.1177/11779322211059238
dc.identifier.citationBioinformatics and Biology Insights, Volume 15, November 2021
dc.identifier.doi10.1177/11779322211059238
dc.identifier.issn1177-9322
dc.identifier.urihttps://repositorio.unab.cl/handle/ria/63814
dc.language.isoen
dc.publisherSAGE Publications Inc.
dc.rights.licenseAttribution-NonCommercial 4.0 International CC BY-NC 4.0 Deed
dc.rights.urihttps://creativecommons.org/licenses/by-nc/4.0/
dc.subjectDivide-and-conquer approach
dc.subjectGenome analysis
dc.subjectMicrobial typing
dc.subjectMLST
dc.subjectParallel computing
dc.titleFastMLST: A Multi-core Tool for Multilocus Sequence Typing of Draft Genome Assemblies
dc.typeArtículo
Archivos
Bloque original
Mostrando 1 - 1 de 1
No hay miniatura disponible
Nombre:
Guerrero_Fastmlst-a-multi-core.pdf
Tamaño:
378.26 KB
Formato:
Adobe Portable Document Format
Descripción:
TEXTO COMPLETO EN INGLÉS
Bloque de licencias
Mostrando 1 - 1 de 1
No hay miniatura disponible
Nombre:
license.txt
Tamaño:
1.71 KB
Formato:
Item-specific license agreed upon to submission
Descripción: