KMCP: accurate metagenomic profiling of both prokaryotic and viral populations by pseudo-mapping

作者全名:"Shen, Wei; Xiang, Hongyan; Huang, Tianquan; Tang, Hui; Peng, Mingli; Cai, Dachuan; Hu, Peng; Ren, Hong"

作者地址:"[Shen, Wei; Xiang, Hongyan; Huang, Tianquan; Tang, Hui; Peng, Mingli; Cai, Dachuan; Hu, Peng; Ren, Hong] Chongqing Med Univ, Inst Viral Hepatitis, Affiliated Hosp 2, Dept Infect Dis,Minist Educ,Key Lab Mol Biol Infe, Chongqing 400010, Peoples R China"

通信作者:"Shen, W; Hu, P; Ren, H (通讯作者),Chongqing Med Univ, Inst Viral Hepatitis, Affiliated Hosp 2, Dept Infect Dis,Minist Educ,Key Lab Mol Biol Infe, Chongqing 400010, Peoples R China."

来源:BIOINFORMATICS

ESI学科分类:BIOLOGY & BIOCHEMISTRY

WOS号:WOS:001025519200030

JCR分区:Q1

影响因子:5.8

年份:2023

卷号:39

期号:1

开始页: 

结束页: 

文献类型:Article

关键词: 

摘要:"Motivation: The growing number of microbial reference genomes enables the improvement of metagenomic profiling accuracy but also imposes greater requirements on the indexing efficiency, database size and runtime of taxonomic profilers. Additionally, most profilers focus mainly on bacterial, archaeal and fungal populations, while less attention is paid to viral communities. Results: We present KMCP (K-mer-based Metagenomic Classification and Profiling), a novel k-mer-based metagenomic profiling tool that utilizes genome coverage information by splitting the reference genomes into chunks and stores k-mers in a modified and optimized Compact Bit-Sliced Signature Index for fast alignment-free sequence searching. KMCP combines k-mer similarity and genome coverage information to reduce the false positive rate of k-mer-based taxonomic classification and profiling methods. Benchmarking results based on simulated and real data demonstrate that KMCP, despite a longer running time than all other methods, not only allows the accurate taxonomic profiling of prokaryotic and viral populations but also provides more confident pathogen detection in clinical samples of low depth. Availability and implementation: The software is open-source under the MIT license and available at https://github.com/shenwei356/kmcp. Contact: shenwei356@cqmu.edu.cn or hp_cq@163.com or hupengcq@hospital.cqmu.edu.cn or renhong0531@cqmu.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online."

基金机构:National Natural Science Foundation of China [32000474]; China Postdoctoral Science Foundation [2021M700640]; Chongqing Talents Project [cstc2021ycjh-bgzxm0150]

基金资助正文:This work was supported by the National Natural Science Foundation of China [32000474 to W.S.]; the China Postdoctoral Science Foundation [2021M700640 to W.S.]; and the Chongqing Talents Project [cstc2021ycjh-bgzxm0150 to P.H.].