MetaRNN
MetaRNN and MetaRNN-indel
MetaRNN and MetaRNN-indel are pathogenicity prediction scores for human nonsynonymous SNVs (nsSNVs) and non-frameshift (NF) indels. They integrated information from 28 high-level annotation scores (16 functional prediction scores including SIFT, Polyphen2_HDIV, Polyphen2_HVAR, MutationAssessor, PROVEAN, VEST4, M-CAP, REVEL, MutPred, MVP, PrimateAI, DEOGEN2, CADD, fathmm-XF, Eigen and GenoCanyon, 8 conservation scores including GERP, phyloP100way_vertebrate, phyloP30way_mammalian, phyloP17way_primate, phastCons100way_vertebrate, phastCons30way_mammalian, phastCons17way_primate and SiPhy, and 4 allele frequency information from the 1000 Genomes Project, ExAC, gnomAD exome, and gnomAD genome) and produce an ensemble prediction model with a deep recurrent neural network (RNN). The final prediction is the likelihood of a nsSNV or NF indel being pathogenic.
We provide predictions for all potential nsSNVs (~86 million) in the dbNSFP database for rapid and user-friendly analysis. We provided a stand-alone Linux source codes and executable for the Linux environment at GitHub. The executable takes a standard VCF file as input and provides pathogenicity scores for nsSNV and NF indel in a transcript-specific manner as output (supported by ANNOVAR). The average prediction time for a single NF indel is ~ 0.2 seconds which can support timely large-scale predictions.
Please cite:
Li C, Zhi D, Wang K, Liu X (2021) MetaRNN: Differentiating Rare Pathogenic and Rare Benign Missense SNVs and InDels Using Deep Learning. bioRxiv. https://doi.org/10.1101/2021.04.09.438706. [PDF]
MetaRNN and MetaRNN-indel are pathogenicity prediction scores for human nonsynonymous SNVs (nsSNVs) and non-frameshift (NF) indels. They integrated information from 28 high-level annotation scores (16 functional prediction scores including SIFT, Polyphen2_HDIV, Polyphen2_HVAR, MutationAssessor, PROVEAN, VEST4, M-CAP, REVEL, MutPred, MVP, PrimateAI, DEOGEN2, CADD, fathmm-XF, Eigen and GenoCanyon, 8 conservation scores including GERP, phyloP100way_vertebrate, phyloP30way_mammalian, phyloP17way_primate, phastCons100way_vertebrate, phastCons30way_mammalian, phastCons17way_primate and SiPhy, and 4 allele frequency information from the 1000 Genomes Project, ExAC, gnomAD exome, and gnomAD genome) and produce an ensemble prediction model with a deep recurrent neural network (RNN). The final prediction is the likelihood of a nsSNV or NF indel being pathogenic.
We provide predictions for all potential nsSNVs (~86 million) in the dbNSFP database for rapid and user-friendly analysis. We provided a stand-alone Linux source codes and executable for the Linux environment at GitHub. The executable takes a standard VCF file as input and provides pathogenicity scores for nsSNV and NF indel in a transcript-specific manner as output (supported by ANNOVAR). The average prediction time for a single NF indel is ~ 0.2 seconds which can support timely large-scale predictions.
Please cite:
Li C, Zhi D, Wang K, Liu X (2021) MetaRNN: Differentiating Rare Pathogenic and Rare Benign Missense SNVs and InDels Using Deep Learning. bioRxiv. https://doi.org/10.1101/2021.04.09.438706. [PDF]
Office3720 Spectrum Boulevard, Suite 304
Tampa, FL 33612 |
Telephone813-974-9865
|
|