5 November 2021 Jeffrey M. Kidd University of Michigan This directory contains indel and SNP callsets for 1,987 samples processed by the Dog10K Consortium. Variants are divided between SNPs and non-SNPs and separated between the autosomes (chr1-chr38 + chrX-PAR) and the non-PAR region of chrX. SNPs and non-SNPs were separated using GATK SelectVariants (version 4.2.0.0). Sample commands: Select the SNPs: gatk --java-options "-Xmx4g" SelectVariants \ -V chrom/AuotsandXPAR.combined.vcf.gz \ -O vqsr/AutoAndXPAR.SNPs.vcf.gz \ -select-type SNP; Select the non-SNPs: gatk --java-options "-Xmx4g" SelectVariants \ -V chrom/AuotsandXPAR.combined.vcf.gz \ -O vqsr/AutoAndXPAR.nonSNPs.vcf.gz \ -xl-select-type SNP For the non-SNPs, hard filters were employed to identify a higher confidence subset of variants: Sample command for filtering indels: gatk --java-options "-Xmx4g" VariantFiltration \ -V vqsr/AutoAndXPAR.nonSNPs.vcf.gz \ -O vqsr/AutoAndXPAR.nonSNPs.filter.vcf.gz \ --verbosity ERROR \ -filter "QD < 2.0" --filter-name "QD2" \ -filter "FS > 200.0" --filter-name "FS200" \ -filter "ReadPosRankSum < -2.0" --filter-name "ReadPosRankSum-2" \ -filter "SOR > 10.0" --filter-name "SOR-10" A high-confidence subset of SNPs were selected using the VQSR procedure using sites on the CanineHD and Axiom_K9_HD SNP arrays as training set using GATK version 4.2.0.0. See https://github.com/jmkidd/dogmap for more details. VariantRecalibrator command for the Autosomes: gatk --java-options "-Xmx59g" VariantRecalibrator \ -R UU_Cfam_GSD_1.0_ROSY.fa \ -V vqsr/AutoAndXPAR.SNPs.vcf.gz \ -O vqsr/AutoAndXPAR.SNPs.recal \ -resource:array,known=false,training=true,truth=true,prior=12.0 SRZ189891_722g.simp.header.CanineHDandAxiom_K9_HD.GSD_1.0.vcf.gz \ --use-annotation QD --use-annotation MQ --use-annotation MQRankSum \ --use-annotation ReadPosRankSum --use-annotation FS --use-annotation SOR \ --use-annotation DP \ --trust-all-polymorphic true \ -mode SNP \ --rscript-file vqsr/AutoAndXPAR.SNPs.plots.R \ --tranches-file vqsr/AutoAndXPAR.SNPs.tranches \ -tranche 100.0 \ -tranche 99.9 \ -tranche 99.0 \ -tranche 98.0 \ -tranche 97.0 \ -tranche 96.0 \ -tranche 95.0 \ -tranche 94.0 \ -tranche 93.0 \ -tranche 92.0 \ -tranche 91.0 \ -tranche 90.0 VariantRecalibrator command for the chrX-nonPAR: gatk VariantRecalibrator \ -R /home/jmkidd/links/kidd-lab/genomes/UU_Cfam_GSD_1.0/ref-Y/UU_Cfam_GSD_1.0_ROSY.fa \ -V vqsr/chrX.NONPAR.SNPs.vcf.gz \ -O vqsr/chrX.NONPAR.SNPs.output.recal \ -resource:array,known=false,training=true,truth=true,prior=12.0 SRZ189891_722g.simp.header.CanineHDandAxiom_K9_HD.GSD_1.0.vcf.gz \ --use-annotation QD --use-annotation MQ --use-annotation MQRankSum \ --use-annotation ReadPosRankSum --use-annotation FS --use-annotation SOR \ --use-annotation DP \ --trust-all-polymorphic true \ -mode SNP \ --max-gaussians 4 \ --rscript-file X.output.plots.R \ --tranches-file X.snp.output.tranches \ -tranche 100.0 \ -tranche 99.9 \ -tranche 99.0 \ -tranche 98.0 \ -tranche 97.0 \ -tranche 96.0 \ -tranche 95.0 \ -tranche 94.0 \ -tranche 93.0 \ -tranche 92.0 \ -tranche 91.0 \ -tranche 90.0 ApplyVQSR was then used to select the 99.0% tranche of variants. This resulted in a total of 33,374,496* SNPs on the autosomes+X-PAR and 1,191,860 SNPs on the X-nonPAR that pass the criteria. Update/Note June 2023: Note that the the final PASS set for the autosomes actually contains 33,374,690 SNPs, which is 194 more than indicated by the tranche tables. We believe this is due to sites with a VQSRTranche score of exactly 99.00 and ambiguities as to how they are counted. These sites are included in our PASS files and the correct total number of SNPs is 33,374,690 + 1,191,860 = 34,566,550 For reference, the VQSR tranches tables are shown below. Autosomes + chrX-PAR targetTruthSensitivity,numKnown,numNovel,knownTiTv,novelTiTv,minVQSLod,filterName,model,accessibleTruthSites,callsAtTruthSites,truthSensitivity 90.00,0,14812850,0.0000,2.3377,19.6028,VQSRTrancheSNP0.00to90.00,SNP,602478,542230,0.9000 91.00,0,16831393,0.0000,2.3213,16.6877,VQSRTrancheSNP90.00to91.00,SNP,602478,548254,0.9100 92.00,0,19110750,0.0000,2.3019,12.9640,VQSRTrancheSNP91.00to92.00,SNP,602478,554279,0.9200 93.00,0,21943738,0.0000,2.2787,7.1741,VQSRTrancheSNP92.00to93.00,SNP,602478,560304,0.9300 94.00,0,22164983,0.0000,2.2769,6.6668,VQSRTrancheSNP93.00to94.00,SNP,602478,566329,0.9400 95.00,0,22603234,0.0000,2.2732,5.8359,VQSRTrancheSNP94.00to95.00,SNP,602478,572354,0.9500 96.00,0,23721556,0.0000,2.2659,4.3070,VQSRTrancheSNP95.00to96.00,SNP,602478,578378,0.9600 97.00,0,24067456,0.0000,2.2632,3.6897,VQSRTrancheSNP96.00to97.00,SNP,602478,584403,0.9700 98.00,0,28352686,0.0000,2.2551,1.8485,VQSRTrancheSNP97.00to98.00,SNP,602478,590428,0.9800 99.00,0,33374496,0.0000,2.2397,0.4378,VQSRTrancheSNP98.00to99.00,SNP,602478,596453,0.9900 99.90,0,47482193,0.0000,2.1180,-8.1887,VQSRTrancheSNP99.00to99.90,SNP,602478,601875,0.9990 100.00,0,52963968,0.0000,2.0000,-35264.5960,VQSRTrancheSNP99.90to100.00,SNP,602478,602478,1.0000 chrX-nonPAR targetTruthSensitivity,numKnown,numNovel,knownTiTv,novelTiTv,minVQSLod,filterName,model,accessibleTruthSites,callsAtTruthSites,truthSensitivity 90.00,0,946300,0.0000,1.7886,-4.2554,VQSRTrancheSNP0.00to90.00,SNP,10680,9612,0.9000 91.00,0,954982,0.0000,1.7890,-4.3673,VQSRTrancheSNP90.00to91.00,SNP,10680,9718,0.9099 92.00,0,966833,0.0000,1.7891,-4.5198,VQSRTrancheSNP91.00to92.00,SNP,10680,9825,0.9199 93.00,0,980000,0.0000,1.7884,-4.6939,VQSRTrancheSNP92.00to93.00,SNP,10680,9932,0.9300 94.00,0,994567,0.0000,1.7878,-4.8874,VQSRTrancheSNP93.00to94.00,SNP,10680,10039,0.9400 95.00,0,1011265,0.0000,1.7865,-5.1246,VQSRTrancheSNP94.00to95.00,SNP,10680,10146,0.9500 96.00,0,1038293,0.0000,1.7843,-5.5538,VQSRTrancheSNP95.00to96.00,SNP,10680,10252,0.9599 97.00,0,1074339,0.0000,1.7799,-6.2276,VQSRTrancheSNP96.00to97.00,SNP,10680,10359,0.9699 98.00,0,1124766,0.0000,1.7717,-7.3677,VQSRTrancheSNP97.00to98.00,SNP,10680,10466,0.9800 99.00,0,1191860,0.0000,1.7572,-9.3431,VQSRTrancheSNP98.00to99.00,SNP,10680,10573,0.9900 99.90,0,1624098,0.0000,1.6796,-70.9023,VQSRTrancheSNP99.00to99.90,SNP,10680,10669,0.9990 100.00,0,1667705,0.0000,1.6552,-39828.8691,VQSRTrancheSNP99.90to100.00,SNP,10680,10680,1.0000