quicKmer-2 index for hs37d5 4 February 2021 Genome reference was downloaded from: ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/phase2_reference_assembly_sequence/ The fasta file was renamed so that there was no text in the fasta header following the record names. For example: >1 dna:chromosome chromosome:GRCh37:1:1:249250621:1 was renamed to be >1. For the control regions, we excluded known CNVs (not MEI or inversion) from DGV from the following studies: http://dgv.tcag.ca/dgv/docs/GRCh37_hg19_variants_2020-02-25.txt 1000_Genomes_Consortium_Phase_1 1000_Genomes_Consortium_Phase_3 Conrad_et_al_2009 McCarroll_et_al_2008 Sudmant_et_al_2013 We also removed annotated segmental duplications based on the hg19 track at UCSC. We further excluded all chromosomes other than 1 - 22. The exclusion file was converted to an inclusion mask using bedtools. The index was then made using: quicKmer2 search -k 30 -t 24 -s 3G -e 2 -d 100 -w 1000 -c ref/include-control.bed ref/hs37d5.fa and files were gzipped for posting on this web server.