Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

NOTE #1: all written in red is editable.

1. Aligment and variants calling

  • To run the alignment and variants calling use the command line below.

...

module load bio/BCFtools/1.11-GCC-10.2.0
for i in $(ls *.vcf.gz); do bcftools index --tbi $i; done
bcftools merge -Oz /work_directory/*.vcf.gz -o Merged.vcf.gz

2. Haplocheck

Contamination

wget https://github.com/genepi/haplocheck/releases/download/v1.3.3/haplocheck.zip
unzip haplocheck.zip
./haplocheck --out haplocheck_results Merged.vcf.gz

3. Haplogrep

Haplogroup

curl -sL haplogrep.now.sh | bash
./haplogrep classify --in Merged.vcf.gz --format vcf --out haplogroups_workshopsamples.txt

...

  • Before go to the Quality Control steps, please download the all_samples.csv and filename.txt files to your computer.

4. Quality Control analysis 

Use the excel to open the files Merged.txt, haplocheck_results (contains the contamination status), andhaplogroups_workshopsamples.txt (contains the haplogroup information for each one of the samples).

...

  • Continue the next steps using the txt copy file in excel. Check out at the end of this section an example of how the excel file was organized in different sheets based on the QC steps (Figure 1).

5. Homoplasmic and common variants filtering using PLINK in the SCC cluster

Converting vcf to plink files

...

plink --bfile Workshop_samples_05-17-23_nocont_homo --maf 0.05 --make-bed --out Workshop_samples_05-17-23_nocont_homo_common

6. Functional analysis


./mutserve annotate --input variantsfile.txt --annotation rCRS_annotation_2020-08-20.txt --output AnnotatedVariants.txt

...