...
Use the excel to open the files Merged.txt, haplocheck_results (contains the contamination status), andhaplogroups_workshopsamples.txt (contains the haplogroup information for each one of the samples).
- By using the haplocheck_results file, you will check which samples are contaminated (column B: Contamination Status). If there is any sample indicating YES in the contamination status column you will need to copy the Sample IDs (column A: Sample) and paste in a new excel file. Name your file as samples_to_remove and save it as txt format.
#NOTE:If is written YES in the column B, it means that the sample is contaminated based on differences between Major and Minor haplogroups.
#NOTE: samples_to_remove.txt has two columns containing the Sample ID in both columns.
Homoplasmic and common variants filtering using PLINK in the SCC cluster
...