These are the GWAS results for job run_all with 337.475 samples, run for the phenotype Current .
The association between Current and each variant with a minor allele count \(\ge\) 30 and an imputation quality metric \(r^2 \ge\) 0.8 was tested using the logistic regression model:
Current ~ marker + PC1 + PC2 + PC3 + PC4 + PC5 + PC6 + PC7 + PC8 + PC9 + PC10 + PC11 + PC12 + PC13 + PC14 + PC15 + PC16 + PC17 + PC18 + PC19 + PC20 + array + sex + age
The association tests were performed using plink2/2.00-alpha-2-20190429, including a total of 19.064.366 markers (after filtering). Markers having a Hardy-Weinberg equilibrium exact test p-value below 1.0e-6 were filtered out.
861 significant markers have been found. The genomic inflation factor was \(\lambda =\) 1.1111664.
The plink2 program was run with the following parameters:
The table displays the number of samples/variants that have been removed due to different filters:
GCTA cojo was not conducted.
Plink clumping was not conducted.
A regression was conducted using the linear model
Current ~ PC1 + PC2 + PC3 + PC4 + PC5 + PC6 + PC7 + PC8 + PC9 + PC10 + PC11 + PC12 + PC13 + PC14 + PC15 + PC16 + PC17 + PC18 + PC19 + PC20 + array + sex + age
(Note that no marker genotype has been included). The dataframe used as regression input is stored in run_all_Current_regression_frame.RData, while the residuals are in run_all_Current_regression_resid.RData.
glm()
:Variance inflation factors (VIF) are calculated in order to discover multicollinearity. VIF can be obtained by regressing a single independent variable against all other independent variables. As a rule of thumb, no variance inflation factor should be bigger than 10. Otherwise, highly correlated variables should be removed from the model.