ResultsSharingFormat
Results Sharing
The following variables should be included when sharing imputed results for meta-analysis; large files can be shared among small groups via secure file transfer site (as described in Results Sharing). Many working groups use ShareSpaces, a secure web-based file-sharing system implemented by the University of Washington's Catalyst computing group. The service has ample storage space for large files and limits access to a select group identified by UW Netids, by Protect Network IDs, or by Google Account IDs.
ShareSpace Access is arranged via working groups. New members who expect to need access to these sites should register for a ProtectNetworkID: register online
(step-by-step instructions), or create a Google Account ID: create a new Google Account (more information).
File Formats
Results should be shared as plain text files, with the following variable names:
variable name | description |
SNPID | SNP ID as rs number |
chr | chromosome number. Use symbols X, XY, Y and mt for non-autosomal markers. |
position | physical position for the reference sequence (indicate build 35/36 in readme file) |
coded_all | coded allele, also called modeled allele (in example of A/G SNP in which AA=0, AG=1 and GG=2, the coded allele is G) |
noncoded_all | the other allele |
strand_genome | + or -, representing either the positive/forward strand or the negative/reverse strand of the human genome reference sequence; to clarify which strand the coded_all and noncoded_all are on |
beta | beta estimate from genotype-phenotype association, at least 5 decimal places -- “NA” if not available |
SE | standard error of beta estimate, to at least 5 decimal places -- “NA” if not available |
pval | p-value of test statistic, here just as a double check -- “NA” if not available |
AF_coded_all | allele frequency for the coded allele -- “NA” if not available |
HWE_pval | exact test Hardy-Weinberg equilibrium p-value -- only directly typed SNPs, NA for imputed |
callrate | genotyping callrate after exclusions |
n_total | total sample with phenotype and genotype for SNP |
imputed | 1/0 coding; 1=imputed SNP, 0=if directly typed |
used_for_imp | 1/0 coding; 1=used for imputation, 0=not used for imputation |
oevar_imp | observed divided by expected variance for imputed allele dosage |
Please note that a README should be uploaded with a very brief description of the data uploaded, the date, the NCBI human genome reference sequence used (e.g. NCBI 36.2) for strand reference, and the scale of the beta estimates; please also include in the README the SNP HWE p-value, callrate and minor allele frequency filters that have been applied.
For gene-environment interaction analyses, the following variables should be included:
variable name | description |
SNPID | SNP ID as rs number |
chr | chromosome number. Use symbols X, XY, Y and mt for non-autosomal markers. |
position | physical position for the reference sequence (indicate build 35/36 in readme file) |
coded_all | coded allele, also called modeled allele (in example of A/G SNP in which AA=0, AG=1 and GG=2, the coded allele is G) |
noncoded_all | the other allele |
strand_genome | + or -, representing either the positive/forward strand or the negative/reverse strand of the human genome reference sequence; to clarify which strand the coded_all and noncoded_all are on |
beta | beta estimate from additive interaction term, at least 5 decimal places -- “NA” if not available |
SE | standard error of beta estimate, to at least 5 decimal places -- “NA” if not available |
pval | p-value of interaction test statistic, here just as a double check -- “NA” if not available |
df.t | degrees of freedom estimate for t reference distribution for interaction term -- “NA” if not available |
pval.t | p-value of interaction test statistic, using t reference distribution, here just as a double check -- “NA” if not available |
beta.main | beta estimate from genotype-phenotype association, at least 5 decimal places -- “NA” if not available |
SE.main | standard error of beta.main estimate, to at least 5 decimal places -- “NA” if not available |
pval.main | p-value of main test statistic, here just as a double check -- “NA” if not available |
covar.main.inter | covariance between beta and beta.main, to at least 5 decimal places -- “NA” if not available |
AF_coded_all | allele frequency for the coded allele -- “NA” if not available |
HWE_pval | exact test Hardy-Weinberg equilibrium p-value -- only directly typed SNPs, NA for imputed |
callrate | genotyping callrate after exclusions |
n_total | total sample with phenotype and genotype for SNP |
n_exposed | (DICHOTOMOUS EXPOSURE ONLY) number in sample exposed to environmental variable of interest [in longitudinal data, estimated number of independent observations that are exposed] |
imputed | 1/0 coding; 1=imputed SNP, 0=if directly typed |
used_for_imp | 1/0 coding; 1=used for imputation, 0=not used for imputation |
oevar_imp | observed divided by expected variance for imputed allele dosage |
Please note that a README should be uploaded with a very brief description of the data uploaded, the date, the NCBI human genome reference sequence used (e.g. NCBI 36.2) for strand reference, and the scale of the beta estimates; please also include in the README the SNP HWE p-value, callrate and minor allele frequency filters that have been applied.