ResultsSharingFormat: Difference between revisions

From Charge
Jump to navigation Jump to search
mNo edit summary
 
Line 1: Line 1:
== Results Sharing ==
== Results Sharing ==


The following variables should be included when sharing imputed results for meta-analysis; large files can be shared among small groups via secure file transfer site (as described in [[ResultsSharing|Results Sharing]]). Many working groups use [http://catalyst.washington.edu/web_tools/sharespaces.html ShareSpaces], a secure web-based file-sharing system implemented by the University of Washington's Catalyst computing group. The service has ample storage space for large files and limits access to a select group identified by UW [[Netidlist|Netids]], or by [http://www.washington.edu/lst/news/2012/google-login Google Account] IDs.   
The following variables should be included when sharing imputed results for meta-analysis; large files can be shared among small groups via secure file transfer site (as described in [[ResultsSharing|Results Sharing]]). Many working groups use Google Drive, a secure web-based file-sharing system in partnership with by the University of Washington's computing group. The service has ample storage space for large files and limits access to a select group identified by [http://www.washington.edu/lst/news/2012/google-login Google Account] IDs. 


<br> ShareSpace Access is arranged via working groups. New members who expect to need access to these sites should create a Google Account ID: [https://accounts.google.com/SignUp?service=mail&continue=http%3A%2F%2Fmail.google.com%2Fmail%2F&ltmpl=default create a new Google Account] ([http://www.washington.edu/lst/help/account/workingwithnonUW more information]).
<br/>ShareSpace Access is arranged via working groups. New members who expect to need access to these sites should create a Google Account ID: [https://accounts.google.com/SignUp?service=mail&continue=http://mail.google.com/mail/&ltmpl=default create a new Google Account].


== File Formats ==
== File Formats ==


Results should be shared as plain text files, with the following variable names:
Results should be shared as plain text files, with the following variable names:
Line 19: Line 18:
|-
|-
| '''chr'''
| '''chr'''
| chromosome number.&nbsp;Use symbols X, XY, Y and mt for non-autosomal markers.<br>
| chromosome number.&nbsp;Use symbols X, XY, Y and mt for non-autosomal markers.<br/>
|-
|-
| '''position'''
| '''position'''
Line 34: Line 33:
|-
|-
| '''beta'''
| '''beta'''
| beta estimate from genotype-phenotype association, at least 5 decimal places -- “NA” if not available<br>
| beta estimate from genotype-phenotype association, at least 5 decimal places -- “NA” if not available<br/>
|-
|-
| '''SE'''
| '''SE'''
| standard error of beta estimate, to at least 5 decimal places&nbsp;-- “NA” if not available<br>
| standard error of beta estimate, to at least 5 decimal places&nbsp;-- “NA” if not available<br/>
|-
|-
| '''pval'''
| '''pval'''
Line 43: Line 42:
|-
|-
| '''AF_coded_all'''
| '''AF_coded_all'''
| allele frequency for the coded allele -- “NA” if not available<br>
| allele frequency for the coded allele -- “NA” if not available<br/>
|-
|-
| '''HWE_pval'''
| '''HWE_pval'''
| exact test Hardy-Weinberg equilibrium p-value&nbsp;-- only directly typed SNPs, NA for imputed<br>
| exact test Hardy-Weinberg equilibrium p-value&nbsp;-- only directly typed SNPs, NA for imputed<br/>
|-
|-
| '''callrate'''
| '''callrate'''
Line 66: Line 65:
''Please note that a README should be uploaded with a very brief description of the data uploaded, the date, the NCBI human genome reference sequence used (e.g. NCBI 36.2) for strand reference, and the scale of the beta estimates; please also include in the README the SNP HWE p-value, callrate and minor allele frequency filters that have been applied.''
''Please note that a README should be uploaded with a very brief description of the data uploaded, the date, the NCBI human genome reference sequence used (e.g. NCBI 36.2) for strand reference, and the scale of the beta estimates; please also include in the README the SNP HWE p-value, callrate and minor allele frequency filters that have been applied.''


 
<br/>For gene-environment interaction analyses, the following variables should be included:
For gene-environment interaction analyses, the following variables should be included:


{| width="800" border="1" class="prettytable"
{| width="800" border="1" class="prettytable"
Line 78: Line 76:
|-
|-
| '''chr'''
| '''chr'''
| chromosome number.&nbsp;Use symbols X, XY, Y and mt for non-autosomal markers.<br>
| chromosome number.&nbsp;Use symbols X, XY, Y and mt for non-autosomal markers.<br/>
|-
|-
| '''position'''
| '''position'''
Line 93: Line 91:
|-
|-
| '''beta'''
| '''beta'''
| beta estimate from additive interaction term, at least 5 decimal places -- “NA” if not available<br>
| beta estimate from additive interaction term, at least 5 decimal places -- “NA” if not available<br/>
|-
|-
| '''SE'''
| '''SE'''
| standard error of beta estimate, to at least 5 decimal places&nbsp;-- “NA” if not available<br>
| standard error of beta estimate, to at least 5 decimal places&nbsp;-- “NA” if not available<br/>
|-
|-
| '''pval'''
| '''pval'''
| p-value of interaction test statistic, here just as a double check -- “NA” if not available<br>
| p-value of interaction test statistic, here just as a double check -- “NA” if not available<br/>
|-
|-
| '''df.t'''
| '''df.t'''
| degrees of freedom estimate for t reference distribution for interaction term -- “NA” if not available<br>
| degrees of freedom estimate for t reference distribution for interaction term -- “NA” if not available<br/>
|-
|-
| '''pval.t'''
| '''pval.t'''
| p-value of interaction test statistic, using t reference distribution, here just as a double check -- “NA” if not available<br>
| p-value of interaction test statistic, using t reference distribution, here just as a double check -- “NA” if not available<br/>
|-
|-
| '''beta.main'''
| '''beta.main'''
| beta estimate from genotype-phenotype association, at least 5 decimal places -- “NA” if not available<br>
| beta estimate from genotype-phenotype association, at least 5 decimal places -- “NA” if not available<br/>
|-
|-
| '''SE.main'''
| '''SE.main'''
| standard error of beta.main estimate, to at least 5 decimal places&nbsp;-- “NA” if not available<br>
| standard error of beta.main estimate, to at least 5 decimal places&nbsp;-- “NA” if not available<br/>
|-
|-
| '''pval.main'''
| '''pval.main'''
| p-value of main test statistic, here just as a double check -- “NA” if not available<br>
| p-value of main test statistic, here just as a double check -- “NA” if not available<br/>
|-
|-
| '''covar.main.inter'''
| '''covar.main.inter'''
| covariance between beta and beta.main, to at least 5 decimal places&nbsp; -- “NA” if not available<br>
| covariance between beta and beta.main, to at least 5 decimal places&nbsp; -- “NA” if not available<br/>
|-
|-
| '''AF_coded_all'''
| '''AF_coded_all'''
| allele frequency for the coded allele -- “NA” if not available<br>
| allele frequency for the coded allele -- “NA” if not available<br/>
|-
|-
| '''HWE_pval'''
| '''HWE_pval'''
| exact test Hardy-Weinberg equilibrium p-value&nbsp;-- only directly typed SNPs, NA for imputed<br>
| exact test Hardy-Weinberg equilibrium p-value&nbsp;-- only directly typed SNPs, NA for imputed<br/>
|-
|-
| '''callrate'''
| '''callrate'''
Line 145: Line 143:


''Please note that a README should be uploaded with a very brief description of the data uploaded, the date, the NCBI human genome reference sequence used (e.g. NCBI 36.2) for strand reference, and the scale of the beta estimates; please also include in the README the SNP HWE p-value, callrate and minor allele frequency filters that have been applied.''
''Please note that a README should be uploaded with a very brief description of the data uploaded, the date, the NCBI human genome reference sequence used (e.g. NCBI 36.2) for strand reference, and the scale of the beta estimates; please also include in the README the SNP HWE p-value, callrate and minor allele frequency filters that have been applied.''
 
[[Category:Analysis Guidelines]] [[Category:Analysis]]
 
 
[[Category:Analysis Guidelines]]
[[Category:Analysis]]

Latest revision as of 22:41, 21 June 2016

Results Sharing

The following variables should be included when sharing imputed results for meta-analysis; large files can be shared among small groups via secure file transfer site (as described in Results Sharing). Many working groups use Google Drive, a secure web-based file-sharing system in partnership with by the University of Washington's computing group. The service has ample storage space for large files and limits access to a select group identified by Google Account IDs. 


ShareSpace Access is arranged via working groups. New members who expect to need access to these sites should create a Google Account ID: create a new Google Account.

File Formats

Results should be shared as plain text files, with the following variable names:

variable name description
SNPID SNP ID as rs number
chr chromosome number. Use symbols X, XY, Y and mt for non-autosomal markers.
position physical position for the reference sequence (indicate build 35/36 in readme file)
coded_all coded allele, also called modeled allele (in example of A/G SNP in which AA=0, AG=1 and GG=2, the coded allele is G)
noncoded_all the other allele
strand_genome + or -, representing either the positive/forward strand or the negative/reverse strand of the human genome reference sequence; to clarify which strand the coded_all and noncoded_all are on
beta beta estimate from genotype-phenotype association, at least 5 decimal places -- “NA” if not available
SE standard error of beta estimate, to at least 5 decimal places -- “NA” if not available
pval p-value of test statistic, here just as a double check -- “NA” if not available
AF_coded_all allele frequency for the coded allele -- “NA” if not available
HWE_pval exact test Hardy-Weinberg equilibrium p-value -- only directly typed SNPs, NA for imputed
callrate genotyping callrate after exclusions
n_total total sample with phenotype and genotype for SNP
imputed 1/0 coding; 1=imputed SNP, 0=if directly typed
used_for_imp 1/0 coding; 1=used for imputation, 0=not used for imputation
oevar_imp observed divided by expected variance for imputed allele dosage

Please note that a README should be uploaded with a very brief description of the data uploaded, the date, the NCBI human genome reference sequence used (e.g. NCBI 36.2) for strand reference, and the scale of the beta estimates; please also include in the README the SNP HWE p-value, callrate and minor allele frequency filters that have been applied.


For gene-environment interaction analyses, the following variables should be included:

variable name description
SNPID SNP ID as rs number
chr chromosome number. Use symbols X, XY, Y and mt for non-autosomal markers.
position physical position for the reference sequence (indicate build 35/36 in readme file)
coded_all coded allele, also called modeled allele (in example of A/G SNP in which AA=0, AG=1 and GG=2, the coded allele is G)
noncoded_all the other allele
strand_genome + or -, representing either the positive/forward strand or the negative/reverse strand of the human genome reference sequence; to clarify which strand the coded_all and noncoded_all are on
beta beta estimate from additive interaction term, at least 5 decimal places -- “NA” if not available
SE standard error of beta estimate, to at least 5 decimal places -- “NA” if not available
pval p-value of interaction test statistic, here just as a double check -- “NA” if not available
df.t degrees of freedom estimate for t reference distribution for interaction term -- “NA” if not available
pval.t p-value of interaction test statistic, using t reference distribution, here just as a double check -- “NA” if not available
beta.main beta estimate from genotype-phenotype association, at least 5 decimal places -- “NA” if not available
SE.main standard error of beta.main estimate, to at least 5 decimal places -- “NA” if not available
pval.main p-value of main test statistic, here just as a double check -- “NA” if not available
covar.main.inter covariance between beta and beta.main, to at least 5 decimal places  -- “NA” if not available
AF_coded_all allele frequency for the coded allele -- “NA” if not available
HWE_pval exact test Hardy-Weinberg equilibrium p-value -- only directly typed SNPs, NA for imputed
callrate genotyping callrate after exclusions
n_total total sample with phenotype and genotype for SNP
n_exposed (DICHOTOMOUS EXPOSURE ONLY) number in sample exposed to environmental variable of interest [in longitudinal data, estimated number of independent observations that are exposed]
imputed 1/0 coding; 1=imputed SNP, 0=if directly typed
used_for_imp 1/0 coding; 1=used for imputation, 0=not used for imputation
oevar_imp observed divided by expected variance for imputed allele dosage

Please note that a README should be uploaded with a very brief description of the data uploaded, the date, the NCBI human genome reference sequence used (e.g. NCBI 36.2) for strand reference, and the scale of the beta estimates; please also include in the README the SNP HWE p-value, callrate and minor allele frequency filters that have been applied.