Introduction to “weighted_FDR.R”
Software to Apply Weighted False Discovery Rate
for Multiple Testing
Version 1.2 (beta)
Version Information
Version 1.0 –
Original version released
Version 1.1 – Update released
This version checks if all the positions for the p values match those for the priors. When this is the case no interpolation is needed. For large datasets this greatly reduces the computation time.
Version 1.2 – Update
A small change was made to the output. Previously when using prior.type=”USER” it would print out that you used a transformation and a weight. This is not the case for this prior.type and this information is removed from the output.
Overview
When testing a large number of hypotheses, such as an association genome scan, the power to detect modest effects can be low, due to the penalty for multiple testing. This is especially true when traditional approaches such as the Bonferroni correction are employed. The false discovery (FDR) approach increases power in a multiple testing scenario, but it is still challenging to obtain significant results when very large numbers of tests are performed. To enhance power further, a weighted FDR approach can be employed that involves weighting the hypotheses based on prior data, such as a linkage scan, knowledge about biological pathways, candidate genes and so on. The wFDR procedure up-weights likely candidates and down-weights others, while maintaining control of the overall rate of false discoveries. The weights must average to 1 and be chosen independently of the association test statistics.
Simulations reveal that if an informative linkage study was used for the weights, the weighted FDR approach improves power considerably. Remarkably, the loss in power is small even if the linkage study was uninformative.
Software Development
weighted_FDR.R was developed by
Lambertus “Bert” Klei, University of Pittsburgh Medical Center and
weighted_FDR.R is implemented in the R programming language, which is a freeware version of S-plus for Windows and Linux (ver. 1.7.0). If you do not have a copy of R you can download it from the R-project (http://www.r-project.org).
References
Please cite the following paper if you use wFDR in your analysis:
Roeder K, Bacanu S-A, Wasserman L, Devlin B (2006) Using linkage genome scans to improve power of association genome scans. Am J Hum Genet. February 2006.
See also:
Genovese CR, Roeder K, Wasserman L (2005) False discovery control with p-value weighting. Biometrika, submitted.
Downloads
Example files
This is an example of simulated data. The p-values to be tested were from a complete genome association scan in which there was a marker every 1cM (total number of markers: 3779). The prior values were assumed to be from a linkage scan of independent data. The marker density of the linkage scan was 5cM (total number of markers: 703).
Input files
File with p-values to be tested (wfdr_ex1_assc.txt)
File with prior information (wdfr_ex1_link.txt)
· To obtain unweighted FDR results issue the following command
wFDR(p.val.file=”wfdr_ex1_assc.txt”,
out.file=”unw_fdr_results.txt”,trans.method=”UNW”)
The output of this command is contained in:
Results file (unw_fdr_results.txt)
Graphics1) (unw_fdr_graphs.pdf)
Notice: Using FDR locates one additional significant result over regular Bonferroni correction.
· To obtain weighted FDR results issue the following command
wFDR(p.val.file=”wfdr_ex1_assc.txt”,
prior.file=”wfdr_ex1_link.txt”,
out.file=”wght_fdr_results.txt”,
prior.type=”ZSCO”)
The output of this command is contained in:
Results file (wght_fdr_results.txt)
Graphics1) (wght_fdr_graphs.pdf)
Notice: Using weighted FDR locates three additional significant results over unweighted FDR.
1) Graphics files are not automatically saved by the program. The user will have to do this manually.
Windows: Blow the picture up to full-screen size, right click the graph and save it. After saving, close the graph to be able to create a new one.
Linux: Issue the command dev.off() and rename the graphics file Rplot.ps in your working directory.