Introduction to "Spectral-GEM": Software for Matching

This method is described in:

``Discovering Genetic Ancestry Using Spectral Graph Theory'' (Lee, Luca, Klei, Devlin, Roeder, Genetic Epidemiology, 2009).

OVERVIEW

We have 2 versions of the program available, both of which use an R program to call a Fortran executable file.

The first is the R package SpectralGEM, which is available from the CRAN website (http://cran.r-project.org/). The primary function in this package is also called SpectralGEM, although there are also functions to create data and parameter files in the appropriate format.

The second is a zipped directory of files, including R scripts with the SpectralGEM code and analysis functions, as well as the Fortran executable. This version may be downloaded below.

Both versions allow the user to control the program from R, but both require the accompanying Fortran executable. The R library version downloads this executable in the flow of the program; the zip download already includes the file.

Although the directions and example appearing on this page correspond to the second version of the program, the sections entitled "Notes", "Hints", "Warnings" and "preprocessing" are useful to those using the R library version as well. Users of this version may also type 'help(SpectralGEM)' in R once the package is installed for further documentation.

The R programming language is a freeware version of S-plus for Windows and Linux. Thus, the first step in running the program will be to download and install a copy of R. To download R, go to

http://www.r-project.org

and follow the necessary links to download R for Windows/Linux. Once R is installed, simply type R at the command prompt (Linux), or select R from the Start Menu (Windows). To quit R, type q() at the R command prompt. To cancel an R command, type control-c (Linux) or ESC (Windows).
 

 

The following changes have been implemented in version 2.1 of SpectralGEM (5/27/2009, Bert Klei)

1) The eigengap heuristic depends on both the number of individuals and the number of SNPs in the analysis. In the previous version of SpectralGEM the heuristic only depended on the number of individuals. For small numbers of SNPs this value was inaccurate.

2) SpectralGEM now always uses a minimum of two dimensions of ancestry, even when the significant number of eigenvalues is less than two.

3) SpectralGEM will use no more than 50 dimensions of ancestry in its calculations, even when the number of significant eigenvalues exceeds 50.

4) A bug was fixed that sometimes caused the earlier version of SpectralGEM to fail.

Because of these changes all older versions of SpectralGEM should no longer be used.
 

 

Download Program
For windows system.
For linux/unix system.(Uncompiled Fortran source codes.)

Directions: pdf

Example Files
Sample input file
Sample output files (after running the software using the above three sample files.)