Software for genetic association analyses in case-parent triads,
case-control data (or combined case-parent control-parent triads), with
SNP haplotypes from candidate genes or GWAS data
Web page last updated: March 5, 2017
Most recent version: Haplin 6.2.0, uploaded to CRAN Feb 27, 2017
HAPLIN is free software written for the purpose of analyzing case-parent
triad (trio) data and/or case-control data. Some of the main features of
The models estimated by Haplin are described in detail in Gjessing HK and
Lie RT. Case-parent triads: Estimating single- and double-dose effects of
fetal and maternal disease gene haplotypes.
Annals of Human Genetics (2006) 70, pp. 382-396.
- Analyses of the case-parent triad design, the case-control design, and
"hybrid" designs using combinations of case-parent triads and
- Optimal use of designs with missing genotypic data, for instance a
single SNP has not been typed for some individuals, or when the case
father has not been genotyped at all, or when the control parents are
- Estimation is based on haplotypes, for instance SNP haplotypes, even
though phase is not known from the genetic data.
- Estimation of relative risk (RR) associated with each haplotype, not
only significance testing.
- Optional estimation of effects of maternal haplotypes, particularly relevant in perinatal
- Estimation of RRs, haplotypes etc. also on the X chromosome, with
models including dose-response and X-inactivation.
- Estimation of parent-of-origin effects.
- Gene-environment interactions can be estimated for all genetic effects
- Support for GWAS data and parallel processing.
- Extensive facilities for power calculations.
PDF version here.
Also available from Blackwell
new in this version of Haplin?
Some features high on the Wish List for
Haplin is written by Hakon K.
Gjessing. Hilde-Gunn Bruu contributed to early versions of the data
reading and preparation parts. Rolv Terje Lie has contributed with numerous
useful and insightful suggestions, and inspired the work from its beginning.
Nguyen Trung Truc programmed the nice external GUI for generating Haplin
syntax. Øivind Skare has done extensive testing and simulations with the
more recent versions of Haplin, and added a TDT test. Astanand (Anil)
Jugessur has provided very useful feedback from a user's perspective, and
authored a number of papers using Haplin. Miriam Gjerdevik has written
the functions lineByLine and convertPed to recode and modify very large text
files, cbindFiles and rbindFiles to merge very large text files, and
snpPower, snpSamplesize, and hapPower to compute power and sample size for
single SNP and haplotype analyses.
Please feel free to contact me at email@example.com,
with questions or bug reports.
Note: Although we have done our best
to avoid errors, the software is offered without
any warranties. We cannot take responsibility for any problems or
damages caused by using it.
Cite: If you use Haplin in your
publications, please refer to the Annals of Human Genetics paper above. In
addition, typing citation(package = "Haplin") in R will give you the most recent reference to the Haplin R-package.
Haplin is written for use with the statistical software R. However, it is
easy to install and requires no previous knowledge of R. R can be downloaded
free of charge from The R Project for
Statistical Computing. For Windows users, a shortcut to the R
installation file is found here.
Haplin is implemented as a standard R package, and should run without
problems on all reasonably new R versions, for Windows, Linux or MAC.
To install Haplin in R:
Start R and type install.packages("Haplin")
Haplin will then be fetched from the CRAN repository and
installed on your computer.
To use Haplin in an R session, use the R command library(Haplin).
Haplin is then loaded and ready for use.
Haplin is run by the single command
(or whatever the path to the data file is). The data file (data.dat) can
have any name, but should be a text file in a specific format (see below).
This command reads data, performs the estimation and prints and plots the
result in one run.
By default, Haplin excludes triads with missing data. To include these
triads in the calculations, include the use.missing argument:
For more examples of how to run Haplin, see the haplin help file (in R, type
?haplin). For a quick overview of all available functions in Haplin, use
help(package = "Haplin").
I have collected a few pieces of advice
that may be useful if you encounter problems.
The complete reference list of help files is here.
page has a nice presentation of
the package documentation. Make sure it is the most recent version.
There are three ways to handle input data with Haplin:
- The native Haplin data file format is a fairly simple ASCII
file, described here.
- If the data (on relatively few SNPs) are available in a standard
ped-format, it is possible to convert ped files directly to the Haplin
format. See here for details.
- With a larger number of SNPs in ped-format, such as a GWAS ped file
produced by plink, data can be read into Haplin via the GenABEL package
data format. A complete description of how to import and handle GWAS
data is found here.
To test that Haplin runs properly, you can download the trial data files HAPLIN.trialdata.txt and HAPLIN.trialdata2.txt,
and run Haplin with the commands
use.missing = T, maternal = T)
haplin("HAPLIN.trialdata2.txt", use.missing = T, n.vars = 2, ccvar = 2,
design = "cc.triad", reference = "ref.cat", response = "mult")
The results should look something like this: HAPLIN.trialrun.txt,
In addition, a plot is produced, which should look something like
this: HAPLIN.trialrun.jpg, HAPLIN.trialrun2.jpg.
accessible Graphical User Interface for generating Haplin syntax is under
development, and a preliminary versjon is available at haplin.fhi.no,
thanks to Nguyen Trung Truc. The syntax generator helps setting up Haplin
commands which can be cut and pasted into your own R window. It includes
some (but not all) of the features currently available in Haplin. NOTE:
Unfortunately, the GUI web page is temporarily unavailable.
Model and estimation
The models implemented in Haplin are extensions of the log-linear models
described and developed in the papers
Gjessing HK and Lie RT. Case-parent triads: Estimating single- and
double-dose effects of fetal and maternal disease gene haplotypes.
Annals of Human Genetics (2006) 70, pp. 382-396. Wilcox AJ, Weinberg CR, Lie RT (1998).
Distinguishing the effects of maternal and offspring genes through studies
of "case-parent triads". American Journal
of Epidemiology, 148(9):
Weinberg CR, Wilcox AJ, Lie RT (1998). A log-linear approach to
case-parent-triad data: assessing effects of disease genes that act directly
or though maternal effects and that may be subject to parental imprinting. American Journal of Human Genetics, 62: 969-78
and follow-ups to these. The basic log-linear model for case-parent triad
data allows a user to compute relative risks associated with a variant
allele, together with corresponding confidence intervals and p-values. It
also allows a similar effect estimation for maternal alleles, i.e. to study
the effect of genes of the mother
that may influence the development of the fetus. Haplin extends these models
to situations with multiple densely spaced SNPs (or other markers), where
phase is unknown. Haplin then estimates the relative risks associated with haplotypes, not only single markers. In
addition, Haplin uses a parametrization that will detect (at least with
sufficient sample size) dominance- or recessive deviations from a
dose-response model. For some details about parametrization, choice of
reference category and interpretation of results, see parametrization.pdf.
The most recent Haplin version also includes the option to run on
case-control data, or to combine case-parent triads with control-parent
Power and sample size calculation
Extensive facilities for power and sample size calculations have been
added to the more recent versions of Haplin. In particular, the functions
snpPower, snpSampleSize, hapPower, and hapPowerAsymp
have been provided to do this for fetal effects, maternal effects,
parent-of-origin effects, GxE analyses etc. Please see Haplin_power
for further details.
Hakon K. Gjessing
Division of Epidemiology
Norwegian Institute of Public Health
P.O.Box 4404 Nydalen
N-0403 Oslo, NORWAY