Presentation

QTLMap is a software dedicated to the detection of QTL from experimental designs in outbred population. QTLMap software is developed by the Animal Genetics Division at INRA (French National Institute for Agronomical Research). The statistical techniques used are linkage analysis (LA) and linkage disequilibrium linkage analysis (LDLA) using interval mapping. Different versions of the LA are proposed from a quasi Maximum Likelihood approach to a fully linear (regression) model. The LDLA is a regression approach (Legarra and Fernando, 2009). The population may be sets of half-sib families or mixture of full- and half- sib families. The computations of Phase and Transmission probabilities are optimized to be rapid and as exact as possible. QTLMap is able to deal with large numbers of markers (SNP) and traits (eQTL).

The aim of QTLMap developers is to propose various genetic models depending on

  • the number of QTL alleles segregating (biallelic in crosses between monomorphic breeds, biallelic without hypothesis on the origin, multiallelic, haplotype identity),
  • the number of QTL segregating (one, two linked, several unlinked),
  • the number of traits under the QTL influence. The trait determinism may vary depending on
    • the trait distribution (gaussian trait, survival trait or threshold distribution),
    • the interactions between the QTL and fixed effects or other loci,
    • the residual variance structure (homo- or heteroskedasticity for half-sib families).

Due to differences with the asymptotical conditions from the chi2 theory, the test statistic significance are evaluated either through numerical approximations, or through empirical calculations obtained from permutations or simulations under the null hypothesis.

Up to now, the following functionnalities have been implemented :

  • QTL detection in half-sib families or mixture of full- and half-sib families
  • One or several linked QTL segregating in the population
  • Single trait or multiple trait
  • Nuisance parameters (e.g. sex, batch, weight...) and their interactions with QTL can be included in the analysis
  • Gaussian, discrete or survival (Cox model) data
  • Familial heterogeneity of variances (heteroscedasticity)
  • Can handle eQTL analyses
  • Computation of transmission and phase probabilities adapted to high throughput genotyping (SNP)
  • Empirical thresholds are estimated using simulations under the null hypothesis or permutations of trait values
  • Computation of power and accuracy of your design or any simulated design

The main source code of QTLmap is written in fortran 95 and use the OpenMP API (Parallel Programming).
The implementation of regression approach of LA,LD and LDLA analysis are available in CUDA to obtain empirical thresholds for large data (speed improvement).

Installation

Prerequisite

Optional Prerequisite

Download the QTLMap source code

Download the latest release here

Compilation

  • tar xvf qtlmap-X.X.X-Source.tar.gz
  • cd qtlmap-X.X.X-Source
  • mkdir build
  • cd build
  • build the Makefile with cmake :
    • Release (Multi-threading Environment)
       cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_Fortran_COMPILER=gfortran ../
    • Release (GPGPU-CUDA Environment)
       cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_Fortran_COMPILER=gfortran -DCUDA_IMPL=true -DTYPE_ARCH_CUDA="compute_20" ../
      • Simple precision support : add
        -DCUDA_SP=true
  • make
  • make install

the binary are in the qtlmap-X.X.X-Source/bin directory

To give a specific path of LAPACK librairies :

-DLAPACK_LIBRARY_PATH=/path/lapack

Documentation

A complete documentation is available in pdf format qtlmap-doc-0.8

Build the Parameter files and the input files from examples

PorcQTL (Microsatellites example)

Description of the dataset

  • 5 half-sib families
  • 236 ind. with 5 phenotypes and genotypes
  • 10 microsatellites

download source:dataset-porcqtl.tar.gz

Pedigree
910014   880607    890755    1
910016   880607    890755    1
910020   880607    890755    1
...
...
944217   910001    910014    2
944220   910001    910014    2
952658   910001    910014    2
952659   910001    910014    2
952660   910001    910014    2
...
...

The file contains pedigree information for the 2 last generations of a design which comprises 3
generations, i.e. parents and progeny. It must not contain the grand parental pedigree information.
Each line is made of an alphanumeric ID triplet (individual, sire, dam). A fourth information gives
the generation number : « 1 » for the parental generation ; « 2 » for the progeny generation. An
animal missing one or both parents ID has not to be included in the file. The file
must be sorted by generation, sire ID and dam ID

Marker map
S0383   7    0.01  0.01  0.01    1
S0025   7    0.05  0.05  0.05    1
SW1354  7    0.23  0.22  0.26    1
SW1369  7    0.52  0.54  0.50    1
SLA     7    0.62  0.62  0.63    1
S0102   7    0.74  0.72  0.82    1
SW352   7    1.01  0.91  1.19    1
SW63    7    1.19  1.01  1.46    1
S0101   7    1.48  1.32  1.73    1
SW764   7    1.70  1.50  2.01    1

This file gives the locations of the markers on the chromosome(s). Each line corresponds to a single
marker, and gives (order to be followed) :
  • marker name (alphanumeric) ;
  • name of the chromosome carrying the marker (alphanumeric) ;
  • marker position of the marker on the average map (in Morgan) ;
  • marker position of the marker on the male map (in Morgan) ;
  • marker position of the marker on the female map (in Morgan) ;

Genotypes
S0383 S0025 SW1354 SW1369 SLA S0102 SW352 SW63 S0101 SW764
890761  9  10   3   4   2   8   5   2   4  12   3   3   2   1  13  13   2   5   2   2
880607  6   6   2   2   3   4   2   2   1   2   2   6   2   1   7   6   2   4   4   6
890769 16  10   1   1   8   5   2   6   4  12   4   3   3   4  11  13   5   5   2   2
...

This file contains the animals phenotypes at the markers. The first line gives the marker names, the
markers must belong to the marker map file. For each animal, a line gives its ID (as decribed in the
pedigree file) followed by the markers phenotypes, ranked following in the first line order . Each
phenotype is made of 2 alleles, unordered. When an animal has no phenotype for a marker, both
alleles must be given the missing value code as given in the parametrisation of the analysis.

Traits
944217    7.1350  1  1     1.9320  1  1      .8730  1  1    35.8630  1  1    31.3220  1  1
944220    6.4060  1  1     2.0680  1  1      .8460  1  1    30.2670  1  1    26.6590  1  1
952658    6.5000  1  1     2.1440  1  1      .9060  1  1    30.6870  1  1    26.7960  1  1
...

This file gives the phenotypes of the traits to be analysed.
The progeny performances only are considered in the analysis and must be given in the file.
For each animal, its ID (identical to the ID given in the pedigree file) is followed by information
about nuisance effects (fixed effect levels, covariable value) and then by three information for each
trait : the performance, an 0/1 variable IP which indicates if (IP=1) or not (IP=0) the trait was
measured for this animal and must be included in the analysis, and 0/1 variable (IC) which indicates
if (IC=0) it was censored or not (IC=1), this IC information being needed for survival analysis (by
default IC=1).

The parameter files

the main parameter file : p_analyse


#input files
in_map=carte             # the marker map
in_genealogy=genea       # pedegree
in_genotype=typage       # genotypes 
in_traits=perf           # phenotypes of F2

#an other parameter file describing the statistic model of each trait 
in_model=model           

#analysis options
opt_step=0.05                # analysis step : 5 cMorgan
opt_ndmin=20                 # number of progeny to consider full-sib families
opt_chromosome=7             # chromosome to analyze.
opt_unknown_char=0           # unknown genotype char
opt_minsirephaseproba = 0.80   # minimal paternal phase probability
opt_mindamphaseproba  = 0.10   # minimal maternal phase probability

#output/results
out_output=./OUTPUT/result       # main results : residuals variances, estimation of qtl,haplotype effects found under the tested hypothesis
out_summary=./OUTPUT/summary     # summary
out_phases=./OUTPUT/phases       # parental phases
out_lrtsires=./OUTPUT/lrts       # likelihood ratio test found amongst the linkage group (main and by half-sib fam.)
out_lrtdams=./OUTPUT/lrtd        # likelihood ratio test found amongst the linkage group (by full-sib fam.)
out_pded=./OUTPUT/pded           # grand parental segment transmission marginal probabilities
out_pdedjoin=./OUTPUT/pded_join  # grand parental segment transmission joint probabilities
out_pateff=./OUTPUT/pateff       # Sire QTL effects estimations amongst the linkage group
out_mateff=./OUTPUT/mateff       # Dam QTL effects estimations amongst the linkage group

the model file

5                   # number of traits
0 0                 # number of fixed effect , number of covariable
nofe nocov          # names of fixed effects and covariable
Bardiere r 0 0      # name of trait 1 : model : fixe, cov, QTL*ef fixe / r : real i: integer c : categorial
Imf      r 0 0      # ...
Panne    r 0 0
x2       r 0 0
x4       r 0 0

Analyse the dataset

  • Unitrait - Linkage Analysis without a model description (as in the example, all trait have none fixed effect or covariate)
    qtlmap --calcul=1
  • Multitraits - Linkage Analysis without a model description (as in the example, all trait have none fixed effect or covariate)
    qtlmap --calcul=5
  • Multitraits/Discriminant function Analysis - Linkage Analysis without a model description (as in the example, all trait have none fixed effect or covariate)
    qtlmap --calcul=6

QTLMAS 2011 Datasets (SNP example)

Description of the dataset

  • 20 half-sib families
  • 2000 progenies with phenotypes
  • 3200 progenies with genotypes
  • 5 chromosomes
  • 1998 SNP by chromosome

download source:dataset-qtlmas-2011.tar.gz

Build the parameter files

Analyse the dataset

  • Unitrait - Linkage Analysis with SNP
    qtlmap --calcul=4 --snp
  • Unitrait - Linkage Disequilibrium with SNP
    qtlmap --calcul=26 --snp
  • Unitrait - Linkage Disequilibrium Linkage Analysis with SNP
    qtlmap --calcul=28 --snp

Analyse the dataset

Support

Subscribe and post any message/question to the qtlmap-users list :