Dataset


» Download: Simulation Values (xls)


Description
The dataset consists of 2,025 individuals from two generations. All individuals have complete marker information. There are 453 SNP marker loci which are randomly distributed over 5 chromosomes. Each chromosome is approximately 1 Morgan in length. The first 25 individuals are parents, 20 female and 5 male. The remaining 2000 individuals are offspring, 100 full sibs (FS) families, one from each combination of a male and female parent. Each FS family has 20 offspring.

Fifty FS families have been phenotyped, the other 50 FS families do not have phenotypes. Phenotypes were recorded at multiple time points. The phenotyped FS families are chosen such that each female parent has at least 40 phenotyped offspring while each male parent has 100 phenotyped offspring.

The dataset is divided into four files.


-1- PHENOTYPE FILE: phenotypes.csv contains 6 columns corresponding to:

Individual_ID, Trait_value_time0, Trait_value_time132, Trait_value_time265, Trait_value_time397, Trait_value_time530.

The 5 trait values are measures of yield at 5 different times in the production period. These yield values could be seen to represent weight during the growth of an animal or biomass during the growth of a crop. (Additional information: The asymptotic values of individuals' yield range from 14 to 66).

-2- PEDIGREE FILE: pedigree.csv contains 3 columns corresponding to:

Individual_ID, Parent1_ID, Parent2_ID

The first column identifies the individual. The second and third column identify the female and male parents of each individual.

-3- HAPLOTYPE FILE: haplotypes.csv contains 454 columns on the phased haplotypes:

Individual_ID, M1, M2, … , M453 (M1 = marker 1, M2 = marker 2, etc.).

The file contains 2 haplotypes per individual on separate lines (line 1 = maternal; line 2 = paternal).

-4- MAP FILE: map.csv contains 3 columns corresponding to:

Marker_ID, Chromosome_ID, Position_M

The marker positions are given in Morgan


Analysis
The phenotyped FS families can be used to detect QTL and/or to train a model for genomic selection. The remaining (not-phenotyped) FS families are the validation set to predict breeding values using genomic selection methods.


Reporting

QTL detection:
The dataset is simulated to allow the 50 phenotyped FS families to be used for QTL detection (by association, linkage or combinations thereof). For comparison of results we ask you to report the estimated positions and explained variances for QTL that affect the phenotypic trait in this population.

Breeding Value Prediction:
No phenotype is given for 50 FS families such that these can be used for prediction of breeding values with marker data. For comparison of results we ask you to report the predicted breeding values of all 1000 non-phenotyped FS individuals on time600.


Questions
If you have any questions concerning the dataset please contact: qtlmas2009@wur.nl


  
Print this page

  Programme
  Book of Abstracts
  Participants
  Invited speakers
  Dataset
  Pictures
  Location
  Conference fee and Payment
  Abstract & Paper submission
  Sponsors