The BLUEPRINT Project is a high impact FP7 aiming to produce a blueprint of haemopoetic epigenomes. Our goal is to apply highly sophisticated functional genomics analysis on a clearly defined set of primarily human samples from healthy and diseased individuals, and to provide at least 100 reference epigenomes to the scientific community. This resource-generating activity will be complemented by research into blood-based diseases, including common leukaemias and autoimmune disease (Type 1 Diabetes), by discovery and validation of epigenetic markers for diagnostic use and by epigenetic target identification.This may eventually lead to the development of novel and more individualised medical treatments.
The Blueprint consortium expects this data to be valuable to other researchers. In keeping with Fort Lauderdale principles, data users may use the data for many studies, but are expected to allow the data producers to make the first presentations and to publish the first paper with global analyses of the data. Our full data reuse statement can be found on our website.
|Raw data||Data archives (EGA & ENA)|
|Processed data||FTP site|
|Genome browser||Genomatix browser|
|BLUEPRINT Track Hub on the UCSC browser|
|BLUEPRINT Track Hub on the Ensembl browser|
The majority of samples sequenced by Blueprint are consented for release via a managed access system. To facilitate this we have archived the data in the EGA. Users can apply to download data. The process for this can be found on our DAC applications page. Data for samples that do not require managed access have been archived with the ENA. In each case, links to the raw data can be found through the experiment grid.
|Experiment Type||Data Type||File Format||Example|
|RNA-Seq||Quantification||GTF||C0010K Monocyte transcript quantification|
|RNA-Seq||Alignment Signal||BigWig||C0010K Monocyte plus strand signal|
|ChIP-Seq||Peak Calls||BigBed||C0010K Monocyte H3K4me1 peak calls|
|ChIP-Seq||Alignment Signal||BigWig||C0010K Monocyte H3K4me1 signal|
|DNase1-Seq||Hotspots||BigBed||C0010K Monocyte Dnase hotspots|
|DNase1-Seq||Alignment Signal||BigWig||C0010K Monocyte Dnase signal|
|WGS Bisulphite Seq||Hypo-methylated Regions||BigBed||C0010K Monocyte hypo methylation calls|
|WGS Bisulphite Seq||Hyper-methylated Regions||BigBed||C0010K Monocyte hyper methylation calls|
|WGS Bisulphite Seq||Alignment Signal||BigWig||C0010K Monocyte methylation call signal|
Secondary analysis results are made available as part of the data release cycle. The methods and how to access the results are listed on the secondary analysis page.
The FTP site has 3 major sections listed here and described in more detail
- data : This directory contains all the processed data files described in the above table
- release : This directory contains files specific to a particular release, such as meta data and indexes
- reference : This directory contain reference data sets used for our analysis, e.g. a GENCODE gene set or reference assembly
The data directory contains all processed data files as described in the above table. New files will be added each release. Subdirectories are organised by species, tissue type, donor, cell type and data type. The filename format follows this general form:
The freeze date in the filename should match the first freeze in which the file was produced.
The most recent release can be found at current_release. Each release directory contain an index file (list of all files for the specific release), description of analysis pipeline, Track Hub directory and a readme file describing the current index file.
A description for the data index and each analysis pipeline can be found here:
The reference directory contains the reference materials used for our analysis pipelines. The subdirectories are also dated to allow us to update our reference files. The analysis pipeline readmes should indicate which reference files were used for a particular analysis pipeline.
The research leading to these results has received funding from the European Union's Seventh Framework Programme (FP7/2007-2013) under grant agreement [no 282510 - BLUEPRINT]
Blueprint is part of the International Human Epigenome Consortium