  About ENCODE Data

The Encyclopedia of DNA Elements (ENCODE) Consortium is an international collaboration of research groups funded by the National Human Genome Research Institute (NHGRI). The goal of ENCODE is to build a comprehensive parts list of functional elements in the human genome, including elements that act at the protein and RNA levels, and regulatory elements that control cells and circumstances in which a gene is active.

Click to enlarge ENCODE data are now available for the entire human genome. All ENCODE data are free and available for immediate use via :

To search for ENCODE data related to your area of interest and set up a browser view, use the UCSC Track Search tool (Advanced features). The Data Summary shows a comprehensive listing of ENCODE data that is released or in preparation. Early access to pre-release ENCODE data is provided at http://genome-preview.ucsc.edu. If you would like to receive notifications of ENCODE data releases and related news by email, subscribe to the encode-announce mailing list. For more information about how to access this data, see the free online OpenHelix ENCODE tutorial.

To complement the human ENCODE data, Mouse ENCODE experiments are currently underway. Early access to this data is available on the Mouse mm9/NCBI37 browser at the UCSC preview site. The Mouse ENCODE Data Summary lists experiments that are planned or in progress.

All ENCODE data is freely available for download and analysis. However, before publishing research that uses ENCODE data, please read the ENCODE Data Release Policy, which places some restrictions on publication use of data for nine months following data release.    Read more about ENCODE data at UCSC.


15 Sept 2011 - Summer ENCODE human data releases: GENCODE Genes V7, HAIB TFBS, Stanf Nucleosome, HAIB Genotype, SUNY RIP-seq, SUNY Switchgear

Six tracks of human ENCODE data were released in late summer on the hg19 genome browser:

Gene Annotations from ENCODE/GENCODE Version 7: The GENCODE Version 7 Genes track shows high-quality manual annotations merged with evidence-based automated annotations across the entire human genome. This version of GENCODE provides an increase of 25% in manual curation of transcripts over the previous (V4) version at UCSC.

Transcription Factor Binding Sites by ChIP-seq from ENCODE/HudsonAlpha: This track displays 185 experiments identifying transcription factor binding sites in multiple cell lines by chromatin immunoprecipitation followed by high throughput sequencing. A total of 56 transcription factor target antibodies and 20 cell types are represented.

Nucleosome Position by MNase-seq from ENCODE/Stanford/BYU: This track displays nucleosome position density maps from micrococcal nuclease digested chromatin in GM12878 and K562 cell lines. In the context of the ENCODE project, nucleosome positioning data are particularly valuable for analysis of the relationship between transcription factor binding, histone modifications, and gene activity.

Genotype (CNV and SNP) by Illumina 1MDuo and CBS from ENCODE/HudsonAlpha: This track displays copy number variation (CNV) as determined by the Illumina Human 1M-Duo Infinium HD BeadChip assay and circular binary segmentation (CBS). Allele frequency and single nucleotide polymorphism (SNP) data generated by the experiment are available for download.

RIP-seq from ENCODE/SUNY Albany: This track displays transcriptional fragments associated with RNA binding proteins in K562 and GM12878 cell lines, using Ribonomic profiling followed by high throughput sequencing.

RNA Binding Protein Associated RNA by SwitchGear from ENCODE/SUNY Albany: This track displays 3' UTR regions associated with RNA binding proteins in the HT-1080 cell line as evidenced by reporter assays.

8 July 2011 - Mouse ENCODE data releases: DNaseI hypersensitivity (UW DNaseI HS) and histone modifications (LICR Histone)

Two tracks of ENCODE data were released on the mm9 genome browser, from the UCSD/Ludwig Institute for Cancer Research and the University of Washington Mouse ENCODE groups.

Histone Modifications by ChIP-seq from ENCODE/LICR: This track shows a comprehensive survey of cis-regulatory elements in the mouse genome by using ChIP-seq to identify transcription factor binding sites and chromatin modification profiles in many mouse tissues and primary cells, including bone marrow, cerebellum, cortex, heart, kidney, liver, lung, spleen, mouse embryonic fibroblast cells (MEFs) and embryonic stem (ES) cells.

DNaseI Hypersensitivity by Digital DNaseI from ENCODE/University of Washington: This track shows DNaseI sensitivity measured genome-wide in mouse tissues and cell lines using the Digital DNaseI methodology and DNaseI hypersensitive sites.

1 July 2011 - ENCODE data releases: Broad ChromHMM, Open Chrom Synth, UChicago TFBS, Duke Affy Exon

Four tracks of ENCODE production data and analysis were released in June, from the Broad Institute (Kellis lab), OpenChromatin (Duke, UNC, UT-A) and University of Chicago (White Lab) ENCODE groups. This is the first data release from the University of Chicago ENCODE group, which joined the Consortium as part of the NIH ARRA stimulus grants.

Chromatin State Segmentation by HMM from ENCODE/Broad: This track, and the companion hg18 track, display chromatin state segmentation of the human genome into fifteen states grouped to predict functional elements.

DNaseI/FAIRE/ChIP Synthesis from ENCODE/OpenChrom(Duke/UNC/UTA): This track displays a synthesis of open chromatin regions and binding of selected regulatory factors, based on three complementary methodologies.

Transcription Factor Binding Sites by Epitope-Tag ChIP-seq from ENCODE/University of Chicago: This track maps human transcription factor binding sites genome-wide using expressed transcription factors as GFP tagged fusion proteins after BAC recombineering.

Affymetrix Exon Array from ENCODE/Duke: This track displays human tissue microarray data using Affymetrix Human Exon 1.0 ST expression arrays, including annotations at the gene level.

24 June 2011 - ENCODE RNA-seq data standards now available

The ENCODE Consortium has finalized 'Standards, Guidelines and Best Practices for RNA-Seq V1.0', as part of the Consortium's continuing effort to generate data standards. The document is available at the ENCODE portal via the Data Standards link.

RNA-Seq is a directed experimental approach aimed at characterizing transcription in biological samples. This document presents a set of guidelines and standards focused on best practices for creating 'reference quality' transcriptome measurements. sets.

1 June 2011 - ENCODE data releases in April and May

Five tracks of ENCODE production and analysis data were released in April and May on the GRCh37/hg19 human assembly from the Caltech, Broad Institute, HudsonAlpha Institute for Biotechnology, Duke University (Open Chromatin), SUNY Albany, University of Washington, Boston University, Stanford/Yale/Davis/Harvard and UCSC ENCODE groups

Integrated Regulation from ENCODE: This collection of tracks displays integrated signal and clustering annotations from multiple cell lines, using ENCODE primary data from RNA-seq, ChIP-seq, and DNase-seq assays. This track is a companion to the hg18 ENCODE Regulation track.

Open Chromatin by DNaseI HS from ENCODE/OpenChrom(Duke University): This track displays DNaseI hypersensitivity evidence as part of the four Open Chromatin track set.

RNA Binding Protein Associated RNA by RIP-chip GeneST from ENCODE/SUNY Albany: This track displays transcriptional fragments associated with RNA binding proteins in different cell lines using RIP-Chip (Ribonomic) profiling on Affymetrix GeneChip® Human Gene 1.0 ST Arrays.

DNaseI Hypersensitivity by Digital DNaseI form ENCODE/University of Washington: This track shows DNaseI sensitivity measured genome-wide in different cell lines using the Digital DNaseI methodology.

OrChID Predicted DNA Cleavage Sites from ENCODE/Boston Univ (Tullius Lab): This track display predicted hydroxyl radical cleavage intensity on naked DNA for each nucleotide in the genome.

Histone Modifications by ChIP-seq from ENCODE/Stanford/Yale/Davis/Harvard: This track displays maps of histone modifications genome-wide using ChIP-seq in different cell lines.

22 April 2011 - ENCODE data releases: UTA TFBS, UW CTCF, UNC FAIRE, RIKEN CAGE Loc, UW Affy Exon, UW DNaseI DGF & Duke DNaseI HS

Seven tracks of ENCODE data on the GRCh37/hg19 human assembly were released in March and April from from the University of Texas at Austin (Open Chromatin), University of Washington, University of North Carolina (Open Chromatin), RIKEN, and Duke University (Open Chromatin) ENCODE groups. Read more.

