The human genome, once an undecipherable code, is now more easily and completely understood than ever. After the completion of the first human genome sequence in 2003, researchers were left with the monumental task of figuring out what parts were functionally important. It became apparent that only about one percent of the human genome codes for proteins that play important roles in the cells throughout our bodies. The vast majority of our genome was thought to be non-coding. These often-overlooked portions of DNA emerged as key players in the field of genetics.
Many of these non-coding regions act as intricate DNA switches, controlling when and where genes are turned on and off. This process, called gene regulation, is important in all normal functions of our bodies, including cell differentiation, response to environmental changes, development, and growth. Dysregulation of gene regulation can lead to disease. By understanding gene regulation and how it changes in response to the environment, during differentiation, and during development, researchers are unlocking new frontiers in medicine, with the potential to develop targeted therapies for a wide range of conditions.
“ENCODE’s legacy lies not only in its groundbreaking discoveries but also in its commitment to open data, fostering the strong spirit of collaborative innovation we see today.” -Rick Myers
The ENCODE Project: Pioneering breakthroughs in gene regulation
After the first human genome was sequenced, a new era of genetic discovery was unfolding. A major contributor to this endeavor was a team led by Faculty Investigator Rick Myers, PhD, who played a significant leadership role in the Human Genome Project. One of the most notable contributions of Myers’ lab has been its pivotal role in the Encyclopedia of DNA Elements (ENCODE) Project. This ambitious international effort aimed to characterize all of the functional elements within the human genome, including both protein-coding genes and non-coding elements that regulate gene activity. It also sought to understand how these components interact in biological processes.
The ENCODE Consortium, funded by the National Human Genome Research Institute (NHGRI), spanned two decades and included more than 30 institutions and 500 scientists worldwide. It made monumental strides in advancing our knowledge of the human genome, maintaining the large-scale, open-access data release championed by the Human Genome Project. The wealth of publicly accessible data generated by the Consortium and deposited in the ENCODE data repository has empowered thousands of researchers worldwide to make discoveries in human diseases, from cardiovascular disease to bipolar disease. ENCODE also produced numerous protocols for genome-wide functional analysis that have become standards in the field.
The project led to the identification of more than a million DNA switches and made significant headway into understanding the 1,600 or so DNA-binding proteins called transcription factors, better understanding how they turn genes on or off or determine the levels of gene expression in different cell types and at different times during development.

Throughout the ENCODE Project, Myers and his lab, in collaboration with Dr. Eric Mendenhall at HudsonAlpha and Dr. Barbara Wold at Caltech, conducted the largest study of transcription factors expressed at normal levels to date and generated hundreds of genome-wide datasets that measure transcription factor binding sites in the human genome, identified RNA transcripts in mouse and human cells, and identified DNA methylation sites throughout the human genome. By expanding the catalog of known transcription factors and their functions, Myers and his lab helped unlock new avenues for medical innovation, enabling researchers to navigate the complexities of human biology and identify potential therapeutic targets.
ENCODE at a Glance
The Myers Lab joined ENCODE when it launched in 2003, first participating in the pilot project aimed at annotating one percent of the human genome and then joining the full genome effort in 2007. The lab moved to HudsonAlpha in 2008.
The Consortium produced more than 15 terabytes of raw data.
Through ENCODE, dozens of protocols for looking at the functionality of genetic regions were developed or adapted, several from the Myers group, including ChIP-seq, RNA sequencing, CETCH-seq, and DNA methylation.
The Consortium discovered more than a million DNA switches and hundreds of transcription factors that bind them.
Beyond ENCODE: Searching for brain-specific transcription factors
The ENCODE Project’s legacy will live on for decades as researchers utilize the vast amounts of data generated by the Consortium. For their part, Myers and his lab are just getting started. In late 2023, they published a comprehensive analysis of the binding of 680 human transcription factors and how they regulate gene expression patterns in HepG2 cells, a type of human liver cancer cell with extensive ENCODE data.
Transcription factors play a crucial role in gene regulation by binding to cis-regulatory elements (CREs), short stretches of DNA near the genes they regulate. ChIP-seq is a powerful technique that reveals how proteins interact with DNA. By leveraging the vast amount of ENCODE ChIP-seq data, Myers and his team found that most known candidate CREs are bound by at least one of the 680 assayed DNA-associated proteins, suggesting that this research has identified a significant portion of the regulatory elements in the human genome.
The study not only confirmed the binding of many known transcription factors to their target genes but also uncovered novel transcription factors and regulatory relationships. These findings provide valuable insights into the complex network of gene regulation in human cells and may have significant implications for understanding human disease and developing targeted therapies. The work highlights the enduring impact of the ENCODE Project and sets the stage for future discoveries in the field of gene regulation.
While large-scale datasets like ENCODE have provided valuable insights into gene regulation, their reliance on cancer-derived cell lines often limits their ability to accurately capture the complexities of gene expression in diverse human tissues, particularly the brain. In early 2024, Myers and his lab made a significant contribution to the field of neuroscience by publishing BrainTF, a comprehensive resource that maps the binding sites of more than 100 transcription factors in human postmortem brain tissue. The study represents the largest dataset to date on transcription factor binding in human neurological cells, offering unprecedented insights into the intricate regulatory mechanisms governing gene expression in the brain.
This brain study identified many novel transcription factor binding sites not found in existing databases, highlighting the unique aspects of brain regulation. The valuable data generated from this study is publicly available. By providing open access to this comprehensive resource from difficult-to-obtain brain tissues, the researchers aim to empower the scientific community to delve deeper into the intricate mechanisms of brain function and dysfunction, ultimately accelerating the development of novel therapeutic strategies for psychiatric illnesses. The data and additional experiments are being done in the Myers Lab to understand the regulation of genes that cause neurodegenerative diseases and include ways of possibly mitigating the effects of these mutant genes.
