Doubt on genomic code for nucleosome positioning?

Doubt on genomic code for nucleosome positioning?

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

I was reading "A genomic code for nucleosome positioning" (by Eran Segal et al). And I am having 2 doubts.

The figure(b) in this image from the paper shows the graph of fraction (3-bp moving average) of AA/TA/TT dinucleotides of nucleosome dna sequence they analysed statistically as far as I understand. What is 3-bp moving average here? I also don't understand how they chose 0th position (the so called dyad). Also what does it mean to have oscillations (correlation?) in this graph?

UPDATE : I am adding some supplementary information related to finding the dinucleotide fractions. Still I don't understand why is the fraction found so?

Dyad is the centre of the DNA that is wrapped around the nucleosome core (It basically is the centre of symmetry of the nucleosome). It is a common practice to set it at 0 thereby making incoming DNA half, negative and outgoing DNA half, positive.

By oscillations the authors mean that there is a periodic repeat of A/T dinucleotide. IMO it is actually not correct to call it oscillation which is mostly used in a time course dynamical sense.

I guess this is what is meant by the 3-nt moving average:

You have conditional dinucleotide probabilities for each position (As shown in the figure). Now you calculate the A/T dinucleotide probability which is:

P[A/T] = PAA + PAT + PTT + PTA

Now you find the moving average for 3 steps:

MA(n) = (1/3)×(P[A/T](n) + P[A/T](n-1) + P[A/T](n-2)) where n is the nth position of the DNA

Histone modifications are key components of chromatin packaging but whether they constitute a ‘code’ has been contested. We believe that the central issue is causality: are histone modifications responsible for differences between chromatin states, or are differences in modifications mostly consequences of dynamic processes, such as transcription and nucleosome remodeling? We find that inferences of causality are often based on correlation and that patterns of some key histone modifications are more easily explained as consequences of nucleosome disruption in the presence of histone modifying enzymes. We suggest that the 35-year-old DNA accessibility paradigm provides a mechanistically sound basis for understanding the role of nucleosomes in gene regulation and epigenetic inheritance. Based on this view, histone modifications and variants contribute to diversification of a chromatin landscape shaped by dynamic processes that are driven primarily by transcription and nucleosome remodeling.

We use cookies to help provide and enhance our service and tailor content and ads. By continuing you agree to the use of cookies .

Scientists discover a genetic code for organizing DNA within the nucleus

DNA - the long, thin molecule that carries our hereditary material - is compressed around protein scaffolding in the cell nucleus into tiny spheres called nucleosomes. The bead-like nucleosomes are strung along the entire chromosome, which is itself folded and packaged to fit into the nucleus. What determines how, when and where a nucleosome will be positioned along the DNA sequence? Dr. Eran Segal and research student Yair Field of the Computer Science and Applied Mathematics Department at the Weizmann Institute of Science have succeeded, together with colleagues from Northwestern University in Chicago, in cracking the genetic code that sets the rules for where on the DNA strand the nucleosomes will be situated. Their findings appeared today in Nature.

The precise location of the nucleosomes along the DNA is known to play an important role in the cell's day to day function, since access to DNA wrapped in a nucleosome is blocked for many proteins, including those responsible for some of life's most basic processes. Among these barred proteins are factors that initiate DNA replication, transcription (the transfer of genetic information from DNA to RNA) and DNA repair. Thus, the positioning of nucleosomes defines the segments in which these processes can and can't take place. These limitations are considerable: Most of the DNA is packaged into nucleosomes. A single nucleosome contains about 150 genetic bases (the "letters" that make up a genetic sequence), while the free area between neighboring nucleosomes is only about 20 bases long. It is in these nucleosome-free regions that processes such as transcription can be initiated.

For many years, scientists have been unable to agree whether the placement of nucleosomes in live cells is controlled by the genetic sequence itself. Segal and his colleagues managed to prove that the DNA sequence indeed encodes "zoning" information on where to place nucleosomes. They also characterized this code and then, using the DNA sequence alone, were able to accurately predict a large number of nucleosome positions in yeast cells.

Segal and his colleagues accomplished this by examining around 200 different nucleosome sites on the DNA and asking whether their sequences have something in common. Mathematical analysis revealed similarities between the nucleosome-bound sequences and eventually uncovered a specific "code word." This "code word" consists of a periodic signal that appears every 10 bases on the sequence. The regular repetition of this signal helps the DNA segment to bend sharply into the spherical shape required to form a nucleosome. To identify this nucleosome positioning code, the research team used probabilistic models to characterize the sequences bound by nucleosomes, and they then developed a computer algorithm to predict the encoded organization of nucleosomes along an entire chromosome.

The team's findings provided insight into another mystery that has long been puzzling molecular biologists: How do cells direct transcription factors to their intended sites on the DNA, as opposed to the many similar but functionally irrelevant sites along the genomic sequence? The short binding sites themselves do not contain enough information for the transcription factors to discern between them. The scientists showed that basic information on the functional relevance of a binding site is at least partially encoded in the nucleosome positioning code: The intended sites are found in nucleosome-free segments, thereby allowing them to be accessed by the various transcription factors. In contrast, spurious binding sites with identical structures that could potentially sidetrack transcription factors are conveniently situated in segments that form nucleosomes, and are thus mostly inaccessible.

Since the proteins that form the core of the nucleosome are among the most evolutionarily conserved in nature, the scientists believe the genetic code they identified should also be conserved in many organisms, including humans. Several diseases, such as cancer, are typically accompanied or caused by mutations in the DNA and the way it organizes into chromosomes. Such mutational processes may be influenced by the relative accessibility of the DNA to various proteins and by the organization of the DNA in the cell nucleus. Therefore, the scientists believe that the nucleosome positioning code they discovered may aid scientists in the future in understanding the mechanisms underlying many diseases.

Dr. Eran Segal's research is supported by the Arie and Ida Crown Memorial Charitable Fund and the Estelle Funk Foundation.

The Weizmann Institute of Science in Rehovot, Israel, is one of the world's top-ranking multidisciplinary research institutions. Noted for its wide-ranging exploration of the natural and exact sciences, the Institute is home to 2,500 scientists, students, technicians and supporting staff. Institute research efforts include the search for new ways of fighting disease and hunger, examining leading questions in mathematics and computer science, probing the physics of matter and the universe, creating novel materials and developing new strategies for protecting the environment

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.

Thirst for Science

In 2006, Segal published a paper in which they attempted to model and predict positioning based purely on localized sequence signals. In the paper, they compared their model to a number of previously published datasets (mostly low resolution or small sized datasets) in order to claim an accuracy of about 50%. In a later paper, Narlikar et. al. (an independent group from the Segal work) showed that using positioning information, the Segal model, can be informative when looking at the regulatory network.

A couple months ago, I saw Jonathan Widom give a talk in which he discussed the initial results of the follow-up work to Segal’s original model. Since the Segal paper, a number of groups have published larger scale and higher resolution mappings of nucleosome positioning, including the Lee et. al. paper (2007). These datasets have provided larger and better datasets which have been used to improve the models of positioning. While I’ve not yet seen the newer model Widom spoke about published, it looked to contain promising improvements to the original.

An entirely different perspective on nucleosomes has to do with histone dynamics. I’ve written before about some elegant work on Histone Turnover, but recently Shivaswamy et. al. took a different approach to the problem. They compared the positioning of nucleosomes in normal versus heat shock conditions and showed some interesting transcription related changes. They also put forth a model in which some nucleosomes are well positioned with neighboring nucleosomes simply pack in around these well defined ones. I know the authors of this paper and they are interested in exploring nucleosome changes in a variety of conditions.

Two additional perspectives are on the horizon: the effects of individual genes on positioning and the pliability of positioning through evolution. Desiree Tillo gave an interesting talk at the recent CSHL Systems Biology meeting on their follow-up work to Lee et. al. (2007). Namely, they are examining nucleosome positioning in a large number of mutant strains (conditional alleles of essentials and deletions of nonessentials) as well as when yeast are challenged with small molecule inhibitors of histone-modifying enzymes. I am aware of at least two groups who are working on the more evolutionary based perspective, but only in the very early stages.

I am definitely NOT an expert in this field, as my current knowledge stems largely from these papers and a few talks. Currently this field is hot and progress is made quickly …. so I’m probably missing several key papers in the area (Suggestions for reading are welcome). Considering the likely influence of nucleosomes on regulation, it is a topic I will continue to follow with some interest.

Nucleosome positioning in yeasts: methods, maps, and mechanisms

Eukaryotic nuclear DNA is packaged into nucleosomes. During the past decade, genome-wide nucleosome mapping across species revealed the high degree of order in nucleosome positioning. There is a conserved stereotypical nucleosome organization around transcription start sites (TSSs) with a nucleosome-depleted region (NDR) upstream of the TSS and a TSS-aligned regular array of evenly spaced nucleosomes downstream over the gene body. As nucleosomes largely impede access to DNA and thereby provide an important level of genome regulation, it is of general interest to understand the mechanisms generating nucleosome positioning and especially the stereotypical NDR-array pattern. We focus here on the most advanced models, unicellular yeasts, and review the progress in mapping nucleosomes and which nucleosome positioning mechanisms are discussed. There are four mechanistic aspects: How are NDRs generated? How are individual nucleosomes positioned, especially those flanking the NDRs? How are nucleosomes evenly spaced leading to regular arrays? How are regular arrays aligned at TSSs? The main candidates for nucleosome positioning determinants are intrinsic DNA binding preferences of the histone octamer, specific DNA binding factors, nucleosome remodeling enzymes, transcription, and statistical positioning. We summarize the state of the art in an integrative model where nucleosomes are positioned by a combination of all these candidate determinants. We highlight the predominance of active mechanisms involving nucleosome remodeling enzymes which may be recruited by DNA binding factors and the transcription machinery. While this mechanistic framework emerged clearly during recent years, the involved factors and their mechanisms are still poorly understood and require future efforts combining in vivo and in vitro approaches.

This is a preview of subscription content, access via your institution.


Nucleosomes are dynamic structures—they can be unfolded, relocated or partially unwrapped, with the help of chromatin remodeling enzymes or by thermal fluctuations alone ( 51,61–64). In addition, nucleosome arrays fold into chromatin fibers, which in turn form intricate, dynamic higher-order structures ( 65). Thus the degree of accessibility of nucleosomal DNA to nucleases such as MNase can vary significantly depending on how mobile the nucleosome is and whether it resides within open or closed chromatin. Multiple factors may contribute to nucleosome mobility, including DNA sequence, chromatin remodeler action and interactions with non-histone DNA-binding proteins such as components of transcriptional machinery ( 28, 31, 47,66–68). Such interactions may both prevent nucleosome formation by steric exclusion and facilitate it by establishing favorable contacts with the histone octamer and/or nucleosomal DNA.

Here we show that nucleosomes exhibit a wide range of sensitivities to MNase ( 43, 45, 46), and thus MNase concentration determines which nucleosomes will be preferentially isolated in nuclease digestion experiments. Indeed, at higher MNase concentrations typically used in nucleosome mapping experiments to reduce chromatin to mononucleosomes, only MNase-resistant nucleosomes will yield DNA of approximately mononucleosomal length, whereas DNA of MNase-sensitive nucleosomes will be overdigested and lost from the mononucleosomal band. In contrast, at lower MNase concentrations the mononucleosome-size DNA fragments will be contributed primarily by MNase-sensitive nucleosomes, whereas MNase-resistant nucleosomes will correspond to longer (di-, tri-nucleosome, etc.) DNA fragments. Consistent with this idea, we have found that digestions at high and low MNase concentrations yield distinct nucleosome subsets in both 0–12 h Drosophila embryos and S2 cells (Supplementary Figure S3A), with MNase-sensitive and MNase-resistant nucleosomes exhibiting unique sequence signatures (Figure 3, Supplementary Figure S5). Moreover, many of the mononucleosome-size fragments are shorter than the canonical nucleosomal DNA length of 147 bp, indicating that, similarly to baker's yeast ( 39), a significant fraction of fly nucleosomes is partially unwrapped and overdigested by MNase. We have used chromatin immunoprecipitation against H3 and H2B histones to confirm that these shorter DNA fragments, as well as all other fragments analyzed by MNase HIGH - and MNase LOW -seq, are associated with histone proteins.

Finally, we observe that the broad nucleosome-depleted regions upstream of TSS ( 69) are actually enriched in MNase-sensitive nucleosomes these nucleosomes are not detected at higher MNase concentrations. In order to provide a more comprehensive map of nucleosome positioning, we have used both mono- and di-nucleosome fragments isolated at low MNase concentration in S2 cells. Mononucleosome fragments correspond to MNase-accessible nucleosomes that are easily released from chromatin, while di-nucleosome and longer fragments contain nucleosomes from regions that are more inaccessible to MNase. We note that DNA fragments corresponding to more than two nucleosomes (and thus related to inaccessible chromatin regions) were not sequenced (Supplementary Figure S4A). Nonetheless, our analysis goes beyond a standard approach of identifying mononucleosomes at high MNase concentration, providing a more comprehensive picture of Drosophila chromatin.

Remarkably, we find that nucleosomes that were MNase-sensitive when cells were grown and harvested at 27°C become MNase-resistant when the temperature is lowered to 18°C (Figure 2). Since this temperature change is too small to significantly affect the scale of thermal fluctuations, we hypothesize that the observed loss of DNA accessibility to MNase is due to changes in higher-order chromatin structure rather than thermally-activated dynamics of individual nucleosomes. This is consistent with our observation that regions of open and closed chromatin are enriched in MNase-sensitive and MNase-resistant nucleosomes, respectively (Figure 4). It is also possible that lowering the temperature modifies enzymatic activity of chromatin remodelers, impeding nucleosome translocation, unfolding and histone exchange and making nucleosomes less susceptible to nuclease digestion.

Sorting genes by expression levels reveals striking bimodality in nucleosome organization around TSS: in active genes we observe phased arrays of well-positioned nucleosomes, whereas in silent genes phasing with respect to TSS is lost (Figure 1, Supplementary Figure S6). This observation is consistent with an idea that nucleosome arrays in the vicinity of active genes are anchored through interactions with regulatory factors and components of transcriptional machinery ( 28). In contrast, nucleosomes in silent genes are positioned primarily through intrinsic sequence specificity of histone–DNA interactions. The range of sequence-specific nucleosome formation energies observed with genomic DNA is likely to be too small to provide strong phasing of nucleosome arrays, resulting in nucleosome fluid rather than a crystal-like array of well-positioned nucleosomes ( 60). Interestingly, nucleosomes are also strongly phased in the vicinity of DHS (some of which are far from any coding regions) (Figure 5). This phasing likely results from interactions between nucleosomes and factors bound at the hypersensitive sites, which may create potential barriers and wells on the nucleosome free energy landscape through favorable contacts with nucleosomes and steric exclusion ( 24, 58).

These observations were used to construct a biophysical model of nucleosome distribution in genic regions, in which the +1 nucleosome immediately downstream of the TSS is positioned through interactions with chromatin remodelers mediated by components of transcriptional machinery in active genes (Figure 6). The other nucleosomes are then positioned by steric exclusion, with sequence-dependent effects and active remodeling providing additional refinements ( 24, 47, 48, 50, 70). In silent genes, sequence-dependent positioning and active remodeling are unable to phase nucleosomes in the absence of +1 nucleosome anchoring, which serves to nucleate the array. Overall, our model is successful in reproducing patterns of nucleosome occupancy in both active and silent genes.

Interestingly, the +1 nucleosome, which in our framework anchors the entire array, also exhibits the highest histone turnover rates (Supplementary Figure S7). Histone exchange rates and sensitivity to MNAse at the +1 nucleosome correlate with RNA Pol II occupancy in the vicinity of TSS. Thus it appears that the presence of RNA polymerase both stabilizes the +1 nucleosome, precisely defining its position to a degree not achievable with DNA sequence alone, and at the same time increases its histone exchange rates. The latter requires an active remodeling mechanism since thermally-activated histone exchange should be impeded if the nucleosome is stabilized through interactions with external factors such as Pol II.


Structure of the core particle Edit

Overview Edit

Pioneering structural studies in the 1980s by Aaron Klug's group provided the first evidence that an octamer of histone proteins wraps DNA around itself in about 1.7 turns of a left-handed superhelix. [18] In 1997 the first near atomic resolution crystal structure of the nucleosome was solved by the Richmond group, showing the most important details of the particle. The human alpha satellite palindromic DNA critical to achieving the 1997 nucleosome crystal structure was developed by the Bunick group at Oak Ridge National Laboratory in Tennessee. [19] [20] [21] [22] [23] The structures of over 20 different nucleosome core particles have been solved to date, [24] including those containing histone variants and histones from different species. The structure of the nucleosome core particle is remarkably conserved, and even a change of over 100 residues between frog and yeast histones results in electron density maps with an overall root mean square deviation of only 1.6Å. [25]

The nucleosome core particle (NCP) Edit

The nucleosome core particle (shown in the figure) consists of about 146 base pair of DNA [10] wrapped in 1.67 left-handed superhelical turns around the histone octamer, consisting of 2 copies each of the core histones H2A, H2B, H3, and H4. Adjacent nucleosomes are joined by a stretch of free DNA termed linker DNA (which varies from 10 - 80 bp in length depending on species and tissue type [17] ).The whole structure generates a cylinder of diameter 11 nm and a height of 5.5 nm.

Nucleosome core particles are observed when chromatin in interphase is treated to cause the chromatin to unfold partially. The resulting image, via an electron microscope, is "beads on a string". The string is the DNA, while each bead in the nucleosome is a core particle. The nucleosome core particle is composed of DNA and histone proteins. [29]

Partial DNAse digestion of chromatin reveals its nucleosome structure. Because DNA portions of nucleosome core particles are less accessible for DNAse than linking sections, DNA gets digested into fragments of lengths equal to multiplicity of distance between nucleosomes (180, 360, 540 base pairs etc.). Hence a very characteristic pattern similar to a ladder is visible during gel electrophoresis of that DNA. [26] Such digestion can occur also under natural conditions during apoptosis ("cell suicide" or programmed cell death), because autodestruction of DNA typically is its role.

Protein interactions within the nucleosome Edit

The core histone proteins contains a characteristic structural motif termed the "histone fold", which consists of three alpha-helices (α1-3) separated by two loops (L1-2). In solution, the histones form H2A-H2B heterodimers and H3-H4 heterotetramers. Histones dimerise about their long α2 helices in an anti-parallel orientation, and, in the case of H3 and H4, two such dimers form a 4-helix bundle stabilised by extensive H3-H3' interaction. The H2A/H2B dimer binds onto the H3/H4 tetramer due to interactions between H4 and H2B, which include the formation of a hydrophobic cluster. [11] The histone octamer is formed by a central H3/H4 tetramer sandwiched between two H2A/H2B dimers. Due to the highly basic charge of all four core histones, the histone octamer is stable only in the presence of DNA or very high salt concentrations.

Histone - DNA interactions Edit

The nucleosome contains over 120 direct protein-DNA interactions and several hundred water-mediated ones. [30] Direct protein - DNA interactions are not spread evenly about the octamer surface but rather located at discrete sites. These are due to the formation of two types of DNA binding sites within the octamer the α1α1 site, which uses the α1 helix from two adjacent histones, and the L1L2 site formed by the L1 and L2 loops. Salt links and hydrogen bonding between both side-chain basic and hydroxyl groups and main-chain amides with the DNA backbone phosphates form the bulk of interactions with the DNA. This is important, given that the ubiquitous distribution of nucleosomes along genomes requires it to be a non-sequence-specific DNA-binding factor. Although nucleosomes tend to prefer some DNA sequences over others, [31] they are capable of binding practically to any sequence, which is thought to be due to the flexibility in the formation of these water-mediated interactions. In addition, non-polar interactions are made between protein side-chains and the deoxyribose groups, and an arginine side-chain intercalates into the DNA minor groove at all 14 sites where it faces the octamer surface. The distribution and strength of DNA-binding sites about the octamer surface distorts the DNA within the nucleosome core. The DNA is non-uniformly bent and also contains twist defects. The twist of free B-form DNA in solution is 10.5 bp per turn. However, the overall twist of nucleosomal DNA is only 10.2 bp per turn, varying from a value of 9.4 to 10.9 bp per turn.

Histone tail domains Edit

The histone tail extensions constitute up to 30% by mass of histones, but are not visible in the crystal structures of nucleosomes due to their high intrinsic flexibility, and have been thought to be largely unstructured. [32] The N-terminal tails of histones H3 and H2B pass through a channel formed by the minor grooves of the two DNA strands, protruding from the DNA every 20 bp. The N-terminal tail of histone H4, on the other hand, has a region of highly basic amino acids (16-25), which, in the crystal structure, forms an interaction with the highly acidic surface region of a H2A-H2B dimer of another nucleosome, being potentially relevant for the higher-order structure of nucleosomes. This interaction is thought to occur under physiological conditions also, and suggests that acetylation of the H4 tail distorts the higher-order structure of chromatin.

Higher order structure Edit

The organization of the DNA that is achieved by the nucleosome cannot fully explain the packaging of DNA observed in the cell nucleus. Further compaction of chromatin into the cell nucleus is necessary, but it is not yet well understood. The current understanding [24] is that repeating nucleosomes with intervening "linker" DNA form a 10-nm-fiber, described as "beads on a string", and have a packing ratio of about five to ten. [17] A chain of nucleosomes can be arranged in a 30 nm fiber, a compacted structure with a packing ratio of

50 [17] and whose formation is dependent on the presence of the H1 histone.

A crystal structure of a tetranucleosome has been presented and used to build up a proposed structure of the 30 nm fiber as a two-start helix. [33] There is still a certain amount of contention regarding this model, as it is incompatible with recent electron microscopy data. [34] Beyond this, the structure of chromatin is poorly understood, but it is classically suggested that the 30 nm fiber is arranged into loops along a central protein scaffold to form transcriptionally active euchromatin. Further compaction leads to transcriptionally inactive heterochromatin.

Although the nucleosome is a very stable protein-DNA complex, it is not static and has been shown to undergo a number of different structural re-arrangements including nucleosome sliding and DNA site exposure. Depending on the context, nucleosomes can inhibit or facilitate transcription factor binding. Nucleosome positions are controlled by three major contributions: First, the intrinsic binding affinity of the histone octamer depends on the DNA sequence. Second, the nucleosome can be displaced or recruited by the competitive or cooperative binding of other protein factors. Third, the nucleosome may be actively translocated by ATP-dependent remodeling complexes. [35]

Nucleosome sliding Edit

Work performed in the Bradbury laboratory showed that nucleosomes reconstituted onto the 5S DNA positioning sequence were able to reposition themselves translationally onto adjacent sequences when incubated thermally. [36] Later work showed that this repositioning did not require disruption of the histone octamer but was consistent with nucleosomes being able to "slide" along the DNA in cis. In 2008, it was further revealed that CTCF binding sites act as nucleosome positioning anchors so that, when used to align various genomic signals, multiple flanking nucleosomes can be readily identified. [37] Although nucleosomes are intrinsically mobile, eukaryotes have evolved a large family of ATP-dependent chromatin remodelling enzymes to alter chromatin structure, many of which do so via nucleosome sliding. In 2012, Beena Pillai's laboratory has demonstrated that nucleosome sliding is one of the possible mechanism for large scale tissue specific expression of genes. The work shows that the transcription start site for genes expressed in a particular tissue, are nucleosome depleted while, the same set of genes in other tissue where they are not expressed, are nucleosome bound. [13]

DNA site exposure Edit

Work from the Widom laboratory has shown that nucleosomal DNA is in equilibrium between a wrapped and unwrapped state. Measurements of these rates using time-resolved FRET revealed that DNA within the nucleosome remains fully wrapped for only 250 ms before it is unwrapped for 10-50 ms and then rapidly rewrapped. [38] This implies that DNA does not need to be actively dissociated from the nucleosome but that there is a significant fraction of time during which it is fully accessible. Indeed, this can be extended to the observation that introducing a DNA-binding sequence within the nucleosome increases the accessibility of adjacent regions of DNA when bound. [39] This propensity for DNA within the nucleosome to "breathe" has important functional consequences for all DNA-binding proteins that operate in a chromatin environment. [38] In particular, the dynamic breathing of nucleosomes plays an important role in restricting the advancement of RNA polymerase II during transcription elongation. [40]

Nucleosome free region Edit

Promoters of active genes have nucleosome free regions (NFR). This allows for promoter DNA accessibility to various proteins, such as transcription factors. Nucleosome free region typically spans for 200 nucleotides in S. cerevisae [41] Well-positioned nucleosomes form boundaries of NFR. These nucleosomes are called +1-nucleosome and −1-nucleosome and are located at canonical distances downstream and upstream, respectively, from transcription start site. [42] +1-nucleosome and several downstream nucleosomes also tend to incorporate H2A.Z histone variant. [42]

Eukaryotic genomes are ubiquitously associated into chromatin however, cells must spatially and temporally regulate specific loci independently of bulk chromatin. In order to achieve the high level of control required to co-ordinate nuclear processes such as DNA replication, repair, and transcription, cells have developed a variety of means to locally and specifically modulate chromatin structure and function. This can involve covalent modification of histones, the incorporation of histone variants, and non-covalent remodelling by ATP-dependent remodeling enzymes.

Histone post-translational modifications Edit

Since they were discovered in the mid-1960s, histone modifications have been predicted to affect transcription. [43] The fact that most of the early post-translational modifications found were concentrated within the tail extensions that protrude from the nucleosome core lead to two main theories regarding the mechanism of histone modification. The first of the theories suggested that they may affect electrostatic interactions between the histone tails and DNA to "loosen" chromatin structure. Later it was proposed that combinations of these modifications may create binding epitopes with which to recruit other proteins. [44] Recently, given that more modifications have been found in the structured regions of histones, it has been put forward that these modifications may affect histone-DNA [45] and histone-histone [46] interactions within the nucleosome core. Modifications (such as acetylation or phosphorylation) that lower the charge of the globular histone core are predicted to "loosen" core-DNA association the strength of the effect depends on location of the modification within the core. [47] Some modifications have been shown to be correlated with gene silencing others seem to be correlated with gene activation. Common modifications include acetylation, methylation, or ubiquitination of lysine methylation of arginine and phosphorylation of serine. The information stored in this way is considered epigenetic, since it is not encoded in the DNA but is still inherited to daughter cells. The maintenance of a repressed or activated status of a gene is often necessary for cellular differentiation. [17]

Histone variants Edit

Although histones are remarkably conserved throughout evolution, several variant forms have been identified. This diversification of histone function is restricted to H2A and H3, with H2B and H4 being mostly invariant. H2A can be replaced by H2AZ (which leads to reduced nucleosome stability) or H2AX (which is associated with DNA repair and T cell differentiation), whereas the inactive X chromosomes in mammals are enriched in macroH2A. H3 can be replaced by H3.3 (which correlates with activate genes and regulatory elements) and in centromeres H3 is replaced by CENPA. [17]

ATP-dependent nucleosome remodeling Edit

A number of distinct reactions are associated with the term ATP-dependent chromatin remodeling. Remodeling enzymes have been shown to slide nucleosomes along DNA, [48] disrupt histone-DNA contacts to the extent of destabilizing the H2A/H2B dimer [49] [50] and to generate negative superhelical torsion in DNA and chromatin. [51] Recently, the Swr1 remodeling enzyme has been shown to introduce the variant histone H2A.Z into nucleosomes. [52] At present, it is not clear if all of these represent distinct reactions or merely alternative outcomes of a common mechanism. What is shared between all, and indeed the hallmark of ATP-dependent chromatin remodeling, is that they all result in altered DNA accessibility.

Studies looking at gene activation in vivo [53] and, more astonishingly, remodeling in vitro [54] have revealed that chromatin remodeling events and transcription-factor binding are cyclical and periodic in nature. While the consequences of this for the reaction mechanism of chromatin remodeling are not known, the dynamic nature of the system may allow it to respond faster to external stimuli. A recent study indicates that nucleosome positions change significantly during mouse embryonic stem cell development, and these changes are related to binding of developmental transcription factors. [55]

Dynamic nucleosome remodelling across the Yeast genome Edit

Studies in 2007 have catalogued nucleosome positions in yeast and shown that nucleosomes are depleted in promoter regions and origins of replication. [56] [57] [58] About 80% of the yeast genome appears to be covered by nucleosomes [59] and the pattern of nucleosome positioning clearly relates to DNA regions that regulate transcription, regions that are transcribed and regions that initiate DNA replication. [60] Most recently, a new study examined dynamic changes in nucleosome repositioning during a global transcriptional reprogramming event to elucidate the effects on nucleosome displacement during genome-wide transcriptional changes in yeast (Saccharomyces cerevisiae). [61] The results suggested that nucleosomes that were localized to promoter regions are displaced in response to stress (like heat shock). In addition, the removal of nucleosomes usually corresponded to transcriptional activation and the replacement of nucleosomes usually corresponded to transcriptional repression, presumably because transcription factor binding sites became more or less accessible, respectively. In general, only one or two nucleosomes were repositioned at the promoter to effect these transcriptional changes. However, even in chromosomal regions that were not associated with transcriptional changes, nucleosome repositioning was observed, suggesting that the covering and uncovering of transcriptional DNA does not necessarily produce a transcriptional event. After transcription, the rDNA region has to protected from any damage, it suggested HMGB proteins play a major role in protecting the nucleosome free region. [62] [63]

Nucleosomes can be assembled in vitro by either using purified native or recombinant histones. [64] [65] One standard technique of loading the DNA around the histones involves the use of salt dialysis. A reaction consisting of the histone octamers and a naked DNA template can be incubated together at a salt concentration of 2 M. By steadily decreasing the salt concentration, the DNA will equilibrate to a position where it is wrapped around the histone octamers, forming nucleosomes. In appropriate conditions, this reconstitution process allows for the nucleosome positioning affinity of a given sequence to be mapped experimentally. [66]

Disulfide crosslinked nucleosome core particles Edit

A recent advance in the production of nucleosome core particles with enhanced stability involves site-specific disulfide crosslinks. [67] Two different crosslinks can be introduced into the nucleosome core particle. A first one crosslinks the two copies of H2A via an introduced cysteine (N38C) resulting in histone octamer which is stable against H2A/H2B dimer loss during nucleosome reconstitution. A second crosslink can be introduced between the H3 N-terminal histone tail and the nucleosome DNA ends via an incorporated convertible nucleotide. [68] The DNA-histone octamer crosslink stabilizes the nucleosome core particle against DNA dissociation at very low particle concentrations and at elevated salt concentrations.

Nucleosomes are the basic packing unit of DNA built from histone proteins around which DNA is coiled. They serve as a scaffold for formation of higher order chromatin structure as well as for a layer of regulatory control of gene expression. Nucleosomes are quickly assembled onto newly synthesized DNA behind the replication fork.

H3 and H4 Edit

Histones H3 and H4 from disassembled old nucleosomes are kept in the vicinity and randomly distributed on the newly synthesized DNA. [69] They are assembled by the chromatin assembly factor-1 (CAF-1) complex, which consists of three subunits (p150, p60, and p48). [70] Newly synthesized H3 and H4 are assembled by the replication coupling assembly factor (RCAF). RCAF contains the subunit Asf1, which binds to newly synthesized H3 and H4 proteins. [71] The old H3 and H4 proteins retain their chemical modifications which contributes to the passing down of the epigenetic signature. The newly synthesized H3 and H4 proteins are gradually acetylated at different lysine residues as part of the chromatin maturation process. [72] It is also thought that the old H3 and H4 proteins in the new nucleosomes recruit histone modifying enzymes that mark the new histones, contributing to epigenetic memory.

H2A and H2B Edit

In contrast to old H3 and H4, the old H2A and H2B histone proteins are released and degraded therefore, newly assembled H2A and H2B proteins are incorporated into new nucleosomes. [73] H2A and H2B are assembled into dimers which are then loaded onto nucleosomes by the nucleosome assembly protein-1 (NAP-1) which also assists with nucleosome sliding. [74] The nucleosomes are also spaced by ATP-dependent nucleosome-remodeling complexes containing enzymes such as Isw1 Ino80, and Chd1, and subsequently assembled into higher order structure. [75] [76]

The crystal structure of the nucleosome core particle ( PDB: 1EQZ ​ [27] [28] ) - different views showing details of histone folding and organization. Histones H2A , H2B , H3 , H4 and DNA are coloured.

Supporting information

S1 Fig. Comparison of in vivo nucleosome occupancy in a few datasets.

A) Histograms of nucleosome occupancy: Lee et al. [6], Kaplan et al. [3], Mavrich et al. [29], and Oberbeckmann et al. [31]. B) Heatmap of the root-mean-square-deviation (RMSD) between different datasets. C) Composite plot of the average nucleosome occupancy near transcription start sites (TSSs, left) or transcription termination sites (TTSs, right).

S2 Fig. Locating and annotating NDR.

A) A snapshot of Lee et al. occupancy data [6] showing three examples of potential NDRs: (a), (b), and (c). The occupancy data is discretized using horizontal lines at 80%, 73.2%, 66.4%, etc. with a constant step size of 6.78%. B) Zoom-in views of the annotated NDRs, (a)–(c), in A. “dx” is the distance between a cut-point at 80% line and the lowest crossing points. “l” is the distance between cross points with the 73.2% line. The occupancy was modified so that the occupancies between the lowest cross points are set to 0. The size of NDR is indicated by the red lines (length crossed at the 66.4% line). C) A histogram of dx. For NDR annotation, we require dx to be less than 100bp. D) Modified occupancy in an example region with the sharp downward spikes to 0 occupancy representing NDRs. The original data is shown in blue. The different asterisk points represent locations of annotated NDRs from previous studies: Chereji et al. [44], Yadon et al. [59], and Jiang & Pugh [64]. E) A table showing the number of overlapping NDRs between our annotation and those from Chereji et al. and Yadon et al.

S3 Fig. TF cluster distribution, density, and composition.

A) TF cluster density in hit-, missed-, and non-NDRs with different number of TFs taken into consideration (maximum distance between TFs in a cluster, dx, is 12bp). B) Histograms for the number of TFs in a cluster (left) and the size of TF clusters (right) with dx = 147bp. C) TF cluster density in hit-, missed-, and non-NDRs with dx = 147bp. D) Occurrence frequency for Top30 TF motifs in hit-, missed-, and non-NDRs.

S4 Fig.

Heatmap of the occupancy of the top 30 TFs and the five PolyA factors near TSSs (top), within gene bodies (center), and near TTSs (bottom). We have listed all 5542 genes with the exact TSS and TTS coordinates, and corresponding values can be found in S1, S2 and S3 Tables. The gene indices and annotations are adapted from ref. [44].

S5 Fig. Change in nucleosome occupancy at TSSs upon Rsc3 and/or PolyA/T deletion.

A) The format of the figure is the same as the main Fig 6 except that we either eliminated Rsc3 (left), or Rsc3 and PolyA/T (middle), or Rsc3 and remodeling effect (right) in the model. B) Pearson correlation coefficient, R, between experimental [49] and simulated nucleosome occupancy change. Note that the correlation is higher when only Rsc3 is deleted.

S6 Fig. Heatmap of the occupancy of the top 30 TFs and the five PolyA factors on the D. hansenii YAC introduced into S. cerevisiae.

The top panel shows the occupancy near the TSSs of the 154 genes in the YAC [50] the lower panel shows the occupancy in fortuitous NDRs generated in the gene body.

S7 Fig. Comparison with other models using Oberbeckmann et al. dataset as the reference.

Model performance measured by ρN, RMSD, PNDR, and AUC are compared among different models: N2, Nupop, Dnabend, Segal, Ozonov, and our own model.

S8 Fig. NDR prediction for various annotated NDRs.

A) NDR Prediction of our model in comparison to various annotated NDRs. The NDRs of Chereji [44] and Yadon [59] are same as in S2E Fig, except this time if the size between the centers of consecutive NDRs is less than 125 bp, they are merged as a single NDR. Oberbeckmann’s [31] NDRs are annotated using the same scheme as Lee’s [6] NDRs (see Materials and Methods). The lists of NDRs and their coordinates for Lee and Oberbeckmann are given in S5 and S6 Tables, respectively. B) NDR Prediction of our model for common NDRs in multiple datasets.


We thank Alain Arneodo for providing the sequence-based algorithm to compute histone binding energies and nucleosome occupancy (28) and Stephan Schiffels for kindly making his Wright–Fisher evolution model algorithm available. We also are grateful for stimulating discussions with Ville Mustonen. This work was supported by Deutsche Forschungsgemeinschaft Grant SFB 680, by German Federal Ministry of Education and Research Grant 0315893-Sybacol, and in part by the National Science Foundation (NSF) under Grant NSF PHY05-51164 during a visit to the Kavli Institute for Theoretical Physics (Santa Barbara, CA).

Watch the video: What are DNA and Genes? (October 2022).