Information

Hydrophobic proteins in the body?


I know that we can get hydrophobic amino acids, but are there any proteins in the body whose surface is hydrophobic? If so what is their typical function and where can they typically be found and if not why not?


Are there any proteins in the body whose surface is hydrophobic?

Sure. Although you are right in thinking that most proteins have hydrophilic surfaces, some are very hydrophobic. My favorite example is Elastin, it is the main component of the skin which grants it elasticity. In fact, the hydrophobic nature of elastin is what confers it its function. The idea, briefly said, is that a insoluble/hydrophobic structure will tend to minimize it's surface area with water, so if you stretch such a substance (i.e. increase the surface area), it will pull back to the minimum surface area state.

Reference: Li, Daggett. Molecular basis for the extensibility of elastin. Journal of Muscle Research & Cell Motility (2002).


Greasy tags for protein removal

Most proteins in the human body are difficult targets for small-molecule drugs. This problem may have been overcome with the discovery of molecules that induce protein degradation, suggesting fresh, modular approaches to drug discovery.

It was recently discovered 1,2 that proteins covalently 'tagged' with small, synthetic, hydrophobic molecules are degraded by the cell's quality-control machinery. Writing in Chemistry & Biology, Long et al. 3 now report that non-covalent binding of such molecules also marks proteins for degradation. This finding could open up a wide range of proteins as targets for drug-discovery programmes.

The dearth of newly approved drugs in the past decade reflects the challenges faced by the pharmaceutical industry. Although advances in genomics have identified many proteins that are implicated in disease, many of these proteins — especially those that are not enzymes — are not currently viable drug targets. In fact, it has been estimated that only about 15% of the human proteome is 'druggable' with small molecules 4 .

Many attractive drug targets have therefore been dubbed 'undruggable'. For instance, there are roughly 1,400 human transcription factors — proteins that regulate messenger RNA synthesis from DNA, but which lack enzymatic activity. These proteins remain largely undruggable, despite the fact that aberrant expression of some of them is known to cause cancer. One possible solution to this challenge has been the development of small interfering RNAs (siRNAs), which intervene in gene expression by binding to mRNA. However, delivering siRNAs to their targets in vivo has been a difficult hurdle to overcome, and so small molecules that can affect the function of undruggable proteins are needed.

Another emerging approach is to destroy, rather than inhibit, target proteins in cells. Normal protein turnover in cells is mainly mediated by the ubiquitin–proteasome system (UPS), which tags unwanted or misfolded proteins with chains of the ubiquitin protein. Once ubiquitinated, the marked proteins are recognized by the proteasome, a large, barrel-like molecular machine that cleaves proteins into small peptides. Efficient removal of unwanted proteins is key to cell survival, as evidenced by the development of proteasome inhibitors as effective antitumour agents 5 .

Several strategies have been reported that co-opt the UPS for targeted protein degradation. One of these uses 'proteolysis-targeting chimaeric molecules' to bring the protein of interest close to a ubiquitin ligase (an enzyme that mediates the ubiquitination of a target protein), thus bringing about protein ubiquitination and subsequent degradation 6 .

An alternative approach is to mimic a misfolded protein state using small molecules. Normally, the 'greasy' (hydrophobic) side chains of polypeptides are buried in the interior of a globular protein, with the hydrophilic amino-acid residues lying at the surface. Even a small increase in surface hydrophobicity can make a protein unstable. For instance, the deletion of a single amino acid from the CFTR protein is the main cause of cystic fibrosis. The deletion results in the exposure of hydrophobic patches on the surface of CFTR, leading to misfolding and subsequent degradation of the protein (Fig. 1).

a, Cellular chaperone proteins help other proteins that have become partially unfolded to refold into their correct tertiary structure. If refolding fails, the chaperones trigger degradation of the unfolded protein by the proteasome, a large protein complex. b, Synthetic hydrophobic groups attached to a protein's surface can mimic the partially unfolded state. Because chaperones are unable to refold these proteins, the tagged proteins are degraded by the proteasome. Long et al. 3 report that hydrophobic tags do not need to be covalently attached to a protein to induce degradation.

We have recently shown 1,2 that the covalent attachment of a synthetic hydrophobic group (such as adamantane, a bulky hydrocarbon) to the surface of proteins attracts chaperone proteins whose job it is to help refold misfolded proteins, or, if they cannot be refolded, to target them for degradation by the proteasome. But most drugs bind to proteins through non-covalent interactions, and it was unclear whether non-covalently bound molecules could also trigger this sequence of events.

Long et al. 3 have settled this concern. They investigated the biological effect of attaching a hydrophobic group (Boc3Arg, a modified arginine amino acid) to trimethoprim (TMP), a ligand molecule that binds non-covalently to the dihydrofolate reductase (DHFR) enzyme from the bacterium Escherichia coli. The authors observed that TMP–Boc3Arg induces 30–80% DHFR degradation in mammalian cells, depending on the rate of DHFR synthesis. This effect could be blocked either by TMP, which competes with TMP–Boc3Arg for binding to DHFR, or by inhibitors of proteasome activity.

The authors also demonstrated that the glutathione S-transferase (GST) enzyme is degraded when treated with a compound in which Boc3Arg is attached to ethacrynic acid (EA), a GST inhibitor that becomes covalently bound to the enzyme's active site. This demonstrates that the degradation effect of Boc3Arg occurs for at least two enzymes. Long et al. went on to make a fusion protein in which DHFR is attached to GST, and then treated cells producing the protein with either TMP–Boc3Arg or EA–Boc3Arg. They observed that DHFR–GST was degraded more efficiently by EA–Boc3Arg, which binds covalently to the protein, than by TMP–Boc3Arg, which binds non-covalently. This suggests that the covalent attachment of hydrophobic tags to enzymes is the more effective strategy for protein degradation.

As TMP is a high-affinity inhibitor of E. coli DHFR, further studies are needed to determine whether a small molecule that is both a protein inhibitor and a degradation signal is more effective in abrogating protein function than a simple inhibitor. As pointed out by the authors, the case of botulinum toxin illustrates the advantage of the degradation approach. The most potent form of this toxin, which causes muscle paralysis, has a half-life in the body of about 3 months. Although an inhibitor of the toxin would be able to suppress toxicity in the short term, elimination of the toxin is obviously a preferable therapeutic approach.

However, the Boc3Arg moiety is large (almost 500 daltons in mass), and large molecules often have poor pharmacokinetic properties that limit their use as drugs. So, appending it to an existing inhibitor could potentially worsen that inhibitor's pharmacokinetic properties. Curiously, even though TMP has high affinity for E. coli DHFR and is thought to have excellent cell permeability, Long et al. needed to use a high concentration of TMP–Boc3Arg to observe protein degradation. This suggests that TMP–Boc3Arg has difficulty permeating cells.

Other non-covalent ligand–protein systems need to be tested to establish the minimum ligand–protein affinity necessary to initiate protein degradation. Meanwhile, it is intriguing to speculate about how the modularity of this protein-degradation strategy might be used for drug discovery. One could envisage a streamlined process in which ligands for an undruggable protein are identified, appended with hydrophobic moieties (such as adamantane or Boc3Arg) and tested for their ability to degrade the target protein. Finding high-affinity ligands for undruggable proteins will certainly be a challenge, but methods are becoming available to facilitate this.

For instance, chemical libraries in which each compound is attached to a unique DNA 'barcode' can be tested for protein binding, and the chemical entities that have the highest binding affinities subsequently identified using the barcodes 7 . This method would allow the rapid screening of up to 10 9 compounds, whereas the largest screens currently used assay only about 10 6 compounds 8 . A combination of such high-throughput screening methods with the hydrophobic tagging approach could make today's undruggable proteins attractive biological targets in the search for compounds that ameliorate human disease.


Introduction

A detailed understanding of the molecular origins of the hydrophobic effect 1,2,3,4,5,6,7,8,9,10 in proteins and of its role as a driving force in protein folding and assembly 10,11,12,13,14 is still an open problem. This situation arises because protein sequences include a complex combination of hydrophobic (non-polar) and hydrophilic (polar) regions. The resulting patterns of non-polar regions have typical dimensions that correspond to a critical value for the hydrophobic effect 1,2,3,4,5,6,7,8 . This value, which is of about 1 nm, corresponds to the distinction between small and large non-polar solutes 1,2,3,4,5,6,7,8,9 . To understand the molecular origin of this characteristic length, one should start from the observation that water molecules tend to form networks of hydrogen bonds at the expense of their rotational entropy. Non-polar solutes smaller than 1 nm occupy a volume that is too small to perturb significantly the formation of such hydrogen bond networks. By contrast water molecules near the surfaces of non-polar solutes larger than 1 nm cannot form all the hydrogen bonds that they would do in bulk. These two situations correspond to a different scaling of the hydrophobic forces, with the volume or the surface, for small or large ideal hydrophobic solutes, respectively 1 .

Here our aim is to clarify first whether proteins, whose surfaces as mentioned above are characterized by the presence of complex polar and non-polar patterns, behave effectively like small or large non-polar solutes, and then to investigate the consequences of this fact on their folding behaviour 11,12,13 . Addressing this question requires an accurate characterization of the conformational space of a protein in water to study the number of hydrogen bonds formed by water molecules in proximity of its surface with respect to the bulk.

To attack this problem, we exploited the opportunities offered by the case of yeast frataxin (Fig. 1a). Frataxin, which is a protein involved in the assembly of iron-sulfur clusters, is also related to Friedreich’s ataxia, a fatal neurodegenerative condition 15 . We considered yeast frataxin because this protein represents one of the few examples for which both cold and hot denatured states have been observed at neutral pH and without addition of destabilizing agents (at 272 K and 323 K, respectively), and characterized by nuclear magnetic resonance (NMR) chemical shifts and circular dichroism (CD) 15,16,17,18 . As shown by the pioneering work of Privalov, the thermodynamic analysis of cold denaturation, compared to that of the thermal denaturation, offers unique advantages for understanding the molecular determinants of hydrophobicity 19,20 . Indeed, it is generally recognised that while thermal denaturation is the consequence of a temperature-induced increase in conformational fluctuations, cold denaturation is a consequence of an enthalpy gain of the solvent 19,20 . Not many atomistic details are known, however, about the structural consequences of this gain of entropy or enthalpy in the hot and cold denatured states, respectively. Thus, modelling the cold and hot denatured states of yeast frataxin in the absence of any additional agent should highlight the different roles played by the protein and solvent at different temperatures.

(a) Structure of the native state of frataxin. (b) Free energy surface of the cold denatured state (CDS) determined at 272 K as a function of the sketch-map 63 collective variables that describe the conformational features of the ensembles (see Methods). Nine microstates are shown representing the local and global minima. These microstates comprise >90% of the total population. (c) Free energy surface of the hot denatured state (HDS) determined at 323 K as a function of the same two collective variables. Sixteen microstates are shown representing the local and global minima. These microstates comprise >90% of the total population (see Methods). (d,e) Secondary structure populations of the CDS (d) and HDS (e). α-Helices are represented in blue, β-strands in red and polyproline II in green free energies are given in kJ/mol.

In the following we employ the chemical shifts measured in cold and hot denaturation conditions 15,16,17 together with replica-averaged metadynamics (RAM) simulations 21,22 (see Materials and Methods) to elucidate at atomic resolution the structure and the dynamics of the cold denatured state (CDS) and hot denatured state (HDS). In the RAM simulations the experimental information provided by NMR chemical shifts is incorporated in terms of structural restraints 22 and at the same time the sampling of the conformational space is enhanced by metadynamics 23 . This overall approach makes it possible to simultaneously change the force field used in the molecular dynamics simulations to improve their agreement with the experimental data in the spirit of the maximum entropy principle 24 and to significantly decrease the computational resources required to obtain convergence in the sampling. Then, by using Φ-values analysis 25,26,27 and Φ-values restrained molecular simulations 26 , we determined their cold transition state (CTS) and hot transition state (HTS) to describe the differences in the cold and hot denaturation processes corresponding to the differences between the cold and hot denatured states.


Steroid Hormones

Steroid hormones, being hydrophobic molecules, diffuse freely into all cells. However, their "target" cells contain cytoplasmic and/or nuclear proteins that serve as receptors of the hormone. The hormone binds to the receptor and the complex binds to hormone response elements &mdash stretches of DNA within the promoters of genes responsive to the hormone. The hormone/receptor complex acts as a transcription factor turning target genes "on" (or "off").


A Tale of Two Types

You will learn about two types of membrane proteins: peripheral proteins and integral proteins. Peripheral proteins have weaker and temporary connections to the membrane. Some just sit on the surface, anchored with a few ionic bonds while others might have small sections that dip into the hydrophobic section of the bilayer. When you look at the entire membrane, there are more peripheral proteins when compared to the number of integral proteins.

As you can guess from the name, integral proteins are permanently connected to the cell membrane. They are hard workers and have large sections embedded in the hydrophobic (middle) layer of the membrane.

Transmembrane proteins are integral proteins that cross the membrane and can act as pathways for ions and molecules. Polytopic transmembrane proteins cross the membrane several times. Some are receptor proteins while others form channels. Ion movement that does not require work is called passive transport while active transport systems use work to move molecules. Active transport is regularly used when membrane proteins pump ions against the concentration gradient.


Conclusions

The present analysis suggests that a shared haloadaptation strategy of proteins in the presence of molar salt concentration, but not in the presence of osmolytes, necessitates a weakening of the hydrophobic interactions in general, and in particular at the level of core and conserved hydrophobic contacts. Weakening of these interactions counterbalances their strengthening by the presence of salts in solution and may help the structure preventing aggregation and/or loss of function in hypersaline environments. Indeed, decreasing hydrophobicity makes halophilic proteins unstable in low-concentration salt solutions and may in part explain the request of the halophilic proteins for high salt concentrations. To complete the picture, the destabilization of halophilic proteins at low-salt concentration due to the strong electrostatic repulsion should be considered [18, 33]. Shrinking of hydrophobic contacts must be even more critical for the early stages of folding when intramolecular hydrophobic nuclei must correctly form to guide the polypeptide through the folding funnel to the native state.

Considering also the significant increase of biotechnology applications of halophiles, the comprehension of the multifaceted etiology of halophilicity (including the electrostatic factors) can provide the theoretical basis for the engineering of proteins of great interest because stable at concentrations of salts that cause the denaturation or aggregation of the majority of macromolecules.


Results

A Framework to Describe the Interactions Stabilizing the Droplet State.

The premise of this work is that the droplet state is characterized by low-specificity interactions and liquid-like conformational entropy. Thus, we hypothesized that proteins that are conformationally heterogeneous in their native states and maintain this property upon binding would be particularly prone to form the droplet state. In estimating the degree of conformational heterogeneity in both the native and bound states, we observe that proteins span a continuum between structural order and disorder (23, 24), which we will express by the probabilities of pD (free state) and pDD (bound state). We also note that interactions with high conformational entropy are realized via many different binding configurations, which can be achieved by both ordered and disordered domains (25). By contrast, ordered binding modes with low conformational entropy are mediated by well-defined interfaces, as exemplified by rigid docking or templated folding (26).

Ordered and disordered binding modes exhibit characteristic sequence signatures. Motifs mediating ordered binding modes have a strong compositional bias as compared with their embedding protein regions. In contrast, motifs mediating disordered binding modes are more similar to their flanking regions, which can be realized via a variety of sequence patterns and contact types, as their specificity stems from their distinct character as compared with their flanking regions (23). We have previously demonstrated (27) that by identifying such interaction elements based on compositional bias, it is possible to estimate structural order or disorder under cellular conditions in excellent agreement with in vivo proteomic studies (28).

Properties of Proteins That Can Form the Droplet State.

Datasets of proteins representing the droplet state.

We have analyzed three public datasets of proteins reported to undergo liquid–liquid phase separation (Materials and Methods). The first is the PhaSepDB dataset (http://db.phasep.pro/) (29), which assembles data from three resources (Materials and Methods and Dataset S1): 1) proteins from the literature with in vivo and in vitro experimental data on liquid–liquid phase separation (REV, 351 proteins Materials and Methods and Dataset S1), 2) proteins from UniProt associated with known organelles (UNI, 378 proteins Materials and Methods and Dataset S1), and 3) proteins identified by high-throughput experiments of liquid–liquid phase separation (HTS, 2,572 proteins Materials and Methods and Dataset S1). The second dataset is PhaSePro (https://phasepro.elte.hu) (30), which identifies protein regions associated with liquid–liquid phase separation (PSP, 121 proteins Materials and Methods and Dataset S1). The third dataset is LLPSDB (http://bio-comp.org.cn/llpsdb) (31), which assembles proteins observed to undergo in vitro liquid–liquid phase separation with well-defined experimental conditions and phase diagrams (Materials and Methods and Dataset S1). LLPSDB distinguishes whether proteins can phase separate spontaneously as one component (droplet-driving proteins, LPS-D, 133 proteins Materials and Methods and Dataset S1) or require a partner to undergo liquid–liquid phase separation (droplet-client proteins, LPS-C, 41 proteins Materials and Methods and Dataset S1). In this dataset, 77 proteins exhibit both droplet-driving and droplet-client behaviors.

To create a dataset for liquid–liquid phase separation, we merged the proteins in the REV, PSP, and LPS-D datasets, which we consider as drivers of droplet formation (453 unique proteins, LLPS dataset Materials and Methods and Dataset S1). We generated two negative control datasets, one with human proteins only and another with a mixture of organisms (Dataset S2). For the human negative set (hsnLLPS dataset, 18,108 proteins Materials and Methods), we excluded from the Swiss-Prot human proteome all proteins that appeared in any of the liquid–liquid phase separation datasets (REV, UNI, HTS, PSP, LPS-D, LPS-C) (29 ⇓ –31) (Dataset S2). For the negative set corresponding to multiple organisms (nsLLPS Materials and Methods), we derived the organism distribution from the LLPS dataset. To build a control dataset, we considered organisms populated more than 1% in the LLPS dataset and used their proteomes from UniProt (Caenorhabditis elegans, Chlamydomonas reinhardtii, Drosophila melanogaster, Homo sapiens, Mus musculus, Rattus norvegicus, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Xenopus laevis Materials and Methods) and removed all proteins present in the LLPS or HTS datasets. We then randomly chose sequences according to the frequencies of these organisms in the LLPS dataset with 10 times enrichment (nsLLPS Materials and Methods and Dataset S2).

Analysis of the amino acid compositions of droplet-driving and droplet-client proteins.

Droplet-driving proteins are enriched in disorder-promoting residues (P, G, S) and depleted in order-promoting (F, I, V, C, W) residues as compared with non–phase-separating proteins (Fig. 2A). N and Q, which are distinguished in prion-like domains (32), are more abundant in droplet-driving proteins than those not reported to undergo LLPS. However, droplet-driving proteins are not significantly enriched in residues that mediate π–π and cation–π interactions (Y, R), as compared with non–phase-separating (nsLLPS) proteins (Fig. 2A and Dataset S3). These results indicate that droplet formation does not depend on a specific contact type but can rather be realized in many ways by low-specificity interactions. The composition of droplet-driving proteins is in between that of globular proteins (33) and disordered proteins in the DisProt database (33), more abundant in order-promoting residues (W, C, Y, F, I, V) as compared with disordered proteins (SI Appendix, Fig. S1B), and enriched in disorder-promoting residues (P, D, E) as compared with globular proteins (34) (SI Appendix, Fig. S1A). Aromatic residues observed in disordered regions, for example in nucleoporins, often mediate low-affinity interactions (35). These compositional properties reflect the preference of droplet-driving proteins for the disordered state in the bound form, which is comparable to protein complexes with disordered binding modes (23).

Differential amino acid compositions of droplet-driver and droplet-client proteins. (A) Differences in amino acid compositions (ΔAA) of droplet-driver proteins in the LLPS dataset and of proteins not reported to phase separate (nsLLPS). (B) Differences in amino acid compositions of droplet-client proteins that require additional components for phase separation (LPS-C dataset) and of proteins that have not been reported to phase separate (nsLLPS). (C) Differences in amino acid compositions of droplet-client proteins (LPS-C) and droplet-driver proteins (LLPS). Amino acids grouped as hydrophobic (light green), aromatic (green), hydrophilic (turquoise), charged (steel blue), and disorder-promoting (dark blue) (34). The SEs and the significances of the differences by Kolmogorov–Smirnov test are shown in Dataset S3.

As compared with non–phase-separating proteins, droplet-client proteins are enriched in charged residues (D, K, E) and disorder-promoting prolines (Fig. 2B and Dataset S3). Droplet-client proteins exhibit characteristic differences from droplet-driving proteins, as they are enriched in charged residues (K, E) and hydrophobic motifs (L, V, I), while being depleted in amyloid-promoting (N, Q), phosphorylation-promoting (S), and disorder-promoting (G) residues (Fig. 2C and Dataset S3). The amino acid composition of droplet clients is thus more similar to structured than disordered proteins (SI Appendix, Fig. S1B).

Analysis of the conformational entropy of droplet-driving and droplet-client proteins.

We observed that different protein datasets representing the droplet state have markedly different characteristics in their conformational entropy in the free state and its change upon binding. Drivers of droplet formation (LPS-D) have high levels of disorder in free (pD) and bound states (pDD), while droplet clients (LPS-C) are mostly ordered in both forms (Fig. 3 A and B). Proteins in the REV and PSP datasets exhibit disordered binding modes, which are comparable to droplet-driver proteins, so they likely phase separate spontaneously. Proteins associated with known membraneless organelles (UNI) or identified by high-throughput experiments (HTS) (29) have significantly lower conformational entropy in both free and bound states, and thus likely have components that form droplets via partner interactions. Comparison of spontaneously phase-separating and non–phase-separating proteins (Fig. 3 C and D) indicates that a high conformational entropy is a characteristic of the droplet state.

Conformational properties in different datasets of LLPS proteins in the free and bound states. PhaSepDB literature reviewed (light blue), PhaSepDB human organelle-associated proteins from UniProt (steel blue), PhaSepDB proteins identified by high-throughput experiments (dark blue), PhaSePro (orange), LLPSDB one-component proteins (droplet drivers wheat), and two-component (droplet clients gray) phase-separating proteins. (A) The probability for the disordered state (pD) in the free form was characterized by the fraction of disordered residues, as computed by the ESpritz NMR program (36). Residues are classified as disordered if they have an ID score ≥0.3089. The fraction of disordered residues was computed per protein as NID/NAA and these values were averaged for each dataset. (B) Probability for disordered binding (pDD) was computed by the FuzPred program (23). The median pDD value was determined for each protein and these values were averaged for each dataset. (C and D) Comparison of pD and pDD in droplet-driving (LLPS light blue) and non–phase-separating proteins (nsLLPS dark blue). Statistical significances were computed by the Mann–Whitney U test using the R program (**P < 10 −3 , ***P < 10 −5 , ****P < 10 −10 ).

Sequence-Based Prediction of Droplet Propensity Profiles of Proteins.

Based on the analysis reported above, in this section, we present a method of predicting the sequence-based profile of the propensity of proteins to form spontaneously the droplet state (pDP). To achieve this result, we define the probability of residue Ai to be involved in spontaneous phase separation by pDP (Ai) using a binary logistic model as p D P ( A i ) = e x p F S ( A i ) 1 + e x p F S ( A i ) , [1] where F S ( A i ) is the scoring function for the residue F S ( A i ) = λ 1 p D ( A i ) + λ 2 p D D ( A i ) + γ , [2] where p D ( A i ) is the probability for disorder in the free state and p D D ( A i ) is the probability for disordered binding (23). p D ( A i ) contains an estimate of the conformational entropy in the unbound form, while p D D ( A i ) contains an estimate of the binding entropy. λ1 and λ2 are the linear coefficients of the predictor variables and γ is a scalar constant (intercept), which were determined using the binary logistic model (Materials and Methods and Dataset S4). pD was derived from the disorder score as computed using the ESpritz NMR algorithm (36), with the best performance on disordered protein complexes (23). The pDD values were predicted by the FuzPred method, which describes binding modes under cellular conditions (27). The pD and pDD values capture the balance between enthalpy and entropy that stabilizes the droplet state, which is associated with the nonspecific nature of a variety of side-chain interactions.

To train our model, we used a dataset of droplet-promoting regions, with evidence to mediate spontaneously phase separation (Materials and Methods and Dataset S1). As a negative set, we defined regions in non–phase-separating proteins with the same length distribution as in the positive set (Materials and Methods). The size of the negative set was 10 times that of the positive set and we applied stratified sampling in the training. We found that the linear coefficients were robust over many random selections of the positive and negative sets, as well as the training set size (Dataset S4). In the final parameterization, the linear coefficients of both disorder and binding modes were positive, reflecting the preference for a disordered bound state in the droplets. The threshold to mediate droplet formation was derived from the binary logistic model (pDP ≥ 0.60).

To estimate the performance of the method, we calculated an area under the curve (AUC) value of 87.0% on the training set and an AUC value of 85.9% on the test set (Materials and Methods and Dataset S4). We applied these coefficients to all droplet regions and obtained an AUC value of 84.4%. These results illustrate that the parameters are robust across droplet regions from different organisms. We also note that droplet-promoting propensity profiles of proteins that were observed to form droplets under cellular conditions and those that were detected only by in vitro experiments are not significantly different (SI Appendix, Fig. S3).

We have thus developed the FuzDrop method to predict droplet-promoting propensities of residues from the primary sequence based on the conformational entropy of disordered binding modes in droplets.

Droplet-Promoting Propensity Profiles of TDP-43 and α-Synuclein.

We applied the FuzDrop method to predict the droplet-promoting propensities of two proteins reported to undergo liquid–liquid phase separation, TDP-43 (37) and α-synuclein (38, 39). Our results indicate that the low-complexity region of TDP-43 (residues 262 to 414) mediates spontaneous phase separation. We note that the α-helical segment (residues 320 to 331), which constitutes the amyloid core in TDP-43 fibrils (40) (Fig. 4A) and is predicted to undergo disorder-to-order transition upon binding, also has a high droplet-promoting propensity (Fig. 4A).

Droplet-promoting propensity profiles (pDP) of the TDP-43 low-complexity (LC) domain and of α-synuclein. (A) The TDP-43 LC domain has overall high droplet-promoting propensities. The depletion in the droplet profile corresponds to the α-helical segment (orange), which is involved in the amyloid core. The N- (lime) and C- (blue) flanking regions are disordered in the NMR structure of the G335D mutant (PDB ID code 2n4g). (B) The disordered C-terminal region of α-synuclein (blue) is predicted to drive droplet formation. The N-terminal region (lime), which folds into an α-helix, has intermediate pDP values. The ensemble is derived from the Protein Ensemble Database (PED9AAC). The pLLPS threshold is indicated by a bold gray line.

In the case of α-synuclein, the highly disordered C-terminal region (residues 98 to 140), which also remains disordered upon binding to lipid vesicles (41), is predicted to drive the formation of the droplet state (Fig. 4B). The central non-amyloid beta component (NAC) region has lower pDP propensity to spontaneously phase separate, but may be involved in droplets via hydrophobic protein interactions, which are absent from β-synuclein and γ-synuclein (38).

Sequence-Based Prediction of Droplet-Driving Proteins.

In this section, we present a method of ranking proteins according to their propensity to form the droplet state. In order to achieve this result, we estimate the probability of liquid–liquid phase separation (pLLPS) using a binary logistic model (Materials and Methods) with a scoring function (FLLPS) derived from residue droplet-promoting propensities and a term for hydrophobic interactions F L L P S = λ 1 ∗ m e d i a n < p D P ( A i ) >+ λ 2 ∗ n D P R + λ 3 ∗ H + γ , [3] where m e d i a n < p D P ( A i ) >is the median of the residue droplet-promoting propensities, n D P R is the number of long droplet-promoting regions (DPRs ≥25 consecutive residues with pDP ≥ 0.6), and H is a hydrophobic term (≥6-residue hydrophobic motifs within disordered regions) (Materials and Methods). λ1, λ2, and λ3 are the linear coefficients of the predictor variables and γ is a scalar constant (intercept), which we determined on the LLPStrain and nsLLPStrain datasets (Materials and Methods and Dataset S5). We found that the linear coefficients were robust over many random selections of the positive and negative sets, as well as the training set size (Dataset S5). The threshold to mediate spontaneous liquid–liquid phase separation was derived from the binary logistic model (pLLPS ≥ 0.61). We propose that the pLLPS value expresses the droplet-driving potential under physiological conditions, as droplet-promoting propensities of proteins that form droplets under physiological conditions and those that were detected to phase separate only in vitro do not deviate significantly (SI Appendix, Fig. S2). We also note that using nonphysiological conditions, such as high concentrations of protein and crowding agents, can induce liquid–liquid phase separation at pLLPS values below the threshold, especially if droplet-promoting regions are present.

To estimate the performance of the method, we calculated an AUC value of 88.3% on the training set (0.75 of the LLPS dataset) and an AUC value of 90.7% on the test set, using stratified sampling (Materials and Methods and Dataset S5). As an attempt to further improve performance, we incorporated a π–π term (19) into the scoring function of the logistic model (Materials and Methods). Adding this term slightly increased the performance of the model (AUC 92.2% Dataset S5) with a moderate contribution to the scoring function. These results are in accord with the presence of π–π interactions in many droplet proteins, but also show that these interactions are not prerequisites for droplet formation.

The performance and robustness of the model (Eq. 3 and Dataset S5) demonstrate that the droplet state can be predicted from sequence based on the estimated conformational entropy of binding and a nonspecific enthalpy term. We also note that our model by Eq. 3 serves as a general framework for predicting droplet-driver proteins. Accumulating data collected using more systematic and uniform experimental approaches (8) will enable further refinement of the parameters in our model and to predict the minimum concentration for phase separation, although this property can be expected to be highly dependent on the cellular conditions.

Region Specificity of the FuzDrop Method and Experimental Validation of the Predictions.

We note that estimates of the overall propensity of a protein to form the droplet state cannot be readily obtained by a simple average of the values of the profiles of Eq. 2. This overall propensity is also determined by specific regions, rather than only by the general properties of the entire sequence, including in particular droplet-promoting regions and short motifs within disordered regions, which are prone to establish hydrophobic interactions (Eq. 3). This point can be illustrated by distinct behaviors of α-synuclein and β-synuclein (38). The C-terminal region of both proteins possesses a droplet-promoting region, with a preference for disordered binding modes (Fig. 4 and SI Appendix, Fig. S3). In addition, the NAC region of α-synuclein contains eight hydrophobic residues, biased for disordered binding, which can exert a nonspecific driving force (resembling hydrophobic collapse) for droplet formation. Notably, however, β-synuclein and γ-synuclein, which lack these residues (SI Appendix, Fig. S3), were not observed to undergo liquid–liquid phase separation under physiological conditions (38).

The predicted pLLPS values by the FuzDrop method (0.62 for α-synuclein, 0.54 for β-synuclein, and 0.40 for γ-synuclein) suggest that β-synuclein and γ-synuclein have lower propensity to adopt the droplet state as compared with α-synuclein. Indeed, γ-synuclein did not phase separate under any of the experimental conditions tested (38). To validate the predictions close to the prediction threshold, we explored β-synuclein phase behavior in a set of in vitro experiments (Fig. 5). In line with previous observations (38, 39), we did not observe any droplets after incubating high concentrations of fluorescein 5-isothiocyanate (FITC)-labeled β-synuclein on a glass surface, whereas we did observe droplets for FITC-labeled α-synuclein (Fig. 5 A and B). As hydrophobic effects are important for α-synuclein droplet formation and considering that β-synuclein lacks the predominantly hydrophobic segment in the NAC region, we reasoned that raising the experimental temperature would increase the strength of residual hydrophobic interactions, allowing the protein to cross the phase barrier. Indeed, β-synuclein formed micrometer-sized droplets when the temperature was raised by 10 °C and at high concentrations (Fig. 5C). Droplets formed by FITC–β-synuclein were initially liquid-like, as evidenced by fluorescence recovery after photobleaching (FRAP), but showed rapid conversion to a gel-like state (Fig. 5C). The phase separation behavior of β-synuclein illustrates that protein phase separation is highly dependent on the experimental conditions, that proteins with a predicted pLLPS below the threshold (pLLPS ≥ 0.61) require more extreme conditions to adopt the droplet state, and that the droplet state of these proteins is generally short-lived.

Region-specific phase behavior of α-synuclein and β-synuclein. (A) FITC-labeled β-synuclein (pLLPS 0.54), which lacks the characteristic NAC region found in α-synuclein, does not phase separate at high concentrations (200 μM) and under crowding conditions (10% [weight/volume] PEG), whereas FITC-labeled α-synuclein (pLLPS 0.62) readily forms droplets under the same conditions. (B) Increasing the experimental temperature by 10 °C does lead to rapid coalescence of β-synuclein into micrometer-sized droplets. (C) Rapid FRAP of a small area within a droplet (Top) 1 min after phase separation FRAP 3 min after phase separation (Bottom) and a nonlinear fit of fractional fluorescence recovery over time (Right). (Scale bars, 10 μm [A and B] and 5 μm [C].)

As an additional test of our predictions, we ranked a set of proteins associated with Alzheimer’s disease (42) based on their predicted FuzDrop scores (Dataset S6) and selected one of the top candidates, complexin-1, to experimentally test our predictions (Fig. 6). To assess whether complexin-1 can form droplets through liquid–liquid phase separation, we incubated Alexa 488-labeled complexin-1 on a glass surface under crowding conditions at physiological pH (Materials and Methods). After a brief lag phase (<1 min), complexin-1 formed micrometer-sized droplets in suspension (Fig. 6A). The droplets were characteristic of a liquid phase, as they showed distinct wetting behavior after prolonged incubation (>10 min) (Fig. 6A) and fused upon making contact (Fig. 6B). Furthermore, molecules within the droplets showed local rearrangement, as evidenced by rapid FRAP (Fig. 6C). We also predicted that the disordered N-terminal region of complexin-1 drives its liquid–liquid phase separation (SI Appendix, Fig. S4). This region cooperatively interacts with the SNARE complex and plasma membrane (43) to facilitate synaptic vesicle fusion (44). Phase separation may contribute to activation of complexin-1 by relieving its autoinhibition, which is a common mechanism by the droplet state (21).

Complexin-1 undergoes liquid–liquid phase separation. (A) Alexa 488-labeled complexin-1 (10 μM) coalesces into micrometer-sized droplets under crowding conditions (Left). Droplets exhibit a wetting phenotype when encountering a glass surface (Right). (B) Complexin-1 droplets readily fuse when in close proximity (<1 μm) and relax into a round structure after fusion, as noted by the arrows. (C) Rapid FRAP of a small area within a droplet (Top) nonlinear fit of fractional fluorescence recovery over time (Bottom). (Scale bars, 5 μm [A] and 1 μm [B and C].)

Droplet-Driving and Droplet-Client Proteins in the Human Proteome.

We applied the prediction method to estimate the proteins capable of undergoing spontaneous liquid–liquid phase separation (droplet-driving proteins) in the Swiss-Prot human proteome. We thus ranked the proteins in the human proteome according to their propensity to form the droplet state (Dataset S7), and estimated that about 40% of them are capable of spontaneous droplet formation.

This list contains only about 60% of the human proteins currently associated with membraneless organelles (UNI). This fraction is even lower for proteins identified by high-throughput experiments (HTS), including organelle purification (45, 46), affinity purification (47, 48), immunofluorescence image-based screen (49, 50), and proximity labeling (51, 52) (SI Appendix, Fig. S5). As the FuzDrop approach was developed for proteins that drive droplet formation, our results indicate that membraneless organelles contain also proteins that undergo phase separation by being driven by a partner (droplet-client proteins). We observed that droplet clients have a lower conformational disorder in both free and bound states (Fig. 3), suggesting the involvement of distinguished, local motifs. Thus, the droplet-client mechanism can provide a route for structured proteins to be engaged in condensates via specific droplet-promoting regions.

To investigate the properties underlying the droplet-client mechanism, we analyzed the presence of long and short droplet-promoting regions in the droplet-driver (LLPS) and droplet-client (LPS-C) datasets (Table 1). We found that ∼90% of droplet-client proteins contain a short droplet-promoting region (≥10 residues), while only ∼60% have long ones (≥25 residues). The frequency of short and long droplet-promoting regions in proteins, identified by high-throughput experiments, is comparable to droplet-client proteins (Table 1), indicating that they follow a partner-induced client mechanism. In contrast, the frequency of droplet-promoting regions in proteins associated with human membraneless organelles is comparable to droplet drivers (Table 1). Considering their lower droplet-promoting propensities (Dataset S7), these results indicate that proteins in membraneless organelles likely follow both driver and client mechanisms.

Percentage in different datasets of proteins containing regions predicted to be droplet promoting

Overall, we thus estimate that over 80% of the proteins in the human proteome contain regions that can mediate droplet formation. Half of these proteins can condensate spontaneously, while the other half can do so by interacting with other components (Table 1). We have also observed that the number of droplet-promoting regions is comparable in proteins observed to form droplets under physiological conditions or detected by in vitro experiments (SI Appendix, Fig. S2), corroborating the relevance of the predictions under cellular conditions. We then extended these results to other organisms (Dataset S8), leading to the suggestion that the droplet state is a proteome-wide phenomenon.


Hydrophobic proteins in the body? - Biology

Proteins have complex and dynamic shapes. The function of a protein is determined by its structure a change in the protein’s activity involves a change in some portion of the protein’s structure (shape). What, then, determines a protein’s structure?

Proteins are assembled as a linear chain of amino acids covalently linked by peptide bonds. As this chain is being assembled (each subsequent amino acid is added onto the free carboxy- terminus of the nascent polypeptide chain), the polypeptide chain begins to fold.

Biologists distinguish 4 levels of protein structure. Students should be able to identify the four levels of protein structure, and the molecular forces or interactions responsible for stabilizing each level of structure.

Four levels of protein structure

Primary – the linear sequence of amino acids, held together by covalent peptide bonds.

Secondary – alpha helices and beta sheets, stabilized by hydrogen bonds between peptide backbone amino groups and carboxyl groups of amino acids within the same polypeptide chain, but not immediately next to each other.

Tertiary – overall 3-D shape of the folded polypeptide chain, that can be described as the spatial relationships of the secondary structure elements linked by loops. Stabilized by various types of amino acid side chain interactions, including: hydrophobic and van der Waals interactions, hydrogen bonding, ionic bonds, covalent disulfide bonds between cysteine residues, and interactions with solvent water molecules.

Quaternary – assemblage of two or more folded polypeptides into a functional unit. Stabilized by interchain hydrophobic and van der Waals interactions, hydrogen bonding, ionic bonds, and covalent disulfide bonds between cysteine residues on different polypeptide chains.

1. The classic case exploring protein structure is hemoglobin. Functional hemoglobin is a tetramer, consisting of two alpha-globin and two beta-globin polypeptide chains. Hemoglobin also requires a cofactor, heme (also called a prosthetic group), containing an iron atom that binds oxygen.

a) What levels of protein structure does hemoglobin exhibit?

b) The most common sickle-cell disease mutation changes a glutamic acid (a negatively charged amino acid) in beta-globin to valine (a hydrophobic amino acid). Where would you most commonly expect to find a charged amino acid like glutamic acid, in the interior of the folded protein or on the surface?

c) Which of the following changes do you think might also cause sickle-cell disease?

i) the glutamic acid changes to an aspartic acid, a different negatively charged amino acid

ii) the glutamic acid changes to a lysine, a positively charged amino acid

iii) the glutamic acid changes to a tryptophan, a hydrophobic amino acid

iv) the glutamic acid changes to a serine, an uncharged, hydrophilic amino acid

d) Sickle cell hemoglobin mutations alter what levels of protein structure (when sickling of red blood cells is apparent)?

2. The most common mutation associated with cystic fibrosis causes a single amino acid, a phenylalanine, to be omitted from the protein called CFTR (cystic fibrosis transmembrane conductor). The CFTR protein functions as a chloride channel in the membrane, formed as the single long CFTR polypeptide chain crosses the membrane back and forth several times. The absence of this phenylalanine, which has a large hydrophobic side chain, causes the protein to be mis-folded. This mis-folded protein is recognized by the cellular quality control system and sent to the cellular recycling center (the proteasome) only about 1 percent makes it to the proper destination, the plasma membrane. My case study is published as a blost post:

3. Microbes that live in extreme environments of temperature, salt and pH have proteins that are adapted for structural stability in these extreme environments.

Edits: 9/30/12 – replaced link to single video with 2 split videos of protein structure and sickle cell hemoglobin


Materials and Methods

Protein Engineering.

Glu-containing variants of the Δ+PHS variant of SNase were prepared with QuickChange site-directed mutagenesis on a pET24A+ vector as described previously (9, 16). Purification was performed as described previously (33).

Thermodynamic Stability Measurements.

Stability measurements were performed with guanidinium chloride titrations using an Aviv Automated Titration Fluorimeter 105, as described previously (34). Linkage analysis of pH dependence of stability to obtain pKa values was performed as described previously (9, 10, 12).

Optical Spectroscopy.

pH titrations monitored with CD at 222 nm or with Trp fluorescence were performed with an Aviv Automated Titration Fluorimeter model 105 and with an Aviv circular dichroism spectrometer model 215, respectively. The experiments were performed following protocols published previously (34).


Difference Between Hydrophilic and Hydrophobic

Solvents, mixtures, compounds, and particles are just some of the components of a chemist’s life. Studies involving the observance of molecule behavior in any given state or environment may seem to be one of the most brain-whacking jobs for those with little background in chemistry and related sciences, but these are very helpful in coming up with the latest products and developments in various industries.

Chemists, biologists, and other individuals pursuing a career in the field of science, start their career by attaining the necessary training from universities and colleges. When they decide to have a career related to biochemistry, their education starts with lessons that give them a deeper understanding of molecular activities and behavior.

That being said, it is safe to assume that the basic courses offered during their first year of college include an evaluation of the hydrophobic and hydrophilic nature of molecules and other particles.

The word “hydro-” means “water.” Thus, studying hydrophobic and hydrophilic molecules concerns the solubility and other properties of particles as they interact with water. The term “–phobic,” originating from “phobia,” would translate into “fearful of (water).” Hydrophobic molecules and particles, therefore, can be defined as those that do not mix with water – they repel it. On the other hand, hydrophilic molecules are those that interact well with H2O.

In other words, the distinction between hydrophobic and hydrophilic molecules is drawn by observance of the hydrophobic particles’ repellency of water and hydrophilic molecules’ attraction to water.

In a laboratory experiment, for example, one can observe that there are particular solubles that dissolve in water, and others that do not. Crushed and powdered makeup, for example, may be able to dissolve in a glass full of cooking oil, but not in a glass full of water. Salt, on the other hand, is easily absorbed by water, but it may not dissolve in oil.

Crushed and powdered makeup, therefore, can be seen as hydrophobic particles. Meanwhile, students can arrive at the conclusion that the molecules of salt are hydrophilic. Salt can keep a strong affinity in water, which can absorb and dissolve it. On the other hand, the oil-based makeup contains molecules that repel and refuse to combine with the molecules of water.

Aside from laboratory experiments, this molecular behavior in reference to the hydrophobic and hydrophilic nature is also observed when biologists look into the permeability of cell membranes. Note that several particles may enter and exit the cell through the membrane, which is made of lipid bilayers and proteins.

When the particles are hydrophobic, a simple passive diffusion occurs, which means that the molecule does not need the exertion of energy to enter or exit the cell. This is because the cell membrane comes with hydrophobic components that match the molecules.

On the other hand, hydrophilic particles may need protein carriers for facilitated diffusion. This is because the components of the molecules reject those of the cell membrane.

To get a clearer understanding of this, picture a glass of water and a glass of cooking oil. When water is added to the oil, there is repulsion between the molecules. But when one puts water into water and oil into oil, no reaction will be observed.

Organic chemistry provides an explanation for this phenomenon. Note that water contains polar molecules it therefore follows that polar substances and particles get absorbed or attracted by H2O. Hydrophilic molecules are known to be polar and ionic – they have positive and negative charges, which can attract water molecules. Conversely, hydrophobic particles are known to be non-polar.

Summary:

1.Hydrophilic means water loving hydrophobic means resistant to water.
2.Hydrophilic molecules get absorbed or dissolved in water, while hydrophobic molecules only dissolve in oil-based substances.
3.Hydrophilic molecules require facilitated diffusion, while hydrophobic molecules are suitable for passive diffusion in cellular activities.
4.Hydrophilic molecules are polar and ionic hydrophobic molecules are non-polar.