Analysis: Limited agreement among lists of Cdc28p substrates

November 3, 2009

A collaboration between the Morgan lab at UCSF and the Gygi lab at Harvard has resulted in a paper by Holt et al. in Science, which reports the identification of several hundred substrates of the central cell-cycle kinase Cdc28p (also known as Cdk1) in the budding yeast Saccharomyces cerevisiae:

Global analysis of Cdk1 substrate phosphorylation sites provides insights into evolution.

To explore the mechanisms and evolution of cell-cycle control, we analyzed the position and conservation of large numbers of phosphorylation sites for the cyclin-dependent kinase Cdk1 in the budding yeast Saccharomyces cerevisiae. We combined specific chemical inhibition of Cdk1 with quantitative mass spectrometry to identify the positions of 547 phosphorylation sites on 308 Cdk1 substrates in vivo. Comparisons of these substrates with orthologs throughout the ascomycete lineage revealed that the position of most phosphorylation sites is not conserved in evolution; instead, clusters of sites shift position in rapidly evolving disordered regions. We propose that the regulation of protein function by phosphorylation often depends on simple nonspecific mechanisms that disrupt or enhance protein-protein interactions. The gain or loss of phosphorylation sites in rapidly evolving regions could facilitate the evolution of kinase-signaling circuits.

The paper makes several interested in analyses and observations. However, I found the comparison to the previous study of Cdc28p substrates by Ubersax et al. from the Morgan lab to be less detailed than I had hoped for:

Phosphorylation of Cdk1 consensus sites was observed on 67% (122 of 181) of proteins previously identified as Cdk1 substrates in vitro (4). Sixty-six percent (80 of 122) of these proteins contained sites at which phosphorylation decreased (log2 H/L < –1) after inhibition of Cdk1 (only 45 of 122 are expected if there is no correlation between the experiments in vitro and in vivo; χ2 test, P < 10-10).

In other words, 44% (80 of 181) of Cdc28p substrates identified in the old study were confirmed by the new study, and only 26% (80 of 308) of the Cdc28p substrates identified in the new study are supported by the old study. There are many possible explanations for this discrepancy

Depth of the mass spectrometry

It is notoriously difficult to identify peptides from low-abundance proteins in mass spectrometry. In the new mass spectrometry study, the authors were able to map 8710 precise phosphorylation sites on 1957 proteins. However, budding yeast is estimated to express in the order of 4500 distinct proteins during exponential growth (Gavin et al., 2006). Assuming that the majority of these proteins contain sites that are phosphorylated during at least part of the mitotic cell cycle, it is likely that a considerable number of low-abundance Cdc28p substrates identified in the old study have been missed in the new study.

Biases in phosphopeptide enrichment

When doing phosphoproteomics, it is necessary to first enrich for phosphopeptides to improve the coverage. To this end, Holt et al. used immobilized metal affinity chromatography (IMAC). In 2007, the Aebersold group at ETH published a paper showing that different purification methods lead to isolation of different, partially overlapping segments of the phosphoproteome. Specifically, they showed that IMAC enrichment biases the data towards isolation of multiply phosphorylated peptides. Given that only a single purification method was used, it is likely that in vivo Cdc28p substrates may have been missed in the new study, in particular if the peptides contain only a single phosphorylation site.

In vitro vs. in vivo conditions

The old study by Ubersax et al. was done performed on cell lysate, which is an in vitro strategy (although all other proteins expressed during the cell cycle are present). It is thus likely that some of the proteins that are phosphorylated by Cdc28p under these conditions are nonetheless not in vivo Cdc28p substrates.

Can we do better?

As always, it is easy to point out potential flaws in other people’s data sets; however, it is much more constructive to do something about the problems. The challenge is thus to construct a larger and more reliable set of Cdc28p substrates by combining the data from the two studies.

To check the feasibility of assigning confidence scores to different putative Cdc28p substrates, I tested if the fold change observed in the new study correlates with the chance that the substrate was also identified in the old study. To this end, I divided the 308 Cdc28p substrates from the new studies into two groups and constructed histograms of the fold changes for each group:

Phosphorylation ratios from Holt et al.

The fold changes are clearly skewed towards larger negative values for the Cdc28p substrates also identified by the old study relative to the proteins that were not previously identified as Cdc28p substrates. This difference is statistically significant at P < 1% according to the Kolmogorov-Smirnov test. This suggests that the observed fold changes in the new mass spectrometry study correlates with the likelihood that the proteins are true Cdc28p substrates.

The old study gave rise to so-called P-score for the individual proteins (not to be confused with P-values). I decided to test if these too can be used as quality scores, I constructed an equivalent histogram in which the Cdc28p substrates found in the old study were divided into two groups based on whether or not they were also found in the new study:

P-scores from Ubersax et al.

In this case, no obvious trend is seen and a Kolmogorov-Smirnov test indeed reveals no statistically significant difference between the two distributions. Surprisingly, the P-scores do thus not appear to be useful quality scores for the putative Cdc28p substrates.

Given the two sets of putative Cdc28 substrates, only one of which can be ranked by reliability, how can we create a better combined set? If one aims for the high accuracy at the price of low coverage, one could obviously choose to trust only the substrates identified by both screens. However, given the caveats regarding depth of mass spectrometry and biases arising from the enrichment procedure, I would be hesitant to use this approach. Alternatively, one could aim for maximal coverage at the price of accuracy by trusting all sites identified by either study. However, seeing the large fraction of novel substrates identified by Holt et al. with a log2-ratio only slightly below -1, I would personally tend to apply a more stringent threshold to the data from the new study by Holt et al., for example requiring log2-ratio below -2, before merging the sets of substrates from the two studies.

WebCiteCite this post


Editorial: Social network plumbing

August 12, 2009

I guess it is no secret to anyone that Facebook as agreed to acquire FriendFeed. Several people seem puzzled why I left FriendFeed only 3 hours after learning this news. I can understand that this may look like a knee-jerk reaction, but there is logic behind the madness.

The truths is that my existing setup of Web 2.0 services was not working nearly as well as I would like. The sheer amount of content being shared on FriendFeed meant that it was easy to overlook a blog post from one of my favorite bloggers, for which reason I still subscribed to their blogs as RSS feeds. This caused me to waste time because the same posts appear in two place, and I could not filter out the blogs on FriendFeed because most comments would be posted there and not on the blogs. Receiving everyone’s tweets on FriendFeed tended to create a background noise that would drown all other conversation; however, I could also not filter out the Twitter streams on FriendFeed and follow people directly on Twitter instead because many cross-post all their FriendFeed “likes” and/or comments to Twitter!

Given the new situation, it was clear to me that the time had come to fix my broken social network setup and redo the plumbing in such a way that FriendFeed would no longer be responsible for gathering most of the content. Looking at FriendFeed, I discovered that most of the content of interest originated from just three sources: RSS feeds of blogs, Google Reader shared items, and Twitter. By following people directly on Google Reader and Twitter, both of which I was already using on a daily basis, I was thus able to relegate FriendFeed to a much less important role. I still feed my content from other sources into FriendFeed and I occasionally check for comments on my posts; however, it is no longer where I read content posted by others. Coincidentally, the new role of FriendFeed is almost identical to the role that Facebook has played all along.

To make a long story short, I’m not leaving the friendly community at FriendFeed in anger. I still read the content produced and shared by the same people as before. I have just fixed the plumbing.


Analysis: Results from thermal stability shift and competition binding assays correlate well

July 31, 2009

Several large kinase inhibitor screens have been published in recent years. Two of the largest come from Stefan Knapp’s lab and Ambit, respectively. The former group used a temperature shift assay to measure the change in thermal stability of 60 human serine/threonine kinases that is caused by the binding of each of 156 kinase inhibitors (Fedorov et al., 2007). The latter group used a competition a competition binding assay to measure the dissociation constants (Kd) for 38 kinase inhibitors and 290 distinct kinases (Karaman et al., 2008).

The two screens are not directly comparable because one measures temperature shifts whereas the other measures dissociation constants. To see if it possible to convert temperature shift values to Kd values, I asked Damian Szklarczyk (who is a Ph.D. student in my group) to map all data from both screens onto a common set of chemical and protein identifiers, extract all inhibitor-kinase pairs that were measured in both assays, and make a scatter plot of -log(Kd) as function of temperature shift. The result was a set of 704 pairs of temperature shift and Kd values. In the plot below, inhibitor-kinase pairs for which binding was not observed in the competition binding assay were defined to have a Kd of 10 microM, and negative values from the temperature shift assay were treated as zero temperature shift.

Correlation between temperature shift and -log(Kd)

The plot shows that the two assays are in very good agreement, which is surprising considering that the assays are fundamentally very different and were run using different expression constructs for several of the kinases. The linear Pearson correlation coefficient is 0.92 when excluding the one obvious outlier shown in red (BIRB796 vs. MAPK11; this appears to be a false negative in the competition binding assay).

The linear fit gives an intercept with the y-axis of 4.9223, which implies that a temperature shift of zero (i.e. no binding according to the temperature shift assay) does not translate precisely into a Kd of 10 microM (i.e. no binding according to the competition binding assay). We thus did a second linear regression in which we forced the intercept with the y-axis to 5 (red regression line in the plot). We thereby at the calibration function -log(Kd) = 5+0.244*Ts, which allows us to to convert temperature shifts to Kd values. We have thereby managed to put the measurements from the two kinase inhibitor screens onto a common basis that facilitates direct comparison and integration.

Full disclosure: I have an on-going collaboration with Stefan Knapp’s lab related to screening of kinase inhibitor.

WebCiteCite this post


Resource: Second Life Interactive Dendrogram Rezzer (SLIDR)

July 4, 2009

About half a year ago, I began experimenting with Second Life as a tool for virtual conferences (I should add that my experiences have since improved). However, I believe that imitating real life in a virtual world is not necessarily the best way to use the technology – it may be better to use virtual reality for doing the things that are difficult to do in the real world. A good example of this is Hiro’s Molecule Rezzer, which is one of the best known scientific tools in Second Life. It, and its much improved successor Orac, allows people to easily construct molecular models of small molecules in Second Life.

After speaking with several other researchers in Second Life, who like I are interested in evolution, I set out to build a similar tool for visualization of phylogenetic trees. The result is SLIDR (Second Life Interactive Dendrogram Rezzer), which based on a tree in Newick format constructs a dendrogram object. The first version of SLIDR can handle trees both with and without branch lengths; however, I have not yet implemented support for labels on internal nodes or for bootstrap values.

The picture below shows an example of a dendrogram that was automatically generated by SLIDR based on a Newick tree:

SLIDR closeup

There is a bit more to SLIDR than this, though. After the dendrogram has been built, it can be loaded with a photo and/or a sound for each of the leaf nodes. When click on a node, the corresponding sound will be played and the photo will be shown on the associated screen (the white box in front of which I stand):

SLIDR posing

I plan to work with collaborators in Second Life to construct dendrograms for evolution of bats (including their echolocation sounds and photos of the animals) and for the fully sequenced Drosophila genomes. Please do hesitate to contact me if you would like to use SLIDR on another project. I intend to make SLIDR available as open source software once I have implemented support for the full Newick format.

WebCiteCite this post


Resource: STRING v8.1

June 25, 2009

After months of hard work from the entire STRING team – thanks everyone -  I am pleased to be able to say that STRING v8.1 has now been put into production. Here is a screen shot of the start page:

STRING 8.1 start page

This is a minor release of STRING, which means that the imported databases of microarray expression data, protein interactions, genetic interactions, and pathways as well as text-mining evidence have all been updated. We have also fixed a bug that affected the minority of bacteria that have multiple chromosomes.

Another notable feature of STRING v8.1 is the new interactive network viewer that is implemented in Adobe Flash:

STRING 8.1 network viewer

For further details please see the post on the official STRING/STITCH blog.

WebCiteCite this post


Analysis: On the evolution of protein length and phosphorylation sites

June 25, 2009

It has been much too long since I have last written a blog post. Part of the reason has been that I have been busy moving back to Denmark, starting up a research group, and co-founding a company. More on that in other blog posts. The main reason, however, has been a lack of papers that inspired me to do the simple follow-up analyses that I usually blog about.

This has thankfully changed now. Pedro Beltrao and coworkers recently published an interesting paper in PLoS Biology on the evolution of regulation through protein phosphorylation. The paper presents several interesting analyses and comparisoins of phosphoproteomics data from three yeast species; the abstract summarizes the findings better than I can do:

Evolution of Phosphoregulation: Comparison of Phosphorylation Patterns across Yeast Species
The extent by which different cellular components generate phenotypic diversity is an ongoing debate in evolutionary biology that is yet to be addressed by quantitative comparative studies. We conducted an in vivo mass-spectrometry study of the phosphoproteomes of three yeast species (Saccharomyces cerevisiae, Candida albicans, and Schizosaccharomyces pombe) in order to quantify the evolutionary rate of change of phosphorylation. We estimate that kinase–substrate interactions change, at most, two orders of magnitude more slowly than transcription factor (TF)–promoter interactions. Our computational analysis linking kinases to putative substrates recapitulates known phosphoregulation events and provides putative evolutionary histories for the kinase regulation of protein complexes across 11 yeast species. To validate these trends, we used the E-MAP approach to analyze over 2,000 quantitative genetic interactions in S. cerevisiae and Sc. pombe, which demonstrated that protein kinases, and to a greater extent TFs, show lower than average conservation of genetic interactions. We propose therefore that protein kinases are an important source of phenotypic diversity.

Figure 1a in the paper shows the intriguing observation that, despite rapid evolution of individual phosphorylation sites, the relative number of phosphorylation sites within proteins from different functional classes (Gene Ontology categories) remains remarkably constant between species:

Beltrao et al., PLoS Biology, 2009, Figure 1a

However, it occurred to me that this could potentially be a consequence of longer proteins having more phosphorylation sites, and protein length being conserved through evolution. I thus counted the number of unique phosphorylation sites identified in each protein (thanks to Pedro Beltrao for providing the data) and correlated it with the length of the proteins. In the two plots below, I have pooled the proteins so that each dot corresponds to 100 proteins. The upper and lower panels show the results for S. cerevisiae and S. pombe, respectively:

Number of phosphorylation sites vs. protein lengh for S. cerevisiae

Number of phosphorylation sites vs. protein length for S. pombe

As should be evident from the plots, the average number of phosphorylation sites in a protein correlates strongly with its length, which is by no means surprisings. It is unclear to me why the intercept with the y-axis appears to differ from zero in both plots; suggestions are welcome.

The next question was whether the Gene Ontology terms that correspond to proteins with many phosphorylation sites are indeed assigned to proteins that are longer than average. I thus examined the terms “Cell budding”, “Morphogenesis”, and “Signal transduction”.

The average S. cerevisiae protein is 450 aa long. Proteins annotated with “Cell budding”, “Morphogenesis”, and “Signal transduction” are on average 1.6 (739 aa), 2.1 (945 aa), and 1.5 (679 aa) times longer, respectively. By comparison, the corresponding ratios observed for phosphorylation sites are approximately 2.3, 2.6, and 2.4. It would thus appear that differences in protein length between functional classes of proteins account for much, but not all, of the signal that was observed by Beltrao et al. when comparing the number phosphorylation sites.

Edit: Make sure to read Pedro Beltrao’s follow-up blog post, which nicely confirms that whereas protein length does play a role, it is not the full story.

WebCiteCite this post


Update: The BuzzCloud for 2008

January 19, 2009

Yes, it is that time of the year again – we are now almost three weeks into 2009, most papers published in 2008 have hopefully made it into Medline, and it is time to reveal the words of 2008. In other words, I have updated the BuzzCloud resource and here is the result for 2008 (click on the image to go to the web resource):

BuzzCloud 2008

I am thrilled to see the outcome. Without any cheating or tweaking, several buzzwords related to proteomics make it on the list with “phosphoproteomics” and “quantitative phosphoproteomics” being the two most prominent of them. Nice for me to see considering that my new research group at the Novo Nordisk Foundation Center for Protein Research will focus heavily on improving and applying the NetworKIN and NetPhorest resources for analysis of phosphoproteomics data.


Editorial: Virtual conferences in Second Life

January 18, 2009

This blog has been very quiet for a long time. There are several reasons for this, most of which are positive: I have not had many boring or negative results to write blog posts about, I have been busy writing manuscripts about the positive results instead, and I have moved to Copenhagen where I am busy starting my own research group at the Novo Nordisk Foundation Center for Protein Research. There is also one more reason for the absence of blog posts from me: I have spent a lot of time experimenting with Second Life, and that is the topic of this blog post.

I first got interested in Second Life when I heard that Nature Publishing Group was setting up a virtual conference center called Elucian Islands. In the beginning I felt very alone on Elucian Islands. There was a good reason for that – I was alone most of the time. My view on Second Life was thus that it was pretty (see images below) but rather useless.

I obviously took a look at the SciFoo presentations (seen in the background of the image above) and the other scientific displays at Elucian Islands and elsewhere in Second Life. However, these mostly reinforced my negative view of Second Life being fairly useless, since almost everything I saw was already being served better by dedicated resources. For example, slide shows are much more conveniently viewed and shared in SlideShare than in Second Life, and 3D protein structures can be examined and analyzed better in programs such as PyMOL.

Over at FriendFeed, Jean-Claude Bradley fought a brave fight trying to convince me that Second Life is in fact useful for science. His key point was that Second Life is all about interacting with people, so I should try to go to some scientific events in Second Life. Sadly, there are still not many such events, and although they have changed my view on Second Life, they have also shown that there are many problems that remain to be solved.

The first virtual seminar I went to was “Cancer, Cell Cycle, and Check Points” organized by Digi S Lab. This was a perfect match since I work on cell-cycle regulation myself. The seminar consisted of two excellent presentations given by Letizia Cito from Sbarro Health Research Organization and Fayamdria Foley from the American Cancer Society.

Meeting on Cancer and Cell Cycle 1

Meeting on Cancer and Cell Cycle 2

Whereas the presentations were great, the seminar also illustrated several of the problems that need to be overcome before virtual conferences in Second Life are ready for prime time. When the first talk started, I could not see any of the slides. Restarting my Second Life client did not solve the problem, nor did a reboot of my computer. After giving up solving the problem, the entire region in which the seminar took place suddenly crashed causing speakers and participants to all be logged out. When it came back online after some minutes and everyone had found their way back, I could suddenly see the slides. Even then, however, they took so long to appear on my screen that the presenter had typically explained half of what was on a slide by the time I could see the slide. I see this as a major problem that must be solved before Second Life conferences can work properly – it must be possible to change slides without a noticeable delay.

The second event I went to was the “ESRC Complexity Research Seminar in Second Life” that took place at Elucian Islands. This seminar was very different from the one described above in that it was not a purely virtual seminar; instead it was a video feed from a real-world seminar that was being transmitted into Second Life. Think of it as a virtual overflow room – the image below shows the people who had gathered shortly before the event started.

ESRC Complexity Research Seminar

Sadly, this event was marred by technical problems. The sound stream was of such poor quality that the Second Life participants could barely understand a word of what the speakers were saying, and the video stream was of too low quality to be able to read their slides. I do not want to dwell on this but just note that good quality microphones and cameras are a prerequisite for streaming events into Second Life.

The third event I went to was the “Virtual Conference on Climate Change and CO2 Storage“, which again took place at Elucian Islands. This was again a mixed event taking place both in the real world and in Second Life. The presentations were excellent and important lessons had been learned from the previous events. The microphones worked perfectly this time, and the video feed had been abandoned in favor of showing a copy of the actual slides in Second Life, which greatly improved the readability.

In addition to these events, Elucian Islands now also runs regular events such as the weekly Nature Podcast event where a fairly large group of people gather to listen to the latest podcast shortly after it has been released (image from Joanna Scott’s blog).

Nature Podcast at Elucian Islands

Regular events are crucial in SL because they bring people together in the same place at the same time. The need for people to be online at the same time is in my view one of the major drawbacks of Second Life compared to other tools that researchers can use for social networking. In my view Second Life should thus not be seen as competing with tools like FriendFeed or Twitter, which you can read when you feel like it, but rather as virtual reality alternative to video conferences. I think that Nature Publishing Group is on the right track with this, and I hope that the few remaining technical hurdles will be overcome in the near future.

Full disclosure: I have been working with the staff from Nature Publishing Group trying to solve technical challenges on Elucian Islands.


Analysis: Four complementary yeast interactomes

October 4, 2008

The latest issue of Science features a paper by Yu et al. in which they report the results of a comprehensive yeast two-hybrid (Y2H) screen for interactions between budding yeast proteins. Just a few months earlier, Science published a paper by Tarassov et al. that describes a similar screen performed using a novel protein fragment complementation assay (PCA). Peer Bork and I wrote a Perspectives piece on these two papers, showing that the different assays for detecting protein interactions are complementary in the sense that they capture interactions for different subsets of the proteome. For example, PCA detects many interactions for membrane proteins whereas Y2H detects many interactions for nuclear proteins.

As part of writing the Perspectives piece, I performed numerous analyses that were not included in the final publication, because they were either too technical for a broad audience, not interesting enough to spend valuable space on, or would involve additional figures. Thankfully, my blog imposes no limitations on the number of words or figures (nor is it required that the content is interesting, although that is desirable).

The comparison included, in addition to the two interactomes introduced above, a third interactome that consists of all the high-confidence interactions identified by Gavin et al. and Krogan et al. using the tandem affinity purification (TAP) method. Also included in the comparison (but not in the Perspectives piece) was the literature-curated (LC) set of interactions published by Reguly et al. in 2006.

The Venn diagram below shows the overlap of the four interactomes in terms of proteins, that is a protein is considered to belong to an interactome if the method in question suggested at least one interaction partner:

The numbers outside the ellipses specify the total number of proteins for which a given method identified interactions. Notably, the PCA, Y2H, and TAP interactomes cover only approximately one sixth, one third, and half of the yeast proteome, respectively, despite all three assays having been tested on all yeast ORFs. This suggests that only a fraction of proteins can be targeted with a given assay.

A second way to compare the four interactomes is to count their overlaps in terms of pairs of interacting proteins. To provide additional detail, I distinguished between interactions that are not found in a given interactome because one or both proteins are not covered by the interactome in question (dashed lines in the diagrams), and interactions that were not found despite both proteins being covered (full lines in the diagrams). The Venn diagrams below show all twelve pairwise comparisions of the four interactomes:

As expected, the largest overlap is observed when comparing the two largest interactomes (LC and TAP), whereas the smallest overlap is observed when comparing the smallest interactomes (PCA and Y2H). Even if taking into account the differences in terms of protein coverage, however, the the overlaps between the interactomes leave a lot to be desired.

There are several reasons for the poor overlap at the level of pairwise interactions. One is that false positive interactions are unlikely to be reproducible by a different assay. A second is that the assays measure fundamentally different types of interactions: PCA and Y2H measure direct binary interactions between proteins, whereas TAP measures co-complex interactions, that is whether two proteins are part of the same complex or not. This is illustrated in the figure below, which shows the binary and co-complex networks for three different scenarios:

The two types of assays have different strengths and weaknesses. Binary interaction assays can in principle distinguish between the two first complexes, which only differ in that the subunits B and C are in direct contact in first complex but not in the second. However, binary assays are not able to distinguish between the second and the third scenario, that is whether A, B, and C form a single complex (ABC) or two complexes (AB and AC). Conversely, data from co-complex assays are able to answer the latter question but are unable to distinguish between the two first scenarios. The different assays thus complement each other, not only because they are able to interrogate different subsets of the proteome, but also because they provide us with complementary information about the composition and topology of protein complexes.

WebCiteCite this post


Analysis: Cell-cycle-regulated proteins are more abundant in haploid relative to diploid cells

September 30, 2008

Two days ago, Matthias Mann’s group published a paper in Nature in which they compare the level of individual proteins in haploid relative to diploid budding yeast cells:

Comprehensive mass-spectrometry-based proteome quantification of haploid versus diploid yeast

Mass spectrometry is a powerful technology for the analysis of large numbers of endogenous proteins. However, the analytical challenges associated with comprehensive identification and relative quantification of cellular proteomes have so far appeared to be insurmountable. Here, using advances in computational proteomics, instrument performance and sample preparation strategies, we compare protein levels of essentially all endogenous proteins in haploid yeast cells to their diploid counterparts. Our analysis spans more than four orders of magnitude in protein abundance with no discrimination against membrane or low level regulatory proteins. Stable-isotope labelling by amino acids in cell culture (SILAC) quantification was very accurate across the proteome, as demonstrated by one-to-one ratios of most yeast proteins. Key members of the pheromone pathway were specific to haploid yeast but others were unaltered, suggesting an efficient control mechanism of the mating response. Several retrotransposon-associated proteins were specific to haploid yeast. Gene ontology analysis pinpointed a significant change for cell wall components in agreement with geometrical considerations: diploid cells have twice the volume but not twice the surface area of haploid cells. Transcriptome levels agreed poorly with proteome changes overall. However, after filtering out low confidence microarray measurements, messenger RNA changes and SILAC ratios correlated very well for pheromone pathway components. Systems-wide, precise quantification directly at the protein level opens up new perspectives in post-genomics and systems biology.

Although the paper focuses on the larger amount of cell-wall proteins and proteins involved in pheromone response in haploid cells, the supplementary tables reveal similar biases for many other functional classes, including nucleosomes and cyclin-dependent kinase inhibitors. As many of these proteins are regulated during the cell cycle, I suspected that cell-cycle-regulated proteins might be more abundant in haploid cells relative to diploid cells.

To test this hypothesis, I divided the proteins quantified by the Mann group into two classes: dynamic proteins, which are encoded by genes that are periodically expressed during the cell cycle, and static proteins, which are encoded by genes that are expressed at a constant level (de Lichtenberg et al., 2005). For each class, I plotted the log2-ratios of the protein levels in haploid and diploid cells:

The plot reeals a quite strong shift of dynamic proteins toward higher log-ratios; this difference is highly significant according to the Mann-Whitney U test (P < 10-12). Proteins encoded by cell-cycle-regulated genes are thus in general more abundant in haploid budding yeast cells than in diploid cells.

Full disclosure: I currently collaborate with Matthias Mann and members of his group, and we will soon be colleagues a the Novo Nordisk Foundation Center for Protein Research.

WebCiteCite this post