Tag Archives: protein complexes

Announcement: EMBO practical course on protein interaction analysis in Budapest

Later this year, I will once again be one of the teachers on the long-running EMBO practical course “Computational analysis of protein-protein interactions: Sequences, networks and diseases”. The 2016 version of the course will be taking place on May 30 – June 4 in Budapest, Hungary, and the application deadline is February 1.

For more details see the course website or the poster below.



Announcement: EMBO practical course on protein interaction analysis in South Africa

I very much look forward to once again be part of the team of teachers behind the EMBO practical course “Computational analysis of protein-protein interactions: From sequences to networks”. This time it will for the first time take place on the African continent, more specifically in Cape Town, South Africa. The course will take place from September 23 – October 3 and the application deadline is July 23.

Please check the course website or the poster below for details.

Course poster

Announcement: PTMs in Cell Signaling conference

Two years ago, I was one of the organizers of the 2nd Copenhagen Bioscience Conference entitled PTMs in Cell Signaling. I think it is fair to describe it as a highly successful meeting, and it is my great pleasure to announce that we will be organizing a second meeting on the topic September 14-18, 2014.

CBC6 poster

My co-chairs Jeremy Austin Daniel, Michael Lund Nielsen, and Amilcar Flores Morales have managed to put together the following excellent lineup of invited speakers:

Alfonso Valencia, Chris Sander, David Komander, Gary Nolan, Genevieve Almouzni, Guillermo Montoya, Hanno Steen, Henrik Daub, John Blenis, John Diffley, John Tainer, Karolin Luger, Marcus Bantscheff, Margaret Goodell, Matthias Mann, Michael Yaffe, Natalie Ahn, Pedro Beltrao, Stephen Elledge, Tanya Paull, Tony Hunter, Yang Shi, Yehudit Bergman, and Yosef Shiloh.

All conference expenses are covered, which means that there will be no registration fee and no expenses for accommodation or food. You will have to cover your own travel expenses, though.

Participants will be selected based on abstract submission, which is open until June 9, 2014. For more information please see the conference website.

Analysis: Markov clustering and the case of the unsupported protein complexes

In 2006, Krogan and coworkers published a paper in Nature describing a global analysis of protein complexes in budding yeast. This resulted in a network of 7,123 protein-protein interactions involving 2,708 proteins, which was organized into 547 protein complexes using the Markov clustering algorithm.

Considering my previous two posts, it probably comes as a surprise to nobody that I wanted to check if the issue of unnatural clusters also affected this study. Albert Palleja, a postdoc in my group, thus extracted the 547 sub-networks corresponding the protein complexes and applied single-linkage clustering to check if all clusters corresponded to connected sub-networks.

It turned out that 9 of the 547 protein complexes do not correspond to connected sub-networks in the protein interaction network that formed the basis for the clustering. Two complexes each contain two additional subunits that have no interactions with any of the other subunits of the proposed complex, five complexes contain one additional subunit with no interactions to other subunits, and two complexes are proposed hetero-dimers made up of subunits that do not interact according to the interaction network. These complexes are visualized in the figure below with the erroneous subunits highlighted in red:

To check if these additional subunits are in any way supported by the experimental data presented in the paper, I downloaded the set of raw purification from the Krogan Lab Interactome Database. For 4 of the 9 complexes, the additional subunits are weakly supported by at least one purification. It should be noted, however, that this evidence was not judged to be sufficiently reliable by the authors themselves to include the interaction in the core network based on which the complexes were derived.

To make a long story short, this analysis shows that 9 of the 547 protein complexes published by Krogan and coworkers contain one or more subunits that are not supported by the interaction network from which the complexes were derived. Of these, 5 complexes contain subunits that have no support in the underlying experimental data, and which are purely artifacts of using the MCL algorithm without without enforcing that clusters must correspond to connected sub-networks.

Analysis: Four complementary yeast interactomes

The latest issue of Science features a paper by Yu et al. in which they report the results of a comprehensive yeast two-hybrid (Y2H) screen for interactions between budding yeast proteins. Just a few months earlier, Science published a paper by Tarassov et al. that describes a similar screen performed using a novel protein fragment complementation assay (PCA). Peer Bork and I wrote a Perspectives piece on these two papers, showing that the different assays for detecting protein interactions are complementary in the sense that they capture interactions for different subsets of the proteome. For example, PCA detects many interactions for membrane proteins whereas Y2H detects many interactions for nuclear proteins.

As part of writing the Perspectives piece, I performed numerous analyses that were not included in the final publication, because they were either too technical for a broad audience, not interesting enough to spend valuable space on, or would involve additional figures. Thankfully, my blog imposes no limitations on the number of words or figures (nor is it required that the content is interesting, although that is desirable).

The comparison included, in addition to the two interactomes introduced above, a third interactome that consists of all the high-confidence interactions identified by Gavin et al. and Krogan et al. using the tandem affinity purification (TAP) method. Also included in the comparison (but not in the Perspectives piece) was the literature-curated (LC) set of interactions published by Reguly et al. in 2006.

The Venn diagram below shows the overlap of the four interactomes in terms of proteins, that is a protein is considered to belong to an interactome if the method in question suggested at least one interaction partner:

The numbers outside the ellipses specify the total number of proteins for which a given method identified interactions. Notably, the PCA, Y2H, and TAP interactomes cover only approximately one sixth, one third, and half of the yeast proteome, respectively, despite all three assays having been tested on all yeast ORFs. This suggests that only a fraction of proteins can be targeted with a given assay.

A second way to compare the four interactomes is to count their overlaps in terms of pairs of interacting proteins. To provide additional detail, I distinguished between interactions that are not found in a given interactome because one or both proteins are not covered by the interactome in question (dashed lines in the diagrams), and interactions that were not found despite both proteins being covered (full lines in the diagrams). The Venn diagrams below show all twelve pairwise comparisions of the four interactomes:

As expected, the largest overlap is observed when comparing the two largest interactomes (LC and TAP), whereas the smallest overlap is observed when comparing the smallest interactomes (PCA and Y2H). Even if taking into account the differences in terms of protein coverage, however, the the overlaps between the interactomes leave a lot to be desired.

There are several reasons for the poor overlap at the level of pairwise interactions. One is that false positive interactions are unlikely to be reproducible by a different assay. A second is that the assays measure fundamentally different types of interactions: PCA and Y2H measure direct binary interactions between proteins, whereas TAP measures co-complex interactions, that is whether two proteins are part of the same complex or not. This is illustrated in the figure below, which shows the binary and co-complex networks for three different scenarios:

The two types of assays have different strengths and weaknesses. Binary interaction assays can in principle distinguish between the two first complexes, which only differ in that the subunits B and C are in direct contact in first complex but not in the second. However, binary assays are not able to distinguish between the second and the third scenario, that is whether A, B, and C form a single complex (ABC) or two complexes (AB and AC). Conversely, data from co-complex assays are able to answer the latter question but are unable to distinguish between the two first scenarios. The different assays thus complement each other, not only because they are able to interrogate different subsets of the proteome, but also because they provide us with complementary information about the composition and topology of protein complexes.

WebCiteCite this post

Commentary: On large protein complexes and the essentiality of hubs

In 2001, Jeong and coworkers published a paper in Nature in which they showed that the central proteins in interaction networks, that is the proteins with the highest connectivity, are enriched for essential proteins. This publication has been highly influential as evidenced by the numerous subsequent publications on the importance of “hub” proteins. Several hypothesis have been published that try to explain why hubs are essential, for example that certain protein interactions are essential and that a protein with many interactions is thus more likely to be involved in at least one essential interaction (He and Zhang, 2006).

Yesterday, Zotenko and coworkers published a paper in PLoS Computational Biology in which they take a closer look at the cause of this phenomenon:

Why Do Hubs in the Yeast Protein Interaction Network Tend To Be Essential: Reexamining the Connection between the Network Topology and Essentiality.

The centrality-lethality rule, which notes that high-degree nodes in a protein interaction network tend to correspond to proteins that are essential, suggests that the topological prominence of a protein in a protein interaction network may be a good predictor of its biological importance. Even though the correlation between degree and essentiality was confirmed by many independent studies, the reason for this correlation remains illusive. Several hypotheses about putative connections between essentiality of hubs and the topology of protein-protein interaction networks have been proposed, but as we demonstrate, these explanations are not supported by the properties of protein interaction networks. To identify the main topological determinant of essentiality and to provide a biological explanation for the connection between the network topology and essentiality, we performed a rigorous analysis of six variants of the genomewide protein interaction network for Saccharomyces cerevisiae obtained using different techniques. We demonstrated that the majority of hubs are essential due to their involvement in Essential Complex Biological Modules, a group of densely connected proteins with shared biological function that are enriched in essential proteins. Moreover, we rejected two previously proposed explanations for the centrality-lethality rule, one relating the essentiality of hubs to their role in the overall network connectivity and another relying on the recently published essential protein interactions model.

What Zotenko et al. show is, in other words, that essential hubs tend to be highly connected with each other and hence form large “Essential Complex Biological Modules”. Table 7 in their paper lists the Gene Ontology terms associated with these modules; among the recurring themes are “rRNA metabolic process”, “mRNA metabolic process”, “RNA splicing”, “ribosome biogenesis and assembly”, and “proteolysis”. These Gene Ontology terms obviously correspond to well known protein complexes, namely the RNA polymerases, the spliceosome, the ribosome, and the proteoasome. The analysis of Zotenko et al. thus suggests that the much debated correlation between centrality and essentiality is simply a consequence of the fact that many of the large protein complexes in a eukaryotic cell are essential, which is hardly surprising considering that they have been conserved through more than two billion years of evolution (Brocks et al., 1999).

Edit: For more views on the results of Zotenko et al. see the discussion on FriendFeed.

WebCiteCite this post

Commentary: Does just-in-time assembly of protein complexes explain phenotypes?

Beginning of this year Ben Lehner’s lab published a beautiful study in BMC Systems Biology with the title “A simple principle concerning the robustness of protein complex activity to changes in gene expression”. The abstract reads:


The functions of a eukaryotic cell are largely performed by multi-subunit protein complexes that act as molecular machines or information processing modules in cellular networks. An important problem in systems biology is to understand how, in general, these molecular machines respond to perturbations.


In yeast, genes that inhibit growth when their expression is reduced are strongly enriched amongst the subunits of multi-subunit protein complexes. This applies to both the core and peripheral subunits of protein complexes, and the subunits of each complex normally have the same loss-of-function phenotypes. In contrast, genes that inhibit growth when their expression is increased are not enriched amongst the core or peripheral subunits of protein complexes, and the behaviour of one subunit of a complex is not predictive for the other subunits with respect to over-expression phenotypes.


We propose the principle that the overall activity of a protein complex is in general robust to an increase, but not to a decrease in the expression of its subunits. This means that whereas phenotypes resulting from a decrease in gene expression can be predicted because they cluster on networks of protein complexes, over-expression phenotypes cannot be predicted in this way. We discuss the implications of these findings for understanding how cells are regulated, how they evolve, and how genetic perturbations connect to disease in humans.

It struck me that these observations can all be explained by the just-in-time assembly model for temporal regulation of protein complex assembly, which I developed together with members of Søren Brunak’s group. For a long explanation and discussion of the model see our paper “Evolution of Cell Cycle Control: Same Molecular Machines, Different Regulation”. For the short version see the figure below, which shows how cell-cycle regulation of just a single subunit is sufficient to control when during the cell cycle a complex is active (click to enlarge):

The just-in-time assembly hypothesis

What will happen if you knock down the expression of one subunit of a complex? The maximal number of complete complexes that can be assembled will be reduced, irrespective of whether the subunit is dynamic or static. Whether this results in a given phenotype depends on the function of the complex. However, the effect should in principle be the same for different subunits of the same complex, which is exactly what Lehner and coworkers observed.

What if you instead overexpress one subunit of a complex? For a static subunit it should not really matter; the maximal number of complete complexes that can be assembled is unchanged. On the other hand, overexpression of a dynamic subunit may cause the complex to become constitutively active, which could have disastrous consequences for the cell. Overexpression of dynamic and static subunits of the same complex should thus give rise to different phenotypic effects. This would explain the observation by Lehner and coworkers that subunits of the same complex often have different overexpression phenotypes.

If this hypothesis is true, genes that lead to phenotypic effects when overexpressed should preferentially encode dynamic proteins, i.e. many of the genes should be periodically expressed. In fact, this correlation between overexpression phenotype and cell-cycle regulation was already described by the Hughes, Boone and Andrews labs who originally published the dataset on overexpression phenotypes (for details see their paper in Molecular Cell):

Genes expressed periodically during the cell cycle (de Lichtenberg et al., 2005) were more likely to show an overexpression phenotype (p = 0.017), and in particular, this tended to cause abnormal morphology [p < 10-13] or cell cycle arrest [p < 10-14](Table S3). When the analysis is limited to genes known to function in the mitotic cell cycle, we still find that overexpression of periodically expressed genes is more likely to cause cell cycle arrest (p = 0.008) or abnormal morphology (p = 0.006) than constitutively expressed cell cycle genes (Table S3), indicating that unscheduled expression of genes that are usually expressed periodically often leads to toxicity.

The results of the two papers thus point in the direction that the just-in-time assembly hypothesis can explain the qualitatively differences between knock-down and overexpression phenotypes.

WebCiteCite this post