23 Feb 2023
by Richa

Summary- paper 21: Protein complexes in cells by AI-assisted structural proteomics

Francis J O’Reilly, Andrea Graziadei, Christian Forbrig, Rica Bremenkamp, Kristine Charles, Swantje Lenz, Christoph Elfmann, Lutz Fischer, Jörg Stülke, Juri Rappsilber

Molecular Systems Biology, 2023

Questions/gaps addressed:

  • Although AI tools such as Alphafold multimer are invaluable, predicting the interaction interfaces of all possible combinations of protein pairs is prohibitively expensive and computationally impractical. Proteins also form complexes involving much larger numbers of subunits. How can we leverage the power of AI tools to benefit experimental data analysis?

  • Existing validation tools for PPIs (eg, two-hybrid, affinity purification mass spectrometry (AP-MS) and co-fractionation MS studies (CoFrac-MS)) provide little topological or structural information, often even leaving open if an interaction is direct or indirect.

Major hypotheses:

  • Combining complementary techniques including in-cell crosslinking to discover high-confidence direct protein interactions without genetic modification and to accurately predict and validate corresponding structural models can provide richer PPI information with higher accuracy.

Key methods:

  • Generated a whole-cell interaction network using crosslinking and co-fractionation mass spectrometry in B. subtilis with the membrane-permeable crosslinker DSSO. Cells were treated with crosslinker, lysed, the proteins fractionated and trypsin digested, and the resulting peptides separated by cation exchange and size exclusion chromatography prior to mass spectrometry.

  • Generated a comprehensive PPI candidate list (predictions from crosslinking MS, CoFrac-MS, and SubtiWiki) for system-wide structure modeling with AlphaFold-Multimer (version 2.1), added known PPIs that lack structural information to our experimentally identified PPIs.

  • Models were assessed for overall predicted TM score (pTM) and interface predicted TM score (ipTM). pTM reports on the accuracy of prediction within each protein chain and ipTM on the accuracy of the complex. A TM-score of 0.5 is broadly indicative of a correct fold/domain prediction, while scores above 0.8 correspond to models with matching topology and backbone path. An ipTM > 0.85 has proven reliable in other analyses when compared to known interface TM score and the DockQ docking quality score.

  • Employed a noise model in which 300 B. subtilis proteins from the datasets were predicted as pairs with random E. coli proteins. The ipTM distribution of the resulting decoy PPIs was compared with 10 subsamples of our AlphaFold-Multimer predictions. No decoy PPIs reported an ipTM > 0.85.

Major takeaways:

  • Describe genetic-free approaches for protein–protein interaction screening capable of producing large numbers of novel protein–protein interactions along with their topologies by fixing interactions in cells. The experimental approaches yielded 44 high-quality PPI models (ipTM > 0.85).

  • Crosslinking MS providing the highest “hit rate” for structural modeling of novel PPIs. 12% of crosslinking MS PPIs lead to models with ipTM > 0.85, 4% of CoFrac-MS, and 11% of the SubtiWiki dataset.

  • Found a strong correlation between ipTM and restraint satisfaction of heteromeric crosslinks, despite the fact that crosslinking information was not used in AlphaFold model prediction. Crosslink violation is especially low with ipTM > 0.85, indicating that high-confidence models agree with the residue–residue distances observed in situ.

  • A total of 80 of 153 high-quality dimers include proteins with transmembrane domains, and membrane proteins are present in half of the predicted trimer structures.