Complexity and the Language of Proteins

sh2-webAll of the animal life on Earth, including human beings, can be traced back to a unicellular ancestor somewhat similar to the modern-day protozoa. In one sense, the hundreds of millions of years of evolution is the story of how organisms became more and more complex, growing from a single cell to trillions of highly specialized cells forming different organs and tissues in a single body. Yet while you could easily tell a protozoa from a human in a police lineup, cells from the two species are made up of many of the same proteins, performing similar jobs. What changed to produce these profound differences in complexity?

One potential area where this complexity may have bloomed is tyrosine phosphorylation, a key cellular signal for pathways that control cell growth, proliferation, and structure. Enzymes called tyrosine kinases add a phosphate group to a wide range of cellular targets, which can act like a light switch, turning their function on or off. The phosphorylated proteins are recognized by another group of proteins with a special “sensor” called the SH2 domain. Because tyrosine kinases will promiscuously phosphorylate many targets in the cell, the very picky SH2 domain proteins are responsible for sorting out the noise.

“Tyrosine kinases tend to be not that selective,” said Piers Nash, assistant professor in the Ben May Department of Cancer Research at the University of Chicago who studies this system. “They’ll phosphorylate a lot of things, and that creates all of these docking sites for SH2-domain-containing proteins. It’s really up to the SH2 domains to interpret those signals and convert them into downstream signaling pathways.”

The more complex the cell, the more unique types of SH2 domains that are needed to perform this important sorting function. In the unicellular cousins of animals, organisms can get by with just a single SH2 domain. But in humans, some 121 SH2 domains are known to exist, managing many different pathways in many different cells. In two recent papers, Nash’s laboratory studied how these SH2 domains manage their impressive selectivity and the evolutionary pathway that they took from simple protozoa to complicated human.

It’s essential that SH2 domains only bind to the right phosphorylated protein — repeatedly screwing up and activating the wrong pathway could lead to diabetes, cancer, or worse. But scientists have struggled to figure out how SH2 domains choose their appropriate target, with some even concluding that they aren’t so selective at all, merely in the right part of the cell at the right time to only bind the correct protein. However, that wasn’t what a research team led Bernard Liu from Nash’s laboratory found when they looked at how SH2 domains bind actual cell targets such as the insulin receptor.

“It turned out that the SH2 domains were exquisitely selective, much more selective than the general motifs for the SH2 domains that had previously been mapped,” Nash said. “So it was clear there was additional information encoded in the peptide that the SH2 domain makes use of.”

The researchers then deduced that the SH2 domains select their target through a kind of language, looking for the exact sequence of amino acids – or “word” – that marks the appropriate match. Because each amino acid (akin to the letters of the word) will either attract a particular SH2 domain or reject its peers, changing only one amino acid can completely change the meaning, like altering the word “light” to “fight.”

“For SH2 domains, that makes all the difference in the world. They can sense incredibly subtle differences,” Nash said. “It’s looking at the entire peptide and seeing both the permissive and the non-permissive residues, integrating that and making this collective decision about what to bind.”

These small changes are the very stuff of evolution, where a single genetic mutation can change the amino acid sequence of a protein. To examine how SH2 domains changed over evolutionary time, Nash’s team gathered genetic sequence data from 21 different organisms, from a microscopic “choanoflagellate to sea anenomes, mosquitos, frogs, possums, and humans. In a paper published last week in Science Signaling, the researchers found the expected increase in the number of different SH2 domains as organisms grew more complex. But the timing of those changes was extremely fascinating, as big jumps in SH2 number corresponded to big jumps in complexity, such as at the junction between unicellular and multicellular organisms and the boundaries between insects, fish, and mammals.

But despite the proliferation of SH2 domains, Nash said you can see recurring patterns as nature repeatedly tweaks these cellular machines to come up with more and more variations.

“Essentially, evolution uses what it has,” Nash said. “It doesn’t come up with the optimal solution, it comes up with the best solution for what it’s stuck with.”

In the wake of this research, the laboratory continues to look at how SH2 domains work in the hope of someday understanding the language well enough to manipulate it. Knowing why one SH2 domain binds target A and activates pathway A while another does the same for target and pathway B could help scientists design more specific drugs for conditions such as diabetes and cancer. These more specific drugs could help correct a dysfunction without unintentionally activating or deactivating other systems, producing unintended side effects.

“Armed with that knowledge, we can then go back in and make those changes, and actually change the selectivity from one to the other,” Nash said. “If we can find very specific pathways, we may be able to say well, there’s something we can do which would strengthen this interaction and correct your problem, but would not give you the massive growth and proliferation that would lead to cancer.”

To start cracking the SH2 domain code, Nash is starting to turn to experts from another field: linguistics. After all, people who have studied and translated the grammar of lost languages might be able to do the same for the undiscovered language of cellular proteins.

“The idea that we can begin to understand the language of protein interaction is actually very neat,” Nash said. “It’s complex enough now that the existing algorithms for prediction don’t work. But the linguistic guys have the same problem: they can do the simple stuff very easily, but to get to the level of more complex recognition is actually quite difficult.”

About Rob Mitchum (525 Articles)
Rob Mitchum is communications manager at the Computation Institute, a joint initiative between The University of Chicago and Argonne National Laboratory.
%d bloggers like this: