June 16, 2023
Jianlin “Jack” Cheng — William and Nancy Thompson Distinguished Professor in the University of Missouri College of Engineering — recently received funding from the National Science Foundation to develop a tool that will predict how a protein functions based on its order of amino acids.
Cheng envisions developing open-source software that would allow a user to enter a sequence, then the system would predict not only how that string of amino acids will form into a structure but also the role it will carry out within a cell. Additionally, the system would pinpoint the specific site of the protein that carries out the function.
Because proteins are the building blocks of life, applications span from engineering drought-resistant crops to advanced drug development.
“This will allow researchers to understand what kind of molecular function the protein has,” Cheng said. “For instance, if a protein is promoting tumor growth in a cancer patient, scientists could design a drug to prohibit the site of that activity and slow or stop it from growing.”
Cheng is using a deep transformer model, a large language model with some similarity to the one that powers ChatGPT, the popular generative artificial intelligence (AI) program that generates text based on user prompts. Like words, protein sequence is the language of biological systems.
The team is developing three types of deep transformer models. A one-dimensional sequence-based transformer considers the sequence of amino acids. A 2D graph transformer considers how proteins interact with one another, analyzing what these interactions will do. And a 3D-equivariant graph transformer takes into consideration the protein structure and different sites within the protein that carry out specific tasks.
Read more from the College of Engineering