An AI Program Has Cracked Science’s Problem With Protein

Want more Junkee in your life? Sign up to our newsletter, and follow us on Instagram, Twitter and Facebook so you always know where to find us.

Artificial intelligence program AlphaFold has officially cracked one of biology’s biggest problems — protein structures and what they look like. Something that is likely going to impact almost every field of scientific research.

“This is a problem that we just didn’t think was going to be solved for a long time, and then just sort of overnight the game’s changed,” said Dr Kate Michie, a structural biologist at the University of New South Wales.

“I mean we’ve dreamed of it. There’s been lots of false declarations of success. This one really works,” added Professor Paul Curmi, molecular biophysicist also at UNSW.

What The New Software Discovered About Protein Structures

The huge breakthrough that happened over the weekend was basically a big data dump of 200 million protein structures from over one million species. 

“The field of protein structures been going for a long time and it’s sort of taken a really long time for each individual protein structure to be solved,” Dr Michie explained to Junkee. “It’s experimentally really difficult. It takes a lot of resources and costs thousands of dollars, if not hundreds of thousands of dollars.”

The program behind the in-demand data is AlphaFold, which was developed by DeepMind, a London-based AI tech company owned by Google. The program’s code was first released about a year ago for scientists to use to try and figure out what proteins look like.

Before AlphaFold, scientists used other experimental methods like x-ray crystallography that at best took weeks, months, or even years to figure out the structures of proteins.

“There’s a second method called nuclear magnetic resonance spectroscopy. That’s a very good method but slower. As of 2013, there is yet another method called single-particle cryo-electron microscopy,” said Professor Curmi.

A database of around 190,000 structures existed from these methods. But sever since AlphaFold released last year, that number jumped up to 1 million because scientists were finally able to predict the 3D structure of a protein in seconds by using AI. 

The company has now released over 200 million protein structures predicted by the software. This covers almost every single known protein on the planet.

What Exactly Are Proteins?

Let’s back up — what is so important about proteins, and why is their structure so hard to pinpoint?

Proteins are known as the building blocks of life, in the sense that they underpin every biological process in every living thing. They’re made up of up to 20 different amino acids, and each one has a specific DNA code which acts like a list of instructions for a cell to build a protein. Examples of proteins include antibodies that fight off infections, or enzymes that help us digest food.

Previously, scientists have understood protein’s DNA code and the corresponding chain of amino acids, but have found it a lot more difficult to figure out how that chain folded up to create a protein. Until AlphaFold.

“AlphaFold Two allows you to directly translate the DNA sequence into the three-dimensional structure of the constituent proteins. And that in itself is a game changer,” said Professor Curmi.

How AlphaFold Can Change The Future Of Research

Having the structure of almost every known protein in the database is huge. This data will change scientific research by what kind of experiments can now be performed. Now that the protein structure is already figured out by AlphaFold, scientists can focus on what proteins do and how, instead of what they look like. This has potential benefits in all kinds of biological research, like developing sustainable materials or reducing pollution by understanding the reactions at a molecular level.

“Let’s take coronavirus,” Professor Curmi explained. “If you have the structure, you’ve got the blueprint [of] what the thing looks like, and you can analyse that and say ‘okay there’s a weak point here’. You can design a molecule that will fit in there exactly, and jam it up.”

Dr Michie also pointed out that researchers who may have been put off by the labour-intensive process of determining protein structures can now easily access AlphaFold’s data for free (and so can you!).

“People who previously thought that structural biology was too hard now means they can download a structure that we think is probably close to experimentally usable. So now we can start designing better experiments to really test that. 

It’s really leapfrogged science ahead in a really quick way. [It’s] much cheaper than trying to solve all of these structures independently. So it’s huge.”