Using an artificial intelligence language model, scientists at Meta, the parent company of Facebook and Instagram, succeeded in predicting the structure of more than 600 million proteins in two weeks. These proteins belonged to viruses, bacteria and microbes, and now even this information can be used in the production of new drugs.
A program called ESMFold It exists in Meta which was originally designed to decode human languages. Now scientists have used this program to make accurate predictions about the complexity of the three-dimensional structures of proteins. Of these predictions that are open source in ESM Metagenomic Atlas published, can be used to develop new drugs, detect unknown microbial functions, and trace important connections between distantly related species.
ESMFold is not the first AI model to predict proteins. In 2022, Google’s DeepMind unveiled a similar model called AlphaFold for this purpose. But Meta claims that their system 60 times faster than DeepMind system Is. The results of the meta-studies are published in preprint form in the bioRxiv database.
Using meta-linguistic artificial intelligence model for proteins
Proteins are the building blocks of all living things and are made of long chains of amino acids. Knowing the shape of proteins is the best way to know their function, but amino acids can form and make proteins in many ways. The best way to determine the structure of a protein is to useX-ray crystallography“, that is, to see how the high-energy light is broken in different proteins, but the implementation of this tedious method can take months or years, and it does not work on all proteins.
As a result, the researchers developed an advanced computer meta-model that tries to understand the language of protein sequences. The model is trained with the sequences of millions of natural proteins and can automatically fill in the gaps in the sequences.
To test this model, they went to a metagenomic database of DNA from different places such as soil, seawater and human gut and fed this information to ESMFold. At the end of the work, the researchers were able to construct the structure within two weeks More than 617 million proteins predict This figure is about 400 million more than AlphaFold in the same time period.
Scientists believe that more than 200 million predictions with ESMFold have been of high quality. They now hope to use the system for more work in the fields of proteins, health, disease and the environment.