Genomic language model


The new work was done by a small team at Stanford University. It relies on a feature that’s common in bacterial genomes: the clustering of genes with related functions. 

"Often, bacteria have all the genes needed for a given function —importing and digesting a sugar, synthesizing an amino acid, etc. —right next to each other in the genome. 

"In many cases, all the genes are transcribed into a single, large messenger RNA. This gives the bacteria a simple way to control the activity of entire biochemical pathways at once, boosting the efficiency of bacterial metabolisms.

"So, the researchers developed what they term a genomic language model they call 'Evo' using an enormous collection of bacterial genomes. 

"The training was similar to what you’d see in a large language model, where Evo was asked to output predictions of the next base in a sequence, and rewarded when it got it right. It’s also a generative model, in that it can take a prompt and output novel sequences with a degree of randomness, in the sense that the same prompt can produce a range of different outputs."

Comments

Popular posts from this blog

Hamza Chaudhry

Swarm 🦹‍♂️

Digital ID tracking system