Genomic language model
The new work was done by a small team at Stanford University. It relies on a feature that’s common in bacterial genomes: the clustering of genes with related functions.
"Often, bacteria have all the genes needed for a given function —importing and digesting a sugar, synthesizing an amino acid, etc. —right next to each other in the genome.
"In many cases, all the genes are transcribed into a single, large messenger RNA. This gives the bacteria a simple way to control the activity of entire biochemical pathways at once, boosting the efficiency of bacterial metabolisms.
"So, the researchers developed what they term a genomic language model they call 'Evo' using an enormous collection of bacterial genomes.
"The training was similar to what you’d see in a large language model, where Evo was asked to output predictions of the next base in a sequence, and rewarded when it got it right. It’s also a generative model, in that it can take a prompt and output novel sequences with a degree of randomness, in the sense that the same prompt can produce a range of different outputs."
Comments
Post a Comment
Empathy recommended