Diffusion explainer
"Diffusion models simulate this process in reverse, reconstructing legible forms from randomness.
"For a sense of how this works for images, picture a photo of an elephant. To train the model, you make a copy of the photo, adding a layer of random black-and-white static on top.
"Make a second copy and add a bit more, and so on hundreds of times until the last image is pure static, with no elephant in sight. For each image in between, a statistical model predicts how much of the image is noise and how much is really the elephant. It compares its guesses with the right answers and learns from its mistakes.
"Over millions of these examples, the model gets better at de-noising the images and connecting these patterns to descriptions like male Borneo elephant in an open field.
"Now that it’s been trained, generating a new image means reversing this process. If you give the model a prompt, like a happy orangutan in a mossy forest, it generates an image of random white noise and works backward, using its statistical model to remove bits of noise step by step.
"At first, rough shapes and colors appear. Details come after, and finally (if it works) an orangutan emerges, all without the model knowing what an orangutan is."
Comments
Post a Comment
Empathy recommended