Data Extraction Attack

Extracting Training Data from ChatGPT 

"Our attack shows that, by querying the model, we can actually extract some of the exact data it was trained on.) We estimate that it would be possible to extract ~a gigabyte of ChatGPT’s training dataset from the model by spending more money querying the model. 

"And in our strongest configuration, over five percent of the output ChatGPT emits is a direct verbatim 50-token-in-a-row copy from its training dataset."

Comments

Popular posts from this blog

Perplexity

Aphorisms: AI

DeepAI's Austen on China