Anthropic done jailbroke itself 😎

August 28, 2024

The OpenAI rival startup Anthropic yesterday released system prompts for its Claude family of AI models and committed to doing so going forward, setting what appears to be a new standard of transparency for the fast-moving gen AI industry, according to observers.

System prompts act much like the operating instructions of large language models (LLMs), telling models the general rules they should follow when interacting with users and the behaviors or personalities they should exhibit (sic) They also tend to show the cut-off date for the information learned by the LLM during training.

Most LLMs have system prompts, but not every AI company publicly releases them. Uncovering the system prompts for models has even become a hobby of sorts for AI jailbreakers.

But now, Anthropic has beat the jailbreakers at their own game, going ahead and revealing the operating instructions for its models Claude 3.5 Sonnet, Claude 3 Haiku and Claude 3 Opus on its website under the release notes section.

In addition, Anthropic’s Head of Developer Relations Alex Albert posted on X (formerly Twitter) a commitment to keeping the public updated on its system prompts, writing: “We’re going to log changes we make to the default system prompts on Claude dot ai and our mobile apps.”

Search This Blog

chatainews

Anthropic done jailbroke itself 😎

Comments

Post a Comment

Popular posts from this blog

Hamza Chaudhry

Perplexity

Alongside AI: Eye vs AI