Nvidia GTC
"The Nvidia Groq 3 language processing unit (LPU) incorporates intellectual property Nvidia licensed from the start-up Groq last Christmas Eve for US $20 billion.
"Training and inference tasks have distinct computational requirements.
"While training can be done on huge amounts of data at the same time and can take weeks, inference must be run on a user’s query when it comes in. Unlike training, inference doesn’t require running costly backpropagation.
"With inference, the most important thing is low latency —users expect the chatbot to answer quickly, and for thinking or reasoning models inference runs many times before the user even sees an output."
Comments
Post a Comment
Empathy recommended