Sociolinguistics
"The epistemology of AI is different from the way language is learnt by humans, and the absence of any cultural understanding is key to this.
"Language in LLMs is about statistical co-occurrence, not about deep cultural meanings.
"As a result of this specific LLM ontology and epistemology, the data used are of crucial importance.
"LLMs are trained on large internet-based datasets, but large doesn't mean equal or diverse.
"On the contrary, it is known that those databases are heavily biased, and that is exactly why those models need alignment in the form of bias removal.
"Bender et al argues that from 'each step, from initial participation in Internet fora, to continued presence there, to the collection and finally the filtering of training data, current practice privileges the hegemonic viewpoint.'
"Combine this with a rising trust in the authority and knowledgeability of those LLMs and we see where sociolinguistics can come in."
Comments
Post a Comment
Empathy recommended