Hallucination is Inevitable: An Innate Limitation of Large Language Models (arxiv preprint)

OmnipotentEntity@beehaw.org · 8 months ago

Hallucination is Inevitable: An Innate Limitation of Large Language Models (arxiv preprint)

jmp242@sopuli.xyz · 8 months ago

I think it’s very clear that this “stochastic parrot” idea is less and less accepted by researchers and philosophers, maybe only in the podcasts I listen to…

It’s not capable of knowledge in the sense that humans are. All it does is probabilistically predict which sequence of words might best respond to a prompt

I think we need to be careful thinking we understand what human knowledge is and our understanding of the connotations if the word “sense” there. If you mean GPT4 doesn’t have knowledge like humans have like a car doesn’t have motion like a human does then I think we agree. But if you mean that GPT4 cannot reason and access and present information - that’s just false on the face of just using the tool IMO.

It’s also untrue that it’s predicting words, it’s using tokens, which are more like concepts than words, so I’d argue already closer to humans. To the extent it is just predicting stuff, it really calls into question the value of most of the school essays it writes so well now…

TheChurn@kbin.social · 8 months ago

A token is not a concept. A token is a word or word fragment that occured often in free text and was assigned a number. Common words, prefixes, and suffixes are the vast majority of tokens, and the rest are uncommon pairs of letters.

The algorithm to generate tokens is essentially compression, there is no semantic meaning embedded in them.

kciwsnurb@aussie.zone · 8 months ago

only in the podcasts I listen to

Yes definitely. Many of my fellow NLP researchers would disagree with those researchers and philosophers (not sure why we should care about the latter’s opinions on LLMs).

it’s using tokens, which are more like concepts than words

You’re clearly not an expert so please stop spreading misinformation like this.