ChatGPT is not a database

ChatGPT is not a database

In the process of their creation, language models are exposed to A LOT of data, and it is thanks to the content of the texts they see during training that they acquire knowledge (though their direct goal is to learn to generate the most likely next work given a sequence, not to acquire knowledge.) This knowledge does not get stored as such anywhere, not are stored all the texts seen during training. Once everything has been seen, the language model is somewhat sealed, and it has no way to “find back” any answer. The answer just comes from probabilities and there is no lookup strategy in the model itself. Nowadays there are lookup strategies “outside”, like launching a browser and searching the answer externally, BUT: the answer produced by the language model (as a result of next word prediction) and the one it looks up connecting to an external browser do not necessarily correspond!