top of page

AI prediction error: Jane Austen and Star Trek as training material


Basically, this is how my simple mind represents how AI Large Language Models such as ChatGPT or Claude or Bing work:


When someone looks into our eyes and starts to say "I love..." our mind looks forward and, most likely, complements it with "..you". If then the significant other says "...your sister", in AI terms this is a prediction error. A catastrophic one, in this case.


AI models have been trained on lots of reading material (essentially, the internet, or more selected data depending on the purpose) to do the same "next word prediction" and use it on us. With as few AI prediction error as possible, on Jane Austen or anything else.


A graphical illustration how AI Large Languae Models work

That said, in reality, scientists don´t always know how this really happens.


Then they call those abilities "emergent properties" which doesn´t say much at all. It´s like saying "consciousness is an emergent property of neural activity" - no one knows how that is supposed to work either. Then it´s called the hard problem. So there´s still a kind of hard problem in AI, perhaps a little easier. Maybe one day there will be an AI that is transparent to us.




There´s a great animated article in the New York Times illustrating how this learning works in stages (cycles) on various training material, namely literary authors. It´s fun.



An image with various authors used as AI training material





A thought on...

bottom of page