Autoregressive Model

Autoregressive models like GPT-2 use the previous words (context) to predict the next word in a sentence. They are mainly used for generating text, such as creating a continuation of a story. Think of this as a mystery novel reader. The reader starts from the beginning and reads one word at a time, always predicting what comes next based on what they've already read.



