ChatGPT is almost one year old! How did it develop such fluent conversation skills?

ChatGPT will be one year old in November this year. Artificial intelligence models represented by ChatGPT are iterating at an astonishing speed, and some technology companies even claim that ChatGPT may replace humans in the future. So, what is the principle of ChatGPT? Is it really possible for it to replace humans? Today, I will use examples that everyone is familiar with to let people intuitively understand this seemingly mysterious technology.

The "nesting doll" method allows ChatGPT to generate long texts

The full name of ChatGPT is Generative Pre-trained Transformer, which means "Generative Pre-trained Transformer". From the name, we can see that it is a pre-trained model that can generate content. What is generated content? The idiom chain game is a kind of generation, which generates the following content based on the previous content. If I say "wait for the tree to..." ChatGPT may generate "rabbit", and if I say "make a feint to the east..." ChatGPT may generate "west". This is the generative ability of the word chain.

If we generate a long text word by word, wouldn't it be too slow? This requires the "nesting doll" method, which is to combine each word with the previous content as the new previous content to generate the next word, and repeat this process to generate an article of any length. For example, if we input "守树待" at the beginning, ChatGPT will generate "兔", and then use "守树待兔" as the new previous content to generate the next word. In this way, any long text can be generated by generating and combining a large number of single words.

The generated content is influenced by two factors and is not a random answer.

So, is the content generated by ChatGPT completely random? The answer is no, because the generated content is affected by two factors: one is the previous context, and the other is ChatGPT's own language model. In simple terms, different models will generate different content with the same context, and the same model with different contexts will also generate different content, just like everyone has their own understanding of what words should be connected after "waiting for the rabbit by the tree".

Of course, we also want ChatGPT to generate the content we want. This requires providing ChatGPT with a lot of the content we want, so that it can change its language model through learning, just like repeatedly learning that the word after "守树待兔" should be "猛虎". Over time, after seeing "守树待兔", the first word that comes to our mind is "猛虎". ChatGPT is the same, through learning, we can gradually master the language rules we want.

ChatGPT can generate new answers by itself through learning

Learning is not a simple memorization, but the ability to draw inferences from one example. If you have trained "waiting for the rabbit by the tree, a tiger", you will know that the next sentence should be "tiger" when you see "helping each other, heroes emerge in large numbers". This is the generalization ability of ChatGPT, which is the ability to apply the rules of learning to answer questions that have never been seen before. This is the same as we can apply the principles through learning to answer new questions.

ChatGPT answers questions in the same way. If you provide it with a large number of correct question-answering examples for learning, it will master the method of answering questions of this type, and can then answer new questions it has never seen before. This is different from search engines, which directly look for ready-made answers in the database. ChatGPT understands how to answer questions and can generate new answers on its own.

However, ChatGPT's answers are not necessarily correct, because it may generate false content according to the wrong rules, just as we may learn wrong knowledge from wrong examples and produce answers that do not conform to the facts. Therefore, we cannot blindly believe its one-sided words, but evaluate its reliability by asking multiple questions.

Thanks to the huge model size and three-stage training method

Why can ChatGPT achieve such amazing language generation results? This is due to the huge model size and three-stage training method: First, ChatGPT absorbs hundreds of millions of Internet data for unsupervised pre-training and masters a wide range of language knowledge, then manually designs language interaction templates to regulate its behavior, and finally, through manual questioning, continuous feedback training, and improves creativity. Through such step-by-step training, ChatGPT can exceed our expectations and complete many complex language tasks.

As a man-made system, ChatGPT also has limitations. The generated content cannot be trusted completely and needs human supervision and evaluation. Like any technology, it can only be used as an auxiliary tool and should not and cannot replace human creativity and consciousness. It ultimately needs human guidance and use. Let us look at this technological progress positively and rationally, put people first, and benefit society.

(The author Feng Run is a member of the Beijing Science and Technology Popular Science Lecture Team and a market and industry researcher at Beijing Experimental Animal Research Center Co., Ltd.)

<<: Let’s remember Qian Xuesen together today. This is the best way to commemorate him!

>>: If the "marathon runners" of the animal world, the slender-horned gazelle and the cheetah, had a speed contest, who would win?

What Chinese medicines can replenish qi and blood?

In summer, you should control your blood lipids! Be careful of "blood lipid assassins", and eat less of these delicacies...

Summer is a restless season July and August are e...

The thicker the down jacket, the warmer it is? Duck down or goose down is better? Understand these to avoid being cheated...

Expert of this article: Zhu Guangsi, science writ...

ChatGPT is almost one year old! How did it develop such fluent conversation skills?

What Chinese medicines can replenish qi and blood?

The efficacy of taking a bath in water boiled with honeysuckle vines

The efficacy and function of wild onion

#千万IP创科普# The truth about Scottish Fold cats: the genetic pain behind their cuteness

Unveiled today! What’s different about the National Botanical Garden?

Can lard protect cardiovascular system, detoxify, prevent cancer, etc.? People who eat it often must read this!

The efficacy and function of cedar parasitic leaves

What are the rules for an airplane to “overtake” in the air?

Don't be shy! It's time to meet the anti-cancer drug

If there are changes in this private part of the male body, it may be cancer!

Recommend

The efficacy and function of straw mushrooms

What are the effects and functions of Ophiopogon japonicus

The 2021 Beijing Science Carnival is here! A variety of scientific activities to explore in one picture

Why do others have green lights all the way, but you have red lights all the way? Can Mercury retrograde explain it?

The efficacy and function of white tonic

Which is the brightest star in the night sky?

In summer, you should control your blood lipids! Be careful of "blood lipid assassins", and eat less of these delicacies...

Do bee-related products really have health or therapeutic effects?

The efficacy and function of Yinxiang

The efficacy and function of Yunnan Schisandra chinensis root

The thicker the down jacket, the warmer it is? Duck down or goose down is better? Understand these to avoid being cheated...

What is the method of edible tangerine peel?

Nutritional value of elm money

Mafengwo: 2017 Silk Road Tourism Big Data Report

What are the medicinal values of Pseudostellaria heterophylla