GPT-3 Explained in Under 2 Minutes

栏目: IT技术 · 发布时间: 5年前

So, you’ve seen some amazing GPT-3 demos on Twitter (machine-made Op-Eds, poems, articles, even working code). But what’s going on under the hood of this incredible model? Here’s a (brief!) look inside.

GPT-3 is a neural-network-powered language model. Alanguage model is a model that predicts the likelihood of a sentence existing in the world. For example, a language model can label the sentence “I take my dog for a walk” as more probable to exist (i.e. on the Internet) than the sentence “I take my banana for a walk.” This is true for sentences as well as phrases and, more generally, any sequence of characters.

Like most language models, GPT-3 is elegantly trained on an unlabeled text dataset (in this case, Common Crawl ). Words or phrases are randomly removed from the text, and the model must learn to fill them in using only the surrounding words as context. It’s a simple training task that results in a powerful and generalizable model.

The GPT-3 model architecture itself is atransformer-based neural network. This architecture became popular about 2–3 years ago, and is the basis for the popular NLP modelBERT. From an architecture perspective, GPT-3 is not actually very novel! So what makes it so special and magical?

IT’S REALLY BIG. I mean really big. With 175 billion parameters, it’s the largest language model ever created (GPT-2 had only 1.5 parameters!), and was trained on the largest dataset of any language model. This, it appears, is the main reason GPT-3 is so impressive.

And here’s the magical part. As a result, GPT-3 can do what no other model can do (well): perform *specific* tasks without any special tuning. You can ask GPT-3 to be a translator, a programmer, a poet, or a famous author, and it can do it with fewer than 10 training examples. Damn .

Most other models (like BERT) require an elaborate fine-tuning step, where you gather thousands of examples of (say) French-English sentence pairs to teach it how to do translation. With GPT-3, you don’t need to do that fine-tuning step. This is the heart of it. This is what gets people excited about GPT-3: custom language tasks without training data.

Today, GPT-3 is in private beta, but boy can I not wait to get my hands on it.


很遗憾的说,推酷将在这个月底关闭。人生海海,几度秋凉,感谢那些有你的时光。


以上所述就是小编给大家介绍的《GPT-3 Explained in Under 2 Minutes》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

淘宝、天猫网上开店速查速用一本通

淘宝、天猫网上开店速查速用一本通

刘珂 / 北京时代华文书局 / 2015-6 / 39.8

为了帮助众多的新手卖家掌握淘宝天猫网上开店、货源准备、店铺装修、商品拍摄、交易方法、营销推广以及售后服务等知识,本书作者根据自己多年网上开店心得,并结合了多名淘宝五皇冠店主和天猫旗舰店卖家的经验,精心策划编写了本书。 《淘宝、天猫网上开店速查速用一本通:开店、装修、运营、推广完全攻略》将目前最前沿、最流行的营销理念运用到淘宝天猫网上平台,所有技术都在实际应用获得显著效果,并且还在持续创造着惊......一起来看看 《淘宝、天猫网上开店速查速用一本通》 这本书的介绍吧!

XML 在线格式化
XML 在线格式化

在线 XML 格式化压缩工具

RGB CMYK 转换工具
RGB CMYK 转换工具

RGB CMYK 互转工具

HSV CMYK 转换工具
HSV CMYK 转换工具

HSV CMYK互换工具