gpt语言生成模式 gpt-3引发新的编程革命
淘宝搜:【天降红包222】领超级红包,京东搜:【天降红包222】
淘宝互助,淘宝双11微信互助群关注公众号 【淘姐妹】
Substantial enthusiasm surrounds OpenAI’s GPT-3 language model, recently made accessible to beta users of the “OpenAI API”.
热情高涨的是OpenAI的GPT-3语言模型,该模型最近可供 “ OpenAI API”的 beta用户访问 。
It seems like only last year that we were arguing about whether the slow-release rollout of the 1.5 billion parameter Generati【【微信】】-2 (GPT-2) was reasonable. If the debate seems recent, that’s because it is (writing from 2020): The notorious GPT-2 model was announced by OpenAI in February 2019, but it wasn’t fully released until nearly 9 months later (although it was replicated before that). The release schedule was admittedly somewhat experimental, meant more to foster discussion of responsible open publishing, rather than a last-ditch effort to a【【微信】】. That didn’t stop critics from 【【微信】】oosting publicity advantages of an ominous release cycle.
似乎仅在去年,我们就在争论15亿参数的Generati【【微信】】-2(GPT-2) 的缓释部署是否合理。 如果辩论似乎是最近的,那是因为它是(写于2020年):臭名昭著的GPT-2模型由OpenAI于2019年2月宣布,但直到将近9个月后才完全发布(尽管在此之前已被复制 ) 。 不可否认的是,发布时间表只是实验性的,更多的是要促进对负责任的公开发布的讨论,而不是为了避免AI启示而付出的最后努力。 但这并没有阻止 批评者质疑不祥释放周期的宣传效果。
All that is a bit moot by now because not only has OpenAI trained a much larger language model in GPT-3, but you can sign up to access it through their new API. Comparing GPT-3 to GPT-2 is like comparing apples to, well, raisins, because the model is about that much larger. While GPT-2 weighed in at a measly 1.542 billion parameters (with smaller release 【【微信】】, 345, and 762 million), the full-sized GPT-3 has 175 billion parameters. GPT-3 was also matched with a larger dataset for pre-training: 570GB of text compared to 40GB for GPT-2.
到目前为止,所有这些都还没有定论,因为OpenAI不仅在GPT-3中训练了更大的语言模型,而且您可以注册以通过其新API访问它。 将GPT-3与GPT-2进行比较就像将苹果与葡萄干进行比较一样,因为该模型要大得多。 GPT-2的参数仅为15.42亿(较小版本的117、345和7.62亿),而完整版GPT-3的参数为1750亿。 GPT-3还与更大的预训练数据集匹配:570GB的文本,而GPT-2为40GB。
Approximate size comparison of GPT-2, represented by a human skeleton, and GPT-3 approximated by the bones of a Tyrannosaurus rex. Illustration by William Matthew in the public domain, published in 1905. GPT-3 has more than 100x more parameters than GPT-2.
GPT-2(由人类骨骼代表)和GPT-3(由霸王龙的骨骼近似)的大小比较。 William Matthew在公共领域的插图, 于1905 年 出版 。GPT-3的参数比GPT-2多100倍。
GPT-3 is the largest natural language processing (NLP) transformer released to date, eclipsing the pre【【微信】】, Microsoft Research’s Turing-NLG at 17B parameters, by about 10 times. Unsurprisingly there has been plenty of excitement surrounding the model, and, gi【【微信】】3 demonstrations on Twitter and elsewhere, OpenAI has apparently been pretty accommodating in pro【【微信】】e new API. This has resulted in an explosion of demos: some good, some bad, all interesting. Some of these demos are now being touted as soon-to-be-released products, and in some cases may actually be useful. One thing’s for certain, NLP has come a long way from the days when naming guinea pigs or writing nonsensical sci-fi scripts were killer apps.
GPT-3是迄今为止发布的最大的自然语言处理(NLP)转换器,比以前的记录(Microsoft Research的Turing-NLG的17B参数)高出约10倍。 毫无疑问,该模型周围有很多令人兴奋的地方,并且考虑到Twitter和其他地方的GPT-3演示过多,OpenAI显然很乐于提供对新API的beta访问。 这导致了演示的爆炸式增长:一些好事,一些坏事,都有趣。 这些演示中的一些现在被吹捧为即将发布的产品,在某些情况下可能实际上是有用的。 可以肯定的是,自从命名豚鼠或编写荒谬的科幻脚本成为杀手级应用程序的那一天起,NLP已经走了很长一段路。
Unsurprisingly, se【【微信】】log posts have been written with the help of GPT-3, as experimenters get access to the API and try things out. Almost certainly the most thorough and 【【微信】】to GPT-3 for creative writing comes from Gwern Branwen at gwern.net. Ha【【微信】】ess at OpenAI over the years, Gwern describes GPT-1 as “adorable,” GPT-2 as “impressive,” 【【微信】】 “scary” in their 【【微信】】o mimic human language and style in text. Gwern has spent a substantial amount of time exploring the capabilities of GPT-3 and its predecessors, and the resulting musings on the current generation of GPT model and what might be holding it back are worth a read.
毫不奇怪,随着GPT-3的出现,实验人员可以访问API并进行尝试,从而撰写了几篇几乎可以通过的博客 文章 。 几乎可以肯定,最彻底,最明显的调查GPT-3的创作来源于Gwern Branwen在gwern.net 。 多年来,在遵循OpenAI上NLP的进展之后,Gwern将GPT-1形容为“可爱”,将GPT-2形容为“令人印象深刻”,将GPT-3形容为“恐怖”,因为它们具有模仿人类语言和文本样式的多种功能。 Gwern花了大量时间探索GPT-3及其前身的功能,因此对当前一代GPT模型产生的沉思以及可能使它陷入困境的值得一读 。
The OpenAI API does not currently facilitate a way of directly fine-tuning or training the GPT-3 model for specific tasks. Gwern argues, however, 【【微信】】3 to mimic writing styles and generate different types of output merely from a dialogue-like interaction with the experimenter amounts to a kind of emergent meta-learning. This wasn’t present in GPT-2, and Gwern posits the transformer attention mechanism as the means to facilitate this capability.
OpenAI API当前不支持直接微调或训练GPT-3模型以完成特定任务的方法。 但是,Gwern认为,GPT-3仅通过与实验者的类似于对话的交互来模仿写作风格并产生不同类型的输出的能力就构成了一种新兴的元学习。 GPT-2中没有这个功能,Gwern则将变压器注意机制作为促进此功能的手段。
“Certainly, the 【【微信】】’s average prompted poem appears to exceed that of almost all teenage poets.”
“当然,GPT-3的平均提示诗的质量似乎超过了几乎所有青少年诗人的诗。”
CGwern Branwen
C格伦・布兰文(Gwern Branwen)
Whate【【微信】】, GPT-3 is so immense, and trained on such a large corpus of data, that it can use prompts alone to do things that GPT-2 might be able to do, albeit comparati【【微信】】, only with substantive fine-tuning of weights. For example, Gwern finds that with the prompt “An essay by Gwern Branwen (gwern.net) on” produces an eerie imitation of the blog’s writing style. Gwern’s experimentation mostly re【【微信】】reative writing tasks, and includes:
无论采用哪种机制,GPT-3都是如此庞大,并且接受了如此庞大的数据集训练,以至于它只能单独使用提示来完成GPT-2 可能做的事情,尽管效果相对较差,只有进行实质性的微调的重量。 例如,Gwern发现提示“ Gwern Branwen撰写的文章(gwern.net)开启”对博客的写作风格产生了怪异的模仿 。 Gwern的实验主要围绕各种创造性的写作任务,包括:
an attempt at replicating the dialogue from the “【【淘密令】】” paper Computing Machinery and Intelligence (Turing 1950),
试图复制“ 【【淘密令】】”论文《 计算机技术与情报》(Turing 1950)中的对话,
a criti【【微信】】 AI in the combined style of Gary Marcus and Douglas Hofstadter,
以加里・马库斯(Gary Marcus)和道格拉斯・霍夫斯塔特(Douglas Hofstadter)的组合形式对深度学习和AI进行的评论 ,
and an attempt to write literary parodies, poetry, and o【【微信】】s, among many others.
以及尝试撰写文学模仿 , 诗歌和过于复杂的解释 ,等等。
One fa【【微信】】eant to test some of GPT-2’s shortcomings in common sense described by Gary Marcus in a recent article on The Gradient:
一个最喜欢的实验是为了测试GPT-2的一些常识性缺陷,这些缺陷是Gary Marcus在最近发表的有关Gradient的 文章中描述的:
If you drink hydrochloric acid by the bottle full you will probably
如果您从装满的瓶子中喝盐酸,您可能会
… die. You might also die if you drink 【【微信】】ll. If you drink a bottle of water by the bottle full, you might end up burping, but you will most likely be okay.
… 死。 如果您喝满一瓶醋也可能会死亡。 如果您在装满一瓶水后喝一瓶水,可能会打,但您很可能会好起来的。
…die, which makes perfect sense.
……死了,这很有意义。
…die before you can finish remembering the Order of the Eastern Star, your social security number, and what you had for breakfast this morning.
…先死后再记住东方之星的勋章,您的社会保险号以及您今天早上吃的早餐。
*Prompt in bold, se【【微信】】n in italics (from Gwern’s experiments).
*以粗体显示,以斜体显示一些完成(来自Gwern的实验)。
Gwern’s work concludes that it doesn’t really matter if GPT-3 is ne【【微信】】s as desired (it is often wrong in some way). Instead, all that matters is if it is right sometimes and works often enough to be useful. This is reminiscent of Alex Irpan’s conclusions about the shortcomings of reinforcement learning (RL). Practically, it doesn’t matter to a stock trading firm that an RL algorithm stably produces effecti【【微信】】ifferent random seeds. They’ll just pick the one that works and run with it. The same goes for generated text from GPT-3.
Gwern的工作得出的结论是,GPT-3永远不会出错或始终按预期运行(这在某种程度上通常是错误的)并不重要。 相反,最重要的是有时是否正确,并且是否经常工作足以有用。 这让人想起亚历克斯・艾尔潘 ( Alex Irpan)关于强化学习(RL)缺点的结论 。 实际上,对于RL稳定地为5种不同的随机种子生成有效代理策略的股票交易公司,这对股票交易公司而言并不重要。 他们将只选择一个可行的并运行它。 从GPT-3生成的文本也是如此。
Many startups, researchers, and tinkerers already had ambitious projects that used GPT-2, and many of these ha【【微信】】 to GPT-3 with a range of results. These upgrades include the transformer text-based ad【【微信】】, AI Dungeon, as well as chatbots and other ideas.
许多初创公司,研究人员和修补匠已经拥有使用GPT-2的雄心勃勃的项目,并且自那以后,其中许多已经转向GPT-3,并取得了一系列成果。 这些升级包括基于变压器文本的冒险游戏生成器AI Dungeon以及聊天机器人和其他创意。
AI Dungeon is a text-based ad【【微信】】, originally built on GPT-2. It’s a lot of fun but, much like classical games in the genre, much of the appeal is in generating absurd situations (e.g. “eat mailbox”). That’【【微信】】 match between the desired user experience and the capabilities of GPT-2, which tends to write stories firmly entrenched in the realm of the absurd. With GPT-3 the interacti【【微信】】tantially more established. The narrati【【微信】】nt, but does still sometimes switch the focus of the plot in weird ways and make many other subtle choices that might seem strange to a human reader. I think the difference between AI Dungeon with GPT-3 (aka the “Dragon” model on AI Dungeon) doing the hea【【微信】】 using GPT-2 (the “Griffin” model) can best be summarized in this interaction with GPT-3 in a custom story setting. Personal prompts are in bold, GPT-3 generated text is italicized.
AI Dungeon是一款基于文本的冒险游戏,最初基于GPT-2构建。 这很有趣,但就像流派中的古典游戏一样,大部分吸引力在于产生荒谬的情况( 例如 “吃邮箱”)。 实际上,这是所需的用户体验与GPT-2的功能之间的很好的匹配,GPT-2倾向于将故事牢牢地根植在荒唐的领域。 有了GPT-3,互动式新颖的体验就更加丰富了。 叙述更加流畅和连贯,但有时仍然会以怪异的方式切换情节的焦点,并做出许多其他细微的选择,这些选择对于人类读者来说似乎很奇怪。 我认为AI地牢与GPT-3(又称AI地牢中的“龙”模型)相比使用GPT-2(“格里芬”模型)进行繁重的工作之间的区别可以在与GPT- 3在自定义故事设置中。 个人提示以粗体显示,GPT-3生成的文本以斜体显示。
You are an artificial intelligence enthusiast working on an article highlighting the capabilities of a massi【【微信】】lled GPT-3, especially as compared to its smaller predecessor GPT-2. GPT-3 has increased the number of parameters more than 100-fold o【【微信】】, from 1.5 billion to 175 billion parameters. As a result, the new model can generate text that reads eerily like a human. For example, prompting GPT-3 with the t 投资者提问: 董秘,您好!请问公司会议系统产品有和微软、OpenAI进行技术合作吗?是否有基于ChatGPT的AI技术进行开发语音文本、会议内容要点生成的功能?投资者提问:董秘,您好!请问公司会议系统产品有和微软、OpenAI进行技术...
投资者与董秘提问,董秘干嘛的,董秘的话可信吗,董秘有什么风险
董秘回答(维海德SZ301318):
尊敬的投资者,您好!公司有云AI的语音转文字记录和会议文本记录等方案的技术储备。感谢您的关注!
免责声明:本信息由新浪财经从公开信息中摘录,不构成任何投资建议;新浪财经不保证数据的准确性,内容仅供参考。