{"id":2279,"date":"2021-01-13T17:08:06","date_gmt":"2021-01-13T17:08:06","guid":{"rendered":"https:\/\/thenextweb.com\/?p=1334015"},"modified":"2021-01-13T17:08:06","modified_gmt":"2021-01-13T17:08:06","slug":"googles-new-trillion-parameter-ai-language-model-is-almost-6-times-bigger-than-gpt-3","status":"publish","type":"post","link":"https:\/\/www.londonchiropracter.com\/?p=2279","title":{"rendered":"Google\u2019s new trillion-parameter AI language model is almost 6 times bigger than GPT-3"},"content":{"rendered":"\n<div><img decoding=\"async\" src=\"https:\/\/img-cdn.tnwcdn.com\/image\/neural?filter_last=1&amp;fit=1280%2C640&amp;url=https%3A%2F%2Fcdn0.tnwcdn.com%2Fwp-content%2Fblogs.dir%2F1%2Ffiles%2F2020%2F11%2Fgoogle-ai.jpg&amp;signature=0e4390a8f0b16540433c4aea95b6e98d\" class=\"ff-og-image-inserted\"><\/div>\n<p>A trio of researchers from the Google Brain team recently unveiled the next big thing in AI language models: a massive one trillion-parameter transformer system.<\/p>\n<p>The next biggest model out there, as far as we\u2019re aware, is <a href=\"https:\/\/thenextweb.com\/neural\/2020\/07\/23\/openais-new-gpt-3-language-explained-in-under-3-minutes-syndication\/\">OpenAI\u2019s GPT-3,<\/a> which uses a measly 175 billion parameters.<\/p>\n<p><strong>Background:<\/strong> Language models are capable of performing a variety of functions but perhaps the most popular is the generation of novel text. For example, you can go <a href=\"https:\/\/philosopherai.com\/\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">here<\/a> and talk to a \u201cphilosopher AI\u201d language model that\u2019ll attempt to answer any question you ask it (with numerous <a href=\"https:\/\/thenextweb.com\/neural\/2020\/08\/24\/this-philosopher-ai-has-its-own-existential-questions-to-answer\/\">notable exceptions<\/a>).<\/p>\n<p><em>[Read next:&nbsp;<a href=\"https:\/\/thenextweb.com\/distract\/2020\/12\/21\/how-netflix-shapes-mainstream-culture-explained-by-data\/\">How Netflix shapes mainstream culture, explained by data<\/a>]<\/em><\/p>\n<p>While these incredible AI models exist at the cutting-edge of machine learning technology, it\u2019s important to remember that they\u2019re essentially just performing parlor tricks. These systems <a href=\"https:\/\/www.technologyreview.com\/2020\/07\/20\/1005454\/openai-machine-learning-language-generator-gpt-3-nlp\/\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">don\u2019t understand language<\/a>, they\u2019re just fine-tuned to make it look like they do.<\/p>\n<p>That\u2019s where the number of parameters comes in \u2013 the more virtual knobs and dials you can twist and tune to achieve the desired outputs the more finite control you have over what that output is.<\/p>\n<p><strong>What Google\u2018s done:<\/strong> Put simply, the Brain team has figured out a way to make the model itself as simple as possible while squeezing in as much raw compute power as possible to make the increased number of parameters possible. In other words, Google has <em>a lot<\/em> of money and that means it can afford to use as much hardware compute as the AI model can conceivably harness.<\/p>\n<p>In the team\u2019s<a href=\"https:\/\/arxiv.org\/pdf\/2101.03961.pdf\" target=\"_blank\" rel=\"nofollow noopener noreferrer\"> own words<\/a>:<\/p>\n<blockquote readability=\"11\">\n<p>Switch Transformers are scalable and effective natural language learners. We simplify Mixture of Experts to produce an architecture that is easy to understand, stable to train and vastly more sample efficient than equivalently-sized dense models. We find that these models excel across a diverse set of natural language tasks and in different training regimes, including pre-training, fine-tuning and multi-task training. These advances make it possible to train models with hundreds of billion to trillion parameters and which achieve substantial speedups relative to dense T5 baselines.<\/p>\n<\/blockquote>\n<p><strong>Quick take:<\/strong> It\u2019s unclear exactly what this means or what Google intends to do with the techniques described in the pre-print paper. There\u2019s more to this model than just one-upping OpenAI, but exactly how Google or its clients could use the new system is a bit muddy.<\/p>\n<p>The big idea here is that enough brute force will lead to better compute-use techniques which will in turn make it possible to do more with less compute. But the current reality is that these systems don\u2019t tend to justify their existence when compared to greener, more useful technologies. It\u2019s hard to pitch an AI system that can only be operated by trillion-dollar tech companies willing to ignore the massive carbon footprint&nbsp;a system this big creates.<\/p>\n<p><strong>Context:<\/strong> Google\u2018s pushed the limits of what AI can do for years and this is no different. Taken by itself, the achievement appears to be the logical progression of what\u2019s been happening in the field. But the timing <em>is<\/em> a bit suspect.<\/p>\n<blockquote class=\"twitter-tweet\" data-width=\"500\" data-dnt=\"true\" readability=\"8.5014409221902\">\n<p lang=\"en\" dir=\"ltr\">FYI <a href=\"https:\/\/twitter.com\/mmitchell_ai?ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">@mmitchell_ai<\/a> and I found out there was a 40 person meeting in September about LLMs at Google where no one from our team was invited or knew about this meeting. So they only want ethical AI to be a rubber stamp after they decide what they want to do in their playground. <a href=\"https:\/\/t.co\/tlT0tj1sTt\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">https:\/\/t.co\/tlT0tj1sTt<\/a><\/p>\n<p>\u2014 Timnit Gebru (@timnitGebru) <a href=\"https:\/\/twitter.com\/timnitGebru\/status\/1349389412791148546?ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">January 13, 2021<\/a><\/p>\n<\/blockquote>\n<p><em>H\/t: <a href=\"https:\/\/venturebeat.com\/2021\/01\/12\/google-trained-a-trillion-parameter-ai-language-model\/\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Venture Beat<\/a><\/em><\/p>\n<p class=\"c-post-pubDate\"> Published January 13, 2021 \u2014 17:08 UTC <\/p>\n<p> <a href=\"https:\/\/thenextweb.com\/neural\/2021\/01\/13\/googles-new-trillion-parameter-ai-language-model-is-almost-6-times-bigger-than-gpt-3\/\">Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>A trio of researchers from the Google Brain team recently unveiled the next big thing in AI language models: a massive one trillion-parameter transformer system. The next biggest model out there, as&#8230;<\/p>\n","protected":false},"author":1,"featured_media":2280,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/www.londonchiropracter.com\/index.php?rest_route=\/wp\/v2\/posts\/2279"}],"collection":[{"href":"https:\/\/www.londonchiropracter.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.londonchiropracter.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.londonchiropracter.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.londonchiropracter.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=2279"}],"version-history":[{"count":0,"href":"https:\/\/www.londonchiropracter.com\/index.php?rest_route=\/wp\/v2\/posts\/2279\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.londonchiropracter.com\/index.php?rest_route=\/wp\/v2\/media\/2280"}],"wp:attachment":[{"href":"https:\/\/www.londonchiropracter.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=2279"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.londonchiropracter.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=2279"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.londonchiropracter.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=2279"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}