{"id":2935,"date":"2021-02-09T17:03:46","date_gmt":"2021-02-09T17:03:46","guid":{"rendered":"https:\/\/thenextweb.com\/?p=1338247"},"modified":"2021-02-09T17:03:46","modified_gmt":"2021-02-09T17:03:46","slug":"microsoft-says-its-developed-the-most-comprehensive-spelling-correction-system-ever-made","status":"publish","type":"post","link":"https:\/\/www.londonchiropracter.com\/?p=2935","title":{"rendered":"Microsoft says it\u2019s developed \u2018the most comprehensive spelling correction system ever made\u2019"},"content":{"rendered":"\n<p><span>Microsoft has unveiled an AI system called Speller100 that corrects spelling in over 100 languages used in search queries on Bing.<\/span><\/p>\n<p>\u201c<span>We believe Speller100 is the most comprehensive spelling correction system ever made in terms of language coverage and accuracy,\u201d the company said in <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/blog\/speller100-zero-shot-spelling-correction-at-scale-for-100-plus-languages\/\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">a blog post<\/a>.<\/span><\/p>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-1338265 lazy\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/02\/Screenshot-2021-02-09-at-15.18.17.png\" alt=\"Microsoft says Speller100 has improved quality in numerous low- and no-resource languages, such as Macedonian, Belarusian, and Pashto.\" width=\"1640\" height=\"844\" sizes=\"(max-width: 1640px) 100vw, 1640px\" data-lazy=\"true\" data-srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/02\/Screenshot-2021-02-09-at-15.18.17.png 1640w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/02\/Screenshot-2021-02-09-at-15.18.17-280x144.png 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/02\/Screenshot-2021-02-09-at-15.18.17-525x270.png 525w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/02\/Screenshot-2021-02-09-at-15.18.17-262x135.png 262w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/02\/Screenshot-2021-02-09-at-15.18.17-796x410.png 796w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/02\/Screenshot-2021-02-09-at-15.18.17-1592x819.png 1592w\"><figcaption>Credit: Microsoft<\/figcaption><figcaption><a href=\"https:\/\/thenextweb.com\/neural\/2021\/02\/09\/microsoft-says-its-developed-the-most-comprehensive-spelling-correction-system-ever-made\/#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Fthenextweb.com%2Fneural%2F2021%2F02%2F09%2Fmicrosoft-says-its-developed-the-most-comprehensive-spelling-correction-system-ever-made%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: Microsoft says Speller100 has improved corrections in numerous low- and no-resource languages, such as Macedonian, Belarusian, and Pashto.\" data-title=\"Share Microsoft says Speller100 has improved corrections in numerous low- and no-resource languages, such as Macedonian, Belarusian, and Pashto. on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share Microsoft says Speller100 has improved corrections in numerous low- and no-resource languages, such as Macedonian, Belarusian, and Pashto. on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"><\/i><\/a>Microsoft says Speller100 has improved corrections in numerous low- and no-resource languages, such as Macedonian, Belarusian, and Pashto.<\/figcaption><\/figure>\n<p>Bing previously provided high-quality spelling corrections for around two dozen languages. However, it didn\u2019t have enough&nbsp;training data to work well on languages with little web presence and user feedback.<\/p>\n<p>Speller100 overcomes these limitations by looking for similarities in large language families.<\/p>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-1338267 lazy\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/02\/Screenshot-2021-02-09-at-15.39.04.png\" alt=\"Germanic languages have many orthographic similarities.\" width=\"1318\" height=\"362\" sizes=\"(max-width: 1318px) 100vw, 1318px\" data-lazy=\"true\" data-srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/02\/Screenshot-2021-02-09-at-15.39.04.png 1318w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/02\/Screenshot-2021-02-09-at-15.39.04-280x77.png 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/02\/Screenshot-2021-02-09-at-15.39.04-540x148.png 540w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/02\/Screenshot-2021-02-09-at-15.39.04-270x74.png 270w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/02\/Screenshot-2021-02-09-at-15.39.04-796x219.png 796w\"><figcaption>Credit: Microsoft<\/figcaption><figcaption><a href=\"https:\/\/thenextweb.com\/neural\/2021\/02\/09\/microsoft-says-its-developed-the-most-comprehensive-spelling-correction-system-ever-made\/#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Fthenextweb.com%2Fneural%2F2021%2F02%2F09%2Fmicrosoft-says-its-developed-the-most-comprehensive-spelling-correction-system-ever-made%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: The system uses orthographic similarities in language families such as Germanic.\" data-title=\"Share The system uses orthographic similarities in language families such as Germanic. on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share The system uses orthographic similarities in language families such as Germanic. on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"><\/i><\/a>The system uses orthographic similarities in language families such as Germanic.<\/figcaption><\/figure>\n<p>It also applies zero-shot learning to correct errors without needing extra language-specific labeled training data.<\/p>\n<p><em>[Read: <a href=\"https:\/\/thenextweb.com\/shift\/2021\/02\/01\/how-much-does-it-cost-to-buy-electric-car\/\">How much does it cost to buy, own, and run an EV? It\u2019s not as much as you think<\/a>]<\/em><\/p>\n<p>Microsoft said it built around a dozen language family-based models to maximize the zero-shot benefit:<\/p>\n<blockquote readability=\"13\">\n<p><span>Imagine someone had taught you how to spell in English and you automatically learned to also spell in German, Dutch, Afrikaans, Scots, and Luxembourgish.&nbsp;<\/span><em>That&nbsp;<\/em><span>is what zero-shot learning enables, and it is a key component in Speller100 that allows us to expand to languages with very little to no data.<\/span><\/p>\n<\/blockquote>\n<p>The system also reduces the need for human-labeled annotations by extracting text from web pages to generate common errors.<\/p>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-1338266 lazy\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/02\/Screenshot-2021-02-09-at-15.34.18.png\" alt=\"Microsoft designed noise functions to generate common errors of rotation, insertion, deletion, and replacement.\" width=\"1242\" height=\"190\" sizes=\"(max-width: 1242px) 100vw, 1242px\" data-lazy=\"true\" data-srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/02\/Screenshot-2021-02-09-at-15.34.18.png 1242w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/02\/Screenshot-2021-02-09-at-15.34.18-280x43.png 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/02\/Screenshot-2021-02-09-at-15.34.18-540x83.png 540w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/02\/Screenshot-2021-02-09-at-15.34.18-270x41.png 270w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/02\/Screenshot-2021-02-09-at-15.34.18-796x122.png 796w\"><figcaption>Credit: Microsoft<\/figcaption><figcaption><a href=\"https:\/\/thenextweb.com\/neural\/2021\/02\/09\/microsoft-says-its-developed-the-most-comprehensive-spelling-correction-system-ever-made\/#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Fthenextweb.com%2Fneural%2F2021%2F02%2F09%2Fmicrosoft-says-its-developed-the-most-comprehensive-spelling-correction-system-ever-made%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: Speller100 uses noise functions to produce typical errors of rotation, insertion, deletion, and replacement.\" data-title=\"Share Speller100 uses noise functions to produce typical errors of rotation, insertion, deletion, and replacement. on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share Speller100 uses noise functions to produce typical errors of rotation, insertion, deletion, and replacement. on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"><\/i><\/a>Speller100 uses noise functions to produce typical errors of rotation, insertion, deletion, and replacement.<span><\/span><\/figcaption><\/figure>\n<p class>\u201cThis text can easily be extracted through web crawling, and there is a sufficient amount of text for the training of hundreds of languages,\u201d Microsoft said.<\/p>\n<p class>In tests, Speller100 reduced the number of pages with no reduced by up to 30%.&nbsp;It also increased the number of times users clicked on spelling suggestions from single digits to 67%.<\/p>\n<p>Microsoft said shipping the system to Bing is just the first step. The company plans to&nbsp;add&nbsp;the tech to \u201cmany more\u201d of its products in the near future.<\/p>\n<p class=\"c-post-pubDate\"> Published February 9, 2021 \u2014 17:03 UTC <\/p>\n<p> <a href=\"https:\/\/thenextweb.com\/neural\/2021\/02\/09\/microsoft-says-its-developed-the-most-comprehensive-spelling-correction-system-ever-made\/\">Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Microsoft has unveiled an AI system called Speller100 that corrects spelling in over 100 languages used in search queries on Bing. \u201cWe believe Speller100 is the most comprehensive spelling correction system ever&#8230;<\/p>\n","protected":false},"author":1,"featured_media":2936,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/www.londonchiropracter.com\/index.php?rest_route=\/wp\/v2\/posts\/2935"}],"collection":[{"href":"https:\/\/www.londonchiropracter.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.londonchiropracter.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.londonchiropracter.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.londonchiropracter.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=2935"}],"version-history":[{"count":0,"href":"https:\/\/www.londonchiropracter.com\/index.php?rest_route=\/wp\/v2\/posts\/2935\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.londonchiropracter.com\/index.php?rest_route=\/wp\/v2\/media\/2936"}],"wp:attachment":[{"href":"https:\/\/www.londonchiropracter.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=2935"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.londonchiropracter.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=2935"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.londonchiropracter.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=2935"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}