{"id":621778,"date":"2023-03-25T09:48:56","date_gmt":"2023-03-25T14:48:56","guid":{"rendered":"https:\/\/news.sellorbuyhomefast.com\/index.php\/2023\/03\/25\/databricks-debuts-chatgpt-like-dolly-a-clone-any-enterprise-can-own\/"},"modified":"2023-03-25T09:48:56","modified_gmt":"2023-03-25T14:48:56","slug":"databricks-debuts-chatgpt-like-dolly-a-clone-any-enterprise-can-own","status":"publish","type":"post","link":"https:\/\/newsycanuse.com\/index.php\/2023\/03\/25\/databricks-debuts-chatgpt-like-dolly-a-clone-any-enterprise-can-own\/","title":{"rendered":"Databricks debuts ChatGPT-like Dolly, a clone any enterprise can own"},"content":{"rendered":"<div>\n<section>\n<p><time title=\"2023-03-24T18:46:44+00:00\" datetime=\"2023-03-24T18:46:44+00:00\">March 24, 2023 11:46 AM<\/time>\n\t\t\t<\/p>\n<\/section>\n<div>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"750\" height=\"469\" src=\"https:\/\/venturebeat.com\/wp-content\/uploads\/2023\/03\/Untitled-design-29.png?fit=750%2C469&#038;strip=all\" alt=\"Databricks\"><\/p>\n<p><span>Image by Canva Pro<\/span><\/p>\n<\/p><\/div>\n<\/p><\/div>\n<div id=\"primary\" role=\"main\">\n<article id=\"post-2865143\">\n<div>\n<div id=\"boilerplate_2682874\">\n<p><em>Join top executives in San Francisco on July 11-12, to hear how leaders are integrating and optimizing AI investments for success<\/em>. <em><a href=\"https:\/\/avolio.swapcard.com\/Transform2023\/registrations\/Start?utm_source=vb&#038;utm_medium=boiler&#038;utm_content=landingpage&#038;utm_campaign=T23_BoilerPlates\">Learn More<\/a><\/em><\/p>\n<hr>\n<\/div>\n<p>Was data lakehouse platform <a href=\"https:\/\/www.databricks.com\/\">Databricks<\/a> becoming an OpenAI rival on anyone\u2019s 2023 bingo card? Well, hello, Dolly. <\/p>\n<p>Today, in an effort the company says is meant to build on their longtime mission to democratize AI for the enterprise, Databricks released the code for an open-source large language model (LLM) called Dolly \u2014 named after\u00a0<a href=\"https:\/\/dolly.roslin.ed.ac.uk\/facts\/the-life-of-dolly\/index.html\" target=\"_blank\" rel=\"noreferrer noopener\">Dolly the sheep<\/a>, the first cloned mammal \u2014 that it said companies can use to create instruction-following chatbots similar to <a href=\"https:\/\/venturebeat.com\/ai\/openai-turns-chatgpt-into-a-platform-overnight-with-addition-of-plugins\/\">ChatGPT<\/a>. <\/p>\n<p>The model can be trained, the company <a href=\"https:\/\/www.databricks.com\/blog\/2023\/03\/24\/hello-dolly-democratizing-magic-chatgpt-open-models.html\">explained in a blog post<\/a>, on very little data and in very little time. \u201cWith 30 bucks, one server and three hours, we\u2019re able to teach [Dolly] to start doing human-level interactivity,\u201d said Databricks CEO Ali Ghodsi. <\/p>\n<p>There are many reasons a company would prefer to build their own LLM model rather than sending data to a centralized LLM provider that serves a proprietary model behind an API, the blog post explained. Handing sensitive data over to a third party may not be an option, while organizations may have specific needs as far as model quality, cost and desired behavior. <\/p>\n<p><html><body><\/p>\n<div id=\"boilerplate_2803147\">\n<h3>Event<\/h3>\n<div>\n<p><span>Transform 2023<\/span><\/p>\n<div id=\"gm0a52976\">\n<p>Join us in San Francisco on July 11-12, where top executives will share how they have integrated and optimized AI investments for success and avoided common pitfalls.<\/p>\n<\/div>\n<\/div>\n<p><a href=\"https:\/\/avolio.swapcard.com\/Transform2023\/registrations\/Start?utm_source=vb&#038;utm_medium=incontent&#038;utm_content=landingpage&#038;utm_campaign=T23_incontent\"><br \/>\n                Register Now            <\/a>\n                        <\/p>\n<\/div>\n<p><\/body><\/p>\n<p>\u201cWe believe that most ML users are best served long term by directly owning their models,\u201d said the blog post. <\/p>\n<h2 id=\"h-databricks-found-chatgpt-like-qualities-don-t-require-latest-or-largest-llm\">Databricks found ChatGPT-like qualities don\u2019t require latest or largest LLM<\/h2>\n<p>According the announcement, Databricks said Dolly is meant to show that anyone \u201ccan take a dated off-the-shelf open source large language model and give it magical ChatGPT-like instruction.\u201d Surprisingly, it said, instruction-following does not seem to require the latest or largest models \u2014\u00a0Dolly is only 6 billion parameters, compared to 175 billion for GPT-3. <\/p>\n<p>\u201cWe\u2019ve been calling ourselves a data and AI company since 2013, and we have close to 1000 customers that have been using some kind of large language model on Databricks,\u201d said Ghodsi, who told VentureBeat he was \u201cblown away\u201d when ChatGPT was launched at the end of November 2022, but realized only a few companies on the planet have the massive language models necessary for ChatGPT-level ability. <\/p>\n<p>\u201cMost people were thinking, do we have to all leverage these proprietary models that these very few companies have? And if so, do we have to give them our data?\u201d he said. <\/p>\n<p>The answer to both of those questions is no: In February, Meta released the weights for a set of high-quality (but not instruction-following) language models called\u00a0<a href=\"https:\/\/ai.facebook.com\/blog\/large-language-model-llama-meta-ai\/\" target=\"_blank\" rel=\"noreferrer noopener\">LLaMA<\/a>\u00a0to academic researchers, trained for over 80,000 GPU-hours each. Then, in March, Stanford built the\u00a0<a href=\"https:\/\/crfm.stanford.edu\/2023\/03\/13\/alpaca.html\" target=\"_blank\" rel=\"noreferrer noopener\">Alpaca<\/a>\u00a0model, which was based on LLaMA, but tuned on a small dataset of 50,000 human-like questions and answers that, surprisingly, made it exhibit ChatGPT-like interactivity.<\/p>\n<p>Inspired by those two options, Databricks was able to take an existing open source\u00a0<a href=\"https:\/\/huggingface.co\/EleutherAI\/gpt-j-6B\" target=\"_blank\" rel=\"noreferrer noopener\">6 billion parameter model<\/a>\u00a0from\u00a0<a href=\"https:\/\/www.eleuther.ai\/\" target=\"_blank\" rel=\"noreferrer noopener\">EleutherAI<\/a>\u00a0and slightly modify it to elicit instruction following capabilities such as brainstorming and text generation not present in the original model, using data from Alpaca.<\/p>\n<p>Surprisingly, the modified model worked very well. According to the blog post, this suggests that \u201cmuch of the qualitative gains in state-of-the-art models like ChatGPT may owe to focused corpuses of instruction-following training data, rather than larger or better-tuned base models.\u201d <\/p>\n<h2 id=\"h-llm-models-will-not-be-the-hands-of-only-a-few-companies\">LLM models will not be the hands of only a few companies<\/h2>\n<p>Ghodsi said that going forward there will many more LLM models that will become cheaper and cheaper \u2014\u00a0and won\u2019t be in the hands of only a few companies. <\/p>\n<p>\u201cEvery organization on the planet will probably utilize these,\u201d he said. \u201cOur belief is that in every industry, the winning, leading companies will be data and AI companies that will be leveraging this kind of technology and will have these kinds of models.\u201d <\/p>\n<p><strong>VentureBeat&#8217;s mission<\/strong> is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. <a href=\"https:\/\/info.venturebeat.com\/website-preference-center.html?utm_source=VBsite&#038;utm_medium=bottomBoilerplate\" data-type=\"URL\" data-id=\"https:\/\/info.venturebeat.com\/website-preference-center.html\">Discover our Briefings.<\/a><\/p>\n<p>\t\t\t\t<\/html><\/div>\n<\/p><\/div>\n<p><a href=\"https:\/\/venturebeat.com\/ai\/databricks-debuts-chatgpt-like-dolly-a-clone-any-enterprise-can-own\/\" class=\"button purchase\" rel=\"nofollow noopener\" target=\"_blank\">Read More<\/a><br \/>\n Sharon Goldman<\/p>\n","protected":false},"excerpt":{"rendered":"<p>March 24, 2023 11:46 AM Image by Canva Pro Join top executives in San Francisco on July 11-12, to hear how leaders are integrating and optimizing AI investments for success. Learn More Was data lakehouse platform Databricks becoming an OpenAI rival on anyone\u2019s 2023 bingo card? Well, hello, Dolly. Today, in an effort the company<\/p>\n","protected":false},"author":1,"featured_media":621779,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[91505,3105,46],"tags":[],"class_list":{"0":"post-621778","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-databricks","8":"category-debuts","9":"category-technology"},"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/posts\/621778","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/comments?post=621778"}],"version-history":[{"count":0,"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/posts\/621778\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/media\/621779"}],"wp:attachment":[{"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/media?parent=621778"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/categories?post=621778"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/tags?post=621778"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}