{"id":619938,"date":"2023-03-20T09:50:15","date_gmt":"2023-03-20T14:50:15","guid":{"rendered":"https:\/\/news.sellorbuyhomefast.com\/index.php\/2023\/03\/20\/no-storage-no-cry-sinking-the-data-storage-barrier\/"},"modified":"2023-03-20T09:50:15","modified_gmt":"2023-03-20T14:50:15","slug":"no-storage-no-cry-sinking-the-data-storage-barrier","status":"publish","type":"post","link":"https:\/\/newsycanuse.com\/index.php\/2023\/03\/20\/no-storage-no-cry-sinking-the-data-storage-barrier\/","title":{"rendered":"No storage, no cry: Sinking the data storage barrier"},"content":{"rendered":"<div id=\"boilerplate_2682874\">\n<p><em>Join top executives in San Francisco on July 11-12, to hear how leaders are integrating and optimizing AI investments for success<\/em>. <em><a href=\"https:\/\/avolio.swapcard.com\/Transform2023\/registrations\/Start?utm_source=vb&#038;utm_medium=boiler&#038;utm_content=landingpage&#038;utm_campaign=T23_BoilerPlates\">Learn More<\/a><\/em><\/p>\n<hr>\n<\/div>\n<p>In this age of information, <a href=\"https:\/\/www.nasdaq.com\/articles\/data-privacy%3A-a-business-imperative-for-boards-leaders-that-may-contribute-to-market\" target=\"_blank\" rel=\"noreferrer noopener\">big data<\/a> is increasingly viewed as the lifeblood of any organization. Yet, because data has become so big and varied, properly analyzing it remains a huge challenge for enterprises. <\/p>\n<p>As such, the business insights that this essential data should be able to yield instead become either too difficult, time-consuming or costly to produce.<\/p>\n<p>One key challenge is the interaction between storage and analytics solutions and whether they can handle these masses of data \u2014 or is there a way to skip the storage barrier altogether?<strong>\u00a0<\/strong><\/p>\n<p>The timeline for this explosion in big data can be broken into three distinct periods.<\/p>\n<div><body><\/p>\n<div id=\"boilerplate_2803147\">\n<h3>Event<\/h3>\n<div>\n<p><span>Transform 2023<\/span><\/p>\n<div id=\"gm0a52976\">\n<p>Join us in San Francisco on July 11-12, where top executives will share how they have integrated and optimized AI investments for success and avoided common pitfalls.<\/p>\n<\/div>\n<\/div>\n<p><a href=\"https:\/\/avolio.swapcard.com\/Transform2023\/registrations\/Start?utm_source=vb&#038;utm_medium=incontent&#038;utm_content=landingpage&#038;utm_campaign=T23_incontent\"><br \/>\n                Register Now            <\/a>\n                        <\/p>\n<\/div>\n<p><\/body><\/p>\n<p>First there was simple text file (TXT) storage, followed by relational database management systems (RDBMS), allowing for easier monitoring and interaction with larger data sets. <\/p>\n<p>The third stage \u2014 modern open-source formats like Parquet and Iceberg, which more effectively collect compressed files \u2014 resulted from the fact that the capacity of these databases was outpaced by the data they were tasked to collect and analyze.<\/p>\n<p>Then came the stage where database companies would develop their own storage methods in the form of <a href=\"https:\/\/www.gartner.com\/en\/information-technology\/glossary\/data-warehouse\" target=\"_blank\" rel=\"noreferrer noopener\">data warehouses<\/a>. These custom-made, proprietary data storage formats offer better performance and allow data-reliant companies to store their data in ways they can query and handle most effectively.<\/p>\n<p>So, why are data analytics still lagging?<\/p>\n<h2 id=\"h-the-cost-of-data-warehouses\">The cost of data warehouses<\/h2>\n<p>Despite the customization they afford, data warehouse storage formats come with a slew of drawbacks.<\/p>\n<p>These warehouses\u2019 ingestion protocols require<strong> <\/strong>enterprise data to undergo pre-processing before entering the warehouse, so queries are delayed.\u00a0There is also no single source of \u201ctruth,\u201d as the sync process between the originating storage location (where data, still in its raw format, is created) and the data warehouse is complex and can skew datasets.<\/p>\n<p>Vendor lock-in is another issue, as the query-able data from any storage format location is often closed for only one application, and thus not always compatible with the various tools required for data analytics. Lastly, anytime a department wants to analyze its data, the data sources need to be duplicated, which can result in convoluted and sometimes impossible data sharing between different data warehouses.\u00a0<\/p>\n<p>As these shortcomings become increasingly prominent and pose greater challenges for data-driven enterprises, the fourth chapter of the data storage saga is unfolding.\u00a0<\/p>\n<p>Enter the \u201c<a href=\"https:\/\/venturebeat.com\/data-infrastructure\/what-is-a-data-lake-definition-benefits-architecture-and-best-practices\/\" target=\"_blank\" rel=\"noreferrer noopener\">data lake<\/a>.\u201d\u00a0<\/p>\n<h2 id=\"h-diving-into-the-data-lake\">Diving into the data lake<\/h2>\n<p>Unlike a data warehouse (and the walled-in, finite nature that its name implies), a data lake is fluid, deep and wide open. For the first time, enterprises of any size can save relevant data from images to videos to text in a centralized, scalable, widely accessible storage location.<\/p>\n<p>Because these solutions, with their inlets and tributaries and the fluid nature of their storage formats, are designed not only for data storage but with data sharing and syncing in mind, data lakes aren\u2019t bogged down by vendor lock-in, data duplication challenges or single truth source complications. <\/p>\n<p>Combined with open-source formats such as Apache Parquet files \u2014 which are effective enough to manage the analytic needs across various silos within an organization \u2014 these unique storage systems have empowered enterprises to successfully work within a <a href=\"https:\/\/venturebeat.com\/data-infrastructure\/starburst-accelerates-trino-to-warp-speed-to-accelerate-data-querying\/\" target=\"_blank\" rel=\"noreferrer noopener\">data lake<\/a> architecture and enjoy its performance advantages.<\/p>\n<h2 id=\"h-the-house-on-the-lake\">The house on the lake<\/h2>\n<p>Although data lakes are a promising storage and analytics solution, they are still relatively new. Accordingly, industry experts are still exploring the potential opportunities and pitfalls that such cloud compute capabilities may have on their storage solutions.<\/p>\n<p>One attempt to overcome the current disadvantages is by combining data lake capabilities with data warehouse organization and cloud computing \u2014 dubbed the \u201c<a href=\"https:\/\/venturebeat.com\/data-infrastructure\/bringing-order-to-data-lakehouses-onehouse-is-expanding-its-apache-hudi-technology-with-25m-raise\/\" target=\"_blank\" rel=\"noreferrer noopener\">data lakehouse<\/a>\u201d \u2014 essentially a data warehouse floating atop a data lake.<\/p>\n<p>Consider that a data lake is just a collection of files in folders: Simple and easy to use, but unable to pull data effectively without a centralized database. Even once data warehouses had developed a way to read open-source file formats, the challenges of ingestion delays, vendor lock-in, and a single source of truth remained.<\/p>\n<p>Data lakehouses, on the other hand, allow enterprises to use a look-alike-database processing engine and semantic layer to query all their data as is, with no excessive transformations and copies, while maintaining the advantages of both methods. <\/p>\n<p>The success of this combined approach to data storage and analytics is already encouraging. Ventana Research VP and research director Matt Aslett <a href=\"https:\/\/mattaslett.ventanaresearch.com\/databricks-lakehouse-platform-maximizes-analytical-value\" target=\"_blank\" rel=\"noreferrer noopener\">predicts<\/a> that by 2024, more than three-quarters of data lake adopters will be investing in data lakehouse technologies to improve the business value of their accumulated data.<\/p>\n<p>Enterprises can now enjoy the analytical advantages of SQL databases as well as the cheap, flexible storage capabilities of a cloud data lake, while still owning their own data and maintaining separate analytical environments for every domain.\u00a0<\/p>\n<h2 id=\"h-how-deep-does-this-lake-go\">How deep does this lake go?<\/h2>\n<p>As data companies increasingly adopt cloud data lakehouses, more and more<strong> <\/strong>enterprises will be able to focus on one of the most critical assets of business today \u2014 complex analytics on big datasets. Instead of bringing their data into hosting engines, enterprises will actually be bringing high level engines to whatever data they need analyzed.<\/p>\n<p>Thanks to the low entry barriers of cloud data lakehouses, where hardware allocation can be achieved in just a few clicks, organizations will have easily accessible data for every conceivable use case.<\/p>\n<p>Data lakehouse vendors will continue to be tested on their ability to deal with bigger datasets without auto-scaling their compute resources to infinity. But even as the technology progresses, the data lakehouse method will remain consistent in its ability to allow data independence and give users the advantages of both data warehouses and data lakes.<\/p>\n<p>The waters of the data lake may seem untested, but it is increasingly apparent that vendors and enterprises that don\u2019t take the plunge won\u2019t fulfill their data potential.<\/p>\n<p><em>Matan Libis is VP of product at <\/em><a href=\"https:\/\/sqream.com\/\" target=\"_blank\" rel=\"noreferrer noopener\"><em>SQream<\/em><\/a>. <\/p>\n<div id=\"boilerplate_2736392\">\n<h3 id=\"h-datadecisionmakers\">DataDecisionMakers<\/h3>\n<p>Welcome to the VentureBeat community!<\/p>\n<p>DataDecisionMakers is where experts, including the technical people doing data work, can share data-related insights and innovation.<\/p>\n<p>If you want to read about cutting-edge ideas and up-to-date information, best practices, and the future of data and data tech, join us at DataDecisionMakers.<\/p>\n<p>You might even consider\u00a0<a rel=\"noreferrer noopener\" target=\"_blank\" href=\"https:\/\/venturebeat.com\/contribute-to-datadecisionmakers\/\">contributing an article<\/a>\u00a0of your own!<\/p>\n<p><a rel=\"noreferrer noopener\" href=\"https:\/\/venturebeat.com\/category\/DataDecisionMakers\/\" target=\"_blank\">Read More From DataDecisionMakers<\/a><\/p>\n<\/div><\/div>\n<p><a href=\"https:\/\/venturebeat.com\/data-infrastructure\/no-storage-no-cry-sinking-the-data-storage-barrier\/\" class=\"button purchase\" rel=\"nofollow noopener\" target=\"_blank\">Read More<\/a><br \/>\n Matan Libis, SQream<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Join top executives in San Francisco on July 11-12, to hear how leaders are integrating and optimizing AI investments for success. Learn More In this age of information, big data is increasingly viewed as the lifeblood of any organization. Yet, because data has become so big and varied, properly analyzing it remains a huge challenge<\/p>\n","protected":false},"author":1,"featured_media":619939,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4635,23212,46],"tags":[],"class_list":{"0":"post-619938","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-sinking","8":"category-storage","9":"category-technology"},"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/posts\/619938","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/comments?post=619938"}],"version-history":[{"count":0,"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/posts\/619938\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/media\/619939"}],"wp:attachment":[{"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/media?parent=619938"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/categories?post=619938"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/tags?post=619938"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}