No storage, no cry: Sinking the data storage barrier

Join top executives in San Francisco on July 11-12, to hear how leaders are integrating and optimizing AI investments for success. Learn More


In this age of information, big data is increasingly viewed as the lifeblood of any organization. Yet, because data has become so big and varied, properly analyzing it remains a huge challenge for enterprises.

As such, the business insights that this essential data should be able to yield instead become either too difficult, time-consuming or costly to produce.

One key challenge is the interaction between storage and analytics solutions and whether they can handle these masses of data — or is there a way to skip the storage barrier altogether? 

The timeline for this explosion in big data can be broken into three distinct periods.

Event

Transform 2023

Join us in San Francisco on July 11-12, where top executives will share how they have integrated and optimized AI investments for success and avoided common pitfalls.


Register Now

First there was simple text file (TXT) storage, followed by relational database management systems (RDBMS), allowing for easier monitoring and interaction with larger data sets.

The third stage — modern open-source formats like Parquet and Iceberg, which more effectively collect compressed files — resulted from the fact that the capacity of these databases was outpaced by the data they were tasked to collect and analyze.

Then came the stage where database companies would develop their own storage methods in the form of data warehouses. These custom-made, proprietary data storage formats offer better performance and allow data-reliant companies to store their data in ways they can query and handle most effectively.

So, why are data analytics still lagging?

The cost of data warehouses

Despite the customization they afford, data warehouse storage formats come with a slew of drawbacks.

These warehouses’ ingestion protocols require enterprise data to undergo pre-processing before entering the warehouse, so queries are delayed. There is also no single source of “truth,” as the sync process between the originating storage location (where data, still in its raw format, is created) and the data warehouse is complex and can skew datasets.

Vendor lock-in is another issue, as the query-able data from any storage format location is often closed for only one application, and thus not always compatible with the various tools required for data analytics. Lastly, anytime a department wants to analyze its data, the data sources need to be duplicated, which can result in convoluted and sometimes impossible data sharing between different data warehouses. 

As these shortcomings become increasingly prominent and pose greater challenges for data-driven enterprises, the fourth chapter of the data storage saga is unfolding. 

Enter the “data lake.” 

Diving into the data lake

Unlike a data warehouse (and the walled-in, finite nature that its name implies), a data lake is fluid, deep and wide open. For the first time, enterprises of any size can save relevant data from images to videos to text in a centralized, scalable, widely accessible storage location.

Because these solutions, with their inlets and tributaries and the fluid nature of their storage formats, are designed not only for data storage but with data sharing and syncing in mind, data lakes aren’t bogged down by vendor lock-in, data duplication challenges or single truth source complications.

Combined with open-source formats such as Apache Parquet files — which are effective enough to manage the analytic needs across various silos within an organization — these unique storage systems have empowered enterprises to successfully work within a data lake architecture and enjoy its performance advantages.

The house on the lake

Although data lakes are a promising storage and analytics solution, they are still relatively new. Accordingly, industry experts are still exploring the potential opportunities and pitfalls that such cloud compute capabilities may have on their storage solutions.

One attempt to overcome the current disadvantages is by combining data lake capabilities with data warehouse organization and cloud computing — dubbed the “data lakehouse” — essentially a data warehouse floating atop a data lake.

Consider that a data lake is just a collection of files in folders: Simple and easy to use, but unable to pull data effectively without a centralized database. Even once data warehouses had developed a way to read open-source file formats, the challenges of ingestion delays, vendor lock-in, and a single source of truth remained.

Data lakehouses, on the other hand, allow enterprises to use a look-alike-database processing engine and semantic layer to query all their data as is, with no excessive transformations and copies, while maintaining the advantages of both methods.

The success of this combined approach to data storage and analytics is already encouraging. Ventana Research VP and research director Matt Aslett predicts that by 2024, more than three-quarters of data lake adopters will be investing in data lakehouse technologies to improve the business value of their accumulated data.

Enterprises can now enjoy the analytical advantages of SQL databases as well as the cheap, flexible storage capabilities of a cloud data lake, while still owning their own data and maintaining separate analytical environments for every domain. 

How deep does this lake go?

As data companies increasingly adopt cloud data lakehouses, more and more enterprises will be able to focus on one of the most critical assets of business today — complex analytics on big datasets. Instead of bringing their data into hosting engines, enterprises will actually be bringing high level engines to whatever data they need analyzed.

Thanks to the low entry barriers of cloud data lakehouses, where hardware allocation can be achieved in just a few clicks, organizations will have easily accessible data for every conceivable use case.

Data lakehouse vendors will continue to be tested on their ability to deal with bigger datasets without auto-scaling their compute resources to infinity. But even as the technology progresses, the data lakehouse method will remain consistent in its ability to allow data independence and give users the advantages of both data warehouses and data lakes.

The waters of the data lake may seem untested, but it is increasingly apparent that vendors and enterprises that don’t take the plunge won’t fulfill their data potential.

Matan Libis is VP of product at SQream.

DataDecisionMakers

Welcome to the VentureBeat community!

DataDecisionMakers is where experts, including the technical people doing data work, can share data-related insights and innovation.

If you want to read about cutting-edge ideas and up-to-date information, best practices, and the future of data and data tech, join us at DataDecisionMakers.

You might even consider contributing an article of your own!

Read More From DataDecisionMakers

Read More
Matan Libis, SQream

Latest

Mentalist Oz Pearlman Will Get Inside Trump’s Mind at the White House Correspondents’ Dinner

Typically, the White House Correspondents’ Dinner features a comedian for its star act. In years past, the journalists, executives, agents, and miscellaneous members of the DC establishment have gathered at the Washington Hilton to hear speeches from the head of the correspondents’ association and the president. Then a comedian gets up to properly skewer the

David Pollack Reflects on Being Laid Off From ESPN College GameDay

Moving from the Saturday morning spotlight to a home studio was a major shift for one of the most decorated defensive players in college football history. David Pollack, the former Georgia Bulldog and longtime ESPN mainstay, recently shared his perspective on the day his 13-year tenure at the network came to an abrupt end. Appearing

Star High School Football Player Shot and Killed in Texas

Star High School Football Player Shot and Killed in Texas A Lancaster High School football player was shot and killed during an off-campus shooting this week. Myers Anthony, a 16-year-old football star at Lancaster High School in Lancaster. The shooting is still being investigated as a homicide and appears to be an isolated incident. Anthony

New Orleans Saints News, April 16: Could Arvell Reese fall to the Saints?

Skip to main content Here are today’s Saints news links Apr 16, 2026, 12:30 PM UTC Welcome to today’s roundup of New Orleans Saints and NFL news! Some Saints players are showing up off the football field. A worrying trend. Without a doubt for the Saints. New Orleans Saints News Apr 15 New Orleans Saints

Newsletter

Don't miss

Mentalist Oz Pearlman Will Get Inside Trump’s Mind at the White House Correspondents’ Dinner

Typically, the White House Correspondents’ Dinner features a comedian for its star act. In years past, the journalists, executives, agents, and miscellaneous members of the DC establishment have gathered at the Washington Hilton to hear speeches from the head of the correspondents’ association and the president. Then a comedian gets up to properly skewer the

David Pollack Reflects on Being Laid Off From ESPN College GameDay

Moving from the Saturday morning spotlight to a home studio was a major shift for one of the most decorated defensive players in college football history. David Pollack, the former Georgia Bulldog and longtime ESPN mainstay, recently shared his perspective on the day his 13-year tenure at the network came to an abrupt end. Appearing

Star High School Football Player Shot and Killed in Texas

Star High School Football Player Shot and Killed in Texas A Lancaster High School football player was shot and killed during an off-campus shooting this week. Myers Anthony, a 16-year-old football star at Lancaster High School in Lancaster. The shooting is still being investigated as a homicide and appears to be an isolated incident. Anthony

New Orleans Saints News, April 16: Could Arvell Reese fall to the Saints?

Skip to main content Here are today’s Saints news links Apr 16, 2026, 12:30 PM UTC Welcome to today’s roundup of New Orleans Saints and NFL news! Some Saints players are showing up off the football field. A worrying trend. Without a doubt for the Saints. New Orleans Saints News Apr 15 New Orleans Saints

How NFL Prospects Can Build a Winning Football Resume

How NFL Prospects Can Build a Winning Football Resume For serious football players, a clean, well-structured football resume example can help turn game film into something a coach, scout, recruiter, or personnel staffer can scan fast and actually use. The competition is brutal at every level, with only 1.4% of NCAA football players drafted into the NFL

Family Business? Tee Grizzley Reacts After His Mom Accuses Him Of Leaving Her To Struggle (PHOTOS)

Y’all… it looks like some family tension might be brewing behind the scenes involving Tee Grizzley and his mom. What seemed like a regular social media post quickly turned into something deeper. And now, folks are side-eyeing the situation and wondering what’s really going on. RELATED: Tee Grizzley Shares A Message For Artists After His

SoE necessary but not sufficient, business leaders say

PE­TER CHRISTO­PHER Se­nior Mul­ti­me­dia Re­porter pe­ter.christo­pher@guardian.co.tt Heavy hand­ed but nec­es­sary giv­en the state of crime in T&T. This was a com­mon as­sess­ment from var­i­ous busi­ness groups when asked for their per­spec­tive on the lat­est de­c­la­ra­tion of a state of emer­gency in the coun­try. The T&T Cham­ber of In­dus­try and Com­merce, in a re­leased is­sued yes­ter­day

The Big Business of Carolyn Bessette-Kennedy

Can a nine-episode limited series really impact an entire season of shopping trends? Today brands are experiencing—and chasing—the “Carolyn Bessette-Kennedy effect” as a result of Ryan Murphy’s Love Story. And in many cases, it’s more pervasive than they could have prepared for. The FX series, based on the relationship between John F. Kennedy Jr. and