By: Piet Loubser
Through 2018 90% of deployed data lakes will be rendered useless as they’re overwhelmed with information assets captured for uncertain use cases, according to Gartner.1 This is despite growth from pure play Hadoop vendors like Hortonworks and Cloudera. Join our webcast to learn key steps to accelerate value based on our learnings with numerous customers that leverage Paxata’s Self-Service Data Prep solutions in Azure and other cloud environments.
Why is it so Difficult to Get Value Out of the Data Lake?
Without giving away the entire webcast, there are a number of issues that contribute to the challenges. I will highlight a few of them here, but join our webcast to learn more on how to overcome these challenges:
- The premise of the data lake requires a new data “lifestyle.” Our traditional Enterprise Data Warehouse (EDW) was modeled, designed, and built with specific predefined questions in mind, such as we knew the quest and then built a dataset to answer that. Easy enough. The data lake on the other hand is more about collecting data – any kind of data – and then see what you can answer with this data.
- Traditional thinking and technologies will not help you much. The reason why I refer to it as a “lifestyle” is because it requires different thinking, different tools, and in cases, different people. In this new lifestyle, self-service and empowerment of the data analysts, data scientists, and power users is a must. We cannot gate the exploratory analytical with having to rely on costly, scarce IT resources.
- Designing for the pilot vs. design for success. The Open Source world of Hadoop is a wonderful playground with all kinds of tools for different things. While it is a great place to start, quite often these point tools can be extremely complex and require technical skills you do not have or at least not enough of them. They also often lack the enterprise characteristics you need for success in production – governance, security, data lineage. Success is not generating the insight – success is when your insight is informing every person’s behavior or every application is being improved.
Still Confused About the Business Value of the Data Lake?
Data is a strategic asset for business, but most of us treat it as a silo’d insight that informs one person or one team. It has to touch everyone! The challenge with data, as outlined above, is that 80% of the effort can be spent finding, shaping, and cleaning the data for analytics, data science or maybe a new app.
Paxata Self-Service Data Prep for Microsoft Azure
Moving to the cloud is a given for most companies today and Microsoft Azure provides a robust, enterprise ready environment to run your apps and analytical workloads. With our partnership with Microsoft, Paxata has brought our industry leading Self-Service Data Preparation Solution to the Azure Marketplace to help you get up and running very quickly and have the elastic scaling on demand when you need it.
Join Our Webcast
1 Gartner Derive Value from Data Lakes Using Analytics Design Patterns, Svenltana Sicular, Joao Tapadinhas, Cindi Howson, 26 September 2017.