To further strengthen our commitment to providing industry-leading data technology coverage, VentureBeat is pleased to welcome Andrew Brust and Tony Baer as regular collaborators. View their articles in Data pipeline.
Data quality, a subset of smart datais a topic of interest to many business executives – with 82% cited data quality as a barrier to their business. With so many data quality solutions with different approaches currently on the market, how do you choose?
By Alation CEO and co-founder Satyen Sangani said that today’s announcement about Alation Open Data Quality Initiative (ODQI) for the modern data stack is designed to give customers the freedom of choice and flexibility when it comes to choosing the best data quality providers and data observability to match. tailored to the needs of their data-driven modern organizations.
The Alation Open Data Quality Framework (ODQF) opens the Alation Data Catalog to any data quality provider in the modern data stack and data management ecosystem. Initially, data quality and visibility providers such as Acceldata, Anomalo, Bigeye, Experian, FirstEigen, Lightup and Soda joined, as well as industry partners including Capgemini and Fivetran.
Some of them are already Alation partners, while others are new and drawn to the idea of having a standard by which to come together. The company hopes ODQF will become the de facto standard.
From data catalog to intelligent data
Sangani, who has a background in economics and worked in financial analysis and product management at Oracle, co-founded Alation in 2012. However, the company remained stealthy until 2012. 2015, worked with a small number of customers to define what the product was and what the company was. really to achieve and for whom.
Sangani’s experience also illustrates Alation’s approach. He says that selling large-scale packages to large companies to help them analyze their data has resulted in companies not really understanding the data:
“Two years, hundreds of millions of dollars will be spent… and often most of that time is spent determining which systems have the right data, how the data is used, what the data means, ‘ Sangani said. “There are often multiple copies of data and conflicting records. And the people who understand systems and data models are often not outside the company. ”
Realize that data models, schemas, and the like present more of a knowledge management problem than a technical one. Sangani said he believes it combines aspects of human psychology as well as a didactic aspect, in terms of enabling and teaching people how to use reasoning and quantitative thinking.
Over time, Alation’s orbit has been associated with a number of terms and categories. The most prominent of which include metadata management, data governance, and data cataloging. Today, however, Sangani said all three are coming together in a broader market space: what was originally identified by IDC as intelligent data.
For a few years after Alation launched in 2015, the company tried to create a data catalog, a category that was new to many people, according to Sangani. Later, other players from metadata management and data governance also began to focus on building data catalogs.
In parallel, the timeline from 2012 to present also includes technological developments, such as the democratization of big data through the Hadoop ecosystem, as well as the enactment of regulations such as HIPAA and GDPR . All of that serves as a need to create an inventory focused on making data accessible to everyone, which Alation sees as a competitive differentiator.
Alerts as a foundation for data quality
For Alation, the data catalog is the foundation for the broader data intelligence portfolio. Sangani said smart data has many components: overall data management, privacy data management, reference data management, data transformation, data quality, data observability and more. Alation’s strategy is not to “own a box of these things,” as Sangani put it.
“The real issue in this space is not whether you have the ability to tag data. The biggest issue is engagement and adoption. Most people don’t use data properly. Most people don’t understand about what data exists. Most people don’t participate in the data. Most of the data is not well documented,” Sangani said.
“The idea of a data catalog is really all about getting people into the data set. But if it’s our strategy, focusing on engagement and adoption, that means there are some things that we’re not doing strategically,” he said. “What we don’t do is build a data quality solution. What we don’t do is build a data visualization solution or a total data management solution. ”
Alation considered expanding its offering in the data quality market, but decided against it. It’s a fast-moving, densely populated market, and the approaches to solutions can vary greatly. Sangani says that Alation doesn’t have a major competitive differentiator beyond the information in its data catalog. Sangani added that sharing can make Alation a platform for data quality, and that’s what the Open Data Quality Initiative aims to achieve.
However, life-or-death standards are really driven by customer acceptance, Sangani said. This initiative is a continuation of Alation’s Open Connectors framework, which allows third parties to build connectors for metadata for any data system.
Plumbing underpins value-added applications
Sangani says that Alation will continue to build integrations and open frameworks over time, because in the world of data management, there needs to be a consistent way to share metadata. In a way, Sangani added, what Alation has built up to now is plumbing, and ODQF is an example of more plumbing.
However, while the plumbing is essential, the company has begun upgrading to provide value-added features. For example, leverage natural language processing (NLP) to do name entity recognition for recommendations, or let people write sentences in English and convert that to SQL so that validation can be performed. interactively query queryable tuples.
Sangani mentions technologies such as knowledge graphs, AI, and machine learning as ingredients to being able to build a smarter data intelligence layer.
“I’m probably more excited about what we can do in the next five years than we’ve been in the past five years, because it all lays the groundwork for some really exciting applications that we’ve been working on for a long time. we will start to see in the near future,” he said.