New Research Finds Data Integration and Preparation a Priority for the Use of AI and ML in Analytics and Data Management
Paxata, the pioneer in self-service data preparation, announced that it was named an innovator in Enterprise Management Associates(EMA) “Innovation in the Use of Artificial Intelligence (AI) and Machine Learning (ML) for Data Integration and Preparation” Top 3 report. According to the findings, more than half of all participants (52 percent) said that the use of AI or ML to automate the data preparation or integration process is important to their organization. Because of the prominent role of data integration and preparation in any analytics project, the report stated that AI-enablement should be a priority for analytics leaders at all levels as it provides organizations with the ability to overcome the constraints of legacy or less-automated data processing.
“The next major shift in the analytics, business intelligence, and data management markets is coming from the use of AI and ML across the entire information supply chain. Along with using machine learning to find the next-best offer, companies can now point algorithms at modern data platforms to find links between data sets, automate data preparation, or breaches in data governance,” said John Santaferraro, Research Director at EMA and lead author of the report. “Vendors like Paxata that excel in the use of AI and ML in their analytics, business intelligence, and data management platforms will create significant differentiation and barriers to entry that will change the face of all vertical industries.”
When it comes to which inherent capabilities are most important for a data integration and preparation tool, three out of four stated automated data profiling was the top criteria followed by data cleansing recommendations (60 percent). Data integration and preparation were considered the most time-consuming activity for every analytics project with data profiling and cleansing the most time-consuming aspect of data integration. On the low side of this research, only 14 percent of participants were willing to surrender control of data preparation to automated tasks.
EMA built a scoring model based on the priority set by 155 randomly selected participants in the use of AI and ML in data integration and preparation platforms and selected Paxata for their comprehensive coverage of the different AI-enabled capabilities. This is particularly important because according to the report, organizations that are first to implement in their industries can expect an advantage over their competitors.
More specifically, Paxata was recognized for:
- Automated Data Profiling: Paxata Rapid Data Profiling provides a one-click profile button that scans an organizations entire dataset and generates a summary scorecard showing an assessment of its content and quality. Paxata is unique in its ability to apply its algorithms across the entire body of the data while sample-based solutions inherently profile subsets of the data and therefore miss to identify outliers and accurate patterns.
- Data Cleansing Recommendation: Paxata uses algorithmic techniques to offer insight into data quality issues. Paxata is unique in its anomaly detection as it applies its algorithms across the entire body of the data (hundreds of millions of rows) while sample-based solutions inherently miss surfacing data quality issues unless the user goes through many iterations to eventually process the full data to reach the same level of confidence in data quality.
- Data Structure Identification: Paxata’s Intelligent Ingest intelligently detects source types, compression formats, and schemas, including inference in recognizing the content of extension-less files. The intelligent ingest then transforms each structure into a tabular format easy for point-and-click interactive profiling and preparation.
- Correlation or Relationship Recommendation: Paxata uses machine learning algorithms to detect joins and overlaps within different data sets. The algorithms work with the data content as well as metadata to provide a confidence score for the detected joins.
- Automated Data Preparation: Paxata’s Intelligent Automation auto-discovers dependent data preparation projects and data sets and creates multi-project data flows that can be operationalized from a single point. The automation can run on demand or can be scheduled to run all the time without triggers.
“We are extremely excited to be recognized in this report for the innovation that is foundational to our platform. We see customers across the globe continue to demonstrate how AI-enabled data preparation and integration have become instrumental to achieving a competitive advantage, a sentiment that was shared with more than three out of five (66 percent) of respondents,” said Prakash Nanduri, Co-Founder and CEO, at Paxata. “Dating back to 2012 when we created the industry’s first self-service data preparation solution, we have been committed to helping business consumers visually discover, profile, and clean data themselves in order to achieve significant business results. As the report mentions, we will continue to innovate by leveraging modern approaches such as AI and ML to enhance our product and generate even greater value.”