Back in September, our paper “From Data Quality for AI to AI for Data Quality: A Systematic Review of Tools for AI-Augmented Data Quality Management in Data Warehouses” was presented at the BIR Business Informatics Research Conference (BIR) in my hometown, Riga.
The paper, authored by my former student Heidi Carolina Tamm (now DW Lead Designer at Swedbank Estonia) and myself, explores a central question: can AI do more than just consume high-quality data—can it actively create and maintain it?


From “Data Quality for AI” to “Data Quality for AI & AI for Data Quality Management”
High-quality data underpins analytics, AI performance, and regulatory compliance, yet DQ management remains complex, resource-intensive, and often manual. With global data projected to reach 175 zettabytes by 2025, poor DQ carries real business costs—up to 19% of companies report customer loss due to inaccurate or incomplete data.
Traditionally, data quality (DQ) management focuses on ensuring AI has reliable, clean data. We propose a complementary perspective: using AI itself to enhance and automate data quality management. This is particularly relevant for enterprise data warehouses (DWs), which remain central despite the growth of decentralized and domain-driven architectures.
To this end, we systematically reviewed 151 DQ tools, evaluating their support for AI-augmented DQM, particularly automation of rule detection and anomaly identification. Key findings:
- only 10 tools fully support AI-augmented DQM in data warehouses;
- most tools focus on data cleansing rather than leveraging AI for rule discovery or explainable quality improvements;
- metadata + rule-based + ML hybrid approaches show promise but remain underutilized;
- SQL and natural language rule definitions are rarely supported, yet essential for practical use;
- explainable AI and governance features are critical for trust;
- cloud scalability with GDPR compliance is a must for enterprise adoption.
For organizations, AI-driven DQM goes beyond cleaning and profiling, enabling detection, enforcement, and explanation of data quality issues. This leads to better compliance, efficiency, and trust in enterprise data ecosystems.
For researchers and tool developers, there’s a clear need for next-generation DQ tools that are explainable, support SQL/natural language rules, integrate reconciliation logic, and operate under real-world regulatory and architectural constraints. LLMs and AI models can play a role, though they are not a silver bullet; early research shows strong potential.
Our study highlights that while AI has mainly been seen as a consumer of data quality, there is an untapped opportunity to use AI to actively improve DQ, shifting from “Data Quality for AI” to “Data Quality for AI and AI for Data Quality Management.” This research provides practical guidance for organizations and sets the stage for next-generation AI-augmented data governance and DQM solutions.
Read more in the article.


