Databricks, a leading data and Artificial Intelligence (AI) company, recently announced the release of lineage for unity catalog to expand data governance capabilities on its lakehouse.
Databrick’s lakehouse platform helps organizations unify data, analytics, and AI. Their latest feature will enable customers to obtain a complete view of the entire data lifecycle by offering them a solid insight into the source of their lakehouse data and by helping them understand who created it. It will also allow them to understand when and how it has been created and modified and how it is being used.
Data lineage for unity catalog will allow data teams to obtain a solid understanding of the impact of data changes on each downstream customer so that relevant stakeholders can be notified of the changes as early as possible. The new offering will help in ensuring end users high quality data by allowing data stewards to gain visibility into critical areas and by helping them understand which datasets have become obsolete so that unnecessary data can be removed, and risks can be reduced.
Matei Zaharia, Co-founder and chief technologist at Databricks, said, “Governance capabilities such as data lineage are critical as we work to build the industry’s most robust lakehouse platform. Without good data lineage, it is challenging to track the business and verification processes that data-driven organizations need to be successful.”
“Our goal is to ensure our customers can focus on insights and move toward proactive data management practices through a unified, transparent view of their entire data ecosystem”, added Matei.
In addition, data lineage also helps organizations meet compliance standards and track data flows while following compliance regulations like General Data Protection Regulation, California Consumer Privacy Act, or Health Insurance Portability and Accountability Act etc.
Data lineage for unity catalog can be now previewed on AWS and Microsoft Azure.