Data Governance in Operations Needed to Ensure Clean Data for AI Projects

Data Governance in Operations Needed to Ensure Clean Data for AI Projects

Organizations relying on AI and machine learning applications need to have a plan for data governance, to bridge operations and strategic vision.

Data governance in data-driven organizations is a set of practices and guidelines that define where responsibility for data quality lives. The guidelines support the operation’s business model, especially if AI and machine learning applications are at work. 

“Data governance should be a bridge that translates a strategic vision acknowledging the importance of data for the organization and codifying it into practices and guidelines that support operations, ensuring that products and services are delivered to customers,” stated author Gregory Vial is an assistant professor of IT at HEC Montréal.

To prevent data governance from being limited to a plan that nobody reads, “governing” data needs to be a verb and not a noun phrase as in “data governance.” Vial writes, “The difference is subtle but ties back to placing governance between strategy and operations — because these activities bridge and evolve in step with both.”

  • Principles at the foundation of the framework that relate to the role of data as an asset for the organization;
  • Quality to define the requirements for data to be usable and the mechanisms in place to assess that those requirements are met;
  • Metadata to define the semantics crucial for interpreting and using data — for example, those found in a data catalog that data scientists use to work with large data sets hosted on a data lake.
  • Accessibility to establish the requirements related to gaining access to data, including security requirements and risk mitigation procedures;
  • Life cycle to support the production, retention, and disposal of data on the basis of organization and/or legal requirements.

“Governing data is not easy, but it is well worth the effort,” stated Vial. “Not only does it help an organization keep up with the changing legal and ethical landscape of data production and use; it also helps safeguard a precious strategic asset while supporting digital innovation.”