Trustworthy Knowledge Graphs

To reach its potential, artificial intelligence (AI) needs data and context. Without the right (amounts of) data, machine learning (ML), the most prevalent form of AI today, cannot identify patterns or make predictions. Without a deeper understanding of context, AI applications cannot engage people in a meaningful way. Knowledge graphs (KGs), a term coined by Google in 2012 to refer to its general-purpose knowledge base, are critical to both: they reduce the need for large labelled ML datasets; facilitate transfer learning; and generate explanations. KGs are used in many industrial AI applications, including digital twins, enterprise data management, supply chain management, procurement, and regulatory compliance.

As industrial AI applications produce and consume more data, engineering KGs has evolved into a complex, semi-automatic process, which increasingly relies on opaque ML models and vast collections of heterogeneous sources to scale to graphs with millions of nodes and billions of edges. The KG lifecycle is not transparent, accountability is limited, and there are no records about, or indeed methods to determine, how fair a KG is in the downstream applications that use it. KGs are thus at odds with emerging AI regulation such as the EU AI Act; and with ongoing efforts in the wider field of data-centric AI, where the community has started to systematically audit high-stakes AI data assets to make sure they are relevant, representative, and balanced.

The vision of this Focus Group, led by Hans Fischer Senior Fellow Prof. Elena Simperl and her host Prof. Klaus Diepold (Data Processing, TUM), is of a trustworthy KG engineering that genuinely enables human-centric industrial AI applications through compliance with ongoing laws and guidance. Drawing on insights and methods from across AI, human-computer interaction and social sciences, the focus group will first define process blueprints for KG development, maintenance, and assessment that account for emerging human-in-the-loop practices in industrial contexts where human, social, and machine capabilities are seamlessly mixed at unprecedented scale. Supporting these blueprints, the project will then design, implement, and evaluate socio-technical methods that: (1) increase the transparency and accountability of the KG lifecycle through conversational explanations; and (2) assess biases and wider socio-environmental implications of KGs as a component in industrial AI applications. We will apply the research in the context of a legal compliance demonstrator, designed with the help of industry partners. It will include a legal knowledge graph, capturing knowledge pertaining to AI laws and regulation (e.g., EU AI Act), as well as relevant standards, guidance, and case law. The tool will guide compliance professionals when auditing industrial AI applications; document the results in a reusable, structured way; and facilitate the discovery of best practices to streamline compliance efforts in organisations that deploy AI.

Prof. Simperl holds a TUM-IAS Hans Fischer Senior Fellowship funded by Siemens AG.