Explainability as a Non-Functional Requirement in Agentic AI systems
At AWS re:Invent, I attended a fantastic breakout session with Alessandro Cere and Matthew Monfort, where they discussed key capabilities, tools, and patterns to consider when addressing responsible AI concerns.
When the topic of agents came up, I was surprised to see that little attention was given to the explainability of agent outputs. Instead, the focus shifted to emerging standards and understanding risk profiles when introducing AI-driven products.
I asked the speakers whether explainability in the context of agentic AI should be the responsibility of the business or whether it should be included in the emerging common metrics for responsible AI.
To my surprise, they responded that they would not consider explainability a best practice. I found this stance difficult to understand. To me, explainability is a clear business risk—one that is not necessarily tied to the task's importance (though it is critical in life-or-death applications), but one that can hurt a business's reputation if agentic processes go wrong.
So I turn back to the great Martin Fowler's position on explainability of generative AI output. Specifically:
With explicit logic we can, at least in principle, explain a decision by examining the source code and relevant data. Such explanations are beyond most current AI tools. For me this is a reasonable rationale to restrict their usage, at least until developments to improve the explainability of AI bear fruit. (Such restrictions would, of course, happily incentivize the development of more explainable AI.)
AI observability tools are finally maturing, and for most, we can start removing some of the stitched together services that track cost, token usage, latency, and other general AI service concerns. But none of them can (as of yet) accurately describe all the characteristics of a system to ultimately be able to explain a given output.
There's an inherent tension here, one I believe will always exist: AI systems are fundamentally non-deterministic. As a result, we need to understand why a given input resulted in a possibly problematic agentic output. Additionally, considering the well-known biases in foundational models, decisions may be influenced by:
- biased training set
- incorrect decision making
- misinterpreted or misunderstood data (RAG context)
- missing or incorrect input from a dependent or underlying system
Metrics alone will not tell the whole story about why an output was generated.
This is why I am an advocate for treating explainability as a non-functional requirement (NFR) in agentic systems. Like other NFRs, such as security or performance, explainability is essential for building reliable AI solutions. By implementing effective logging techniques, maintaining a deep understanding of the data involved, and precisely defining the use case, businesses can mitigate the risks of harmful or biased outputs without guessing where harmful output was generated. While this approach may increase costs, it ultimately provides value by enhancing transparency and reducing potential reputational and legal risks. Viewing explainability as an NFR ensures that it’s prioritized as a core concern of responsible AI usage, rather than an afterthought.