Action Research in Practice: Case Studies in Ethical AI

Action Research and Ethical AI: Investigating to Improve

Action Research in ethical AI is not theoretical. Here are three examples of how it has been applied:

Community-based audit of a benefits algorithm

A local government used an AI system to assess eligibility for welfare benefits. Advocacy organisations representing affected communities used Action Research to investigate the system’s impacts. Over multiple cycles: documenting individual experiences, analysing patterns, feeding findings back to communities, refining the investigation: they built a detailed picture of how the algorithm was producing discriminatory outcomes for specific groups.

The research did not stop at documentation. Each cycle produced recommendations for the government. Some were adopted. Some were resisted. The researchers continued to observe and reflect, adjusting their approach. The final outcome was a significant revision to the algorithm and a new community oversight mechanism. This did not happen through a single report. It happened through sustained, cyclical, participatory inquiry.

Hospital clinician research into clinical AI

A hospital introduced an AI system to support clinical decision-making. Rather than simply evaluating the system technically, a team of clinicians used Action Research to investigate how it was actually being used: and what that revealed about its ethical dimensions.

They found that clinicians were developing workarounds when they disagreed with the AI’s recommendations: but not documenting these disagreements. This invisible friction was creating accountability gaps. The research findings led to a redesign of the system’s interface and a new process for documenting human-AI disagreement.

Developer-community co-research on a hiring tool

A technology company developing an AI hiring tool used Action Research to involve both its development team and representatives of communities historically underrepresented in tech hiring. Over multiple cycles, they investigated how the tool performed across different demographic groups, what assumptions were embedded in its training data, and how candidates experienced the AI-mediated hiring process.

The research produced both technical improvements and a deeper cultural change in how the development team thought about fairness: not as an abstract metric but as a lived experience with specific people.

Reflection question: Which of these examples resonates most with challenges you see in your own context? What would you do differently, or what would you add?