GenAI for risk assessment

We partnered with QUT researchers to understand the risk assessment capability of AI

In collaboration with Queensland University of Technology, Flame Tree has assessed how Generative AI (GenAI) can be practically and responsibly applied to cybersecurity risk assessment. Our research cuts through the hype to uncover how GenAI performs in real-world scenarios, where accuracy, accountability and alignment with risk appetite are non-negotiable.

Measuring the limitations of GenAI for risk assessment

Our research measured the efficacy of GenAI for risk assessment activities, highlighting the need for responsible AI. By identifying limitations and designing targeted solutions, we’re turning technical findings into a practical roadmap for safer and smarter risk management.

Gen AI needs structure to succeed

Tools like OpenAI remain prone to incorrect assumptions, outdated knowledge and bias.

Accuracy is limited without contextual intelligence

When tested on real-world data, including 180+ Common Vulnerabilities and Exposures (CVEs) and architectural diagrams, GenAI often misclassified risks or offered inconsistent results. Retrieval-Augmented Generation (RAG) significantly improved performance by enriching AI with accurate, trusted information.

Bias and prompt engineering matter

Even slight prompt bias can skew GenAI risk assessments. This reinforces the need for careful system design, robust access controls, and ongoing evaluation, particularly when using AI in governance or compliance decision-making processes.

Human oversight Is critical

GenAI is purely a tool for cybersecurity professionals, not a replacement. The most effective applications combine multi-agent collaboration, structured problem-solving and human review.

Building a roadmap for GenAI

We’re now advancing to the next phase! We’re testing risk and control libraries, third party risk management and optimising prompts for the best results.

Ready to get prepared and be protected?

Reach out to talk to us about responsible AI.