Before You Scale, Check Your Foundations

Every conference I attend right now has a version of the same conversation. Someone on stage talking about how they deployed AI agents. The speed. The efficiency. The results. Then, somewhere in the Q&A, a quieter question from the floor: “We tried something similar. It did not go the way we expected.”

That second voice is also the more honest one. It is more common than the stage version suggests, and it rarely gets a proper answer.

I spoke at AI Agents Sydney earlier this month. Thirty minutes. A room full of senior executives from banking, insurance, government, retail, healthcare. Smart people. Well-resourced organisations. Many of them already investing heavily in agentic AI. My question to the room was not how to build agents. It was why they fail. And what you need to have in place before you scale.

“The technology is rarely the problem. The humans deploying it almost always are.”

The numbers are not flattering

Gartner polled more than 3,400 organisations actively investing in agentic AI earlier this year. Their finding: more than 40 per cent of agentic AI projects will be cancelled before the end of 2027. Not because the agents did not work. Because the people deploying them made the wrong decisions at the wrong moments.

40%+

of agentic AI projects projected to be cancelled by end of 2027. Source: Gartner, 2025.

A separate Microsoft and LinkedIn Workforce Trends report found that 78 per cent of employees are bringing their own AI tools to work outside sanctioned systems. Nearly eight in ten. That is not an adoption success story. That is an organisation that has lost visibility over how work is actually getting done.

There is a name for the gap between what organisations deploy and what actually gets used. We call it the Integration Gap. The distance between AI access and AI adoption. It is never a technology problem. It is always a people one.

What it looks like in practice

Klarna made global headlines in 2023 when they announced their AI assistant was handling two thirds of all customer service interactions, the equivalent of 700 full-time agents. Within twelve months they had reversed course. Significant service failures. Brand damage. A rushed hiring program to rebuild the customer relationships the agents had eroded. The agents worked technically. The human conditions needed to support them did not exist.

Starbucks is a different kind of cautionary story. They deployed automation across operations in pursuit of efficiency. The result was a product and experience so standardised that customers stopped feeling anything when they walked in. CEO Brian Niccol later acknowledged it directly: they had taken the soul out of the brand. You cannot script genuine connection. You cannot automate trust. The technology outpaced the cultural conditions needed to make it work.

Neither story is unique. Versions of both play out inside Australian organisations every week, with less public scrutiny and the same cost.

Three conditions. Every time.

When we look at stalled agent deployments across Australian organisations, the same three conditions are missing. Every time. We call them the ARC model: Authority, Readiness, and Culture.

The ARC Model · hum[ai]n

Authority

Clear, agreed standards for what agents are authorised to do, and who is accountable when they act.

Readiness

Real, applied skills to work alongside agents. Not access. Not awareness. Actual capability that shows up in daily work.

Culture

Psychological safety to question, challenge, and override an agent when something does not look right.

When A, R, and C are all in place, agents deliver. When any one is missing, the deployment stalls. The weakest dimension determines the outcome.

A specialist insurance brokerage came to us after deploying agents across their advisory workflow. The tools were live. Nobody was using them. Not because people objected, but because no one had ever agreed on what the agents were authorised to do. The question underneath the silence was simple: am I even allowed to use this? An Authority gap does not always look like chaos. Sometimes it looks like paralysis.

A large listed property group had the opposite problem. The tools were in use, but inconsistently and without real skill behind them. People were accepting or rejecting agent outputs wholesale, without the ability to interrogate, improve, or appropriately push back. We ran applied sessions using their actual sustainability reporting work rather than fabricated scenarios. Capability developed. And something else surfaced: siloed working patterns across the business were limiting what the agents could do. The technology exposed a structural problem the organisation had been quietly managing around for years.

A leading property management business came to us with shadow AI. People were using tools leadership could not see, solving immediate problems while creating invisible risk. The first phase of our work was simply mapping what tools were actually in use and for what purpose. The gap between official policy and daily practice was significant. No organisation can build effective oversight of agents it cannot see. Culture work has to come before anything else.

Scaling is not the risk. Scaling too early is.

The argument here is not that organisations should slow down. The ones that hesitate for the wrong reasons will find themselves at a real disadvantage. The market is moving. The tools are capable. The results, when conditions are right, are significant.

The argument is that scaling before foundations are in place does not save time. It creates the kind of failure that is expensive and slow to recover from. The organisations doing this well have done the human conditions work first. The ones who skipped it are managing the fallout.

“AI works when people do. That is not a limit on ambition. It is a description of how results actually happen.”

Check your foundations. Then scale.

Before you scale, check your foundations.

The numbers are not flattering

What it looks like in practice

Three conditions. Every time.

Scaling is not the risk. Scaling too early is.