Thought leadership · June 2026

AI on the shop floor

Agentic AI in store operations: strategy, implementation and evidence from Kmart Australia.

Fabio Oliveira
Head of Innovation and Design, Kmart Australia

The starting point

Retail AI looks at customers. The frontline is the underserved frontier.

Almost every AI and voice investment in retail is customer-facing
Chatbots, search, personalisation. The people who run the store are largely ignored by the same investment wave.
Store work is hands-full work
Team members carry stock, push trolleys and serve customers. Every screen interaction interrupts the physical work that defines the job.
The friction compounds at fleet scale
A single stock check is six taps. Across a national device fleet, that is thousands of hours of repetitive navigation every week.
AI that works on the floor must be designed on the floor
The rest of this deck is what that looked like in practice, with the numbers it produced.

Context

Where this evidence comes from

300+

Kmart stores across Australia and NZ, part of the Wesfarmers group

~55,000

Store team members, the user base every frontline tool must serve

28

People across Innovation, Innovation Experience Design and Digital Stores

A small central team serving the whole organisation
Innovation runs as an internal agency: it takes a brief, a timeline and an expected outcome, and solves within those boundaries.
Shape Up, six-week cycles
Every project is a bet with a fixed appetite. Time is fixed, scope is variable, and each bet has explicit gates and circuit breakers.

Strategy

Five principles behind the approach

Centralise innovation, embed it in operations
Decentralised innovation does not survive large incumbents. A small, skilled, dedicated team working alongside operations does.
Separate explored from validated
What people say and what models predict is exploration. Only what is observed in real store conditions counts as validation.
Innovation designs and validates, operations executes
Once a solution is validated, it transitions to the operating business. The handover is planned from day one, not at the end.
Fixed time, variable scope
Six-week bets with defined gates. If confidence drops below the threshold mid-cycle, the bet is stopped, not stretched.
Build vertically, not horizontally
One thing working end to end in a real store beats five things half-built in a lab.

Portfolio

The store operations AI portfolio

Assistant

Team Member Assistant

Voice-first AI assistant answering product, policy and process questions on the devices team members already carry.

Agentic

Auto PDT

Kmart's first agentic implementation: an LLM agent that operates existing store applications by voice.

Agentic

Sidekick

Agentic face-to-face shopping assistant connecting a consumer LLM to product search and in-store navigation.

Decision support

Zone Manager's Best Friend

AI decision support for zone managers, researched and shaped with the managers themselves.

Process + AI

One Way Flow + CV

Replenishment flow redesign with computer vision layered on top, measured store by store.

The next sections go deep on the two frontline implementations: the assistant and the agent.

01 · Team Member Assistant

One assistant, on the device they already carry

It began with a question: what if a casual team member on their third shift could be as productive as someone with three years on the floor?

The Team Member Assistant (TMA) is a voice-first AI assistant for store team members, running on the Zebra handheld devices already used for daily store work.

It answers the questions that dominate a shift: is this in stock, where is it, what does it cost, what is the process for this, what is the policy on that.

The problem
Information lives in many systems. The team member, often with a customer waiting, has to stop and navigate to find it.
The bet
If an assistant answers in seconds, by voice, without interrupting the physical work, it gets used. If it gets used, it changes service speed.

01 · Team Member Assistant

How it was built

Exploratory design first
Rounds of exploratory design with team members, testing a wide spread of use cases to pin down the non-negotiable features and build a roadmap for what comes next.
Three solutions trialled in parallel, then converged
Rather than debating architectures, we built three candidate paths simultaneously inside one cycle and let evidence pick the winner.
Hybrid architecture
Custom React Native front end for the interaction layer, enterprise LLM platform (Gemini) for retrieval and reasoning, cloud-agnostic by design so the provider can change without rebuilding the experience.
Live operational data
Real-time product availability from the data warehouse, refreshed every two hours, with nearby-store suggestions when an item is unavailable.
Knowledge engineered for retrieval
Policy and process PDFs restructured into machine-readable formats. This alone cut complex responses from minutes to seconds.
Deployed to the real fleet
Runs on store-issued Zebra PDTs with team headsets, inside the store's device management environment, not on demo hardware.

01 · Team Member Assistant

The architecture at a glance

Experience layer

Custom React Native front end: voice capture, push to talk, image carousels, headset and device integration. This is where adoption is won or lost.

Owned

⇅

Intelligence layer

Enterprise LLM platform (Gemini) for retrieval and reasoning. Cloud-agnostic by design: the provider can change without rebuilding the experience.

Swappable

⇅

Data layer

Real-time availability from the data warehouse, refreshed every two hours, plus policy and process content restructured for retrieval.

Live

The split is deliberate: the experience is the moat with users, while the model underneath is a commodity that keeps improving.

01 · Team Member Assistant

The usability principles that made it stick

Voice first
Operate without typing, without unlocking the device, and with the screen off. The job keeps both hands busy; the interface must respect that.
Push to talk, one device
Tap the headset for the assistant, hold for team radio. One trusted device, not another one.
Voice in, text out
Every user asked by voice, but they preferred reading answers. The pattern only surfaced by watching real usage.
Show the product
Image carousels for product results. A photo resolves "is this the one?" faster than any description.
Right format for the job
Complexity scoring decides whether an answer is spoken or rendered as visual step-by-step guidance.
Co-created in store
Built on the shop floor with the team members who use it, week by week. Feedback loops are part of the product.

01 · Team Member Assistant

The evidence

100+

Team members using the tool daily, including night-fill casuals

1,000+

Questions asked through the system, and climbing

100%

Of usage came through voice input, none typed

98%

Of questions were product-related: availability, price, location

245

Questions from 26 users in a single weekend after fleet-device deployment

4s

Answer time for common questions, down from 30 seconds at launch

Single-store pilot at Kmart Broadmeadows, deliberately run past the novelty window (3+ weeks) so adoption could be separated from curiosity. Handled policy questions and filtered inappropriate queries without incident.

01 · Team Member Assistant

What the pilot taught us

Build for real demand, not imagined demand
The roadmap assumed policy and process queries would matter. Usage said 98% product information. The product strategy followed the evidence.
Latency is a usability feature
Nobody waits two minutes on a shop floor. Engineering the knowledge base mattered more than choosing the model.
Measure adoption after the novelty wears off
Week one numbers flatter every pilot. The trial was extended specifically to see what week four looked like.
Run a dual track on platform decisions
Out-of-box enterprise AI for speed, custom front end for fidelity. The custom UX features (push to talk, structured outputs, headset integration) proved to be the deal-breakers.
Frame the pilot as the first and last trial
Not an experiment that might be repeated, but version one of a product heading to 55,000 people, with the operational handover designed in from the start.

02 · Auto PDT

The first agentic implementation

Auto PDT is a voice layer that operates the existing applications on Kmart's handheld terminals. The team member speaks; an LLM agent does the tapping.

The agent reads the live screen through Android's accessibility tree, decides the next action, performs it, re-reads the screen and repeats until the task is done. No changes to any existing app or backend system.

Why agentic rather than scripted
Hard-coded screen maps break every time an app updates. The agent adapts to whatever is on screen: if a button moves, it finds the new path.
Why it matters
It turns a six-tap stock check into a spoken sentence, on hardware the business already owns, without a single integration dependency.

02 · Auto PDT

Anatomy of the agent

01

Voice listener

Hardware button trigger, so no false activations on a noisy floor. On-device speech-to-text.

→

02

Command interpreter

An LLM resolves natural speech into a structured intent from a bounded action set.

→

Agentic loop

03

Screen reader

Reads live app state through the accessibility tree: fields, buttons, visible text.

⇄

04

Action engine

Read, decide, act, re-read, until the task is done or it stops gracefully.

→

05

Overlay UI

Minimal floating layer: listening state, confirmation, result. Collapses in seconds.

Typically 3-5 LLM calls and one to two seconds of navigation per task. The interpreter works on intent, not exact words, so partial recognition still works. No changes to the apps being operated.

02 · Auto PDT

Trust is the gating factor, so it is designed in

Bounded action set
The agent can only attempt known intents. It cannot improvise actions outside the list.
Graceful failure, always
If the agent cannot determine the next step, it stops immediately, says so, and leaves the app exactly as it found it. No blind retries, no guessing, no partial actions.
Explicit confirmation on every write
Anything that changes data is read back in plain language and only executes when the team member says confirm.
Capped autonomy
A maximum step count prevents runaway loops. Read-only actions ship first; writes earn their way in.
Phased rollout tied to trust gates
Phase 1 read-only (stock, price, tasks), phase 2 low-risk writes (task completion), phase 3 transactional (markdowns, receiving).

A team member burned once by a wrong markdown will never use the system again. Graceful failure is the single most important behaviour.

02 · Auto PDT

The economics of agentic, and the discipline around it

$0

Hardware spend. Existing devices, existing apps, existing LLM APIs

<1¢

Per voice command, at 3-5 small LLM calls per action

<4s / 90%

Cycle gates: end-to-end latency and accuracy over 50+ controlled runs

Circuit breakers defined before the build starts
If the accessibility tree is unreadable, or voice capture fails, or device management will not permit the service, the bet stops at week two. Worst case is cheap, fast learning.
The market signal
Startups are now selling voice control of consumer phone apps with added hardware. On an enterprise-managed Android fleet, the same capability is pure software.

03 · The wider portfolio

Beyond the floor: the same playbook elsewhere

Sidekick: agentic shopping in the aisle
An MCP-based server connecting a consumer LLM to product search, comparison scoring and in-store navigation. Most agentic commerce targets online; the physical store is open ground.
Zone Manager's Best Friend
AI decision support for the people who run zones, shaped through research with the managers themselves and partnered with the insights team to build on existing data foundations.
One Way Flow and computer vision
Replenishment redesigned as a physical process first: 41.4% improvement in hours per thousand cartons in the trial store before AI was layered on. Vision and automation then compound a process that already works.
Agent teams as innovation capacity
A full pre-MVP strategy package (research, financial modelling, supplier analysis, go-to-market) produced by a managed AI agent team inside a single six-week cycle.

04 · Strategy in practice

Remove waste, eliminate variability, then digitise

01

Remove waste

Some processes should be eliminated, not automated. The simplification work cut whole activities before anyone wrote a line of code.

→

02

Eliminate variability

Digitising a broken manual process does not fix it; it just makes the variability visible faster. One Way Flow standardised the physical work and delivered 41.4% on its own.

→

03

Digitise, then make it agentic

AI lands best on a process that is already simple and stable. The assistant, the agent and the vision layer all compound from that base.

There is a limit to how far digital can go in a manual process. Done in the wrong order, digital only amplifies the noise.

05 · Where this goes

From answering, to operating, to orchestrating

Proven in store

Assistant

Answers on demand. The Team Member Assistant compresses information retrieval to a spoken question.

In build, trust-gated

Agent

Operates the tools. Auto PDT performs the work inside existing systems instead of just answering questions.

The trajectory

Orchestration

Humans direct intent. A manager directing dozens of store-level agents; attention becomes the scarce resource and interfaces shift from operation to direction.

The frontline voice tools are the first rung. The design rules they proved, voice first, bounded autonomy, graceful failure, trust earned in phases, are the same rules the whole ladder will need.

Closing

What I would take into any retail operation

Start where the hands are full
The frontline is the underserved frontier, and the fastest place to prove value.
Evidence from real stores beats demos
Pilot in production conditions, past the novelty window, and let usage rewrite the roadmap.
Design trust before capability
Bounded actions, graceful failure and confirmation on writes are what make agents deployable.
Voice in, text out, co-created on the floor
Usability principles discovered with users are the adoption strategy.
Sequence it
Simplify the process, then digitise it, then make it agentic. In that order.

Fabio Oliveira · June 2026
fabio.so · Questions and discussion welcome