This presentation is password protected.
Agentic AI in store operations: strategy, implementation and evidence from Kmart Australia.
Chatbots, search, personalisation. The people who run the store are largely ignored by the same investment wave.
Team members carry stock, push trolleys and serve customers. Every screen interaction interrupts the physical work that defines the job.
A single stock check is six taps. Across a national device fleet, that is thousands of hours of repetitive navigation every week.
The rest of this deck is what that looked like in practice, with the numbers it produced.
Innovation runs as an internal agency: it takes a brief, a timeline and an expected outcome, and solves within those boundaries.
Every project is a bet with a fixed appetite. Time is fixed, scope is variable, and each bet has explicit gates and circuit breakers.
Decentralised innovation does not survive large incumbents. A small, skilled, dedicated team working alongside operations does.
What people say and what models predict is exploration. Only what is observed in real store conditions counts as validation.
Once a solution is validated, it transitions to the operating business. The handover is planned from day one, not at the end.
Six-week bets with defined gates. If confidence drops below the threshold mid-cycle, the bet is stopped, not stretched.
One thing working end to end in a real store beats five things half-built in a lab.
Voice-first AI assistant answering product, policy and process questions on the devices team members already carry.
Kmart's first agentic implementation: an LLM agent that operates existing store applications by voice.
Agentic face-to-face shopping assistant connecting a consumer LLM to product search and in-store navigation.
AI decision support for zone managers, researched and shaped with the managers themselves.
Replenishment flow redesign with computer vision layered on top, measured store by store.
The next sections go deep on the two frontline implementations: the assistant and the agent.
The Team Member Assistant (TMA) is a voice-first AI assistant for store team members, running on the Zebra handheld devices already used for daily store work.
It answers the questions that dominate a shift: is this in stock, where is it, what does it cost, what is the process for this, what is the policy on that.
Information lives in many systems. The team member, often with a customer waiting, has to stop and navigate to find it.
If an assistant answers in seconds, by voice, without interrupting the physical work, it gets used. If it gets used, it changes service speed.
Rounds of exploratory design with team members, testing a wide spread of use cases to pin down the non-negotiable features and build a roadmap for what comes next.
Rather than debating architectures, we built three candidate paths simultaneously inside one cycle and let evidence pick the winner.
Custom React Native front end for the interaction layer, enterprise LLM platform (Gemini) for retrieval and reasoning, cloud-agnostic by design so the provider can change without rebuilding the experience.
Real-time product availability from the data warehouse, refreshed every two hours, with nearby-store suggestions when an item is unavailable.
Policy and process PDFs restructured into machine-readable formats. This alone cut complex responses from minutes to seconds.
Runs on store-issued Zebra PDTs with team headsets, inside the store's device management environment, not on demo hardware.
Custom React Native front end: voice capture, push to talk, image carousels, headset and device integration. This is where adoption is won or lost.
Enterprise LLM platform (Gemini) for retrieval and reasoning. Cloud-agnostic by design: the provider can change without rebuilding the experience.
Real-time availability from the data warehouse, refreshed every two hours, plus policy and process content restructured for retrieval.
The split is deliberate: the experience is the moat with users, while the model underneath is a commodity that keeps improving.
Operate without typing, without unlocking the device, and with the screen off. The job keeps both hands busy; the interface must respect that.
Tap the headset for the assistant, hold for team radio. One trusted device, not another one.
Every user asked by voice, but they preferred reading answers. The pattern only surfaced by watching real usage.
Image carousels for product results. A photo resolves "is this the one?" faster than any description.
Complexity scoring decides whether an answer is spoken or rendered as visual step-by-step guidance.
Built on the shop floor with the team members who use it, week by week. Feedback loops are part of the product.
Single-store pilot at Kmart Broadmeadows, deliberately run past the novelty window (3+ weeks) so adoption could be separated from curiosity. Handled policy questions and filtered inappropriate queries without incident.
The roadmap assumed policy and process queries would matter. Usage said 98% product information. The product strategy followed the evidence.
Nobody waits two minutes on a shop floor. Engineering the knowledge base mattered more than choosing the model.
Week one numbers flatter every pilot. The trial was extended specifically to see what week four looked like.
Out-of-box enterprise AI for speed, custom front end for fidelity. The custom UX features (push to talk, structured outputs, headset integration) proved to be the deal-breakers.
Not an experiment that might be repeated, but version one of a product heading to 55,000 people, with the operational handover designed in from the start.
Auto PDT is a voice layer that operates the existing applications on Kmart's handheld terminals. The team member speaks; an LLM agent does the tapping.
The agent reads the live screen through Android's accessibility tree, decides the next action, performs it, re-reads the screen and repeats until the task is done. No changes to any existing app or backend system.
Hard-coded screen maps break every time an app updates. The agent adapts to whatever is on screen: if a button moves, it finds the new path.
It turns a six-tap stock check into a spoken sentence, on hardware the business already owns, without a single integration dependency.
Hardware button trigger, so no false activations on a noisy floor. On-device speech-to-text.
An LLM resolves natural speech into a structured intent from a bounded action set.
Reads live app state through the accessibility tree: fields, buttons, visible text.
Read, decide, act, re-read, until the task is done or it stops gracefully.
Minimal floating layer: listening state, confirmation, result. Collapses in seconds.
Typically 3-5 LLM calls and one to two seconds of navigation per task. The interpreter works on intent, not exact words, so partial recognition still works. No changes to the apps being operated.
The agent can only attempt known intents. It cannot improvise actions outside the list.
If the agent cannot determine the next step, it stops immediately, says so, and leaves the app exactly as it found it. No blind retries, no guessing, no partial actions.
Anything that changes data is read back in plain language and only executes when the team member says confirm.
A maximum step count prevents runaway loops. Read-only actions ship first; writes earn their way in.
Phase 1 read-only (stock, price, tasks), phase 2 low-risk writes (task completion), phase 3 transactional (markdowns, receiving).
If the accessibility tree is unreadable, or voice capture fails, or device management will not permit the service, the bet stops at week two. Worst case is cheap, fast learning.
Startups are now selling voice control of consumer phone apps with added hardware. On an enterprise-managed Android fleet, the same capability is pure software.
An MCP-based server connecting a consumer LLM to product search, comparison scoring and in-store navigation. Most agentic commerce targets online; the physical store is open ground.
AI decision support for the people who run zones, shaped through research with the managers themselves and partnered with the insights team to build on existing data foundations.
Replenishment redesigned as a physical process first: 41.4% improvement in hours per thousand cartons in the trial store before AI was layered on. Vision and automation then compound a process that already works.
A full pre-MVP strategy package (research, financial modelling, supplier analysis, go-to-market) produced by a managed AI agent team inside a single six-week cycle.
Some processes should be eliminated, not automated. The simplification work cut whole activities before anyone wrote a line of code.
Digitising a broken manual process does not fix it; it just makes the variability visible faster. One Way Flow standardised the physical work and delivered 41.4% on its own.
AI lands best on a process that is already simple and stable. The assistant, the agent and the vision layer all compound from that base.
Answers on demand. The Team Member Assistant compresses information retrieval to a spoken question.
Operates the tools. Auto PDT performs the work inside existing systems instead of just answering questions.
Humans direct intent. A manager directing dozens of store-level agents; attention becomes the scarce resource and interfaces shift from operation to direction.
The frontline is the underserved frontier, and the fastest place to prove value.
Pilot in production conditions, past the novelty window, and let usage rewrite the roadmap.
Bounded actions, graceful failure and confirmation on writes are what make agents deployable.
Usability principles discovered with users are the adoption strategy.
Simplify the process, then digitise it, then make it agentic. In that order.