Enterprise AI Assistant
Enterprise SaaS · Confidential, visuals abstracted · 2025
I read this as place of incorporation, not tax residency. Confirm before I apply.
12 entities · US → Cayman Islands
Above 10 entities, the change is listed before it runs.
Making bulk AI actions safe to approve.
Legal and financial teams manage diagrams with dozens of interdependent entities. Updating a jurisdiction, ownership state, or entity attribute across a structure could take hours of repetitive editing.
The assistant could reduce that work to a single instruction. My job was to make the result safe to rely on: show what the system understood, make the scope easy to review, and give users a clear way back if something was wrong.
The first prototype saved clicks, not review time.
Early versions could parse a request, find the right entities, and execute the change. They worked well in a demo. In testing, people tried a small edit, watched it succeed, then returned to the manual workflow.
The reason was straightforward: they still had to inspect every change one by one. During a walkthrough involving roughly 40 entities, verification became manual again. Above about ten entities, users stopped scanning the diagram reliably and started checking items individually. That became the threshold for a different interaction.
I defined the rules around execution.
Working with a PM, engineering lead, and design lead, I designed the end-to-end interaction model: how the assistant reads intent, handles ambiguity, scopes an action, previews the result, confirms the change, executes, and recovers.
The product stays fast for small edits, but slows down when the cost of a mistake rises.
Confirmation threshold set at 10 entities: the point where users stopped scanning the diagram and started checking entities one by one.
Three decisions shaped the shipped interaction.
Add friction only when the scope demands it
The initial proposal was one confirmation step for every action. I rejected that because it treated a two-entity edit and a forty-entity edit as the same risk.
For smaller changes, the assistant executes and shows the result for visual checking. At ten or more entities, it pauses first and lists every affected item. Users review the scope, then confirm or discard. The extra step appears only when the interface can no longer show the full change at a glance.
State the interpretation when a request is ambiguous
“Move all US entities” can mean several things in a legal workflow: place of incorporation, tax residency, or asset location. The model should not silently choose one.
I defined ambiguity triggers and a one-question limit. When a request has more than one credible reading, the assistant states its interpretation in plain language and asks the user to confirm or correct it before anything is scoped.
Keep the assistant bounded
There was pressure to make the assistant feel open-ended and conversational. That model works for general-purpose chat. It is a poor fit for a professional changing structured data used in regulated work.
I defined clear entry conditions for four behaviours: execute, clarify, surface, and reason. The assistant can help users move faster and spot issues, but it does not improvise beyond the workflow.
A partner at a US law firm cited the confirmation step as the reason they trusted the product.
The beta showed where confidence was won.
The assistant ran as a 63-day live beta with 11 enterprise client testers. Half of the observed accuracy gaps were caught at the preview step, before execution. Ninety percent of testers understood the capability immediately or after guided exploration, and 70% found it easy to use without careful prompt-writing.
The strongest signal was behavioural: users were more willing to automate larger edits once the affected entities were visible before commit.
I turned failure states into a shared error framework.
As the product grew, I found that errors were being written case by case and sometimes exposed infrastructure language to users. I mapped the recurring states into four levels: informational, warning, recoverable, and blocking. Each level had a consistent presentation rule, tone, and recovery action.
The engineering lead adopted the framework product-wide and used it to review existing error codes. That work surfaced mismatched severities and gave the team a common standard for later AI features.
A reusable pattern needs a written rule.
The confirmation flow held up in use. The part I underinvested in initially was documentation. When adjacent teams began applying the pattern, the edge cases were still in my Figma file and in conversations. I now write the decision rule alongside the design so someone who was not in the room can apply it correctly.