What Happens to Customer Data When You Add AI to Your Business
If you're using Claude, ChatGPT, or any LLM API in customer-facing workflows, your data is going somewhere. Here's the plain-English version of where, what stays where, and what you should tell your customers.
Where the data actually goes
API request lifecycle
Your app sends
Minimum data needed
API processes
Not used for training
Response returns
Back to your app
Retention
30 days, then deleted
Every small business owner I talk to about implementation eventually asks the same question, usually nervously: *"Is my customer data safe when I send it through AI?"*
The honest answer is *mostly yes, but here's exactly what you need to know to be sure*. This post is that exact-what-you-need-to-know.
Not legal advice. Not a substitute for asking your attorney about your specific compliance situation (HIPAA, GLBA, state privacy laws). But the plain-English baseline that should be true for any small business adding AI to workflows that touch customer data.
The mental model
When you call an AI through an API (Claude, GPT-4, Gemini, anything), three things happen:
1. Your application sends a request, which contains whatever data you put in it... customer name, message, context.
2. The API provider's servers process it and send back a response.
3. The data either gets *deleted right away*, *held briefly for abuse monitoring*, or *retained longer* depending on the provider and your account settings.
That third bullet is the whole game. Different providers have different defaults, and they're not always obvious.
What the major providers actually do
Anthropic (Claude API). By default, API inputs and outputs are not used to train Claude. Anthropic retains data for up to 30 days for trust and safety monitoring (looking for abuse), then deletes it. Enterprise plans can configure zero retention.
OpenAI (GPT-4 API). Same general posture. API data is not used for training by default. Standard retention is 30 days for abuse monitoring. Zero retention is available on enterprise plans.
Google (Gemini API). API data is not used for training, but data is retained for up to 24 months for service improvement unless you opt out via specific plan configuration.
The pattern: API calls (developer-facing) are private by default. The free consumer chat products (ChatGPT.com, Claude.ai, Gemini.com) are a separate question with different rules. If you or your team are pasting customer data into the free web chat, that data may be used for training. Don't do that.
What "training data" actually means
Even when training is involved, the data isn't memorized verbatim and served back to other users. Large language models learn statistical patterns from huge corpora. They don't store individual conversations as lookups. The risk isn't *"another customer asks Claude about Jane Smith and gets her address."* That doesn't happen.
The risk is *reputational and contractual*. If you signed an NDA with a client saying their data won't be processed by third parties, and you sent that data to an LLM API, you may have violated that contract. The question is rarely *"did this leak"*... it's *"did you have the right to send it in the first place."*
What you should actually do
Three things, in order of importance.
Use API access, not free consumer chat, for anything customer-facing. This alone closes 90% of the data-handling questions. The API has clearer terms, no training on your inputs, and configurable retention.
Add a sentence to your privacy policy. Something like: *"We use AI tools to draft communications, summarize information, and route customer messages. The providers we use (currently Anthropic and OpenAI) do not retain or train on this data, and we don't share data with them beyond what's needed for the immediate task."* Tighten the wording with your attorney if you're in a regulated space.
Don't send what you don't need to send. If your AI is drafting a reply, it needs the customer's previous message. It doesn't need their SSN, their credit card, or their full medical history. Filter inputs to the minimum useful set. This is good engineering anyway... it makes the AI faster and cheaper.
The compliance edge cases
If your business handles regulated data, you have extra work.
HIPAA-covered entities (healthcare practices, dental offices, some allied services) need a Business Associate Agreement (BAA) with any AI provider touching PHI. Anthropic offers BAAs for enterprise customers. OpenAI does too. Don't skip this step.
Financial services under GLBA have their own rules about third-party data sharing. Talk to your compliance officer before routing customer financial details through any API.
California businesses under CCPA need to disclose AI processing in privacy notices. The sentence above gets you most of the way there.
The bottom line
For a typical small business in trades, real estate, or professional services, running customer messages through Claude or GPT-4 via API is a routine third-party data processing arrangement, not a special category of risk. The same way you'd handle data going to QuickBooks or Mailchimp. A contract, a retention policy, a privacy notice. AI doesn't get a special carve-out, but it also doesn't require a special new playbook.
If you want me to look at your specific setup before you roll anything out, that's part of what the audit covers. Email me with the workflow you're considering and I'll tell you what to check before you ship it.
The AI Operations Audit
Find your version of this in 5 days.
Same methodology. Your operation. ROI math on every opportunity. $1,500 founding rate while the first two spots last... free if we don't find at least three.
Ken Jackson
Founder of LvlUp Agency. 20+ years in product management and software engineering. VP of Engineering at Camp Gladiator, VP of Product at Volusion. Now building AI systems for trades and field service businesses in Austin, TX and beyond.
About Ken →