← All posts

Why most business chatbots are terrible

6 min read

Most chatbots annoy customers instead of helping them. Here is what goes wrong and how to design one that actually resolves problems.

Poor chatbot design: Sophisticated AI crystal hindered by inert, tangled circuits, leading to customer frustration.

Quick answer

They try to do too much. Good chatbots focus on the top 5-10 most common questions, answer those well, and hand off to a human for everything else. The key is honest scope: admit what the bot can't do and escalate quickly.

We've all had the experience. You visit a website, a chat bubble appears, you type your question, and the bot responds with something completely irrelevant. You try rephrasing. Same useless answer. You ask for a human. The bot says "I'm sorry, I didn't understand that. Can you rephrase your question?"

Most business chatbots are terrible. Not because the technology is bad, but because they're designed wrong. They try to do too much, they don't know when to give up, and they frustrate customers who just wanted a quick answer.

Here's how to build one that actually helps.

Why most chatbots fail

They try to handle everything

The most common mistake is building a chatbot that attempts to answer any possible question. This leads to a system that's mediocre at everything and good at nothing. When the bot encounters something it doesn't know, it either makes up an answer or loops through the same "I didn't understand" response.

They don't escalate to humans

A chatbot that never hands off to a real person isn't a support tool. It's a wall between your customer and help. The moment a bot can't resolve an issue, the customer should be connected to a human with the full conversation context included.

They're trained on the wrong data

Many chatbots are built on FAQs that don't match what customers actually ask. The FAQ says "How do I reset my password?" but customers type "I can't get into my account" or "login broken." If the bot doesn't understand the various ways people phrase the same question, it fails on the most common requests.

They sound robotic

"I'd be happy to help you with that! Let me check on that for you. One moment please!" Nobody talks like that. Customers can tell it's a bot, and the fake enthusiasm makes it worse. Natural, concise responses build more trust than performative friendliness.

The design principles that work

Narrow scope, deep competence

The best chatbots handle a small number of tasks extremely well. For an e-commerce business, that might be:

  1. Order status ("Where is my order?")
  2. Returns ("How do I return this?")
  3. Product availability ("Is this in stock in size M?")
  4. Delivery info ("When will it arrive?")

Four tasks. Not forty. Each one has a clear data source (order API, inventory API, returns policy) and a clear resolution (show the status, start the return, check the stock). Anything outside these four gets routed to a human immediately.

Transparent handoff

When the bot can't help, hand off to a human immediately and include the full conversation. Nothing frustrates a customer more than explaining their problem twice. The handoff message should be honest: "I can't help with this one, but I'm connecting you to someone who can."

Template responses with dynamic data

Don't use an LLM to generate free-form responses to customers. Seriously. The risk of a hallucinated return policy, a wrong order date, or an inappropriate tone is not worth the marginal improvement over a well-written template.

Use templates with dynamic data inserted: "Your order #12345 shipped on March 14 via Royal Mail. Tracking number: ABC123." Reliable, accurate, and fast.

Tip

Save LLM intelligence for understanding the customer's question, not for generating the answer. Classify the intent with AI, then respond with a vetted template populated with real data.

Conversation memory within a session

The bot should remember what was discussed earlier in the conversation. If a customer asks about an order and then asks "what about the other one?", the bot needs to know what "the other one" refers to. This is basic UX but a surprising number of chatbots treat every message as a new conversation.

The technical approach

A well-designed chatbot has three layers (similar to the triage system architecture):

Intent detection: Classify what the customer is asking for. This can be a fine-tuned classifier, a rules-based keyword matcher, or an LLM call. The output is a category (order_status, return_request, etc.) and a confidence score.

Entity extraction: Pull out the relevant details. Order number, product name, email address. For simple cases, regex patterns work. For messier inputs, an LLM or NER model is more reliable.

Action and response: Look up the data (query the order API, check the returns database), apply business logic (is the return window still open?), and respond with a template.

If the intent confidence is below your threshold, or the intent doesn't match any supported category, go straight to human handoff. No "sorry I didn't understand" loop.

Measuring whether it's working

Track these metrics from day one:

  • Resolution rate: What percentage of conversations does the bot fully resolve without human intervention?
  • Handoff rate: How often does the bot escalate to a human? Too high means the bot isn't useful enough. Too low might mean it's not escalating when it should.
  • Customer satisfaction: A simple thumbs up/down after the conversation tells you whether the bot actually helped.
  • Time to resolution: For bot-resolved conversations, how long does it take? Should be under 60 seconds for simple queries.
  • False resolution rate: How often does the bot claim to have resolved an issue but the customer comes back with the same problem?

The false resolution rate is the most important and the most overlooked. A bot that confidently gives wrong answers is worse than no bot at all.

Key Takeaways

  • Narrow scope beats broad coverage. Handle 4-5 tasks perfectly instead of 40 tasks poorly.
  • Always offer human handoff. A bot that traps customers is worse than no bot at all.
  • Use AI for understanding questions, not for generating answers. Template responses with dynamic data are more reliable.
  • Track false resolution rate. Confidently wrong answers destroy customer trust.
  • The bot should be invisible when it works. Customers want answers, not a conversation with a robot.

Thinking about building a chatbot?

If your support team handles a high volume of repetitive questions, a well-scoped chatbot can save real time. The key is keeping the scope tight and the handoff smooth. If you want to talk through what that could look like for your business, get in touch.


Related reading: