How AI Lead Scoring Actually Works (and Why Most Chatbots Don't Have One)

A 5-second AI pass on every fresh lead — heat score 1-5, one-line summary, suggested WhatsApp reply. Built into ObieChat. Almost nobody else does this. Here's what's actually happening under the hood and why it matters more than visitors think.

If you've used most chatbot platforms — Tidio, Chatbase, Drift, Intercom — and looked at your captured leads, you've seen the standard view: a list of name + phone + email + transcript. Useful, but no context. To find your hottest lead you have to open each one and read the full conversation. After your bot captures 30 leads in a week, that's 30 transcripts to read, and most owners just don't.

ObieChat does something different. After every fresh lead lands, a quick AI pass scores it 1-5 🔥, writes a one-line summary of what the visitor wants, and drafts a copy-paste WhatsApp reply you could send right now. The leads list becomes a triage dashboard instead of a inbox of transcripts.

This post explains how this works, why it's useful, and why most chatbot platforms don't have it.

What you actually see

After a visitor captures a lead via your ObieChat widget, within ~5 seconds the lead row updates with three new fields:

Heat score (1-5 🔥): how urgently you should call back
Summary (1 line, ≤140 chars): what they want, urgency, fit
Suggested reply (≤300 chars): warm, conversational WhatsApp message ready to send

Here's an example from a real (anonymised) lead an IT consultancy received:

🔥🔥🔥🔥 (4/5)

Summary: Adrian PJ-based, asking about monthly IT retainer for 3 sites. Mentioned competitor at RM 800, ready to switch this week.

Suggested reply: "Hi Adrian, thanks for reaching out — yes our Business Care plan starts at RM 600/month and covers exactly the multi-site setup you described, with same-day response. Happy to do a quick 15-min call to walk you through it — got a slot today at 4pm?"

That's the lead-list view. You see 🔥🔥🔥🔥, you read 7 words of summary, you decide "call Adrian first." Two clicks open the suggested reply, you tweak the time, hit Copy, paste into WhatsApp.

Compare that to: name, phone, email, "Read transcript" link. With 20 leads in your week, the second version makes you read 20 transcripts. The first version makes you call the hottest 3.

How the scoring works (under the hood)

It's surprisingly simple. After ObieChat persists a fresh qualified lead, a background job runs a second AI call — separate from the conversation itself — using a lightweight, cost-efficient model and a constrained prompt that returns exactly the three fields.

The prompt is roughly: "You triage leads for a small business owner. Read the lead's transcript + contact info. Return JSON: score (1-5), summary (≤140 chars), suggestedReply (≤300 chars)."

The model gets:

The captured lead payload (name, phone, email, business, enquiry)
The full conversation transcript
The business context (business name, owner name, services)

It returns the three fields. ObieChat persists them on the lead row. Total latency: 3-8 seconds. Total cost: about RM 0.001 per lead.

How the heat score is decided

The system prompt gives explicit scoring guidance:

1 = spam-ish / weak intent. 2 = curious, might convert eventually. 3 = real lead, fits services, needs follow-up. 4 = hot — buying intent, asked about pricing or scheduling. 5 = URGENT hot — "ready to start," "need this today," asking for invoice/contract.

The model uses intent signals from the actual conversation: did they ask about pricing? mention urgency? compare to a competitor? give a specific name + phone? mention a deadline? Each signal nudges the score. Adrian above scored 4 because he asked about a specific plan, mentioned a competitor, and signalled this-week urgency. A spam-looking lead with no real questions would score 1.

How the suggested reply is drafted

The prompt instructs the model to:

Address by first name when known
Be warm but concise — plain conversational tone, not corporate
Refer specifically to what they asked about (not generic "thanks for reaching out")
End with a soft next step (a question, a slot offer, "call you back in 10 min")
No emoji unless the lead used emoji first
Use the owner's name (if set) so it feels like a real reply, not a template

The output is meant to be a starting point you tweak in 5 seconds before sending, not a final message you blast verbatim. But for most leads, ~80% of the suggested reply is keepable as-is.

Why this matters more than it sounds

The biggest invisible failure mode in small-business lead capture isn't "the bot didn't capture leads." It's "the owner captured leads but didn't follow up fast enough."

Half the leads we see go cold within 24 hours of capture because the owner saw the email notification, didn't have time to read the full transcript, postponed it to "later," and "later" turned into next week. The lead either lost interest or contacted a competitor in the meantime.

AI lead scoring directly attacks this failure mode by:

Compressing decision time — you can triage 20 leads in 60 seconds by glancing at heat scores instead of opening 20 transcripts
Removing the writing barrier — the suggested reply means you go from "I should respond to Adrian" to "I just responded to Adrian" in 30 seconds instead of 5 minutes of staring at a blank text box
Surfacing patterns — once you have 50+ scored leads, you can filter by 🔥🔥🔥🔥+ to see hot leads only, vs scrolling through 50 transcripts mixing spam and hot leads

The effect compounds. Faster follow-up = higher conversion = more revenue. A 10-minute response time for hot leads converts roughly 3-4x better than a 24-hour response time, across every study we've seen.

Why most chatbot platforms don't have this

Honest answer: it's a niche feature that doesn't move the headline metric most platforms compete on. The big chatbot SaaS companies measure themselves by "number of conversations captured" because that's what their customers used to optimise for. Triage + follow-up speed are harder to measure across thousands of customers, so they don't show up in marketing.

There are three other concrete reasons:

1. It costs them per-lead AI tokens

The second AI pass costs us about RM 0.001 per scored lead. Across millions of leads on a large platform, that's real money — and it doesn't directly drive new signups (because feature pricing isn't itemised). Easier to skip it.

2. It exposes prompt-engineering complexity

Getting the scoring to feel right (not too generous, not too conservative, summaries that read like a human wrote them, suggested replies that don't sound robotic) takes iteration. Most chatbot platforms are built by people good at chat UIs, not at lead-quality nuance.

3. Their target customer is "enterprise sales teams"

Bigger platforms target sales teams who already use CRMs (HubSpot, Salesforce). Scoring is the CRM's job in that world. They assume you'll forward leads to Salesforce and Salesforce will score them — even though most small businesses don't have a CRM.

ObieChat is built for solo operators and small teams without CRMs. The scoring lives inside the chatbot platform because for our users, there's nothing else.

What scoring is NOT good for

Worth being honest about limitations. AI lead scoring is triage, not a replacement for judgement. Specifically:

The score is the AI's best guess, not ground truth. It's right ~80% of the time in our internal testing, wrong sometimes. A 5-rated lead might be a passing whim; a 2-rated lead might convert because of context the AI couldn't see.
The suggested reply needs your tone, your context. Send it as-is for 80% of cases; tweak it for the 20% where the visitor said something specific the model didn't quite parse.
It doesn't account for your specific industry's signal weights. A construction company might rate "need it next week" lower (everyone says that) than a legal consultancy where "next week" is a strong intent. Your domain knowledge adjusts the score in your head.
It doesn't follow up for you. If you ignore a 5-rated lead for three days, the AI doesn't escalate. (We're exploring something here for v2 — escalation reminders. Not shipped yet.)

The honest framing: the AI does the first triage so you can do the second triage faster. Both passes still matter.

How to use it well

Three habits that pair well with AI lead scoring:

1. Glance daily, act on 4-5s, batch the rest

Open Console → Leads once a day. Sort by heat (Drizzle's default sort already surfaces hottest first). Reply to 4-5 ratings immediately while context is fresh. Batch 3-rated leads into an evening session. Don't sweat 1-2s — they're either spam-flagged or low-intent.

2. Trust the suggested reply 80%, tweak 20%

Copy the suggested reply with the button, paste into WhatsApp, scan it for accuracy (does it match what your knowledge says about pricing? is the next step the right one?), tweak one or two words, hit send. If you find yourself rewriting it entirely, the underlying knowledge base probably needs a fix — the AI is drafting based on what it knows. Add the gap to your knowledge.

3. Use the digest to spot patterns

ObieChat's daily 8 AM digest email shows yesterday's lead breakdown by heat + top question themes. Over time, patterns emerge — "I'm getting lots of 3-rated leads asking about X; if I add a clearer answer to my knowledge, those would convert as 4s." Adjust knowledge → next week's leads are scored higher because the bot answers them better → more conversions.

What "score 5" looks like in practice

A few real examples (anonymised, paraphrased) of leads that scored 5/5:

"I have 12 employees, our IT person quit last week, need help by Monday." — fits services, urgent, specific
"Just got hacked, need someone today to fix this and audit our systems." — emergency, clear scope, ready to pay
"Switching from [Competitor]. They quoted RM 1,800/mo, we want lower price + better response time." — switching intent + specific competitor data + clear price-sensitivity signal

What 5s usually share: a specific deadline ("by Monday"), a clear scope ("12 employees, IT person quit"), or a competitor-switching signal ("X quoted us, we want better"). These are leads where slow follow-up is the difference between winning and losing the business.

By contrast, a typical 1-2:

"hi" (no enquiry, no phone)
"hello can I get pricing" (no name, generic, no context)
"do you do AI?" (curious browsing, no scope)

Score 3 is the most common — "asking about X service for my business [Y], when can we discuss?" — and accounts for ~50% of qualified leads in our data.

What the scoring will look like in 6 months

We're tuning this actively. Specifically considering:

Per-tenant calibration: letting you mark a few past leads as "hotter than scored" / "cooler than scored" and using that to bias the score for future leads in your industry
Reply suggestions in the visitor's language: if the visitor wrote in Bahasa Malaysia, draft the reply in Bahasa Malaysia too (currently always English)
Escalation reminders: if a 4+ lead hasn't been responded to in X hours, push notification
CRM webhook integration: if you do use HubSpot/Pipedrive, ObieChat already supports outbound webhooks — but we could send the scored data, not just raw lead, so your CRM has the heat from the start

These aren't shipped yet. The current scoring (score + summary + suggested reply, all in English) is what's live today.

Want to see it in action?

Set up ObieChat (free tier — 20 conversations/month, no card). Have a friend pretend to be a lead on your test site. Within 5 seconds of the lead being captured, your leads list will show a heat score + summary + suggested reply for that conversation.

Or if you'd rather have it set up + tuned for your business by us, ObieOnline offers an AI Chatbot setup service — one-off setup includes the knowledge base writing that makes the AI lead scoring genuinely good for your specific industry.

Start your free trial →

About the author: Obie has 17 years across telco and software development and runs ObieOnline, the consultancy behind ObieChat. The AI lead-scoring feature was built originally for ObieOnline's own client work — most KL consultancies don't have CRMs, so the chatbot has to do the triage itself.