Stellar
Back to Blog
speed-to-leadlead-conversionmarketing-attributionanalytics

Speed-to-Lead Isn't One Number: How to Measure Response Time by Source

Average response time hides the leads you're losing. The leads in your p95 and p99 are the ones costing you deals. Per-source measurement, percentile reporting, and SLA buckets that actually surface where you're slow.

Stellar Team

Why the average response time lies

The standard "speed-to-lead" stat is a single number: average response time. It's wrong.

Or rather, it's right but useless. Average response time across all your lead sources tells you nothing about which source is slow, why it's slow, or what you can do about it. It tells you what the typical lead experiences, which by definition isn't the lead that's costing you the deal.

The leads that cost you deals are the long-tail ones. The lead that came in at 7pm Friday and waited until Monday at 11am. The lead that triggered a Zapier flow that quietly broke for two hours. The lead from a campaign source you forgot to wire up. Those are the leads in your p95 and p99, not your average.

This post is about how to track speed-to-lead per-source and per-percentile, what good looks like in production, and what to do with the data once you have it.

Why average response time hides the leads you're losing

A lead-gen funnel has multiple sources. Organic web forms, paid Google ads, partner referrals, your trigger API for upstream tools, webhook integrations from Typeform and Jotform, Eventbrite registrations, your sales team's CSV uploads. Each source has different latency characteristics.

A web form goes through your form provider's webhook, then your CRM, then maybe a Zapier flow, before triggering the call. That's a chain with three failure points.

A trigger API call (your inbound webhook directly into the lead-trigger endpoint) is one hop. Latency is essentially the time it takes to dial.

A CSV upload is bulk-batched. The first contact in the file might be called within 30 seconds; the 500th might wait 20 minutes.

If you average all these together, you get a number that's neither the web form experience nor the API experience. You get something in between that doesn't match what any single lead actually experienced.

What good measurement actually looks like

Per-source latency tracking. Each lead carries the source it came from, persistently, on the row that represents the lead in your database. When you GROUP BY source, you can see the response time profile of each source independently.

Percentiles, not averages. p50 (median) tells you what half the leads experience. p95 tells you what the worst 5% experience. The gap between them tells you whether you have a long-tail problem.

SLA buckets. A response time histogram is hard to read; SLA buckets aren't. Define green (under 60s), amber (under 5 minutes), red (over 5 minutes), and report the percentage of leads in each bucket per source. If 95% of HubSpot-sourced leads are green and only 60% of Eventbrite-sourced leads are, you know exactly where to look.

For example, the dashboard for a typical home-services account in 2026 might look like:

Source Count p50 p95 Green% Amber% Red% HubSpot 342 38s 2m 14s 87% 11% 2% Typeform 87 14s 1m 02s 98% 2% 0% ServiceTitan 156 52s 4m 18s 74% 22% 4% Pipeline 24 2m 04s 12m 35s 38% 46% 17%

That table tells a story average response time can't.

Pipelines are slow (multi-step workflows have inherent delays). ServiceTitan-sourced leads have a long tail; the p95 is much worse than HubSpot's. Typeform leads are fast because the integration is nearly real-time.

You'd never see this looking at a single average across all four.

The 5-minute window: why it matters per source

The well-known speed-to-lead study from MIT and InsideSales (leads contacted within 5 minutes are 21x more likely to qualify than leads contacted at 30 minutes) is true on average. It's also true per-source, but the absolute numbers differ.

Web-form leads: the 5-minute window matters because the prospect just submitted a form. They're at their desk, on the page. Five minutes from now they're already in another tab.

Phone-call leads (Google LSA, missed-call return): the 5-minute window is even tighter. The prospect dialed your number expecting a human; they got voicemail. You have minutes, not hours, before they call your competitor instead.

CRM-sourced leads (HubSpot list sync, ServiceTitan customer follow-up): the window is more forgiving. These leads are often warm-but-not-urgent. Past customers, marketing-qualified contacts. A 30-minute response is fine; an hour is fine; a day might be fine.

Pipeline-sourced leads (multi-step workflows): SLA depends on the step. The first step has the same urgency as the original lead source. The second step (a follow-up call after a no-answer) is on a different clock. You intentionally waited.

Each source needs its own SLA, not a global one.

What changes when you measure right

Three things tend to happen once a team starts tracking per-source response time.

First, the slowest source gets attention. Often it's a workflow nobody owns: a Zapier zap that's been quietly retrying, a Make scenario that times out on weekends, a CRM segment that's missing the trigger. Per-source visibility surfaces ownership questions.

Second, the best source gets credit. The web-form path you wired up six months ago is responding in 14 seconds. That's a marketing claim worth making. "We respond to every lead in under 60 seconds" is a sales pitch when it's true. It's a lie when your average is 45 seconds and your p95 is 12 minutes.

Third, new integrations get instrumented before they ship. When the team adds a new source, the question becomes "what's the expected p50 here?" rather than "did anyone wire this up?"

What to track in your database

For this measurement to be cheap to query at scale, the source needs to be a top-level column on the call (or call_jobs) row, not buried in a JSONB blob. JSONB extraction can't be indexed efficiently for GROUP BY operations across millions of rows.

A reasonable schema:

trigger_source (TEXT, NOT NULL, indexed): one of a known enum like "api_lead", "hubspot", "gohighlevel", "typeform", "jotform", "google_sheets", "webhook", "eventbrite", "zoom_webinar", "pipeline", "manual_retry".

created_at (TIMESTAMPTZ): when the lead arrived in your system.

started_at (TIMESTAMPTZ): when the response (call or email) actually went out.

Composite index on (account_id, trigger_source, created_at DESC) for efficient per-tenant range queries.

Then the query is trivial. SELECT trigger_source, COUNT(*), PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY EXTRACT(EPOCH FROM (started_at - created_at))), GROUP BY trigger_source. That's a regular index scan on a properly-shaped table. Sub-second on a 90-day window with 10k+ leads.

What good looks like in production

For a well-run home-services account in 2026, here are the targets we see hit consistently:

Overall p50: under 60 seconds.

Overall p95: under 5 minutes.

Green-bucket percentage: above 75%.

Worst-source p50: under 5 minutes (no source should be in chronic red).

These are achievable when:

1. Your AI voice agent fires the moment a lead arrives, not when a human gets to it. 2. The agent runs 24/7 (no batched morning queues for after-hours leads). 3. Sources with inherent latency (CSV bulk uploads, multi-step pipelines) get their own SLA buckets so they don't drag the overall numbers down.

If your numbers are worse than these, the question isn't "are we slow." It's "which source is slow, and why."

How to make it visible

Build the dashboard. Or use one. The point is to have a place where you can answer the question "are leads from source X being responded to fast enough?" without writing a query.

For accounts on Stellar's Pro plan, the Speed-to-Lead dashboard does exactly this. Per-source breakdown of p50/p95 plus green/amber/red buckets, sliding 30-day window, sortable by call count. We built it because the alternative (answering the question via custom SQL) meant the question never got asked.

Most of the value isn't the dashboard itself. It's that having the dashboard makes the question askable. Once your weekly review includes "which sources are red," the next conversation is "why," and the conversation after that is "what to do about it."

That's where the conversion lift comes from. Not from being fast, exactly. From knowing where you're slow.

Ready to try Stellar?

Create your first AI voice agent in minutes.

Get started free