The business question
A marketing team with five channels and one budget eventually has to answer an uncomfortable question: which channel actually deserves the credit for a conversion? Get the answer wrong and you cut a channel that was quietly doing the real work, or keep funding one that was just along for the ride.
Why this matters
Attribution sounds like a measurement problem. It's actually a modeling choice dressed up as one — and the choice you make changes the recommendation you'd give a CMO, even when nothing about the underlying customer behavior has changed at all.
The approach
I used anonymized e-commerce session data to build out both a first-touch and a last-touch attribution model on the same conversion events, specifically so the comparison would be clean — same customers, same conversions, only the crediting rule changes.
SQL — channel performance, ranked
SELECT
traffic_source,
COUNT(DISTINCT session_id) AS sessions,
COUNT(DISTINCT CASE WHEN converted = 1 THEN session_id END) AS conversions,
ROUND(100.0 * COUNT(DISTINCT CASE WHEN converted = 1 THEN session_id END)
/ COUNT(DISTINCT session_id), 2) AS conversion_rate,
RANK() OVER (
ORDER BY 100.0 * COUNT(DISTINCT CASE WHEN converted = 1 THEN session_id END)
/ COUNT(DISTINCT session_id) DESC
) AS rank_by_conversion
FROM sessions
GROUP BY traffic_source;
Python — two models, one dataset
import pandas as pd
# first_touch credits the channel that started the journey;
# last_touch credits the channel right before conversion.
first_touch = (
sessions.sort_values("session_start")
.groupby("customer_id")
.first()["traffic_source"]
)
last_touch = (
sessions.sort_values("session_start")
.groupby("customer_id")
.last()["traffic_source"]
)
comparison = pd.DataFrame({
"first_touch": first_touch.value_counts(normalize=True),
"last_touch": last_touch.value_counts(normalize=True),
}).fillna(0)
comparison["swing"] = comparison["last_touch"] - comparison["first_touch"]
That swing column is the actual finding. A channel can look
strong under one model and mediocre under the other — and the size of that
swing tells you how much of the channel's apparent performance was really
"introduces people to the brand" versus "closes the sale."
Findings
Paid search looked strongest under last-touch — unsurprising, since people often search for a brand right before buying it, after discovering it somewhere else first. Organic social and content channels looked meaningfully better under first-touch, which matches their actual job: introducing people to something they hadn't considered yet.
Neither model is "correct." A team that only looks at last-touch will systematically defund the channels doing the hardest, least-credited work — getting someone's attention in the first place.
What I'd do next
A proper multi-touch or data-driven attribution model would split credit across the full path rather than picking a single winner. That's the honest next step — this project deliberately starts with the two simplest models because the gap between them is itself the most useful finding, before adding more modeling complexity on top.