Shipping a fix isn't the same as fixing the problem

Alex Barnett

CEO

Insight

A subscription app shipped a fix to cut down on refund tickets. The change was simple: let the support reps issue refunds on the spot, instead of routing every one to a separate ticketing tool. It's the kind of fix most teams ship, watch the queue feel a little lighter, and check off as done.

This team did one more thing. They checked whether refund tickets, specifically, actually went down. The answer was yes, and also not nearly as much as they'd planned. That gap, and the reason for it, is the whole story.

The number most teams skip

When you ship a fix, the thing you can feel is the queue. It seems lighter, the team seems less swamped, and everyone moves on to the next thing. What you can feel is not the same as what changed.

The number that actually tells you whether the fix worked is narrow and specific: did the one complaint you were trying to remove go down, after the release, compared to before? Not total volume. Not "vibes." That exact complaint, counted on both sides of the release date. Almost nobody pulls that number, because pulling it means reading the tickets and classifying them the same way before and after, which is the work the team is too busy for.

What the measurement showed

The Signal Engine read the tickets on both sides of the release and counted the refund complaints directly. Here's what came back:

So the fix worked. Refund tickets dropped by about a third. That's real, although it was a small enough drop that the team will want another month of data to be fully sure of the size.

But look at the third column. They had planned on refunds falling to 2 or 3%. They landed at 5%. The fix did about half of what they expected. The thing that shipped and the thing that worked were different sizes, and the only reason anyone knew that was the measurement. Without it, "we shipped the refund fix" and "we solved the refund problem" would have been the same sentence in the next review.

Why it only half worked

This is where the support data earned its keep. The fix depended on the support reps changing a habit, taking a new step on every relevant ticket instead of the old reflex. Fixes like that live or die on whether the new habit holds.

It didn't, all the way. The biggest drops came in the first two weeks after the release, when the new flow was top of mind. Then, week by week, as the queue filled back up, the team reached for the faster, familiar path, and the complaint climbed most of the way back toward where it started. The fix shipped on day one. It stuck for about two weeks. Then it faded.

A single before-and-after snapshot would have missed this completely. Measure the month before against the month after and you get one averaged number that looks like a clean win. Watch it week by week and you see the win arrive, then erode. The erosion is the actionable part, because it tells you the problem isn't the fix, it's the adoption.

It also told them what didn't change

A good measurement is as useful for what it rules out as what it confirms. Cancellations, in this case, didn't move at all. That's exactly right: the fix was about refunds, not cancellations, so there was no reason to expect it to touch them. But on a noisy dashboard, a coincidental dip in cancellations that same month could easily have been credited to the refund fix, and the team would have walked away believing the change did more than it did. Measuring the specific complaint keeps you from giving a fix credit for something it never touched.

Shipping versus sticking

The lesson travels past refunds. Any time you ship a change to reduce a specific kind of customer pain, "we shipped it" and "it worked" are two different claims, and the gap between them is usually adoption. The way you tell them apart is to measure the exact complaint the change was meant to remove, before the release and for several weeks after, not the day after.

Most teams ship the fix and move on, because the measurement has always been too expensive to run. That's the part that's changed. The Signal Engine reads every conversation and counts the specific complaint over time, so "did the fix work" stops being a feeling and becomes a number you can watch.

See whether your last fix actually worked

Think about the last change your team shipped to cut a specific kind of ticket. Do you know whether that exact complaint went down, by how much, and whether it held for more than a couple of weeks? Most teams struggle with fuzzy or incomplete data because pulling it at scale expensive and time consuming.

That's what the Signal Engine does. Point it at your support data and it measures the complaint you targeted, before and after the release, week by week, so you can see whether the fix landed, by how much, and whether it stuck. [Book a 20-minute walkthrough] and we'll run it on a fix you've already shipped, then show you what actually happened after it went out.

Share on social media

Insight

Stop trusting manual tag-based support metrics

Tools

The question their data couldn't answer: which provider?

Insight