← All articles

How do you make sense of 1,000+ open-ended survey responses?

The method is thematic analysis. That part's settled. The hard bit at 1,000-plus responses isn't the method, it's holding one consistent coding frame all the way through. And consistency is exactly what makes your themes defensible when someone asks "how do you know?".

A cartoon analyst between a chaotic pile of paper survey responses and a few neat labelled theme stacks

How do you make sense of 1,000+ open-ended survey responses?

Group them into themes using thematic analysis, a six-phase method (Braun & Clarke, 2006) of coding each comment, then collating those codes into named, defined themes. The technique is well understood and hasn't really changed in twenty years. The genuine challenge at 1,000-plus responses is applying one consistent coding frame from the first comment to the last, because consistency is what makes the themes hold up.

Wordnerds turns what customers say into what organisations do, and most of the analysts we work with already know how to code feedback. They're not stuck on method. They're stuck on volume: 1,000 comments is a fortnight of careful coding, 10,000 is a quarter, and the survey runs again next month. The question isn't "which technique?" It's "how do I do the technique properly when there's this much of it?".

What actually breaks when the responses pile up?

Consistency breaks first, long before you run out of hours. Thematic analysis is only as trustworthy as the coding frame behind it, and a frame applied by a tiring human at comment 900 rarely matches the one they started with at comment 20. Researchers have a name for this: coder drift, measured by inter-rater reliability (Cohen's kappa). A drifting kappa is an early warning that your definitions have quietly slipped.

Those are the real stakes, and they're not about deadlines. A report you finished but can't defend is worse than one you didn't finish. When leadership asks "so what?" and then "how do you know?", the honest answer can't be "trust me, I read them all." Manual coding in a spreadsheet, the biggest competitor we have, feels rigorous. At volume it quietly stops being reproducible, which is the one property that matters when the findings are challenged.

Can't you just paste them into ChatGPT?

Not for anything you'll need to stand behind. General-purpose LLMs can code text, but the same prompt produces different results on different runs. Tai et al. (2024) found LLM output only stabilises after roughly 40 controlled repetitions of the same task. Paste 1,000 responses into ChatGPT once and you get a plausible-looking set of themes you can't reproduce, can't audit, and can't explain to a sceptical board.

Plausible isn't the same as defensible. The problem with a single uncontrolled pass isn't that AI is useless; it's that you've swapped a slow-but-inspectable method for a fast-but-unaccountable one. For a regulated report, or any finding someone might push back on, that's the wrong trade. The fix isn't "avoid AI"; it's to use AI inside a transparent, explainable frame you can show your working on: automated where it helps, traceable throughout.

Can you still do it in Excel or Google Sheets?

Yes, right up until the coding frame gives out, not the spreadsheet. For a few hundred responses coded by one person in one sitting, a spreadsheet is completely fine and we'd never tell you otherwise. It's honest, cheap and inspectable.

The trouble starts when volume outlasts one person's concentration, or when a second coder joins and quietly interprets a theme differently. That's a consistency problem, and no amount of clever spreadsheet formulae fixes it, because the thing degrading is the human application of the frame. The signal you've outgrown the sheet isn't the row count; it's the first time you can't confidently say every comment was judged by the same rules.

What does a workflow that survives the volume look like?

The workflow that scales is frame-first, not comment-first. Instead of reading and coding on the fly, where the frame forms as you go and drifts as you tire, you define the frame up front, then apply it consistently however many responses arrive. Four steps:

  1. Build the coding frame first. Draft your themes and their definitions before you code, in your own language, including the regulatory or sector vocabulary you already care about (Awaab's Law, complaint categories, journey stages). This is definition-led, and it's your quality-control layer.
  2. Pressure-test it on a sample. Code a couple of hundred responses against the frame and find where it's ambiguous. Tighten the definitions until two people would code the same comment the same way.
  3. Scale the frame, don't re-invent it. Apply the settled frame across the full dataset. This is where transparent AI earns its place: a structured pipeline (unstructured comments → structured codes → a semantic model) applies your definitions to 1,000 or 100,000 responses without the drift a human hits at comment 900.
  4. Keep the audit trail. Every theme traces back to the comments behind it and the definition that caught them. That's what turns "trust me" into "here, look."

How do UK housing and regulated teams keep it defensible?

They lead with the audit trail, because the regulator asks for it. Under the TSM perception measures and the Consumer Standards, a UK housing association can't just report a satisfaction score; it needs to show the tenant voice behind the number, traceably. A coding frame anyone can inspect is the difference between evidence and assertion.

This is the volume problem at its most acute, and it's where we spend most of our time. Our Wordnerds × Housemark Social Housing Benchmarking Report 2026 analysed 135,000-plus tenant comments across 18 housing associations against one consistent frame, the kind of scale where manual coding simply can't hold its definitions steady. We'd say transparent AI is the answer, wouldn't we. But the specific reason it works here is that the frame is co-designed with your analysts and stays visible, so the output is auditable by a board or a regulator, not a black box.

So what should you actually do first?

Write the frame before you code. Whatever your volume or tool, the single move that most improves a qualitative analysis is defining your themes and their boundaries up front, then applying them consistently, rather than letting the frame emerge, and drift, as you read.

Do that and the next time someone asks how you got to those themes, you open a coding frame anyone can inspect instead of defending a spreadsheet only you can read. Three responses or three thousand, the method is the same; what changes at volume is whether you can keep it consistent. That's the whole game.

Frequently asked questions

What's the best method for analysing open-ended survey responses?

Thematic analysis is the standard: code each response, then group the codes into named, defined themes (Braun & Clarke, 2006). Coding can be inductive (themes emerge from the data) or deductive (you start from a predefined frame). Most real analysis blends the two, and applying it consistently matters more than the method itself.

How many open-ended responses can you analyse by hand?

Realistically a few hundred, coded by one person in one focused sitting. Beyond that, consistency (not time) becomes the limit: fatigue and definition-drift mean later responses get judged by a subtly different frame than earlier ones, which undermines how defensible the findings are.

Can ChatGPT analyse open-ended survey responses?

It can code text, but a single pass isn't reproducible: the same prompt gives different results on different runs, and output only stabilises across many controlled repetitions (Tai et al., 2024). For anything you'll need to defend, use AI inside a transparent, auditable frame rather than as a one-shot black box.

What is a coding frame?

A coding frame is the set of themes and their definitions you apply to every response: the rulebook that decides what counts as what. Building it before you code (definition-led), rather than letting it form as you go, is what keeps analysis consistent and lets a second person reach the same result.

How do you make qualitative analysis defensible to leadership or a regulator?

Keep an audit trail. Every theme should trace back to the comments behind it and the definition that captured them, so the finding can be inspected rather than taken on trust. For UK regulated sectors like housing, this traceability is what the TSM and Consumer Standards effectively require.

Do you need specialist software to analyse open-ended responses?

Not at low volume: a spreadsheet is fine for a few hundred responses coded by one person. You need a dedicated approach once volume outlasts one person's consistency or a second coder joins, because the challenge then is applying one frame at scale, transparently, which is what a Voice of Customer platform integrated into Power BI is built for.

Pete, founder of Wordnerds

So you're reading the footer now? Either you ❤️ Wordnerds or you're desperate for something to read. Either way, CX Corner from Wordnerds is the answer. Fortnightly Voice of Customer bombs dropped in your box. Signup 👇 or find out more.