F3·Market Research & Data·Measurement Critique Case

The NPS Controversy — When a Metric Becomes a Religion

Covers lectures

F3-04 · F3-05 · F3-09

The NPS Controversy — When a Metric Becomes a Religion

Module: F3 — Market Research & Data Type: Measurement Critique Case Cross-references: F3-04 (survey design and measurement), F3-05 (brand tracking and measurement), F3-09 (the limits of data)

The Situation

In December 2003, Frederick Reichheld published an article in the Harvard Business Review titled "The One Number You Need to Grow." The article introduced Net Promoter Score — NPS — a single metric derived from a single question: "How likely is it that you would recommend [company/product/service] to a friend or colleague?" Respondents answer on a scale of 0 to 10. Those who score 9-10 are classified as "Promoters." Those who score 0-6 are classified as "Detractors." Those who score 7-8 are classified as "Passives" and are excluded from the calculation. NPS is calculated by subtracting the percentage of Detractors from the percentage of Promoters.

The result is a single number, ranging from -100 to +100, that purports to measure a company's relationship with its customers and predict its growth trajectory.

Within a decade, NPS became the most widely adopted customer metric in the world. By 2018, two-thirds of Fortune 1000 companies were using NPS in some form. It was adopted by Apple, Amazon, Airbnb, American Express, Southwest Airlines, Philips, GE, Intuit, and hundreds of other major corporations. It became a board-level metric — reported alongside revenue, profit, and market share. It was embedded in employee performance evaluations, executive compensation packages, and corporate strategy documents. Bain & Company, the consulting firm where Reichheld is a Fellow, built a substantial practice around NPS implementation.

NPS is, without question, the most successful marketing metric of the twenty-first century — measured by adoption, influence, and organisational penetration.

It is also, without question, the most fiercely debated. Academic researchers have challenged virtually every claim made for NPS — from its predictive validity to its methodological soundness to its superiority over alternative measures. The debate has produced a body of evidence that is, for a marketing metric, unusually detailed and unusually contentious.

This case examines NPS not as a right-or-wrong question but as a case study in measurement — in what happens when a metric's organisational appeal outstrips its methodological merit, and in the gap between what a number measures and what an organisation believes it measures.

The Data

The Original Reichheld Claim

Reichheld's 2003 HBR article, and his subsequent book The Ultimate Question (2006, revised 2011), made several specific claims about NPS:

Claim 1: NPS is the best predictor of growth. Reichheld argued that NPS correlated more strongly with revenue growth than any other customer metric — including customer satisfaction, repurchase intent, and other loyalty measures. He reported that in most industries he examined, NPS explained a significant proportion of the variation in growth rates between competitors.

Claim 2: The single question is sufficient. Reichheld argued that the "would you recommend?" question captured the essence of customer loyalty more effectively than multi-item satisfaction surveys. He advocated replacing lengthy customer satisfaction questionnaires with the single NPS question (plus an open-ended follow-up: "Why did you give that score?").

Claim 3: NPS is actionable. Reichheld argued that the simplicity of NPS made it actionable — it could be communicated throughout an organisation, tracked over time, and linked to specific operational improvements. Complex satisfaction surveys produced complex data that was difficult to interpret and even more difficult to act upon. NPS produced a single, clear number that everyone from the CEO to the front-line employee could understand.

Claim 4: NPS creates a "system" for improvement. In The Ultimate Question 2.0 (2011), Reichheld expanded NPS from a metric into a management system — the "Net Promoter System" — that included closed-loop feedback (contacting Detractors to understand their concerns), root-cause analysis, and employee engagement programmes linked to NPS outcomes.

The Academic Critique

The academic response to NPS has been substantial, rigorous, and largely unfavourable to Reichheld's claims.

Keiningham et al. (2007): "A Longitudinal Examination of Net Promoter and Firm Revenue Growth." This is the most-cited rebuttal of Reichheld's claims. Timothy Keiningham, Bruce Cooil, Tor Wallin Andreassen, and Lerzan Aksoy published a study in the Journal of Marketing that attempted to replicate Reichheld's finding that NPS was the best predictor of revenue growth.

Using data from the Norwegian Customer Satisfaction Barometer (NCSB) and the American Customer Satisfaction Index (ACSI) — both large-scale, longitudinal datasets — the authors found:

NPS did correlate with revenue growth, but not more strongly than other customer satisfaction measures.
The American Customer Satisfaction Index (ACSI) — a multi-item satisfaction measure — was a statistically superior predictor of revenue growth compared to NPS in the majority of industries examined.
Reichheld's original claim of NPS superiority could not be replicated using independently collected data.
The "recommend" question was not unique — other single-item measures (overall satisfaction, repurchase intent) performed comparably.

The authors' conclusion was direct: "Claims that NPS is the single most reliable indicator of a company's ability to grow are not supported by our research."

Sharp (2008, 2010): The How Brands Grow critique. Byron Sharp, Director of the Ehrenberg-Bass Institute for Marketing Science, has been a persistent and influential critic of NPS. Sharp's critique rests on several empirical findings from the Ehrenberg-Bass research programme:

Brand recommendation is heavily driven by market share, not brand quality. Larger brands get more recommendations simply because more people use them. A brand with 40% market share will naturally have more "Promoters" than a brand with 5% market share — not because it is better, but because it is bigger.
NPS scores are substantially correlated with market penetration. This means NPS may be measuring market share rather than customer loyalty — making it a lagging indicator dressed up as a leading indicator.
The "Passives" (scores of 7-8) are discarded in the NPS calculation, despite representing a substantial proportion of customers and exhibiting meaningful behavioural differences from both Promoters and Detractors.
The claim that recommendation drives growth confuses cause and effect. Brands grow, then they get recommended more — not the reverse.

The single-item measurement problem. Measurement theory — the body of psychometric research on how to construct reliable and valid measures — consistently shows that single-item measures are less reliable than multi-item scales. A single question is subject to greater random error, context effects, and acquiescence bias than a scale composed of multiple items measuring the same construct from different angles. The ACSI, for example, uses three questions to measure customer satisfaction — and the redundancy between items increases measurement reliability.

NPS's reliance on a single question is a deliberate trade-off: simplicity over precision. This trade-off is defensible in practice — simplicity drives adoption, and adoption drives organisational impact. But it is problematic if the metric is being used to make precise comparisons between brands, track small changes over time, or diagnose the causes of customer behaviour. The single item lacks diagnostic power — it tells you whether customers would recommend, but not why, and the open-ended follow-up question does not compensate for this limitation because open-ended responses are difficult to quantify and prone to post-hoc rationalisation.

The 11-point scale problem. NPS uses an 11-point scale (0-10) but collapses it into three categories: Promoters (9-10), Passives (7-8), and Detractors (0-6). This categorisation discards substantial information. A customer who scores 6 (barely a Detractor) is treated identically to a customer who scores 0 (an active enemy of the brand). A customer who scores 7 (a Passive) is excluded entirely from the calculation, despite being a loyal customer by most definitions. The mathematical consequence is that small shifts in scores around the category boundaries (e.g., from 6 to 7, or from 8 to 9) produce disproportionate changes in NPS, while large shifts within categories (e.g., from 0 to 5, or from 9 to 10) produce no change at all.

The "would you recommend?" problem. The NPS question asks about recommendation likelihood. This assumes that recommendation is a meaningful behaviour in the category being measured. For some categories — restaurants, hotels, streaming services — recommendation is a common and natural behaviour. For others — utilities, insurance, banking, petrol stations — consumers rarely recommend and are rarely asked for recommendations. In these low-involvement, low-salience categories, the "would you recommend?" question is measuring a hypothetical behaviour that most consumers will never perform. The metric may be valid for Airbnb and questionable for Thames Water.

The Organisational Appeal

Despite the academic critique, NPS adoption has continued to grow. Understanding why requires examining the metric's organisational, rather than statistical, properties.

Simplicity. NPS is one number. It is easy to understand, easy to communicate, and easy to track. In organisations where marketing metrics are competing for C-suite attention against revenue, profit, and share price, a single number that purports to measure "customer loyalty" has an enormous communication advantage over a multi-item satisfaction index with confidence intervals.

Benchmarkability. Because NPS is standardised — the same question, the same scale, the same calculation — it enables cross-company benchmarking. Companies can compare their NPS to competitors, to industry averages, and to best-in-class performers. This benchmarkability is catnip for boards and executives who want to know "how are we doing compared to X?" The fact that the benchmarks may not be methodologically meaningful (due to differences in sampling, survey context, and category dynamics) does not diminish their organisational appeal.

KPI cascading. NPS can be cascaded through an organisation — from corporate NPS to divisional NPS to regional NPS to store-level NPS to individual employee NPS. This cascading creates a line of sight from the board room to the shop floor, making NPS a management tool as well as a measurement tool. The fact that store-level NPS may have unacceptably high sampling error (small samples produce volatile scores) does not diminish its appeal as a performance management mechanism.

Consultancy support. Bain & Company has built a substantial practice around NPS — implementation, benchmarking, training, and certification. The "Net Promoter System" is a trademarked methodology with a supporting ecosystem of consultants, conferences, and certifications. This institutional support reinforces adoption and creates lock-in: once an organisation has invested in NPS infrastructure, switching to an alternative metric involves significant costs.

Executive compensation linkage. In many organisations, NPS is linked to executive compensation — directly or indirectly. This creates a powerful incentive to maintain and defend the metric, regardless of its methodological limitations. An executive whose bonus depends on NPS improvement is unlikely to champion a research programme that might demonstrate NPS is a poor predictor of business outcomes.

The Analysis

The Gap Between Measurement and Meaning

The NPS controversy illustrates a pattern that recurs throughout market research: the gap between what a metric measures and what an organisation believes it measures.

What NPS measures. At a technical level, NPS measures self-reported likelihood of recommendation on an 11-point scale, categorised into three groups, with the net difference between the top group and the bottom group expressed as a score. This is what the number is. It is a precise measurement of a specific self-report.

What organisations believe NPS measures. Organisations use NPS as if it measures customer loyalty, brand health, future growth potential, service quality, customer experience, and competitive position. These are the meanings attributed to the number — meanings that go far beyond what the question actually asks. The single question "would you recommend?" is made to bear the weight of an entire customer relationship framework.

This gap — between measurement and attributed meaning — is not unique to NPS. It is a general problem in marketing metrics. Brand awareness measures whether consumers have heard of you, but organisations use it as if it measures brand health. Market share measures purchase behaviour, but organisations use it as if it measures competitive strength. Customer satisfaction measures a self-reported attitude, but organisations use it as if it measures loyalty. In every case, the metric measures something narrow, and the organisation interprets it as something broad.

The NPS case is distinctive because the gap is unusually large and the consequences are unusually significant. When an organisation bases strategic decisions on NPS — investing in Detractor recovery programmes, redesigning customer journeys to improve NPS, linking executive compensation to NPS targets — it is acting on the attributed meaning, not the measured meaning. And if the attributed meaning is wrong — if NPS does not actually predict growth, does not measure loyalty, does not diagnose the causes of customer behaviour — then the strategic actions built on NPS are built on sand.

What NPS Gets Right

Acknowledging the academic critique does not require concluding that NPS is worthless. The metric has genuine strengths that explain its adoption and, within limits, justify its use.

Organisational simplicity. NPS's greatest strength is not statistical but organisational. It creates a common language for customer experience across an entire company. When everyone — from the CEO to the call centre agent — understands "we want to create more Promoters and fewer Detractors," the organisation has a shared framework for thinking about customer experience. This shared framework has value even if the metric itself is imprecise.

Directional accuracy. While NPS may not predict growth better than other satisfaction measures, it does correlate with growth. Companies with very high NPS tend to perform well. Companies with very low NPS tend to perform poorly. The direction is right, even if the precision is overstated. As a rough health indicator — a thermometer rather than a diagnostic tool — NPS has utility.

Closed-loop feedback. The Net Promoter System's emphasis on contacting Detractors and understanding their concerns (the "closed loop") is a genuinely valuable practice, regardless of the metric that triggers it. The discipline of following up with unhappy customers, diagnosing root causes, and implementing improvements is good management. NPS provides a mechanism for triggering this discipline — and this may be its most valuable contribution, quite apart from the number itself.

What NPS Gets Wrong

The critique is equally substantive.

No diagnostic power. NPS tells you the score but not the cause. A declining NPS could mean product quality has deteriorated, prices have increased, competitors have improved, customer expectations have shifted, or the survey sample has changed. The number itself provides no diagnosis. The open-ended follow-up question ("Why did you give that score?") partially addresses this, but open-ended responses are difficult to quantify, prone to post-rationalisation, and often too vague to be actionable.

Gaming and manipulation. When NPS is linked to employee performance or executive compensation, it becomes susceptible to gaming. Employees ask customers to give high scores. Survey invitations are sent selectively to satisfied customers. Detractor responses are challenged or excluded. The metric becomes a target, and as Goodhart's Law observes: "When a measure becomes a target, it ceases to be a good measure."

Category blindness. NPS applies the same question, the same scale, and the same calculation to every category — from luxury hotels to electricity providers, from streaming services to industrial chemicals. The assumption that "would you recommend?" is equally meaningful across all categories is not supported by evidence. In categories where recommendation is rare or irrelevant, NPS measures a hypothetical behaviour that has little connection to actual customer value.

The Passives problem. Excluding scores of 7-8 from the calculation is methodologically questionable. These "Passive" customers represent, in many organisations, the largest single group. They are neither enthusiastic nor hostile — they are the satisfied middle. Discarding them loses information and can produce misleading results. A company that converts Passives to Promoters sees no change in its Passive-to-Promoter pipeline in NPS if, simultaneously, some Passives also become Detractors. The net effect may be zero, masking significant shifts in the customer base.

The synthesis

The NPS controversy is a case that demands the Both/And framework.

NPS is a useful health indicator AND a terrible strategic compass. As a quick, simple, organisationally accessible measure of customer sentiment, NPS has value. It is easy to understand. It is easy to communicate. It provides a rough directional signal. As a basis for strategic decisions — resource allocation, competitive positioning, growth forecasting — it is inadequate. The single-item measurement is too imprecise, the categorisation too coarse, the diagnostic power too limited, and the predictive validity too contested.

The solution is not to abandon NPS. The solution is to contextualise it. NPS should not be used alone. It should be used alongside — and interpreted through — other measures:

Customer satisfaction measures (multi-item scales like the ACSI) provide more reliable and diagnostically useful data on how customers feel about specific aspects of the product or service.
Brand tracking studies (continuous or periodic surveys measuring brand awareness, consideration, preference, and usage) provide a broader picture of brand health than a single loyalty metric.
Behavioural loyalty data (repeat purchase rates, share of wallet, customer lifetime value) measure what customers actually do, not what they say they would do.
Qualitative research (depth interviews, ethnographic observation) provides the diagnostic depth that NPS lacks — the why behind the what.

Triangulation, not monopoly. The principle from F3-09 applies: no single data source provides a complete picture. NPS is one signal among many. Used as part of a triangulated measurement framework — where multiple data sources are compared, contrasted, and integrated — it has value. Used as the single metric that governs strategy, resource allocation, and executive compensation, it is dangerously incomplete.

The fact that NPS has achieved near-monopoly status in many organisations is not evidence of its superiority. It is evidence of the organisational appeal of simplicity — and a warning about what happens when a metric's ease of use outweighs its methodological rigour. The metric became a religion because it was simple to preach, not because it revealed the truth.

The Questions

F3-04 Application. Using the principles of survey design and measurement from F3-04, analyse the methodological properties of the NPS question. Evaluate the 11-point scale, the three-category classification (Promoters, Passives, Detractors), and the net score calculation. What information is gained and what information is lost through this design? How would you redesign the NPS measurement to address the most significant methodological critiques while retaining its organisational simplicity?
F3-05 Application. NPS is used as a brand health metric in many organisations. Using the brand tracking and measurement frameworks from F3-05, assess whether NPS is an adequate measure of brand health. What dimensions of brand health does NPS capture? What dimensions does it miss? Design a brand health measurement framework that integrates NPS with other measures to provide a more complete picture.
F3-09 Application. The NPS case illustrates what happens when a metric becomes a religion — when organisational adoption outstrips methodological validation. Using the frameworks from F3-09 on the limits of data, analyse the organisational dynamics that drove NPS adoption. Why did companies adopt NPS despite the academic critique? What does this case teach about the relationship between a metric's organisational properties (simplicity, benchmarkability, cascadability) and its scientific properties (reliability, validity, predictive power)?

Sources

Reichheld, F.F. (2003). "The One Number You Need to Grow." Harvard Business Review, December.

Reichheld, F.F. & Markey, R. (2011). The Ultimate Question 2.0: How Net Promoter Companies Thrive in a Customer-Driven World. Harvard Business Review Press.

Keiningham, T.L., Cooil, B., Andreassen, T.W. & Aksoy, L. (2007). "A Longitudinal Examination of Net Promoter and Firm Revenue Growth." Journal of Marketing, 71(3), 39-51.

Sharp, B. (2008). "Net Promoter Score Fails the Test." Marketing Research, 20(4), 28-30.

Sharp, B. (2010). How Brands Grow: What Marketers Don't Know. Oxford University Press.

Kristensen, K. & Eskildsen, J. (2014). "Is the NPS a Trustworthy Performance Measure?" The TQM Journal, 26(2), 202-214.

Grisaffe, D.B. (2007). "Questions About the Ultimate Question: Conceptual Considerations in Evaluating Reichheld's Net Promoter Score." Journal of Consumer Satisfaction, Dissatisfaction and Complaining Behavior, 20, 36-53.

Fornell, C., Johnson, M.D., Anderson, E.W., Cha, J. & Bryant, B.E. (1996). "The American Customer Satisfaction Index: Nature, Purpose, and Findings." Journal of Marketing, 60(4), 7-18.

East, R., Hammond, K. & Lomax, W. (2008). "Measuring the Impact of Positive and Negative Word of Mouth on Brand Purchase Probability." International Journal of Research in Marketing, 25(3), 215-224.

Goodhart, C.A.E. (1984). "Problems of Monetary Management: The U.K. Experience." In Monetary Theory and Practice. Macmillan.