Inside Revolution· Psychology

Reading “Noise

youngsports 2022. 10. 15. 14:17

Reading “Noise — a flaw in human judgment” by Daniel Kahneman, Olivier Sibony, and Cass Sunstein

People like me who work in clinical research are no strangers to noise because we deal with this issue almost daily. However, I still very much enjoy reading “Noise — a flaw in human judgment” [1] by Daniel Kahneman, Olivier Sibony, and Cass Sunstein [2]. This book is a re-examination of an age old concept from a socio-psychological viewpoint, and the result is a nice complement to the statistical treatment of the subject that I know of.

 

Statistical treatment of noise

From a statistical point of view, any observed or measured value is composed of two components: a true value and an error. The true value is usually unknown. The error component here does not simply mean mistake, but reflects any difference between people and within people, with the latter including measurement error. The error component can be divided into two sub-components: systematic error and random error.

Systematic error may include biases and confounding factors. For instance, the way we select people for a survey, the method we choose for data analysis, and even the selectivity of data for publication can make our conclusion deviated from the truth. We sometimes call systematic error as bias. And, there are hundreds of types of biases that make our research irreproducible or even wrong.

 

Random error is a much more challenging issue. The dissection of the error has occupied some of the best statistical minds in the world. One of the most eminent minds is Ronald Aylmer Fisher, a genius who is regarded as the father of modern statistical science. In the 1920s, Fisher invented the analysis of variance that partitions the variation of a measurement into two sources: between-groups and within-group variation. In Fisher’s thinking, the between-groups variation represents signal, and within-group variation represents noise. Fisher further derived the probability distribution of the ratio of signal over noise ((i.e., the F-test) to assess whether the difference in the measurement between groups is due to a underlying mechanism or random error.

 

One of the world’s best statistical minds, Ronald A. Fisher (1890–1962), who dissected the variation (ie noise) in the early 20th century, but he is not mentioned in Noise.

 

Fisher pointed out that if we take a random sample of people and measure their height, then the deviations between individual values and the sample average closely resemble a Normal distribution (which he called ‘law of error’). In an influential essay, “The Causes Of Human Variability”, Fisher articulated that the mean squared deviation (or called mean squared error — MSE) can be used as a measure of error:

 

“[…] the amount of variability may be measured, as errors of observation are habitually measured, by the mean of the squares of the deviations of different individuals from the mean, this mean square deviation being strictly comparable to the mean square error used in astronomy and in all physical science.”

As can be seen, noise has been of interest to statistical science for a long time. However, the statistical treatment of noise is not easily understood by the public at large, because the mathematics underlying noise treatment is quite daunting to many people.

 

And, that is where I find the socio-psychological treatment of noise by Kahneman, Sibony, and Sunstein quite interesting.

Socio-psychological dissection of noise

To understand the authors’ perspective, let us consider the following fascinating stories. A few years go, in Vietnam, a young man convicted of stealing a duck worth $10 was sentenced by a local court to 7 years in prison. A few months later, another young man in the same region was sentenced to 2 months of prison for stealing two ducks worth about $15. Such sentencing disparity is almost the norm in the Vietnamese judicial system. People don’t really trust the system that is commonly viewed as a chronic flaw. The example I present can also be viewed as a noise in the system, but it has neither been well studied nor documented.

 

In this book, Kahneman, Sibony and Sunstein take a deep dive into the dissection of noise in the American judicial system, business, education, and medical diagnoses. Anything that involves a judgment is subject to noise. A court sentence is a judgment. Similarly, a medical diagnosis is also a judgment. The authors therefore argue that “wherever there is judgment, there is noise — and more of it than you think.”

The authors actually point out that ‘judgment’ here should be understood as “a form or measurement in which the instrument is a human mind.” Like physical measurements, a judgment is the process of assigning a score to an object, but unlike physical measurements, the score does not have to be a number. It is the process of assigning a score that results in noise. In Noise, Kahneman, Sunstein, and Sibony distinguish between noise and bias, and they come up with the equation: squared noise plus squared bias equal to squared error,

Error^2 = Bias^2 + Noise^2

Obviously, the equation implies that bias and noise are two independent sources of error, or in mathematical language, they are orthogonal. The authors define bias as “systematic errors of judgment” (page 163) and that “bias is error we can often see and even explain” (page 243). However, they don’t delve into the treatment of bias, and instead focus on the dissection of noise.

 

The authors group the noise component into two sub-components: level noise and pattern noise. They define level noise as “the variability of the average judgments made by different individuals” (page 365), whereas pattern noise is the variability in judgments within a single judge. This dissection is not new because as I mentioned earlier that Fisher had come up with this concept that he referred to as between-groups and within-group variance about 100 years ago. However, what puzzles me is that the statement that “the proper statistical term for pattern noise is judge x case interaction” (page 76). I think the authors have misunderstood the concept of interaction here.

However, the authors further group the pattern noise into two smaller and independent sources that they call stable pattern noise and occasional pattern noise. They explain that stable pattern noise is typically arisen from factors that are more permanent, such as an employer has preference for candidates from certain universities, or a doctor has a tendency to recommend a type of surgical treatment. Occasional (or transient) pattern noise, as the name implies, represents internal factors that affect one’s judgment, such as “a judge’s good mood at the relevant moment or some unfortunate recent occurrence that is currently on the judge’s mind” (page 203). Overall, they graphically present their dissection of noise as follows:

The dissection of noise by Kahneman, Sibony, and Sunstein. MSE = mean squared error. Source: illustration from the authors’ book.

 

How to reduce noise?

The authors illustrate those components with a series of cases and examples derived from the judicial system, business, and medical diagnosis. In each case, they articulate various causes of noise, and understandably, most of the causes are psychological and judgmental in nature. And, because noise is largely a psychological and judgmental phenomenon, it is a modifiable thing.

How can we eliminate noise? One simple way is to replace individual judgments by the average from multiple judges. Actually, this approach has been shown by Edward Vul and Harold Pasher to improve accuracy of judgment (page 83). This is not new, as Francis Galton already demonstrated the power of ‘wisdom of crowds’ more than 100 years ago. In 1906, while attending a local festival, Galton came up with a simple experiment: he asked 787 people in the festival to guess the weight of the ox being displayed. Not surprisingly, individual guesses were wildly different, but he found that the average guess (1197 lbs) was almost identical to the true weight (1198 lbs). Galton called the average guess ‘Wisdom of Crowds’. This simple example powerfully demonstrates that the collective judgment is better than individual judgments.

However, in Noise, the authors argue that such an average judgment is only valid if individual judgments are independent, meaning that they are not swayed by social pressure. Most of us have seen that in meetings people often want to conform to the majority, and that can lead to disparities among groups. In other words, average judgment is better than individual judgments, but it can still have noise.

Francis Galton (1822 — 1911) who came up with the idea of ‘Wisdom of Crowds’ (see: https://www.nature.com/articles/075450a0)

 

In many cases, statistical models and algorithms can also reduce noise. In judicial system, sentencing algorithms driven by past data can help make criminal punishment decisions more transparent and more consistent. In psychology, back in 1954, the prominent psychologist Paul Meehl showed that statistical models could predict human behavior better than clinical psychologists do. In medicine, statistical models have long been shown to be superior to doctors in prognostic prediction and even clinical diagnosis, because those models can unbiasedly weight multiple factors that human cannot. For instance, the Watson system (an IBM model) can diagnose heart disease better than cardiologists do.

And, given that statistical models have less noise, the question is then: should we replace human judgment by statistical models? Probably not yet. There is a hesitancy in the use of statistical algorithms in medical diagnosis, as many patients think that their disease and health needs are unique that cannot be addressed by a mechanical algorithm. The authors also point out that in some areas, humans make better judgment than statistical algorithms. Another way to reduce noise is for people to follow a well-defined process for a particular problem. For instance, in health care, doctors have based on the best scientific evidence to develop guidelines for management of chronic diseases, and this is a good model for noise reduction in clinical medicine.

 

What can be learned from this book?

This is a good book, but I wish the authors could have written better. I came away with a mixed message: noise is prevalent in our life, but there is not much we can do about it. The treatment of noise and bias is interesting, but it is not quite novel. One of the authors claims that noise has not been dissected in the past, and that is why the authors think that they have “discovered a new continentWow! That is an overstatement. As I pointed out earlier, some of the world’s best statistical minds have worked on this subject long time ago, notably Ronald Fisher, Fischer Black, and William Deming. It is unfortunate the book does not mention the work of those authors.

Economist Fischer Black wrote about noise in 1986, but he is not mentioned in Noise.

 

It appears that the authors have not been well exposed to statistical science, and as a result, they make a number of questionable claims. For example, they consider that pattern noise (eg between-judge variation) is equivalent to statistical term for interaction, which is incorrect at least from an analysis of variance point of view. The authors claim that “while correlation does not imply causation, causation does imply correlation. Where there is a causal link, we should find a correlation” (page 153), but this is also incorrect. Causation and correlation are two different phenomena. A causal relationship can be observed without a correlation when there is bias or when there is a non-monotonic relationship.

I am wondering whether the coining of new terminologies (eg level noise and pattern noise) is necessary when there are well accepted terminologies (between-people variation and within-people variation). Indeed, even the word ‘noise’ the authors use should be referred to as variation.

 

In summary, Noise is a rediscovery of an age old statistical concept, but it is re-interpreted from a socio-psychological viewpoint. In social interaction, the variation between people is a cause for celebration, but when it comes to decision-making the variation is a liability, especially when it comes to level noise and biases. The exposition in this book raises the public awareness of the importance of noise and bias, and how they affect our everyday life, and I consider that is a meaningful contribution. The socio-psychological dissection of noise in this book also contributes to our understanding of human behaviors, and that is a welcome addition to the resource for teaching epidemiology and research methodology.

____

[1] ‘Noise — a flaw in human judgment’ by Daniel Kahneman, Olivier Sibony, and Cass Sunstein, published on 18/5/2021, 464 pages.

[2] Daniel Kahneman is the Eugene Higgins Professor of Psychology, Princeton University, Professor of Public Affairs, the Princeton School of Public and International Affairs, and the Nobel Laureate in Economic Sciences in 2002. He is a member of the American Academy of Arts and Sciences and the National Academy of Sciences, a fellow of the American Psychological Association, the American Psychological Society, the Society of Experimental Psychologists, and the Econometric Society. He is the author of Thinking, Fast and Slow, a best seller.

Olivier Sibony is Professor of Strategy and Business Policy at HEC Paris. He spent 25 years in the Paris and New York offices of McKinsey & Company, where he was a senior partner. He is the author of You’re About to Make a Terrible Mistake!

Cass R. Sunstein is the Robert Walmsley University Professor at Harvard University and the founder and director of the Program on Behavioral Economics and Public Policy. From 2009 to 2012, he was Administrator of the White House Office of Information and Regulatory Affairs. He is author of The World According to Star Wars and Nudge (with Richard H. Thaler), a bestseller.