Sat. Feb 4th, 2023
Crowds are wise enough to know when other people will be wrong

The wisdom of the crowd is a simple approach that can be surprisingly effective in finding the right answer to certain problems. For example, if a large group of people are asked to estimate the number of jelly beans in a jar, the average of all answers will be closer to the truth than individual answers. The algorithm is applicable to limited types of questions, but there is evidence of real-world utility, such as improving medical diagnoses.

This process has some obvious limitations, but a team of researchers from MIT and Princeton published a paper in Nature suggested a way this week to make it more reliable: Look for an answer that’s more common than people think, and it’s probably right.

As part of their paper, Dražen Prelec and his colleagues used a survey of US capitals. Each question was a simple true/false statement formatted as “Philadelphia is the capital of Pennsylvania.” The listed city has always been the most populous city in the state, but that is not necessarily the capital. In the case of Pennsylvania, the capital is actually Harrisburg, but many people don’t know that.

The wisdom of masses approach fails this question. The problem is that questions sometimes rely on people with unusual or otherwise specialized knowledge that is not shared by the majority of people. Since most people don’t have that knowledge, the public’s answer will be flat out wrong.

Previous tweaks have attempted to correct this issue by taking trust into account. People are asked how confident they are in their answers, and more weight is given to more confident answers. However, this only works if people are aware that they don’t know something – which, remarkably, is often not the case.

In the case of the Philadelphia question, people who answered “True” incorrectly had about as much confidence in their answers as people who answered “False” correctly, so confidence ratings did not improve the algorithm. But when people were asked to predict what they thought the general answer would be, there was a difference between the two groups: people who answered “True” thought most people would agree with them, because they didn’t know that they were wrong. In contrast, the people who answered “False” knew they had unique knowledge and correctly assumed that most people would answer incorrectly, and predicted that most people would answer “True.”

Therefore, the group generally predicted that “True” would be the overwhelmingly popular answer. And it was, but not to the extent they had predicted. More people knew it was a trick question than the crowd expected. Due to this discrepancy, the approach can be adjusted. The new version looks at how people predict the population will vote, looks for the answer people gave more times than those predictions suggest, then chooses that “surprisingly popular” answer as the correct one.

To go back to our example, most people will think others will pick Philadelphia, while few will expect others to pick Harrisburg. But because Harrisburg is the correct answer, it will be much more common than the predictions suggest.

Prelec and his colleagues constructed a statistical proposition suggesting that this process would make things better, and then tested it on a number of real-world examples. In addition to the capitals survey, they used a general knowledge survey, a questionnaire asking art professionals and laypeople to rate the prices of certain works of art, and a survey asking dermatologists to rate whether skin lesions were malignant or benign.

Across the aggregated results of all these surveys, the “surprisingly popular” (SP) algorithm had 21.3 percent fewer errors than a standard “popular vote” approach. They also rated people’s confidence in their answers on 290 of the 490 questions from all surveys. Here too, the SP algorithm performed better: it had 24.2 percent fewer errors than an algorithm that chose confidence-weighted answers.

It’s easy to misinterpret the “wisdom of crowds” approach as suggesting that any answer reached by a large group of people will be the right one. That is not the case; it can be quite easily undermined by social influences, such as being told how other people had answered. These shortcomings are a problem because it could be a very useful tool, as evidenced by its hypothetical use in medical settings.

So improvements like these help sharpen the tool to the point where it could have robust real-world applications. “It would be hard to trust a method if it doesn’t work with ideal respondents for simple problems like [the capital of Pennsylvania]”, write the authors. Fixing it so it gets simple questions like this right is a big step in the right direction.

Nature2016. DOI: doi:10.1038/nature21054 (About DOIs).

By akfire1

Leave a Reply

Your email address will not be published.