Episode 293: Gary Smith
Stop Torturing Data
When scientists game the system to get publishable results, it undermines the legitimacy of science.. Data can be interpreted many different ways and sliced into an infinite number of shapes, but specifically shaping your results to make them fit restrictions leads everyone down the wrong path. This is called torturing data, and it can look like cherry-picking participants or results for a study or getting your results first and then reverse engineering your hypothesis after the fact.
Gary Smith is the Fletcher Jones Professor of Economics at Pomona College. He is also the author of several books on data and economics. His latest work, Distrust: Big Data, Data-Torturing, and the Assault on Science, explores society’s general and specific instances of distrusting science in different ways.
Greg and Gary discuss what nefarious things go on when scientists focus on keeping low P Values. They discuss the distinctions between correlation and causation that an AI might not be able to distinguish and the work in that area of Diedrik Stapel. Gary discusses data mining and HARKing. Gary and Greg discuss the difference in importance and feasibility of both backcasting and forecasting with markets, what makes ChatGPT work under the hood, and the real advantage that Warren Buffet has in investing.
*unSILOed Podcast is produced by University FM.*
Episode Quotes:
The future of education with large language models
50:22: We may be going to a world where my ChatGPT talks to your ChatGPT, but I hope not. And in most jobs, you have to communicate, you have to write reports that are persuasive, coherent, and factually correct. And sometimes you have to get up, speak and talk. And in some of my classes, a lot of the things I do are group projects where they work on things outside of class, then they come into class, stand up, and present the results, kind of like a real-world business situation. And the large language models are not going to take that over. And I think if education switches more to that model, teaching critical thinking, working on projects, communicating results, education's going to actually get better. It's not going to destroy education.
Underestimating our capacity as human beings
29:27: The problem today is not that computers are smarter than us. But we think they're smarter than us, and we trust them to make decisions they shouldn't be trusted to make.
Data mining is a vice
23:02: The problem is these computer algorithms they're good at finding patterns—statistical patterns—but they have no way of judging, assessing whether it makes any sense or not. They have no way of assessing whether that is likely to be a meaningful or meaningless thing. And too many people think that data mining is a virtue. And I continue to consider it a vice.
The danger of large language models
46:53: The real danger of large language models is not that they're going to take over the world but that we're going to trust them too much and start making decisions they shouldn't be making.
Show Links:
Recommended Resources:
Guest Profile:
His Work:
- Distrust: Big Data, Data-Torturing, and the Assault on Science 
- Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics 
- What the Luck?: The Surprising Role of Chance in Our Everyday Lives 
- Money Machine: The Surprisingly Simple Power of Value Investing 
- Your Home Dividend: Why Buying A Home May Be the Best Investment You'll Ever Make 
 
                         
             
             
             
             
             
             
             
             
             
             
            