CANR
WORK TITLE: AI Snake Oil
WORK NOTES:
PSEUDONYM(S):
BIRTHDATE:
WEBSITE: https://www.cs.princeton.edu/~arvindn/
CITY:
STATE:
COUNTRY:
NATIONALITY:
LAST VOLUME:
RESEARCHER NOTES:
PERSONAL
Male.
EDUCATION:Indian Institute of Technology, Madras, B.S., 2004; University of Texas at Austin, Ph.D. (computer science), 2009.
ADDRESS
CAREER
Princeton University, assistant professor, 2012, associate professor, 2014, professor, 2022; Knight Institute Visiting Senior Research Scientist 2022-23.
AWARDS:Presidential Early Career Award for Scientists and Engineers; Privacy Enhancing Technology Award, 2008.
WRITINGS
SIDELIGHTS
[open new]
Computer scientist Narayanan has worked in and has researched information privacy, fairness in machine learning, cryptocurrencies, and tech policy. He has anonymized and de-anonymized data linking to publicly available social media accounts, and revealed how machine learning reflects cultural stereotypes. As a professor at Princeton University, he has published papers on social computing and privacy. Narayanan was named to TIME’s list of 100 Most Influential People in AI.
In 2024 Narayanan, along with Sayash Kapoor, of Princeton’s computer science program, published his first book, AI Snake Oil: What Artificial Intelligence Can Do, What It Can’t, and How to Tell the Difference, an examination of AI’s promise, limitations, and misinformation. The potential of artificial intelligence to collate and communicate massive amounts of information is widespread and useful in applications like ChatGPT. But the authors warn of harmful AI claims and products, real world harms, and the dangers of AI applications controlled by unaccountable big tech companies.
The book explains the difference between generative AI, which provides useful data sorting, and predictive AI, which generalizes information found in other sources like social media, leading to biased, racist, and discriminatory actions when consequential decisions, like who gets a job, benefits, and even jail time, are left to AI systems. Narayanan and Kapoor reveal how efficiency gains using AI are grossly overestimated, warn against overoptimistic claims by AI developers, and explain how organizations and academia are falling for AI snake oil.
Another issue the authors discuss is the fallacy that AI is going to go rogue and develop its own agency. Narayanan told Louise Matsakis in an interview at Semafor: “Those arguments are being offered without any evidence… Whatever risks there are from very powerful AI, they will be realized earlier from people directing AI to do bad things, rather than from AI going against its programming and developing agency on its own.” Although some of the authors’ revelations reiterate previous critiques of AI, the book “may not break new ground, but it gets the job done,” according to a Publishers Weekly contributor. In Kirkus Reviews, a critic remarked: “Written in language that even nontechnical readers can understand, the text provides plenty of practical suggestions that can benefit creators and users alike.”
[close new]
BIOCRIT
PERIODICALS
Kirkus Reviews, July 1, 2024, review of AI Snake Oil: What Artificial Intelligence Can Do, What It Can’t, and How to Tell the Difference.
Publishers Weekly, July 8, 2024, review of AI Snake Oil, p. 167.
ONLINE
Semafor, https://www.semafor.com/ (September 15, 2023), Louise Matsakis, “The Princeton Researchers Calling out ‘AI Snake Oil.’”
Arvind Narayanan
Photo of Arvind Narayanan
Title/Position
Professor
Degree
Ph.D., University of Texas, Austin, 2009
arvindn (@cs.princeton.edu)
(609) 258-9302
308 Sherrerd Hall
Homepage
https://www.cs.princeton.edu/~arvindn
Other Affiliations
SPIA, CITP
Research
Interests: Information privacy, fairness in machine learning, cryptocurrencies, tech policy
Research Areas:
Public Law & Policy
Security & Privacy
Short Bio
Arvind Narayanan is an Associate Professor of Computer Science at Princeton. He leads the Princeton Web Transparency and Accountability Project to uncover how companies collect and use our personal information. Narayanan co-created a Massive Open Online Course and textbook on Bitcoin and cryptocurrency technologies which has been used in over 150 courses worldwide. His recent work has shown how machine learning reflects cultural stereotypes, and his doctoral research showed the fundamental limits of de-identification. Narayanan is a recipient of the Presidential Early Career Award for Scientists and Engineers (PECASE), twice recipient of the Privacy Enhancing Technologies Award, and thrice recipient of the Privacy Papers for Policy Makers Award.
Selected Publications
Dark Patterns at Scale: Findings from a Crawl of 11K Shopping Websites.
Arunesh Mathur, Gunes Acar, Michael Friedman, Elena Lucherini, Jonathan Mayer, Marshini Chetty, Arvind Narayanan.
ACM Conference on Computer-Supported Cooperative Work and Social Computing, 2019.
Privacy Papers for Policy Makers Award.
Semantics derived automatically from language corpora contain human-like biases.
Aylin Caliskan, Joanna J. Bryson, Arvind Narayanan.
Science, 2017.
Bitcoin and Cryptocurrency Technologies: A Comprehensive Introduction.
Arvind Narayanan, Joseph Bonneau, Edward W. Felten, Andrew Miller, Steven Goldfeder.
Princeton University Press, 2016.
Runner up for the 2017 PROSE Award in Computing and Information Sciences, Association of American Publishers.
Online tracking: A 1-million-site measurement and analysis.
Steven Englehardt, Arvind Narayanan.
ACM Conference on Computer and Communications Security, 2016.
Privacy Papers for Policy Makers Award.
On the instability of Bitcoin without the block reward.
Miles Carlsten, Harry Kalodner, S. Matthew Weinberg, Arvind Narayanan.
ACM Conference on Computer and Communications Security 2016.
A Precautionary Approach to Big Data Privacy.
Arvind Narayanan, Joanna Huey, Edward Felten.
Data protection on the move 2016.
Privacy Papers for Policy Makers Award.
Research Perspectives and Challenges for Bitcoin and Cryptocurrencies.
Joseph Bonneau, Andrew Miller, Jeremy Clark, Joshua Kroll, Edward W. Felten, Arvind Narayanan.
IEEE Security & Privacy 2015.
Most cited computer security & privacy paper of 2015.
The Web never forgets: Persistent tracking mechanisms in the wild.
Gunes Acar, Christian Eubank, Steven Englehardt, Marc Juarez, Arvind Narayanan, Claudia Diaz.
ACM Conference on Computer and Communications Security (CCS) 2014.
Routes for breaching and protecting genetic privacy.
Yaniv Erlich, Arvind Narayanan.
Nature Reviews Genetics, 2014.
A Scanner Darkly: Protecting User Privacy from Perceptual Applications.
Suman Jana, Arvind Narayanan, Vitaly Shmatikov.
In IEEE Security and Privacy 2013.
2014 Privacy Enhancing Technologies Award.
Arvind Narayanan
Knight Institute Visiting Senior Research Scientist 2022-2023; Princeton University
Arvind Narayanan is a professor of computer science at Princeton University and the director of the Center for Information Technology Policy. He co-authored a textbook on fairness and machine learning and is currently co-authoring a book on AI snake oil. He led the Princeton Web Transparency and Accountability Project to uncover how companies collect and use our personal information. His work was among the first to show how machine learning reflects cultural stereotypes, and his doctoral research showed the fundamental limits of de-identification. Narayanan is a recipient of the Presidential Early Career Award for Scientists and Engineers (PECASE), twice a recipient of the Privacy Enhancing Technologies Award, and thrice a recipient of the Privacy Papers for Policy Makers Award.
Narayanan was the Knight First Amendment Institute’s 2022-2023 visiting senior research scientist. He carried out a research project on algorithmic amplification on social media and hosted a major conference on the topic in Spring 2023.
Arvind Narayanan is a professor of computer science at Princeton and the director of the Center for Information Technology Policy. He led the Princeton Web Transparency and Accountability Project to uncover how companies collect and use our personal information. His work was among the first to show how machine learning reflects cultural stereotypes. Narayanan is a recipient of the Presidential Early Career Award for Scientists and Engineers (PECASE).
Arvind Narayanan
Article
Talk
Read
Edit
View history
Tools
Appearance hide
Text
Small
Standard
Large
Width
Standard
Wide
Color (beta)
Automatic
Light
Dark
From Wikipedia, the free encyclopedia
This biography of a living person relies too much on references to primary sources. Please help by adding secondary or tertiary sources. Contentious material about living persons that is unsourced or poorly sourced must be removed immediately, especially if potentially libelous or harmful.
Find sources: "Arvind Narayanan" – news · newspapers · books · scholar · JSTOR (December 2012) (Learn how and when to remove this message)
Arvind Narayanan
Alma mater
Indian Institute of Technology Madras
University of Texas at Austin
Known for De-anonymization
Awards Privacy Enhancing Technology Award
Scientific career
Institutions
Stanford University
Princeton University
Thesis Data Privacy: the Non-interactive Setting (2009)
Doctoral advisor C. Pandu Rangan
Website www.cs.princeton.edu/~arvindn
Arvind Narayanan is a computer scientist and a professor at Princeton University.[1] Narayanan is recognized for his research in the de-anonymization of data.[2][3]
Biography
Narayanan received technical degrees from the Indian Institute of Technology Madras in 2004.[4] His advisor was C. Pandu Rangan. Narayanan received his PhD in computer science from the University of Texas at Austin in 2009 under Vitaly Shmatikov. He worked briefly as a post-doctoral researcher at Stanford University, working closely with Dan Boneh. Narayanan moved to Princeton University as an assistant professor in September 2012. He was promoted to associate professor in 2014,[5] and to professor in 2022.
Career
In 2006 Netflix began the Netflix Prize competition for better recommendation algorithms. In order to facilitate the competition, Netflix released "anonymized" viewership information. However, Narayanan and advisor Vitaly Shmatikov showed possibilities for de-anonymizing this information by linking this anonymized data to publicly available IMDb user accounts.[6] This research led to higher recognition of de-anonymization techniques[according to whom?] and the importance of more rigorous anonymization techniques.[citation needed] Later Narayanan de-anonymized graphs from social networking[7] and writings from blogs.[8]
In mid-2010, Narayanan and Jonathan Mayer argued in favor of Do Not Track in HTTP headers.[9][10] They built prototypes of Do Not Track for clients and servers.[11] Working with Mozilla they wrote the influential Internet Engineering Task Force Internet Draft of Do Not Track.[12][13]
Narayanan has written extensively about software cultures. He has argued for more substantial teaching of ethics in computer science education[14] and usable[clarification needed] cryptography.[15][16]
Awards
Presidential Early Career Award for Scientists and Engineers[17]
Privacy Enhancing Technology Award 2008[18]
AI Snake Oil: What Artificial Intelligence Can Do, What It Can't, and How to Tell the Difference
Arvind Narayanan and Sayash Kapoor. Princeton Univ., $24.95 (352p) ISBN 978-0-691-24913-1
Narayanan (coauthor of Bitcoin and Cryptocurrency Technologies), a computer science professor at Princeton University, and Kapoor, a PhD candidate in Princeton's computer science program, present a capable examination of AI's limitations. Because ChatGPT and other generative AI software imitate text patterns rather than memorize facts, it's impossible to prevent them from spouting inaccurate information, the authors contend. They suggest that this shortcoming undercuts any hoped-for efficiency gains and describe how news website CNET's deployment of the technology in 2022 backfired after errors were discovered in many of the pieces it wrote. Predictive AI programs are riddled with design flaws, the authors argue, recounting how software tasked with determining "the risk of releasing a defendant before trial" was trained on a national dataset and then used in Cook County, Ill., where it failed to adjust for the county's lower crime rate and recommended thousands of defendants be jailed when they actually posed no threat. Narayanan and Kapoor offer a solid overview of AI's defects, though the anecdotes about racial biases in facial recognition software and the abysmal working conditions of data annotators largely reiterate the same critiques found in other AI cris de coeur. This may not break new ground, but it gets the job done. (Sept.)
Copyright: COPYRIGHT 2024 PWxyz, LLC
http://www.publishersweekly.com/
Source Citation
Source Citation
MLA 9th Edition APA 7th Edition Chicago 17th Edition Harvard
"AI Snake Oil: What Artificial Intelligence Can Do, What It Can't, and How to Tell the Difference." Publishers Weekly, vol. 271, no. 26, 8 July 2024, p. 167. Gale General OneFile, link.gale.com/apps/doc/A801800256/ITOF?u=schlager&sid=bookmark-ITOF&xid=8278c82f. Accessed 23 Aug. 2024.
Narayanan, Arvind AI SNAKE OIL Princeton Univ. (NonFiction None) $24.95 9, 24 ISBN: 9780691249131
Two academics in the burgeoning field of AI survey the landscape and present an accessible state-of-the-union report.
Like it or not, AI is widespread. The present challenge involves strategies to use it properly, comprehend its limitations, and ask the right questions of the entrepreneurs promoting it as a cure for every social ill. The experienced authors bring a wealth of knowledge to their subject: Narayanan is a professor of computer science at Princeton and director of its Center for Information Technology Policy, and Kapoor is a doctoral candidate with hands-on experience of AI. They walk through the background of AI development and explain the difference between generative and predictive AI. They see great advantages in generative AI, which can provide, collate, and communicate massive amounts of information. Developers and regulators must take strict precautions in areas such as academic cheating, but overall, the advantages outweigh the problems. Predictive AI, however, is another matter. It seeks to apply generalized information to specific cases, and there are plenty of horror stories about people being denied benefits, having reputations ruined, or losing jobs due to the opaque decision of an AI system. The authors argue convincingly that when individuals are affected, there should always be human oversight, even if it means additional costs. In addition, the authors show how the claims of AI developers are often overoptimistic (to say the least), and it pays to look at their records as well as have a plan for regular review. Written in language that even nontechnical readers can understand, the text provides plenty of practical suggestions that can benefit creators and users alike. It's also worth noting that Narayanan and Kapoor write a regular newsletter to update their points.
Highly useful advice for those who work with or are affected by AI--i.e., nearly everyone.
Copyright: COPYRIGHT 2024 Kirkus Media LLC
http://www.kirkusreviews.com/
Source Citation
Source Citation
MLA 9th Edition APA 7th Edition Chicago 17th Edition Harvard
"Narayanan, Arvind: AI SNAKE OIL." Kirkus Reviews, 1 July 2024, p. NA. Gale General OneFile, link.gale.com/apps/doc/A799332917/ITOF?u=schlager&sid=bookmark-ITOF&xid=c5352251. Accessed 23 Aug. 2024.
The Princeton researchers calling out ‘AI snake oil’
Louise Matsakis
Louise Matsakis
Updated Sep 15, 2023, 11:53am EDT
tech
Semafor/Al Lucca
Post
Email
Whatsapp
Copy link
Sign up for Semafor Technology: What’s next in the new era of tech. Read it now.
Your Email address
Your Email address
Sign Up
Title iconThe Scene
In July, a new study about ChatGPT started going viral on social media, which seemed to validate growing suspicions about how the chatbot had gotten “dumber” over time. As often happens in these circumstances, Arvind Narayanan and Sayash Kapoor stepped in as the voices of reason.
The Princeton computer science professor and Ph.D candidate, respectively, are the authors of the popular newsletter and soon-to-be book AI Snake Oil, which exists to “dispel hype, remove misconceptions, and clarify the limits of AI.”
The pair quickly put out a newsletter explaining how the paper’s findings had been grossly oversimplified. It was part of a series of similar articles Narayanan and Kapoor have published, filled with balanced criticism of AI and ideas for how to mitigate its harms. But don’t call them technophobes: One of the most charming things Narayanan has written about is how he built a ChatGPT voice interface for his toddler.
In the edited conversation below, we talked to Narayanan and Kapoor about transparency reporting, disinformation, and why they are confident AI doesn’t pose an existential risk to humanity.
Title iconThe View From Arvind Narayanan and Sayash Kapoor
Q: How can consumers quickly evaluate whether a new AI company is selling snake oil or actually offering a reasonable application of this technology?
Narayanan: The key distinction that we make is between predictive AI and generative AI. In our view, most of the snake oil is concentrated in predictive AI. When we say snake oil, it’s an AI that doesn’t work at all — not just an AI that doesn’t live up to its hype; there’s certainly some of that going on in generative AI.
AD
You have AI hiring tools, for instance, which screen people based on questions like, “Do you keep your desk clean?” or by analyzing their facial expressions and voice. There’s no basis to believe that kind of prediction has any statistical validity at all. There have been zero studies of these tools, because researchers don’t have access and companies are not publishing their data.
We very strongly suspect that there are entire sectors like this that are just selling snake oil. And it’s not just companies, there’s a lot of snake oil in academia as well. There was this paper that claims to predict whether a psychology study will replicate or not using machine learning. This paper really has basically all the pitfalls that we could think of, and I would very much call it snake oil. It’s claiming that you can predict the future using AI — that’s the thing that grinds our gears the most.
Q: One of the big worries about generative AI is that it will create a flood of disinformation. But you argue that some of the solutions being proposed, such as adding watermarks to AI-generated images, won’t work. Why?
AD
Kapoor: First, when we look at disinformation itself, it takes the focus away from where solutions actually work — for instance, information integrity efforts on social media. If you recall the Pentagon hoax, the entire reason that was successful to some extent was because it was spread by a verified Twitter account. The photo was of a fake Pentagon bombing. It has clear visual artifacts of AI, the fences were blending into each other — it’s just a really shoddy job. If we focus on watermarking and the role of AI in spreading disinformation, I think we lose sight of this bigger picture.
The other part is AI genuinely does lead to these new types of harms, which are, in our view, much more impactful compared to disinformation. One example is non-consensual deepfakes. This is an area where you don’t need information to spread virally for it to cause harm. You can have a targeted campaign that attacks just one individual, and it will cause immense psychological, emotional, and financial damage. It’s a problem that we feel is relatively unaddressed compared to all of the attention that disinformation is getting.
Q: You argue that AI companies should start publishing regular transparency reports, the same way social media giants like YouTube and Meta do. Why is that a good idea?
Kapoor: I don’t think there’s going to be one set of transparency standards that apply to all language models and then we’re done. I think the process has to be iterative, it has to take into account people’s feedback, and it has to improve over time. With that said, I think one reason why social media is useful is because it can provide us with an initial set of things that companies should report. As we pointed out recently, the entire debate about the harms of AI is happening in a data vacuum. We don’t know how often people are using ChatGPT for medical advice, legal advice, or financial advice. We don’t know how often it outputs hate speech, how often it defames people, and so on.
The first step towards understanding that is publishing user reports. This might seem like it’s technically infeasible — how do you understand how more than, say, 200 million people are using your platform? But again, in social media, this has already been done. Facebook releases these quarterly reports, which outline how much hate speech there is on the platform, how many people reported comments, and how many of those comments were taken down. I think that can be a great model as a starting point for foundational model providers.
Q: People often learn about new AI research by reading pre-print studies on open access archive arXiv.org. But the studies haven’t been peer-reviewed, and sometimes low-quality ones go viral or good ones are misinterpreted. Should there be better controls on arXiv.org?
Kapoor: I do think there are genuine concerns about arXiv papers being misinterpreted, and just the overall pace of research and how it affects research communities. Many people have said that maybe we should discredit anything that hasn’t been peer-reviewed — I wholeheartedly disagree with that. We do need to improve our scientific processes, we need to improve peer review, but at the same time, it’s not necessarily true that everything that’s peer-reviewed also does not suffer from errors.
In our own research on studies that use machine learning, we found top papers published in top journals also tended to suffer from reproducibility issues. So while peer review is still important, I think an overreliance on it can also be harmful. ArXiv.org helps reduce gatekeeping in academia. Peer review tends to favor research that fits within established norms — arXiv can help level the playing field for people who are doing research that’s outside the box.
Q: Wealthy donors are pouring millions of dollars into organizations promoting the idea that artificial intelligence presents an existential risk to humanity. Is that true?
Narayanan: There are just so many fundamental flaws in the argument that x-risk [existential risk] is so serious that we need urgent action on it. We’re calling it a “tower of fallacies.” I think there’s just fallacies on every level. One is this idea that AGI is coming at us really fast, and a lot of that has been based on naive extrapolations of trends in the scaling up of these models. But if you look at the technical reality, scaling has already basically stopped yielding dividends. A lot of the arguments that this is imminent just don’t really make sense.
Another is that AI is going to go rogue, it’s going to have its own agency, it’s going to do all these things. Those arguments are being offered without any evidence by extrapolating based on [purely theoretical] examples. Whatever risks there are from very powerful AI, they will be realized earlier from people directing AI to do bad things, rather than from AI going against its programming and developing agency on its own.
So the basic question is, how are you defending against hacking or tricking these AI models? It’s horrifying to me that companies are ignoring those security vulnerabilities that exist today and instead smoking their pipes and speculating about a future rogue AI. That has been really depressing.
And the third really problematic thing about this is that all of the interventions that are being proposed will only increase every possible risk, including existential risks. The solution they propose is to concentrate power in the hands of a few AI companies.
Q: Is x-risk actually a big concern in the AI research community? Are you fielding questions about it from new students?
Narayanan: I think the median AI researcher is still interested in doing cool technical things and publishing stuff. I don’t think they are dramatically shifting their research because they’re worried about existential risk. A lot of researchers consider it intellectually interesting to work on alignment, but even among them, I don’t necessarily know that the majority think that x-risk is an imminent problem. So in that sense, what you’re seeing in the media exaggerates what’s actually going on in the AI research community.
Kapoor: I definitely agree that the median AI researcher is far from the position that x-risk is imminent. That said, I do think there are some selection effects. For instance, a lot of effective altruism organizations have made AI x-risk their top cause in the last few years. That means a lot of the people who are getting funding to do AI research are naturally inclined, but also have been specifically selected, for their interest in reducing AI x-risk.
I’m an international student here, and one of the main sources of fellowships is Open Philanthropy. Over the last five years or so, they have spent over $200 million on AI x-risk specifically. When that kind of shift happens, I think there’s also a distortion that happens. So even if we have a large number of people working on AI x-risk, it does not really mean that this interest arose organically. It has been very strategically funded by organizations that make x-risk a top area of focus.