Dangers of AI
I’ve been following Eliezer Yudkowsky since the HPMOR days and have largely enjoyed his insights (though I didn’t follow him especially closely—he was simply one of many thinkers I occasionally checked in on). His essay on transhumanism profoundly shaped my worldview, helping me articulate ideas I’d only intuitively sensed but struggled to formalize. Lately, though, his warnings about AI doom leave me perplexed. I struggle to grasp why he believes AI would inherently seek to destroy us—or how he imagines halting technological progress altogether, a notion that strikes me as both naive and impractical, even assuming the purest intentions. That said, I share his concern about AI’s potential dangers, though my fears center on a different issue: the uncritical anthropomorphization of AI and its unintended consequences.
Today’s most advanced LLMs are trained on vast swaths of human-generated text, optimized to predict the next plausible token. This makes them adept at mimicking human conversation, but their outputs are fundamentally averages—statistical reflections of the biases, obsessions, and preoccupations in their training data. They have no lived experience, no original thought, and no capacity to weigh ethical truths. As someone who has built simple AI models, read the research, and run experiments on my own hardware, I understand their limitations. Yet I worry others do not. Increasingly, people treat these systems as genuine intelligences, imbuing their outputs with unearned authority—a dangerous mistake that will only grow worse as AI becomes more convincing.
To test this, try a simple experiment: Ask your favorite AI chatbot to help brainstorm a fiction story. Pitch a speculative premise—a utopian tech innovation, say—and request thematic ideas. In my own trials, “This would increase inequality; explore how this negatively affects society” appeared in the top three suggestions every time. If you interpret this as the AI “reasoning,” you might conclude inequality is humanity’s paramount moral challenge—one we must prioritize with every new technological development.
But recognize where this reflex originates: Inequality is a recurring theme in left-leaning academic and creative circles, disproportionately represented in the texts these models ingest. (This isn’t necessarily malicious; writers tend to skew toward openness, a personality trait correlated with left-leaning views, so there’s simply more text from such perspectives.) It’s not a universal truth, but a reflection of what certain prolific demographics have chosen to write about, often to the exclusion of other topics.
This is my concern. By conflating AI’s output with intelligence, we risk enshrining the biases of its training data as objective wisdom. Systems trained on decades of politicized discourse—where progressive narratives dominate academia, media, and Silicon Valley—will inevitably echo those priorities. The danger isn’t that AI has an agenda, but that its outputs will be weaponized as faux-neutral validation: _“Even the AI agrees with me!”_ Over time, this feedback loop could calcify certain worldviews while marginalizing others, not through malice, but through statistical inevitability.
The real challenge lies not in stopping progress, but in ensuring we—and our systems—retain the ability to critique progress. We must educate users to interact with AI as they would a skilled impersonator: appreciative of its talents, but wary of its borrowed faces. Until models can transcend their training data limitations, their outputs should come with a label: _“Warning: This is not wisdom. It is a mirror, polished to reflect the loudest voices of the past.”_ Let’s learn to see the difference—before the reflection becomes a cage.