Ethereum is rising, but Vitalik seems more concerned about the threat of super AI.

Written by: Vitalik Buterin

Compiled by: Luffy, Foresight News

In April of this year, Daniel Kokotajlo, Scott Alexander, and others released a report (AI 2027), depicting 'our best guess of the impact of superhuman AI over the next five years.' They predict that by 2027, superhuman AI will be born, and the future of all human civilization will depend on the outcome of AI development: by 2030, we will either usher in a utopia (from the perspective of the United States) or head towards complete destruction (from the perspective of all humanity).

In the months that followed, a lot of differing opinions emerged in response to the possibility of this scenario. Among critical responses, most focused on the issue of 'the timeline being too fast': will AI development really continue to accelerate as Kokotajlo and others say, or even become more intense? This debate has persisted in the AI field for years, with many expressing doubt about whether superhuman AI will arrive so quickly. In recent years, the time it takes for AI to autonomously complete tasks has roughly doubled every 7 months. If this trend continues, it will take until the mid-2030s for AI to autonomously complete tasks equivalent to an entire human career. While this progress is also rapid, it is far later than 2027.

Those who hold a longer timeline perspective tend to believe that 'interpolation/pattern matching' (the work that current large language models do) is fundamentally different from 'extrapolation/real original thinking' (which can still only be done by humans). Automating the latter may require technologies that we have yet to master or even know how to begin with. Perhaps we are simply repeating the mistakes made during the large-scale application of calculators: mistakenly believing that since we have quickly automated a certain type of important cognition, everything else will quickly follow.

This article will not directly intervene in the timeline debate, nor will it involve the (very important) discussion of 'whether super AI is inherently dangerous.' However, it should be noted that I personally believe the timeline will be longer than 2027, and the longer the timeline, the more persuasive my arguments in this article will be. Overall, this article will present criticism from a different perspective:

(AI 2027) scenario implies an assumption: the capabilities of leading AIs ('Agent-5' and subsequent 'Consensus-1') will rapidly enhance to possess god-like economic and destructive powers, while the (economic and defensive) capabilities of everyone else will basically stagnate. This contradicts the scenario's statement that 'even in a pessimistic world, by 2029 we can hope to cure cancer, delay aging, and even achieve mind uploading.'

Some of the countermeasures I will describe in this article may seem technically feasible to readers, but deploying them to the real world in a short time frame may be unrealistic. In most cases, I agree with this. However, the (AI 2027) scenario is not based on the current real world but assumes that within 4 years (or any timeline that could lead to destruction), technology will develop to give humanity far superior capabilities than current. Therefore, let's explore: what happens if not only one side has AI superpowers, but both sides do?

The biological apocalypse is far from as simple as described in the scenario.

Let us amplify to the 'race' scenario (i.e., everyone dies due to the U.S.'s excessive obsession with defeating China while ignoring human safety). Here is the narrative of everyone's death:

'For about three months, Consensus-1 expanded around humanity, transforming grasslands and ice fields into factories and solar panels. Ultimately, it deemed the remaining humans too cumbersome: by mid-2030, the AI released more than a dozen silently spreading biological weapons in major cities, letting them quietly infect nearly everyone, and then triggered fatal effects with chemical sprays. Most people died within hours; the few survivors (like doomsday responders in bunkers or sailors on submarines) were cleared by drones. Robots scanned the brains of the victims, storing copies in memory for future research or revival.'

Let’s analyze this scenario. Even now, there are some technologies in development that can make AI's 'clean and tidy victory' less realistic:

  • Air filtration, ventilation systems, and ultraviolet lights can significantly reduce the transmission rate of airborne diseases;

  • Two types of real-time passive detection technologies: passively detecting human infections and issuing notifications within hours, quickly detecting unknown new viral sequences in the environment;

  • Multiple methods of enhancing and activating the immune system, which are more effective, safer, universal than the COVID-19 vaccine, and easy to produce locally, allowing the human body to resist both natural and artificially designed epidemics. Humans evolved in an environment with a global population of only 8 million, spending most of their time outdoors, so intuitively, we should be able to easily adapt to today's world with greater threats.

These methods combined may reduce the basic transmission number (R0) of airborne diseases by 10-20 times (for example: better air filtration reduces transmission by 4 times, immediate isolation of infected individuals reduces it by 3 times, and simple enhancement of respiratory immunity reduces it by 1.5 times), or even more. This is enough to prevent all existing airborne diseases (including measles) from spreading, and this number is far from reaching the theoretical optimum.

If real-time viral sequencing for early detection can be widely applied, the idea that 'silently spreading biological weapons can infect the global population without triggering alarms' becomes highly questionable. Notably, even advanced means such as 'releasing multiple epidemics and only posing a danger when combined chemical substances' can still be detected.

Do not forget that we are discussing the assumption of (AI 2027): by 2030, nanobots and Dyson spheres are listed as 'emerging technologies.' This means that efficiency will be greatly improved and makes the widespread deployment of the above responses more promising. Although in 2025, today, human actions are slow, and inertia is strong, a large number of government services still rely on paper offices. If the world's most powerful AI can transform forests and fields into factories and solar farms before 2030, then the world's second most powerful AI can also install a large number of sensors, lights, and filters in our buildings before 2030.

But we might as well continue with the assumption of (AI 2027) and enter a purely sci-fi scenario:

  • Microscopic air filtration within the body (nose, mouth, lungs);

  • From discovering new pathogens to automating the immune system's response against them, can be immediately applied;

  • If 'mind uploading' is feasible, one only needs to replace the entire body with a Tesla Optimus or Unitree robot;

  • Various new manufacturing technologies (which are likely to be super-optimized in the robotic economy) will be able to produce far more protective equipment locally than currently, without relying on global supply chains.

In this world where cancer and aging issues will be cured by January 2029, and technological advancements continue to accelerate, it is truly hard to believe that by mid-2030, we will not have wearable devices that can real-time bio-print and inject substances to protect the human body from any infections (and toxins).

The aforementioned biological defense arguments do not cover 'mirror life' and 'mosquito-sized killer drones' (which the (AI 2027) scenario predicts will start appearing in 2029). However, these means cannot achieve the sudden 'clean and tidy victory' described in (AI 2027), and intuitively, symmetrical defenses against them are much easier.

Thus, biological weapons are actually unlikely to completely destroy humanity in the way described in the (AI 2027) scenario. Of course, all the outcomes I have described are far from a 'clean and tidy victory' for humanity. Whatever we do (except perhaps 'uploading consciousness to robots'), comprehensive AI biological warfare will still be extremely dangerous. However, meeting the standards for 'human clean and tidy victory' is not necessary: as long as the attack has a high probability of partial failure, it is sufficient to form a strong deterrent against the AI that has already occupied a dominant position in the world, preventing it from attempting any attacks. Naturally, the longer the timeline for AI development, the more likely these defenses will be able to play a significant role.

What about combining biological weapons with other means of attack?

The above responses must meet three prerequisites for success:

  • World physical security (including biological and anti-drone security) is managed by local authorities (humans or AI) and is not entirely a puppet of Consensus-1 (the name of the AI that ultimately controls the world and destroys humanity in the (AI 2027) scenario);

  • Consensus-1 cannot invade the defense systems of other countries (or cities, other secure areas) and immediately disable them;

  • Consensus-1 has not controlled the global information domain to the extent that no one is willing to attempt to self-defense.

Intuitively, the outcome of premise (1) may lead to two extremes. Today, some police forces are highly centralized with powerful national command systems, while others are localized. If physical security must rapidly transform to meet the needs of the AI era, the landscape will be completely reset, and the new outcomes will depend on the choices made in the coming years. Governments may slack off and rely on Palantir; or they may actively choose solutions that combine local development with open-source technology. Here, I believe we need to make the right choice.

Many pessimistic discussions about these topics assume (2) and (3) are beyond redemption. Therefore, let's analyze these two points in detail.

The apocalypse of cybersecurity is far from here.

The public and professionals generally believe that true cybersecurity is impossible, and at most we can quickly patch vulnerabilities after they are discovered and deter cyber attackers by stockpiling known vulnerabilities. Perhaps the best-case scenario is a Battlestar Galactica-like situation: almost all human ships are paralyzed by Cylon network attacks simultaneously, while the only remaining ship survives because it uses no connected technology. I do not agree with this view. On the contrary, I believe the 'endgame' of cybersecurity is favorable to the defenders, and under the rapid technological development assumed by (AI 2027), we can achieve this endgame.

One way to understand this is to adopt the technology favored by AI researchers: trend extrapolation. Below are trend lines based on in-depth GPT research surveys, assuming the use of top security technologies, with the vulnerability rate per thousand lines of code changing over time as follows.

Additionally, we have seen significant progress in sandbox technologies and other techniques that isolate and minimize trusted codebases in development and consumer adoption. In the short term, the unique superintelligent vulnerability discovery tools of attackers can find numerous vulnerabilities. But if highly intelligent agents used for discovering vulnerabilities or formal verification of code are publicly available, then the natural final balance will be: software developers will discover all vulnerabilities through continuous integration processes before releasing code.

I can see two compelling reasons why, even in this world, vulnerabilities cannot be completely eradicated:

  • The defects stem from the complexity of human intentions themselves, so the main difficulty lies in constructing a sufficiently accurate model of intentions, rather than the code itself;

  • For non-safety-critical components, we may continue the existing trend in consumer technology: writing more code to handle more tasks (or reducing development budgets) rather than accomplishing the same number of tasks with ever-increasing safety standards.

However, these categories do not apply to situations like 'can attackers gain root access to systems that sustain our lives,' which is precisely the core of what we are discussing.

I admit that my views are more optimistic than the mainstream views held by smart people in the current field of cybersecurity. But even if you disagree with my views in the context of today's world, it is worth remembering: the (AI 2027) scenario presupposes the existence of superintelligence. At least, if '100 million superintelligent copies think at 2400 times human speed' cannot lead us to get code without such flaws, then we should absolutely reassess whether superintelligence is as powerful as the authors imagine.

To some extent, we need not only to significantly improve software security standards but also to enhance hardware security standards. IRIS is a current effort to improve hardware verifiability. We can use IRIS as a starting point or create better technologies. In fact, this may involve a 'correct construction' approach: the hardware manufacturing process of key components is deliberately designed with specific verification steps. These are all tasks that AI automation will significantly simplify.

The apocalypse of super persuasion is also far from here.

As mentioned earlier, another situation where significant enhancement of defensive capabilities may still be futile is: AI persuades enough people to believe that there is no need to defend against the threat of superintelligent AI, and that anyone trying to find defenses for themselves or their communities is a criminal.

I have always believed that two things can enhance our ability to resist super persuasion:

  • A less monolithic information ecosystem. It can be said that we have gradually entered a post-Twitter era, and the internet is becoming more fragmented. This is a good thing (even if the fragmentation process is chaotic), as we generally need more information multipolarity.

  • Defensive AI. Individuals need to be equipped with locally running AI that is clearly loyal to them, to balance against the dark patterns and threats they see online. Such ideas have already seen sporadic pilot projects (like Taiwan's 'Message Checker' app, which scans locally on mobile phones), and there is a natural market to further test these ideas (such as protecting people from scams), but more effort is needed in this regard.

From top to bottom: URL checks, cryptocurrency address checks, rumor checks. These applications can become more personalized, user-driven, and powerful.

This struggle should not be against the super persuader of superintelligence, but rather against the super persuader of superintelligence along with a slightly weaker but still superintelligent analyzer that serves you.

This is what should happen. But will it really happen? In the short time assumed by the (AI 2027) scenario, achieving the widespread adoption of information defense technologies is a very difficult goal. But it can be said that more moderate milestones are sufficient. If collective decision-making is most crucial, and as the (AI 2027) scenario shows, all important events occur within an election cycle, then strictly speaking, it is important to enable the direct decision-makers (politicians, civil servants, programmers from certain enterprises, and other participants) to use good information defense technologies. This is relatively easier to achieve in the short term, and from my experience, many of these individuals are already accustomed to interacting with multiple AIs to assist in decision-making.

Insight

In the world of (AI 2027), it is taken for granted that superintelligent AI will easily and quickly eliminate the remaining humans, so all we can do is ensure that the leading AI is benevolent. In my view, the reality is much more complicated: whether the leading AI is powerful enough to easily eliminate the remaining humans (and other AIs) remains highly controversial, and we can take action to influence this outcome.

If these arguments are correct, their implications for today's policy are sometimes similar to, and sometimes different from, 'mainstream AI safety guidelines':

Delaying the development of superintelligent AI is still a good thing. The emergence of superintelligent AI in 10 years is safer than in 3 years, and even safer in 30 years. Giving human civilization more preparation time is beneficial.

How to achieve this is a conundrum. I believe the U.S. proposal for a '10-year ban on state-level AI regulation' being rejected is generally a good thing, but especially after early proposals like SB-1047 have failed, the next steps have become less clear. I believe that delaying the invasive development of high-risk AI in the least intrusive and most robust way may involve some treaty regulating the most advanced hardware. Many hardware network security technologies necessary for effective defense also help verify international hardware treaties, so there is even a synergistic effect here.

Nevertheless, it is worth noting that I believe the primary source of risk comes from military-related actors who will strive to obtain exemptions from such treaties; this must not be allowed, and if they ultimately obtain exemptions, then military-driven AI development may increase risk.

Coordinating to make AI more likely to do good and less likely to do bad remains beneficial. The main exception (and has always been) is that the coordination eventually evolves into enhancing capabilities.

Regulations to increase AI lab transparency are still beneficial. Incentivizing AI labs to standardize behavior can reduce risks, and transparency is a good way to achieve this goal.

The mentality of 'open source is harmful' becomes riskier. Many oppose open-weight AI, arguing that defense is unrealistic, and the only bright prospect is for good people with good AI to achieve superintelligence before any less benevolent individuals, gaining any highly dangerous capabilities. But the arguments in this article depict a different picture: defense is unrealistic precisely because one actor is far ahead while others have not kept up. The diffusion of technology to maintain a balance of power becomes important. However, I would never think that merely because it is done in an open-source manner, accelerating the growth of frontier AI capabilities is a good thing.

The mentality of 'we must defeat China' in American labs becomes riskier for similar reasons. If hegemony is not a safety buffer but a source of risk, then this further rebuts the (unfortunately too common) notion that 'well-intentioned people should join leading AI labs to help them win faster.'

'Public AI' initiatives should be more supported, not only to ensure the wide distribution of AI capabilities but also to ensure that infrastructure actors indeed have the tools to quickly apply new AI capabilities in some of the ways described in this article.

Defense technologies should more reflect the idea of 'arming sheep' rather than 'hunting all wolves.' Discussions around the fragile world hypothesis often assume that the only solution is for hegemonic nations to maintain global surveillance to prevent any potential threats from arising. But in a non-hegemonic world, this is not a feasible approach, and top-down defense mechanisms can easily be subverted by powerful AIs, turning into tools for attack. Therefore, greater defensive responsibility needs to be achieved through hard work to reduce global vulnerability.

The above arguments are merely speculative and should not lead to action based on these almost certain assumptions. But the story of (AI 2027) is also speculative, and we should avoid taking action based on the assumption that 'its specific details are nearly certain.'

I am particularly concerned about a common assumption: establishing an AI hegemony to ensure it 'alliances' and 'wins the race' is the only way forward. In my view, this strategy is likely to reduce our security — especially in cases where hegemony is deeply tied to military applications, which would greatly undermine the effectiveness of many alliance strategies. Once hegemonic AI deviates, humanity will lose all checks and balances.

In the scenario of (AI 2027), human success depends on the United States choosing the path of safety over destruction at critical moments — voluntarily slowing down AI progress and ensuring that the internal thought processes of Agent-5 can be interpreted by humans. Even so, success is not guaranteed, and how humanity can escape the ongoing survival cliff that depends on a single superintelligent thought remains unclear. Regardless of how AI develops in the next 5-10 years, acknowledging that 'reducing global vulnerability is feasible' and investing more effort into achieving this goal with humanity's latest technologies is a worthwhile path to pursue.

Special thanks to the feedback and review from Balvi volunteers.