# Existential risk in Gothenburg

This fall I have been chairing a programme at the Gothenburg Centre for Advanced Studies on existential risk, thanks to Olle Häggström. Visiting researchers come and participate in seminars and discussions on existential risk, ranging from the very theoretical (how do future people count?) to the very applied (should we put existential risk on the school curriculum? How?). I gave a Petrov Day talk about how to calculate risks of nuclear war and how observer selection might mess this up, beside seminars on everything from the Fermi paradox to differential technology development. In short, I have been very busy.

To open the programme we had a workshop on existential risk September 7-8 2017. Now we have the videos up of our talks:

I think so far a few key realisations and themes have in my opinion been

(1) the pronatalist/maximiser assumptions underlying some of the motivations for existential risk reduction were challenged; there is an interesting issue of how “modest futures” rather than “grand futures” play a role and non-maximising goals imply existential risk reduction.

(2) the importance of figuring out how “suffering risks”, potential states of astronomical amounts of suffering, relate to existential risks. Allocating effort between them rationally touches on some profound problems.

(3) The under-determination problem of inferring human values from observed behaviour (a talk by Stuart) resonated with the under-determination of AI goals in Olle’s critique of the convergent instrumental goal thesis and other discussions. Basically, complex agent-like systems might be harder to succinctly describe than we often think.

(4) Stability of complex adaptive systems – brains, economies, trajectories of human history, AI. Why are some systems so resilient in a reliable way, and can we copy it?

(5) The importance of estimating force projection abilities in space and as the limits of physics are approached. I am starting to suspect there is a deep physics answer to the question of attacker advantage, and a trade-off between information and energy in attacks.

We will produce an edited journal issue with papers inspired by our programme, stay tuned. Avancez!

# Dampening theoretical noise by arguing backwards

Science has the adorable headline Tiny black holes could trigger collapse of universe—except that they don’t, dealing with the paper Gravity and the stability of the Higgs vacuum by Burda, Gregory & Moss. The paper argues that quantum black holes would act as seeds for vacuum decay, making metastable Higgs vacua unstable. The point of the paper is that some new and interesting mechanism prevents this from happening. The more obvious explanation that we are already in the stable true vacuum seems to be problematic since apparently we should expect a far stronger Higgs field there. Plenty of theoretical issues are of course going on about the correctness and consistency of the assumptions in the paper.

# Don’t mention the war

What I found interesting is the treatment of existential risk in the Science story and how the involved physicists respond to it:

Moss acknowledges that the paper could be taken the wrong way: “I’m sort of afraid that I’m going to have [prominent theorist] John Ellis calling me up and accusing me of scaremongering.

Ellis is indeed grumbling a bit:

As for the presentation of the argument in the new paper, Ellis says he has some misgivings that it will whip up unfounded fears about the safety of the LHC once again. For example, the preprint of the paper doesn’t mention that cosmic-ray data essentially prove that the LHC cannot trigger the collapse of the vacuum—”because we [physicists] all knew that,” Moss says. The final version mentions it on the fourth of five pages. Still, Ellis, who served on a panel to examine the LHC’s safety, says he doesn’t think it’s possible to stop theorists from presenting such argument in tendentious ways. “I’m not going to lose sleep over it,” Ellis says. “If someone asks me, I’m going to say it’s so much theoretical noise.” Which may not be the most reassuring answer, either.

There is a problem here in that physicists are so fed up with popular worries about accelerator-caused disasters – worries that are often second-hand scaremongering that takes time and effort to counter (with marginal effects) – that they downplay or want to avoid talking about things that could feed the worries. Yet avoiding topics is rarely the best idea for finding the truth or looking trustworthy. And given the huge importance of existential risk even when it is unlikely, it is probably better to try to tackle it head-on than skirt around it.

# Theoretical noise

“Theoretical noise” is an interesting concept. Theoretical physics is full of papers considering all sorts of bizarre possibilities, some of which imply existential risks from accelerators. In our paper Probing the Improbable we argue that attempts to bound accelerator risks have problems due to the non-zero probability of errors overshadowing the probability they are trying to bound: an argument that there is zero risk is actually just achieving the claim that there is about 99% chance of zero risk, and 1% chance of some risk. But these risk arguments were assumed to be based on fairly solid physics. Their errors would be slips in logic, modelling or calculation rather than being based on an entirely wrong theory. Theoretical papers are often making up new theories, and their empirical support can be very weak.

An argument that there is some existential risk with probability P actually means that, if the probability of the argument is right is Q, there is risk with probability PQ plus whatever risk there is if the argument is wrong (which we can usually assume to be close to what we would have thought if there was no argument in the first place) times 1-Q. Since the vast majority of theoretical physics papers never go anywhere, we can safely assume Q to be rather small, perhaps around 1%. So a paper arguing for P=100% isn’t evidence the sky is falling, merely that we ought to look more closely to a potentially nasty possibility that is likely to turn into a dud. Most alarms are false alarms.

However, it is easier to generate theoretical noise than resolve it. I have spent some time working on a new accelerator risk scenario, “dark fire”, trying to bound the likelihood that it is real and threatening. Doing that well turned out to be surprisingly hard: the scenario was far more slippery than expected, so ruling it out completely turned out to be very hard (don’t worry, I think we amassed enough arguments to show the risk to be pretty small). This is of course the main reason for the annoyance of physicists: it is easy for anyone to claim there is risk, but then it is up to the physics community to do the laborious work of showing that the risk is small.

The vacuum decay issue has likely been dealt with by the Tegmark and Bostrom paper: were the decay probability high we should expect to be early observers, but we are fairly late ones. Hence the risk per year in our light-cone is small (less than one in a billion). Whatever is going on with the Higgs vacuum, we can likely trust it… if we trust that paper. Again we have to deal with the problem of an argument based on applying anthropic probability (a contentious subject where intelligent experts disagree on fundamentals) to models of planet formation (based on elaborate astrophysical models and observations): it is reassuring, but it does not reassure as strongly as we might like. It would be good to have a few backup papers giving different arguments bounding the risk.

# Backward theoretical noise dampening?

The lovely property of the Tegmark and Bostrom paper is that it covers a lot of different risks with the same method. In a way it handles a sizeable subset of the theoretical noise at the same time. We need more arguments like this. The cosmic ray argument is another good example: it is agnostic on what kind of planet-destroying risk is perhaps unleashed from energetic particle interactions, but given the past number of interactions we can be fairly secure (assuming we patch its holes).

One shared property of these broad arguments is that they tend to start with the risky outcome and argue backwards: if something were to destroy the world, what properties does it have to have? Are those properties possible or likely given our observations? Forward arguments (if X happens, then Y will happen, leading to disaster Z) tend to be narrow, and depend on our model of the detailed physics involved.

While the probability that a forward argument is correct might be higher than the more general backward arguments, it only reduces our concern for one risk rather than an entire group. An argument about why quantum black holes cannot be formed in an accelerator is limited to that possibility, and will not tell us anything about risks from Q-balls. So a backwards argument covering 10 possible risks but just being half as likely to be true as a forward argument covering one risk is going to be more effective in reducing our posterior risk estimate and dampening theoretical noise.

In a world where we had endless intellectual resources we would of course find the best possible arguments to estimate risks (and then for completeness and robustness the second best argument, the third, … and so on). We would likely use very sharp forward arguments. But in a world where expert time is at a premium and theoretical noise high we can do better by looking at weaker backwards arguments covering many risks at once. Their individual epistemic weakness can be handled by making independent but overlapping arguments, still saving effort if they cover many risk cases.

Backwards arguments also have another nice property: they help dealing with the “ultraviolet cut-off problem“. There is an infinite number of possible risks, most of which are exceedingly bizarre and a priori unlikely. But since there are so many of them, it seems we ought to spend an inordinate effort on the crazy ones, unless we find a principled way of drawing the line. Starting from a form of disaster and working backwards on probability bounds neatly circumvents this: production of planet-eating dragons is among the things covered by the cosmic ray argument.

Risk engineers will of course recognize this approach: it is basically a form of fault tree analysis, where we reason about bounds on the probability of a fault. The forward approach is more akin to failure mode and effects analysis, where we try to see what can go wrong and how likely it is. While fault trees cannot cover every possible initiating problem (all those bizarre risks) they are good for understanding the overall reliability of the system, or at least the part being modelled.

Deductive backwards arguments may be the best theoretical noise reduction method.

# The end of the worlds

George Dvorsky has a piece on Io9 about ways we could wreck the solar system, where he cites me in a few places. This is mostly for fun, but I think it links to an important existential risk issue: what conceivable threats have big enough spatial reach to threaten a interplanetary or even star-faring civilization?

This matters, since most existential risks we worry about today (like nuclear war, bioweapons, global ecological/societal crashes) only affect one planet. But if existential risk is the answer to the Fermi question, then the peril has to strike reliably. If it is one of the local ones it has to strike early: a multi-planet civilization is largely immune to the local risks. It will not just be distributed, but it will almost by necessity have fairly self-sufficient habitats that could act as seeds for a new civilization if they survive. Since it is entirely conceivable that we could have invented rockets and spaceflight long before discovering anything odd about uranium or how genetics work it seems unlikely that any of these local risks are “it”. That means that the risks have to be spatially bigger (or, of course, that xrisk is not the answer to the Fermi question).

Of the risks mentioned by George physics disasters are intriguing, since they might irradiate solar systems efficiently. But the reliability of them being triggered before interstellar spread seems problematic. Stellar engineering, stellification and orbit manipulation may be issues, but they hardly happen early – lots of time to escape. Warp drives and wormholes are also likely late activities, and do not seem to be reliable as extinctors. These are all still relatively localized: while able to irradiate a largish volume, they are not fine-tuned to cause damage and does not follow fleeing people. Dangers from self-replicating or self-improving machines seems to be a plausible, spatially unbound risk that could pursue (but also problematic for the Fermi question since now the machines are the aliens). Attracting malevolent aliens may actually be a relevant risk: assuming von Neumann probes one can set up global warning systems or “police probes” that maintain whatever rules the original programmers desire, and it is not too hard to imagine ruthless or uncaring systems that could enforce the great silence. Since early civilizations have the chance to spread to enormous volumes given a certain level of technology, this might matter more than one might a priori believe.

So, in the end, it seems that anything releasing a dangerous energy effect will only affect a fixed volume. If it has energy $E$ and one can survive it below a deposited energy $e$, if it just radiates in all directions the safe range is $r = \sqrt{E/4 \pi e} \propto \sqrt{E}$ – one needs to get into supernova ranges to sterilize interstellar volumes. If it is directional the range goes up, but smaller volumes are affected: if a fraction $f$ of the sky is affected, the range increases as $\propto \sqrt{1/f}$ but the total volume affected scales as $\propto f\sqrt{1/f}=\sqrt{f}$.

Self-sustaining effects are worse, but they need to cross space: if their space range is smaller than interplanetary distances they may destroy a planet but not anything more. For example, a black hole merely absorbs a planet or star (releasing a nasty energy blast) but does not continue sucking up stuff. Vacuum decay on the other hand has indefinite range in space and moves at lightspeed. Accidental self-replication is unlikely to be spaceworthy unless is starts among space-moving machinery; here deliberate design is a more serious problem.

The speed of threat spread also matters. If it is fast enough no escape is possible. However, many of the replicating threats will have sublight speed and could hence be escaped by sufficiently paranoid aliens. The issue here is if lightweight and hence faster replicators can always outrun larger aliens; given the accelerating expansion of the universe it might be possible to outrun them by being early enough, but our calculations do suggest that the margins look very slim.

The more information you have about a target, the better you can in general harm it. If you have no information, merely randomizing it with enough energy/entropy is the only option (and if you have no information of where it is, you need to radiate in all directions). As you learn more, you can focus resources to make more harm per unit expended, up to the extreme limits of solving the optimization problem of finding the informational/environmental inputs that cause desired harm (=hacking). This suggests that mindless threats will nearly always have shorter range and smaller harms than threats designed by (or constituted by) intelligent minds.

In the end, the most likely type of actual civilization-ending threat for an interplanetary civilization looks like it needs to be self-replicating/self-sustaining, able to spread through space, and have at least a tropism towards escaping entities. The smarter, the more effective it can be. This includes both nasty AI and replicators, but also predecessor civilizations that have infrastructure in place. Civilizations cannot be expected to reliably do foolish things with planetary orbits or risky physics.

# Galactic duck and cover

How much does gamma ray bursts (GRBs) produce a “galactic habitable zone”? Recently the preprint “On the role of GRBs on life extinction in the Universe” by Piran and Jimenez has made the rounds, arguing that we are near (in fact, inside) the inner edge of the zone due to plentiful GRBs causing mass extinctions too often for intelligence to arise.

This is somewhat similar to James Annis and Milan Cirkovic’s phase transition argument, where a declining rate of supernovae and GRBs causes global temporal synchronization of the emergence of intelligence. However, that argument has a problem: energetic explosions are random, and the difference in extinctions between lucky and unlucky parts of the galaxy can be large – intelligence might well erupt in a lucky corner long before the rest of the galaxy is ready.

I suspect the same problem is true for the Piran and Jimenez paper, but spatially. GRBs are believed to be highly directional, with beams typically a few degrees across. If we have random GRBs with narrow beams, how much of the center of the galaxy do they miss?

I made a simple model of the galaxy, with a thin disk, thick disk and bar population. The model used cubical cells 250 parsec long; somewhat crude, but likely good enough. Sampling random points based on star density, I generated GRBs. Based on Frail et al. 2001 I gave them lognormal energies and power-law distributed jet angles, directed randomly. Like Piran and Jimenez I assumed that if the fluence was above 100 kJ/m^2 it would be extinction level. The rate of GRBs in the Milky Way is uncertain, but a high estimate seems to be one every 100,000 years. Running 1000 GRBs would hence correspond to 100 million years.

If we look at the galactic plane we find that the variability close to the galactic centre is big: there are plenty of lucky regions with many stars.

When integrating around the entire galaxy to get a measure of risk at different radii and altitudes shows a rather messy structure:

One interesting finding is that the most dangerous place may be above the galactic plane along the axis: while few GRBs happen there, those in the disk and bar can reach there (the chance of being inside a double cone is independent of distance to the center, but along the axis one is within reach for the maximum number of GRBs).

Integrating the density of stars that are not affected as a function of radius and altitude shows that there is a mild galactic habitable zone hole within 4 kpc. That we are close to the peak is neat, but there is a significant number of stars very close to the center.

This is of course not a professional model; it is a slapdash Matlab script done in an evening to respond to some online debate. But I think it shows that directionality may matter a lot by increasing the variance of star fates. Nearby systems may be irradiated very differently, and merely averaging them will miss this.

If I understood Piran and Jimenez right they do not use directionality; instead they employ a scaled rate of observed GRBs, so they do not have to deal with the iffy issue of jet widths. This might be sound, but I suspect one should check the spatial statistics: correlations are tricky things (and were GRB axes even mildly aligned with the galactic axis the risk reduction would be huge). Another way of getting closer to their result is of course to bump up the number of GRBs: with enough, the centre of the galaxy will naturally be inhospitable. I did not do the same careful modelling of the link between metallicity and GRBs, nor the different sizes.

In any case, I suspect that GRBs are weak constraints on where life can persist and too erratic to act as a good answer to the Fermi question – even a mass extinction is forgotten within 10 million years.

# Happy Petrov Day!

On Practical Ethics I blog about Petrov Day: the anniversary of an avoided nuclear cataclysm.

The lovely thing about this incident is that there is a person to focus on, making existential risk dramatically real. The LessWrong community has developed a ritual to commemorate the event and make our individual responsibility for reducing existential risk more vivid.

Averted disasters are hard to see, so we need more and bigger monuments to people who averted things.