Wired has an article about the CSER Existential Risk Conference in December 2016, rather flatteringly comparing us to superheroes. Plus a list of more or less likely risks we discussed. Calling them the “10 biggest threats” is perhaps exaggerating a fair bit: nobody is seriously worried about simulation shutdowns. But some of the others are worth working a lot more on.

## High-energy demons

I am cited as talking about existential risk from demon summoning. Since this is bound to be misunderstood, here is the full story:

As noted in the Wired list, we wrote a paper looking at the risk from the LHC, finding that there is a problem with analysing very unlikely (but high impact) risks: the probability of a mistake in the analysis overshadows the risk itself, making the analysis bad at bounding the risk. This can be handled by doing multiple independent risk bounds, which is a hassle, but it is the only (?) way to reliably conclude that things are safe.

I blogged a bit about the LHC issue before we wrote the paper, bringing up the problem of estimating probabilities for unprecedented experiments through the case of Taleb’s demon (which properly should be Taylor’s demon, but Stigler’s law of eponymy strikes again). That probably got me to have a demon association to the wider physics risk issues.

The issue of how to think about unprecedented risks without succumbing to precautionary paralysis is important: we cannot avoid doing new things, yet we should not be stupid about it. This is extra tricky when considering experiments that create things or conditions that are not found in nature.

## Not so serious?

A closely related issue is when it is reasonable to regard a proposed risk as non-serious. Predictions of risk from strangelets, black holes, vacuum decay and other “theoretical noise” caused by theoretical physics theories at least is triggered by some serious physics thinking, even if it is far out. Physicists have generally tended to ignore such risks, but when forced by anxious acceleratorphobes the arguments had to be nontrivial: the initial dismissal was not really well founded. Yet it seems totally reasonable to dismiss some risks. If somebody worries that the alien spacegods will take exception to the accelerator we generally look for a psychiatrist rather than take them seriously. Some theories have so low prior probability that it seems rational to ignore them.

But what is the proper de minimis boundary here? One crude way of estimating it is to say that risks of destroying the world with lower probability than one in 10 billion can safely be ignored – they correspond to a risk of less than one person in expectation. But we would not accept that for an individual chemistry experiment: if the chance of being blown up if someone did it was “less than 100%” but still far above some tiny number, they would presumably want to avoid risking their neck. And in the physics risk case the same risk is borne by every living human. Worse, by Bostrom’s astronomical waste argument, existential risks risks more than 1046 possible future lives. So maybe we should put the boundary at less than 10-46: any risk more likely must be investigated in detail. That will be a lot of work. Still, there are risks far below this level: the probability that all humans were to die from natural causes within a year is around 10-7.2e11, which is OK.

One can argue that the boundary does not really exist: Martin Peterson argues that setting it at some fixed low probability, that realisations of the risk cannot be ascertained, or that it is below natural risks do not truly work: the boundary will be vague.

## Demons lurking in the priors

Be as it may with the boundary, the real problem is that estimating prior probabilities is not always easy. They can vault over the vague boundary.

Hence my demon summoning example (from a blog post near Halloween I cannot find right now): what about the risk of somebody summoning a demon army? It might cause the end of the world. The theory “Demons are real and threatening” is not a hugely likely theory: atheists and modern Christians may assign it zero probability. But that breaks Cromwell’s rule: once you assign 0% to a probability no amount of evidence – including a demon army parading in front of you – will make you change your mind (or you are not applying probability theory correctly). The proper response is to assume some tiny probability $\epsilon$, conveniently below the boundary.

…except that there are a lot of old-fashioned believers who do think the theory “Demons are real and threatening” is a totally fine theory. Sure, most academic readers of this blog will not belong to this group and instead to the $\epsilon$ probability group. But knowing that there are people out there that think something different from your view should make you want to update your view in their direction a bit – after all, you could be wrong and they might know something you don’t. (Yes, they ought to move a bit in your direction too.) But now suppose you move 1% in the direction of the believers from your $\epsilon$ belief. You will now believe in the theory to $\epsilon + 1\% \approx 1\%$. That is, now you have a fairly good reason not to disregard the demon theory automatically. At least you should spend effort on checking it out. And once you are done with that you better start with the next crazy New Age theory, and the next conspiracy theory…

## Reverend Bayes doesn’t help the unbeliever (or believer)

One way out is to argue that the probability of believers being right is so low that it can be disregarded. If they have probability $\epsilon$ of being right, then the actual demon risk is of size $\epsilon$ and we can ignore it – updates due to the others do not move us. But that is a pretty bold statement about human beliefs about anything: humans can surely be wrong about things, but being that certain that a common belief is wrong seems to require better evidence.

The believer will doubtlessly claim seeing a lot of evidence for the divine, giving some big update $\Pr[belief|evidence]=\Pr[evidence|belief]\Pr[belief]/\Pr[evidence]$, but the non-believer will notice that the evidence is also pretty compatible with non-belief: $\frac{\Pr[evidence|belief]}{\Pr[evidence|nonbelief]}\approx 1$ – most believers seem to have strong priors for their belief that they then strengthen by selective evidence or interpretation without taking into account the more relevant ratio $\Pr[belief|evidence] / \Pr[nonbelief|evidence]$. And the believers counter that the same is true for the non-believers…

Insofar we are just messing around with our own evidence-free priors we should just assume that others might know something we don’t know (maybe even in a way that we do not even recognise epistemically) and update in their direction. Which again forces us to spend time investigating demon risk.

## OK, let’s give in…

Another way of reasoning is to say that maybe we should investigate all risks somebody can make us guess a non-negligible prior for. It is just that we should allocate our efforts proportional to our current probability guesstimates. Start with the big risks, and work our way down towards the crazier ones. This is a bit like the post about the best problems to work on: setting priorities is important, and we want to go for the ones where we chew off most uninvestigated risk.

If we work our way down the list this way it seems that demon risk will be analysed relatively early, but also dismissed quickly: within the religious framework it is not a likely existential risk in most religions. In reality few if any religious people hold the view that demon summoning is an existential risk, since they tend to think that the end of the world is a religious drama and hence not intended to be triggered by humans – only divine powers or fate gets to start it, not curious demonologists.

## That wasn’t too painful?

Have we defeated the demon summoning problem? Not quite. There is no reason for all those priors to sum to 1 – they are suggested by people with very different and even crazy views – and even if we normalise them we get a very long and heavy tail of weird small risks. We can easily use up any amount of effort on this, effort we might want to spend on doing other useful things like actually reducing real risks or doing fun particle physics.

There might be solutions to this issue by reasoning backwards: instead of looking at how X could cause Y that could cause Z that destroys the world we ask “If the world would be destroyed by Z, what would need to have happened to cause it?” Working backwards to Y, Y’, Y” and other possibilities covers a larger space than our initial chain from X. If we are successful we can now state what conditions are needed to get to dangerous Y-like states and how likely they are. This is a way of removing entire chunks of the risk landscape in efficient ways.

This is how I think we can actually handle these small, awkward and likely non-existent risks. We develop mental tools to efficiently get rid of lots of them in one fell sweep, leaving the stuff that needs to be investigated further. But doing this right… well, the devil lurks in the details. Especially the thicket of totally idiosyncratic risks that cannot be handled in a general way. Which is no reason not to push forward, armed with epsilons and Bayes’ rule.

That the unbeliever may have to update a bit in the believer direction may look like a win for the believers. But they, if they are rational, should do a small update into the unbeliever direction. The most important consequence is that now they need to consider existential risks due to non-supernatural causes like nuclear war, AI or particle physics. They would assign them a lower credence than the unbeliever, but as per the usual arguments for the super-importance of existential risk this still means they may have to spend effort on thinking about and mitigating these risks that they otherwise would have dismissed as something God would have prevented. This may be far more annoying to them than unbelievers having to think a bit about demonology.

Emlyn O’Regan makes some great points over at Google+, which I think are worth analyzing:

1. “Should you somehow incorporate the fact that the world has avoided destruction until now into your probabilities?”
2. “Ideas without a tech angle might be shelved by saying there is no reason to expect them to happen soon.” (since they depend on world properties that have remained unchanged.)
3. ” Ideas like demon summoning might be limited also by being shown to be likely to be the product of cognitive biases, rather than being coherent free-standing ideas about the universe.”

In the case of (1), observer selection effects can come into play. If there are no observers on a post demon-world (demons maybe don’t count) then we cannot expect to see instances of demon apocalypses in the past. This is why the cosmic ray argument for the safety of the LHC need to point to the survival of the Moon or other remote objects rather than the Earth to argue that being hit by cosmic rays over long periods prove that it is safe. Also, as noted by Emlyn, the Doomsday argument might imply that we should expect a relatively near-term end, given the length of our past: whether this matters or not depends a lot on how one handles observer selection theory.

In the case of (2), there might be development in summoning methods. Maybe medieval methods could not work, but modern computer-aided chaos magick is up to it. Or there could be rare “the stars are right” situations that made past disasters impossible. Still, if you understand the risk domain you may be able to show that the risk is constant and hence must have been low (or that we are otherwise living in a very unlikely world). Traditions that do not believe in a growth of esoteric knowledge would presumably accept that past failures are evidence of future inability.

(3) is an error theory: believers in the risk are believers not because of proper evidence but from faulty reasoning of some kind, so they are not our epistemic peers and we do not need to update in their direction. If somebody is trying to blow up a building with a bomb we call the police, but if they try to do it by cursing we may just watch with amusement: past evidence of the efficacy of magic at causing big effects is nonexistent. So we have one set of evidence-supported theories (physics) and another set lacking evidence (magic), and we make the judgement that people believing in magic are just deluded and can be ignored.

(Real practitioners may argue that there sure is evidence for magic, it is just that magic is subtle and might act through convenient coincidences that look like they could have happened naturally but occur too often or too meaningfully to be just chance. However, the skeptic will want to actually see some statistics for this, and in any case demon apocalypses look like they are way out of the league for this kind of coincidental magic).

Emlyn suggests that maybe we could scoop all the non-physics like human ideas due to brain architecture into one bundle, and assign them one epsilon of probability as a group. But now we have the problem of assigning an idea to this group or not: if we are a bit uncertain about whether it should have $\epsilon$ probability or a big one, then it will get at least some fraction of the big probability and be outside the group. We can only do this if we are really certain that we can assign ideas accurately, and looking at how many people psychoanalyse, sociologise or historicise topics in engineering and physics to “debunk” them without looking at actual empirical content, we should be wary of our own ability to do it.

So, in short, (1) and (2) do not reduce our credence in the risk enough to make it irrelevant unless we get a lot of extra information. (3) is decent at making us sceptical, but our own fallibility at judging cognitive bias and mistakes (which follows from claiming others are making mistakes!) makes error theories weaker than they look. Still, the really consistent lack of evidence of anything resembling the risk being real and that claims internal to the systems of ideas that accept the possibility imply that there should be smaller, non-existential, instances that should be observable (e.g. individual Fausts getting caught on camera visibly succeeding in summoning demons), and hence we can discount these systems strongly in favor of more boring but safe physics or hard-to-disprove but safe coincidental magic.

# Dampening theoretical noise by arguing backwards

Science has the adorable headline Tiny black holes could trigger collapse of universe—except that they don’t, dealing with the paper Gravity and the stability of the Higgs vacuum by Burda, Gregory & Moss. The paper argues that quantum black holes would act as seeds for vacuum decay, making metastable Higgs vacua unstable. The point of the paper is that some new and interesting mechanism prevents this from happening. The more obvious explanation that we are already in the stable true vacuum seems to be problematic since apparently we should expect a far stronger Higgs field there. Plenty of theoretical issues are of course going on about the correctness and consistency of the assumptions in the paper.

# Don’t mention the war

What I found interesting is the treatment of existential risk in the Science story and how the involved physicists respond to it:

Moss acknowledges that the paper could be taken the wrong way: “I’m sort of afraid that I’m going to have [prominent theorist] John Ellis calling me up and accusing me of scaremongering.

Ellis is indeed grumbling a bit:

As for the presentation of the argument in the new paper, Ellis says he has some misgivings that it will whip up unfounded fears about the safety of the LHC once again. For example, the preprint of the paper doesn’t mention that cosmic-ray data essentially prove that the LHC cannot trigger the collapse of the vacuum—”because we [physicists] all knew that,” Moss says. The final version mentions it on the fourth of five pages. Still, Ellis, who served on a panel to examine the LHC’s safety, says he doesn’t think it’s possible to stop theorists from presenting such argument in tendentious ways. “I’m not going to lose sleep over it,” Ellis says. “If someone asks me, I’m going to say it’s so much theoretical noise.” Which may not be the most reassuring answer, either.

There is a problem here in that physicists are so fed up with popular worries about accelerator-caused disasters – worries that are often second-hand scaremongering that takes time and effort to counter (with marginal effects) – that they downplay or want to avoid talking about things that could feed the worries. Yet avoiding topics is rarely the best idea for finding the truth or looking trustworthy. And given the huge importance of existential risk even when it is unlikely, it is probably better to try to tackle it head-on than skirt around it.

# Theoretical noise

“Theoretical noise” is an interesting concept. Theoretical physics is full of papers considering all sorts of bizarre possibilities, some of which imply existential risks from accelerators. In our paper Probing the Improbable we argue that attempts to bound accelerator risks have problems due to the non-zero probability of errors overshadowing the probability they are trying to bound: an argument that there is zero risk is actually just achieving the claim that there is about 99% chance of zero risk, and 1% chance of some risk. But these risk arguments were assumed to be based on fairly solid physics. Their errors would be slips in logic, modelling or calculation rather than being based on an entirely wrong theory. Theoretical papers are often making up new theories, and their empirical support can be very weak.

An argument that there is some existential risk with probability P actually means that, if the probability of the argument is right is Q, there is risk with probability PQ plus whatever risk there is if the argument is wrong (which we can usually assume to be close to what we would have thought if there was no argument in the first place) times 1-Q. Since the vast majority of theoretical physics papers never go anywhere, we can safely assume Q to be rather small, perhaps around 1%. So a paper arguing for P=100% isn’t evidence the sky is falling, merely that we ought to look more closely to a potentially nasty possibility that is likely to turn into a dud. Most alarms are false alarms.

However, it is easier to generate theoretical noise than resolve it. I have spent some time working on a new accelerator risk scenario, “dark fire”, trying to bound the likelihood that it is real and threatening. Doing that well turned out to be surprisingly hard: the scenario was far more slippery than expected, so ruling it out completely turned out to be very hard (don’t worry, I think we amassed enough arguments to show the risk to be pretty small). This is of course the main reason for the annoyance of physicists: it is easy for anyone to claim there is risk, but then it is up to the physics community to do the laborious work of showing that the risk is small.

The vacuum decay issue has likely been dealt with by the Tegmark and Bostrom paper: were the decay probability high we should expect to be early observers, but we are fairly late ones. Hence the risk per year in our light-cone is small (less than one in a billion). Whatever is going on with the Higgs vacuum, we can likely trust it… if we trust that paper. Again we have to deal with the problem of an argument based on applying anthropic probability (a contentious subject where intelligent experts disagree on fundamentals) to models of planet formation (based on elaborate astrophysical models and observations): it is reassuring, but it does not reassure as strongly as we might like. It would be good to have a few backup papers giving different arguments bounding the risk.

# Backward theoretical noise dampening?

The lovely property of the Tegmark and Bostrom paper is that it covers a lot of different risks with the same method. In a way it handles a sizeable subset of the theoretical noise at the same time. We need more arguments like this. The cosmic ray argument is another good example: it is agnostic on what kind of planet-destroying risk is perhaps unleashed from energetic particle interactions, but given the past number of interactions we can be fairly secure (assuming we patch its holes).

One shared property of these broad arguments is that they tend to start with the risky outcome and argue backwards: if something were to destroy the world, what properties does it have to have? Are those properties possible or likely given our observations? Forward arguments (if X happens, then Y will happen, leading to disaster Z) tend to be narrow, and depend on our model of the detailed physics involved.

While the probability that a forward argument is correct might be higher than the more general backward arguments, it only reduces our concern for one risk rather than an entire group. An argument about why quantum black holes cannot be formed in an accelerator is limited to that possibility, and will not tell us anything about risks from Q-balls. So a backwards argument covering 10 possible risks but just being half as likely to be true as a forward argument covering one risk is going to be more effective in reducing our posterior risk estimate and dampening theoretical noise.

In a world where we had endless intellectual resources we would of course find the best possible arguments to estimate risks (and then for completeness and robustness the second best argument, the third, … and so on). We would likely use very sharp forward arguments. But in a world where expert time is at a premium and theoretical noise high we can do better by looking at weaker backwards arguments covering many risks at once. Their individual epistemic weakness can be handled by making independent but overlapping arguments, still saving effort if they cover many risk cases.

Backwards arguments also have another nice property: they help dealing with the “ultraviolet cut-off problem“. There is an infinite number of possible risks, most of which are exceedingly bizarre and a priori unlikely. But since there are so many of them, it seems we ought to spend an inordinate effort on the crazy ones, unless we find a principled way of drawing the line. Starting from a form of disaster and working backwards on probability bounds neatly circumvents this: production of planet-eating dragons is among the things covered by the cosmic ray argument.

Risk engineers will of course recognize this approach: it is basically a form of fault tree analysis, where we reason about bounds on the probability of a fault. The forward approach is more akin to failure mode and effects analysis, where we try to see what can go wrong and how likely it is. While fault trees cannot cover every possible initiating problem (all those bizarre risks) they are good for understanding the overall reliability of the system, or at least the part being modelled.

Deductive backwards arguments may be the best theoretical noise reduction method.

# Somebody think of the electrons!

Brian Tomasik has a fascinating essay: Is there suffering in fundamental physics?

He admits from the start that “Any sufficiently advanced consequentialism is indistinguishable from its own parody.” And it would be easy to dismiss this as taking compassion way too far: not just caring about plants or rocks, but the possible suffering of electrons and positrons.

I think he has enough arguments to show that the idea is not entirely crazy: we do not understand the ontology of phenomenal experience well enough that we can easily rule out small systems having states, panpsychism is a view held by some rational people, it seems a priori unlikely that there is some mid-sized systems that have all the value in the universe rather than the largest or the smallest scale, we have strong biases towards our kind of system, and information physics might actually link consciousness with physics.

None of these are great arguments, but there are many of them. And the total number of atoms or particles is huge: even assigning a tiny fraction of human moral consideration to them or a tiny probability of them mattering morally will create a large expected moral value. The smallness of moral consideration or the probability needs to be far outside our normal reasoning comfort zone: if you assign a probability lower than $10^{-10^{56}}$ to a possibility you need amazingly strong reasons given normal human epistemic uncertainty.

I suspect most readers will regard this outside their “ultraviolett cutoff” for strange theories: just as physicists successfully invented/discovered a quantum cutoff to solve the ultraviolet catastrophe, most people have a limit where things are too silly or strange to count. Exactly how to draw it rationally (rather than just base it on conformism or surface characteristics) is a hard problem when choosing between the near infinity of odd but barely possible theories.

One useful heuristic is to check whether the opposite theory is equally likely or important: in that case they balance each other (yes, the world could be destroyed by me dropping a pen – but it could also be destroyed by not dropping it). In this case giving greater weight to suffering than neutral states breaks the symmetry: we ought to investigate this possibility since the theory that there is no moral considerability in elementary physics implies no particular value is gained from discovering this fact, while the suffering theory implies it may matter a lot if we found out (and could do something about it). The heuristic is limited but at least a start.

Another way of getting a cutoff for theories of suffering is of course to argue that there must be a lower limit of the system that can have suffering (this is after all how physics very successfully solved the classical UV catastrophe). This gets tricky when we try to apply it to insects, small brains, or other information processing systems. But in physics there might be a better argument: if suffering happens on the elementary particle level, it is going to be quantum suffering. There would be literal superpositions of suffering/non-suffering of the same system. Normal suffering is classical: either it exists or not to some experiencing system, and hence there either is or isn’t a moral obligation to do something. It is not obvious how to evaluate quantum suffering. Maybe we ought to perform a quantum-action that moves the wavefunction to a pure non-suffering state (a bit like quantum game theory: just as game theory might have ties to morality, quantum game theory might link to quantum morality), but this is constrained by the tough limits in quantum mechanics on what can be sensed and done. Quantum suffering might simply be something different from suffering, just as quantum states do not have classical counterparts. Hence our classical moral obligations do not relate to it.

But who knows how molecules feel?