# Catastrophizing for not-so-fun and non-profit

Oren Cass has an article in Foreign Affairs about the problem of climate catastrophizing. It is basically how it becomes driven by motivated reasoning but also drives motivated reasoning in a vicious circle. Regardless of whether he himself has motivated reasoning too, I think the text is relevant beyond the climate domain.

Some of FHI research and reports are mentioned in passing. Their role is mainly in showing that there could be very bright futures or other existential risks, which undercuts the climate catastrophists that he is really criticising:

Several factors may help to explain why catastrophists sometimes view extreme climate change as more likely than other worst cases. Catastrophists confuse expected and extreme forecasts and thus view climate catastrophe as something we know will happen. But while the expected scenarios of manageable climate change derive from an accumulation of scientific evidence, the extreme ones do not. Catastrophists likewise interpret the present-day effects of climate change as the onset of their worst fears, but those effects are no more proof of existential catastrophes to come than is the 2015 Ebola epidemic a sign of a future civilization-destroying pandemic, or Siri of a coming Singularity

I think this is an important point for the existential risk community to be aware of. We are mostly interested in existential risks and global catastrophes that look possible but could be impossible (or avoided), rather than trying to predict risks that are going to happen. We deal in extreme cases that are intrinsically uncertain, and leave the more certain things to others (unless maybe they happen to be very under-researched). Siri gives us some singularity-evidence, but we think it is weak evidence, not proof (a hypothetical AI catastrophist would instead say “so, it begins”).

Confirmation bias is easy to fall for. If you are looking for signs of your favourite disaster emerging you will see them, and presumably loudly point at them in order to forestall the disaster. That suggests extra value in checking what might not be xrisks and shouldn’t be emphasised too much.

## Catastrophizing is not very effective

The nuclear disarmament movement also used a lot of catastrophizing, with plenty of archetypal cartoons showing Earth blowing up as a result of nuclear war or commonly claiming it would end humanity. The fact that the likely outcome merely would be mega- or gigadeath and untold suffering was apparently not regarded as rhetorically punchy enough. Ironically, Threads, The Day After or the Charlottesville scenario in Effects of Nuclear War may have been far more effective in driving home the horror and undesirability of nuclear war better, largely by giving a smaller-scale more relateable scenarios. Scope insensitivity, psychic numbing, compassion fade and related effects make catastrophizing a weak, perhaps even counterproductive, tool.

Another take-home message: when arguing for the importance of xrisk we should make sure we do not end up in the stupid loop he describes. If something is the most important thing ever, we better argue for it well and backed up with as much evidence and reason as can possibly be mustered. Turning it all into a game of overcoming cognitive bias through marketing or attributing psychological explanations to opposing views is risky.

The catastrophizing problem for very important risks is related to Janet Radcliffe-Richards’ analysis of what is wrong with political correctness (in an extended sense). A community argues for some high-minded ideal X using some arguments or facts Y. Someone points out a problem with Y. The rational response would be to drop Y and replace it with better arguments or facts Z (or, if it is really bad, drop X). The typical human response is to (implicitly or explicitly) assume that since Y is used to argue for X, then criticising Y is intended to reduce support for X. Since X is good (or at least of central tribal importance) the critic must be evil or at least a tribal enemy – get him! This way bad arguments or unlikely scenarios get embedded in a discourse.

Standard groupthink where people with doubts figure out that they better keep their heads down if they want to remain in the group strengthens the effect, and makes criticism even less common (and hence more salient and out-groupish when it happens).

## Reasons to be cheerful?

An interesting detail about the opening: the GCR/Xrisk community seems to be way more optimistic than the climate community as described. I mentioned Warren Ellis little novel Normal earlier on this blog, which is about a mental asylum for futurists affected by looking into the abyss. I suspect he was maybe modelling them on the moody climate people but adding an overlay of other futurist ideas/tropes for the story.

Assuming climate people really are that moody.

# The capability caution principle and the principle of maximal awkwardness

Capability Caution Principle: There being no consensus, we should avoid strong assumptions regarding upper limits on future AI capabilities.

It is an important meta-principle in careful design to avoid assuming the most reassuring possibility and instead design based on the most awkward possibility.

When inventing a cryptosystem, do not assume that the adversary is stupid and has limited resources: try to make something that can withstand a computationally and intellectually superior adversary. When testing a new explosive, do not assume it will be weak – stand as far away as possible. When trying to improve AI safety, do not assume AI will be stupid or weak, or that whoever implements it will be sane.

Often we think that the conservative choice is the pessimistic choice where nothing works. This is because “not working” is usually the most awkward possibility when building something. If I plan a project I should ensure that I can handle unforeseen delays and that my original plans and pathways have to be scrapped and replaced with something else. But from a safety or social impact perspective the most awkward situation is if something succeeds radically, in the near future, and we have to deal with the consequences.

Assuming the principle of maximal awkwardness is a form of steelmanning and the least convenient possible world.

This is an approach based on potential loss rather than probability. Most AI history tells us that wild dreams rarely, if ever, come true. But were we to get very powerful AI tools tomorrow it is not too hard to foresee a lot of damage and disruption. Even if you do not think the risk is existential you can probably imagine that autonomous hedge funds smarter than human traders, automated engineering in the hands of anybody and scalable automated identity theft could mess up the world system rather strongly. The fact that it might be unlikely is not as important as that the damage would be unacceptable. It is often easy to think that in uncertain cases the burden of proof is on the other party, rather than on the side where a mistaken belief would be dangerous.

As FLI stated it the principle goes both ways: do not assume the limits are super-high either. Maybe there is a complexity scaling making problem-solving systems unable to handle more than 7 things in “working memory” at the same time, limiting how deep their insights could be. Maybe social manipulation is not a tractable task. But this mainly means we should not count on the super-smart AI as a solution to problems (e.g. using one smart system to monitor another smart system). It is not an argument to be complacent.

People often misunderstand uncertainty:

• Some think that uncertainty implies that non-action is reasonable, or at least action should wait till we know more. This is actually where the precautionary principle is sane: if there is a risk of something bad happening but you are not certain it will happen, you should still try to prevent it from happening or at least monitor what is going on.
• Obviously some uncertain risks are unlikely enough that they can be ignored by rational people, but you need to have good reasons to think that the risk is actually that unlikely – uncertainty alone does not help.
• Gaining more information sometimes reduces uncertainty in valuable ways, but the price of information can sometimes be too high, especially when there are intrinsically unknowable factors and noise clouding the situation.
• Looking at the mean or expected case can be a mistake if there is a long tail of relatively unlikely but terrible possibilities: on the average day your house does not have a fire, but having insurance, a fire alarm and a fire extinguisher is a rational response.
• Combinations of uncertain factors do not become less uncertain as they are combined (even if you describe them carefully and with scenarios): typically you get broader and heavier-tailed distributions, and should act on the tail risk.

FLI asks the intriguing question of how smart AI can get. I really want to know that too. But it is relatively unimportant for designing AI safety unless the ceiling is shockingly low; it is safer to assume it can be as smart as it wants to. Some AI safety schemes involve smart systems monitoring each other or performing very complex counterfactuals: these do hinge on an assumption of high intelligence (or whatever it takes to accurately model counterfactual worlds). But then the design criteria should be to assume that these things are hard to do well.

Under high uncertainty, assume Murphy’s law holds.

(But remember that good engineering and reasoning can bind Murphy – it is just that you cannot assume somebody else will do it for you.)

Wired has an article about the CSER Existential Risk Conference in December 2016, rather flatteringly comparing us to superheroes. Plus a list of more or less likely risks we discussed. Calling them the “10 biggest threats” is perhaps exaggerating a fair bit: nobody is seriously worried about simulation shutdowns. But some of the others are worth working a lot more on.

## High-energy demons

I am cited as talking about existential risk from demon summoning. Since this is bound to be misunderstood, here is the full story:

As noted in the Wired list, we wrote a paper looking at the risk from the LHC, finding that there is a problem with analysing very unlikely (but high impact) risks: the probability of a mistake in the analysis overshadows the risk itself, making the analysis bad at bounding the risk. This can be handled by doing multiple independent risk bounds, which is a hassle, but it is the only (?) way to reliably conclude that things are safe.

I blogged a bit about the LHC issue before we wrote the paper, bringing up the problem of estimating probabilities for unprecedented experiments through the case of Taleb’s demon (which properly should be Taylor’s demon, but Stigler’s law of eponymy strikes again). That probably got me to have a demon association to the wider physics risk issues.

The issue of how to think about unprecedented risks without succumbing to precautionary paralysis is important: we cannot avoid doing new things, yet we should not be stupid about it. This is extra tricky when considering experiments that create things or conditions that are not found in nature.

## Not so serious?

A closely related issue is when it is reasonable to regard a proposed risk as non-serious. Predictions of risk from strangelets, black holes, vacuum decay and other “theoretical noise” caused by theoretical physics theories at least is triggered by some serious physics thinking, even if it is far out. Physicists have generally tended to ignore such risks, but when forced by anxious acceleratorphobes the arguments had to be nontrivial: the initial dismissal was not really well founded. Yet it seems totally reasonable to dismiss some risks. If somebody worries that the alien spacegods will take exception to the accelerator we generally look for a psychiatrist rather than take them seriously. Some theories have so low prior probability that it seems rational to ignore them.

But what is the proper de minimis boundary here? One crude way of estimating it is to say that risks of destroying the world with lower probability than one in 10 billion can safely be ignored – they correspond to a risk of less than one person in expectation. But we would not accept that for an individual chemistry experiment: if the chance of being blown up if someone did it was “less than 100%” but still far above some tiny number, they would presumably want to avoid risking their neck. And in the physics risk case the same risk is borne by every living human. Worse, by Bostrom’s astronomical waste argument, existential risks risks more than 1046 possible future lives. So maybe we should put the boundary at less than 10-46: any risk more likely must be investigated in detail. That will be a lot of work. Still, there are risks far below this level: the probability that all humans were to die from natural causes within a year is around 10-7.2e11, which is OK.

One can argue that the boundary does not really exist: Martin Peterson argues that setting it at some fixed low probability, that realisations of the risk cannot be ascertained, or that it is below natural risks do not truly work: the boundary will be vague.

## Demons lurking in the priors

Be as it may with the boundary, the real problem is that estimating prior probabilities is not always easy. They can vault over the vague boundary.

Hence my demon summoning example (from a blog post near Halloween I cannot find right now): what about the risk of somebody summoning a demon army? It might cause the end of the world. The theory “Demons are real and threatening” is not a hugely likely theory: atheists and modern Christians may assign it zero probability. But that breaks Cromwell’s rule: once you assign 0% to a probability no amount of evidence – including a demon army parading in front of you – will make you change your mind (or you are not applying probability theory correctly). The proper response is to assume some tiny probability $\epsilon$, conveniently below the boundary.

…except that there are a lot of old-fashioned believers who do think the theory “Demons are real and threatening” is a totally fine theory. Sure, most academic readers of this blog will not belong to this group and instead to the $\epsilon$ probability group. But knowing that there are people out there that think something different from your view should make you want to update your view in their direction a bit – after all, you could be wrong and they might know something you don’t. (Yes, they ought to move a bit in your direction too.) But now suppose you move 1% in the direction of the believers from your $\epsilon$ belief. You will now believe in the theory to $\epsilon + 1\% \approx 1\%$. That is, now you have a fairly good reason not to disregard the demon theory automatically. At least you should spend effort on checking it out. And once you are done with that you better start with the next crazy New Age theory, and the next conspiracy theory…

## Reverend Bayes doesn’t help the unbeliever (or believer)

One way out is to argue that the probability of believers being right is so low that it can be disregarded. If they have probability $\epsilon$ of being right, then the actual demon risk is of size $\epsilon$ and we can ignore it – updates due to the others do not move us. But that is a pretty bold statement about human beliefs about anything: humans can surely be wrong about things, but being that certain that a common belief is wrong seems to require better evidence.

The believer will doubtlessly claim seeing a lot of evidence for the divine, giving some big update $\Pr[belief|evidence]=\Pr[evidence|belief]\Pr[belief]/\Pr[evidence]$, but the non-believer will notice that the evidence is also pretty compatible with non-belief: $\frac{\Pr[evidence|belief]}{\Pr[evidence|nonbelief]}\approx 1$ – most believers seem to have strong priors for their belief that they then strengthen by selective evidence or interpretation without taking into account the more relevant ratio $\Pr[belief|evidence] / \Pr[nonbelief|evidence]$. And the believers counter that the same is true for the non-believers…

Insofar we are just messing around with our own evidence-free priors we should just assume that others might know something we don’t know (maybe even in a way that we do not even recognise epistemically) and update in their direction. Which again forces us to spend time investigating demon risk.

## OK, let’s give in…

Another way of reasoning is to say that maybe we should investigate all risks somebody can make us guess a non-negligible prior for. It is just that we should allocate our efforts proportional to our current probability guesstimates. Start with the big risks, and work our way down towards the crazier ones. This is a bit like the post about the best problems to work on: setting priorities is important, and we want to go for the ones where we chew off most uninvestigated risk.

If we work our way down the list this way it seems that demon risk will be analysed relatively early, but also dismissed quickly: within the religious framework it is not a likely existential risk in most religions. In reality few if any religious people hold the view that demon summoning is an existential risk, since they tend to think that the end of the world is a religious drama and hence not intended to be triggered by humans – only divine powers or fate gets to start it, not curious demonologists.

## That wasn’t too painful?

Have we defeated the demon summoning problem? Not quite. There is no reason for all those priors to sum to 1 – they are suggested by people with very different and even crazy views – and even if we normalise them we get a very long and heavy tail of weird small risks. We can easily use up any amount of effort on this, effort we might want to spend on doing other useful things like actually reducing real risks or doing fun particle physics.

There might be solutions to this issue by reasoning backwards: instead of looking at how X could cause Y that could cause Z that destroys the world we ask “If the world would be destroyed by Z, what would need to have happened to cause it?” Working backwards to Y, Y’, Y” and other possibilities covers a larger space than our initial chain from X. If we are successful we can now state what conditions are needed to get to dangerous Y-like states and how likely they are. This is a way of removing entire chunks of the risk landscape in efficient ways.

This is how I think we can actually handle these small, awkward and likely non-existent risks. We develop mental tools to efficiently get rid of lots of them in one fell sweep, leaving the stuff that needs to be investigated further. But doing this right… well, the devil lurks in the details. Especially the thicket of totally idiosyncratic risks that cannot be handled in a general way. Which is no reason not to push forward, armed with epsilons and Bayes’ rule.

That the unbeliever may have to update a bit in the believer direction may look like a win for the believers. But they, if they are rational, should do a small update into the unbeliever direction. The most important consequence is that now they need to consider existential risks due to non-supernatural causes like nuclear war, AI or particle physics. They would assign them a lower credence than the unbeliever, but as per the usual arguments for the super-importance of existential risk this still means they may have to spend effort on thinking about and mitigating these risks that they otherwise would have dismissed as something God would have prevented. This may be far more annoying to them than unbelievers having to think a bit about demonology.

Emlyn O’Regan makes some great points over at Google+, which I think are worth analyzing:

1. “Should you somehow incorporate the fact that the world has avoided destruction until now into your probabilities?”
2. “Ideas without a tech angle might be shelved by saying there is no reason to expect them to happen soon.” (since they depend on world properties that have remained unchanged.)
3. ” Ideas like demon summoning might be limited also by being shown to be likely to be the product of cognitive biases, rather than being coherent free-standing ideas about the universe.”

In the case of (1), observer selection effects can come into play. If there are no observers on a post demon-world (demons maybe don’t count) then we cannot expect to see instances of demon apocalypses in the past. This is why the cosmic ray argument for the safety of the LHC need to point to the survival of the Moon or other remote objects rather than the Earth to argue that being hit by cosmic rays over long periods prove that it is safe. Also, as noted by Emlyn, the Doomsday argument might imply that we should expect a relatively near-term end, given the length of our past: whether this matters or not depends a lot on how one handles observer selection theory.

In the case of (2), there might be development in summoning methods. Maybe medieval methods could not work, but modern computer-aided chaos magick is up to it. Or there could be rare “the stars are right” situations that made past disasters impossible. Still, if you understand the risk domain you may be able to show that the risk is constant and hence must have been low (or that we are otherwise living in a very unlikely world). Traditions that do not believe in a growth of esoteric knowledge would presumably accept that past failures are evidence of future inability.

(3) is an error theory: believers in the risk are believers not because of proper evidence but from faulty reasoning of some kind, so they are not our epistemic peers and we do not need to update in their direction. If somebody is trying to blow up a building with a bomb we call the police, but if they try to do it by cursing we may just watch with amusement: past evidence of the efficacy of magic at causing big effects is nonexistent. So we have one set of evidence-supported theories (physics) and another set lacking evidence (magic), and we make the judgement that people believing in magic are just deluded and can be ignored.

(Real practitioners may argue that there sure is evidence for magic, it is just that magic is subtle and might act through convenient coincidences that look like they could have happened naturally but occur too often or too meaningfully to be just chance. However, the skeptic will want to actually see some statistics for this, and in any case demon apocalypses look like they are way out of the league for this kind of coincidental magic).

Emlyn suggests that maybe we could scoop all the non-physics like human ideas due to brain architecture into one bundle, and assign them one epsilon of probability as a group. But now we have the problem of assigning an idea to this group or not: if we are a bit uncertain about whether it should have $\epsilon$ probability or a big one, then it will get at least some fraction of the big probability and be outside the group. We can only do this if we are really certain that we can assign ideas accurately, and looking at how many people psychoanalyse, sociologise or historicise topics in engineering and physics to “debunk” them without looking at actual empirical content, we should be wary of our own ability to do it.

So, in short, (1) and (2) do not reduce our credence in the risk enough to make it irrelevant unless we get a lot of extra information. (3) is decent at making us sceptical, but our own fallibility at judging cognitive bias and mistakes (which follows from claiming others are making mistakes!) makes error theories weaker than they look. Still, the really consistent lack of evidence of anything resembling the risk being real and that claims internal to the systems of ideas that accept the possibility imply that there should be smaller, non-existential, instances that should be observable (e.g. individual Fausts getting caught on camera visibly succeeding in summoning demons), and hence we can discount these systems strongly in favor of more boring but safe physics or hard-to-disprove but safe coincidental magic.

# Best problems to work on?

80,000 hours has a lovely overview of “What are the biggest problems in the world?” The best part is that each problem gets its own profile with a description, arguments in favor and against, and what already exists. I couldn’t resist plotting the table in 3D:

There are of course plenty of problems not listed; even if these are truly the most important there will be a cloud of smaller scale problems to the right. They list a few potential ones like cheap green energy, peace, human rights, reducing migration restrictions, etc.

I recently got the same question, and here are my rough answers:

• Fixing our collective epistemic systems. Societies work as cognitive systems: acquiring information, storing, filtering and transmitting it, synthesising it, making decisions, and implementing actions. This is done through individual minds, media and institutions. Recently we have massively improved some aspects through technology, but it looks like our ability to filter, organise and jointly coordinate has not improved – in fact, many feel it has become worse. Networked media means that information can bounce around multiple times acquiring heavy bias, while filtering mechanisms relying on authority has lost credibility (rightly or wrongly). We are seeing all sorts of problems of coordinating diverse, polarised, globalised or confused societies. Decision-making that is not reality-tracking due to (rational or irrational) ignorance, bias or misaligned incentives is at best useless, at worst deadly. Figuring out how to improve these systems seem to be something with tremendous scale (good coordination and governance helps solve most problems above), it is fairly neglected (people tend to work on small parts rather than figuring out better systems), and looks decently solvable (again, many small pieces may be useful together rather than requiring a total perfect solution).
• Ageing. Ageing kills 100,000 people per day. It is a massive cause of suffering, from chronic diseases to loss of life quality. It causes loss of human capital at nearly the same rate as all education and individual development together. A reduction in the health toll from ageing would not just save life-years, it would have massive economic benefits. While this would necessitate changes in society most plausible shifts (changing pensions, the concepts of work and life-course, how families are constituted, some fertility reduction and institutional reform) the cost and trouble with such changes is pretty microscopic compared to the ongoing death toll and losses. The solvability is improving: 20 years ago it was possible to claim that there were no anti-ageing interventions, while today there exist enough lab examples to make this untenable. Transferring these results into human clinical practice will however be a lot of hard work. It is also fairly neglected: far more work is being spent on symptoms and age-related illness and infirmity than root causes, partially for cultural reasons.
• Existential risk reduction: I lumped together all the work to secure humanity’s future into one category. Right now I think reducing nuclear war risk is pretty urgent (not because of the current incumbent of the White House, but simply because the state risk probability seems to dominate the other current risks), followed by biotechnological risks (where we still have some time to invent solutions before the Collingridge dilemma really bites; I think it is also somewhat neglected) and AI risk (I put it as #3 for humanity, but it may be #1 for research groups like FHI that can do something about the neglectedness while we figure out better how much priority it truly deserves). But a lot of the effort might be on the mitigation side: alternative food to make the world food system more resilient and sun-independent, distributed and more robust infrastructure (whether better software security, geomagnetic storm/EMP-safe power grids, local energy production, distributed internet solutions etc.), refuges and backup solutions. The scale is big, most are neglected and many are solvable.

Another interesting set of problems is Robin Hanson’s post about neglected big problems. They are in a sense even more fundamental than mine: they are problems with the human condition.

As a transhumanist I do think the human condition entails some rather severe problems – ageing and stupidity is just two of them – and that we should work to fix them. Robin’s list may not be the easiest to solve, though (although there might be piecemeal solutions worth doing). Many enhancements, like moral capacity and well-being, have great scope and are very neglected but lose out to ageing because of the currently low solvability level and the higher urgency of coordination and risk reduction. As I see it, if we can ensure that we survive (individually and collectively) and are better at solving problems, then we will have better chances at fixing the tougher problems of the human condition.

# Survivorship curves and existential risk

In a discussion Dennis Pamlin suggested that one could make a mortality table/survival curve for our species subject to existential risk, just as one can do for individuals. This also allows demonstrations of how changes in risk affect the expected future lifespan. This post is a small internal FHI paper I did just playing around with survivorship curves and other tools of survival analysis to see what they add to considerations of existential risk. The outcome was more qualitative than quantitative: I do not think we know enough to make a sensible mortality table. But it does tell us a few useful things:

• We should try to reduce ongoing “state risks” as early as possible
• Discrete “transition risks” that do not affect state risks matters less; we may want to put them off indefinitely.
• Indefinite survival is possible if we make hazard decrease fast enough.

# Simple model

A first, very simple model: assume a fixed population and power-law sized disasters that randomly kill a number of people proportional to their size every unit of time (if there are survivors, then they repopulate until next timestep). Then the expected survival curve is an exponential decay.

This is in fact independent of the distribution, and just depends on the chance of exceedance. If disasters happen at a rate $\lambda$ and the probability of extinction $\Pr(X>\mathrm{population}) = p$, then the curve is $S(t) = \exp(-p \lambda t).$

This can be viewed as a simple model of state risks, the ongoing background of risk to our species from e.g. asteroids and supernovas.

## Correlations

What if the population rebound is slower than the typical inter-disaster interval? During the rebound the population is more vulnerable to smaller disasters. However, if we average over longer time than the rebound time constant we end up with the same situation as before: an adjusted, slightly higher hazard, but still an exponential.

In ecology there has been a fair number of papers analyzing how correlated environmental noise affects extinction probability, generally concluding that correlated (“red”) noise is bad (e.g. (Ripa and Lundberg 1996), (Ovaskainen and Meerson 2010)) since the adverse conditions can be longer than the rebound time.

If events behave in a sufficiently correlated manner, then the basic survival curve may be misleading since it only shows the mean ensemble effect rather than the tail risks. Human societies are also highly path dependent over long timescales: our responses can create long memory effects, both positive and negative, and this can affect the risk autocorrelation.

## Population growth

If population increases exponentially at a rate $G$ and is reduced by disasters, then initially some instances will be wiped out, but many realizations achieve takeoff where they grow essentially forever. As the population becomes larger, risk declines as $\exp(- \alpha G t).$

This is somewhat similar to Stuart’s and my paper on indefinite survival using backups: when we grow fast enough there is a finite chance of surviving indefinitely. The growth may be in terms of individuals (making humanity more resilient to larger and larger disasters), or in terms of independent groups (making humanity more resilient to disasters affecting a location). If risks change in size in proportion to population or occur in different locations in a correlated manner this basic analysis may not apply.

# General cases

Overall, if there is a constant rate of risk, then we should expect exponential survival curves. If the rate grows or declines as a power $t^k$ of time, we get a Weibull distribution of time to extinction, which has a “stretched exponential” survival curve: $\exp(-t/ \lambda)^k.$

If we think of risk increasing from some original level to a new higher level, then the survival curve will essentially be piece-wise exponential with a more or less softly interpolating “knee”.

## Transition risks

A transition risk is essentially an impulse of hazard. We can treat it as a Dirac delta function with some weight $w$ at a certain time $t$, in which case it just reduces the survival curve so $\frac{S(\mathrm{after }t)}{S(\mathrm{before }t)}=w$. If $t$ is randomly distributed it produces a softer decline, but with the same magnitude.

## Rectangular survival curves

Human individual survival curves are rectangularish because of exponentially increasing hazard plus some constant hazard (the Gompertz-Makeham law of mortality). The increasing hazard is due to ageing: old people are more vulnerable than young people.

Do we have any reason to believe a similar increasing hazard for humanity? Considering the invention of new dangerous technologies as adding more state risk we should expect at least enough of an increase to get a more convex shape of the survival curve in the present era, possibly with transition risk steps added in the future. This was counteracted by the exponential growth of human population until recently.

## How do species survival curves look in nature?

There is “van Valen’s law of extinction” claiming the normal extinction rate remains constant at least within families, finding exponential survivorship curves (van Valen 1973). It is worth noting that the extinction rate is different for different ecological niches and types of organisms.

However, fits with Weibull distributions seem to work better for Cenozoic foraminifera than exponentials (Arnold, Parker and Hansard 1995), suggesting the probability of extinction increases with species age. The difference in shape is however relatively small (k≈1.2), making the probability increase from 0.08/Myr at 1 Myr to 0.17/Myr at 40 Myr. Other data hint at slightly slowing extinction rates for marine plankton (Cermeno 2011).

In practice there are problems associated with speciation and time-varying extinction rates, not to mention biased data (Pease 1988). In the end, the best we can say at present appears to be that natural species survival is roughly exponentially distributed.

# Conclusions for xrisk research

Survival curves contain a lot of useful information. The median lifespan is easy to read off by checking the intersection with the 50% survival line. The life expectancy is the area under the curve.

In a semilog-diagram an exponentially declining survival probability is a line with negative slope. The slope is set by the hazard rate. Changes in hazard rate makes the line a series of segments.
An early reduction in hazard (i.e. the line slope becomes flatter) clearly improves the outlook at a later time more than a later equal improvement: to have a better effect the late improvement needs to reduce hazard significantly more.

A transition risk causes a vertical displacement of the line (or curve) downwards: the weight determines the distance. From a given future time, it does not matter when the transition risk occurs as long as the subsequent hazard rate is not dependent on it. If the weight changes depending on when it occurs (hardware overhang, technology ordering, population) then the position does matter. If there is a risky transition that reduces state risk we should want it earlier if it does not become worse.

### Acknowledgments

Thanks to Toby Ord for pointing out a mistake in an earlier version.

# Appendix: survival analysis

The main object of interest is the survival function $S(t)=\Pr(T>t)$ where $T$ is a random variable denoting the time of death. In engineering it is commonly called reliability function. It is declining over time, and will approach zero unless indefinite survival is possible with a finite probability.

The event density $f(t)=\frac{d}{dt}(1-S(t))$ denotes the rate of death per unit time.

The hazard function $\lambda(t)$ is the event rate at time $t$ conditional on survival until time $t$ or later. It is $\lambda(t) = - S'(t)/S(t)$. Note that unlike the event density function this does not have to decline as the number of survivors gets low: this is the overall force of mortality at a given time.

The expected future lifetime given survival to time $t_0$ is $\frac{1}{S(t_0)}\int_{t_0}^\infty S(t)dt.$ Note that for exponential survival curves (i.e. constant hazard) it remains constant.

# The case for Mars

On practical Ethics I post about the goodness of being multi-planetary: is it rational to try to settle Mars as a hedge against existential risk?

The problem is not that it is absurd to care about existential risks or the far future (which was the Economist‘s unfortunate claim), nor that it is morally wrong to have a separate colony, but that there might be better risk reduction strategies with more bang for the buck.

One interesting aspect is that making space more accessible makes space refuges a better option. At some point in the future, even if space refuges are currently not the best choice, they may well become that. There are of course other reasons to do this too (science, business, even technological art).

So while existential risk mitigation right now might rationally aim at putting out the current brushfires and trying to set the long-term strategy right, doing the groundwork for eventual space colonisation seems to be rational.

# What makes a watchable watchlist?

Stefan Heck managed to troll a lot of people into googling “how to join ISIS”. Very amusing, and now a lot of people think they are on a NSA watchlist.

This kind of prank is of course by why naive keyword-based watch lists are total failures. One prank and it gets overloaded. I would be shocked if any serious intelligence agency actually used them for real. Given that people’s Facebook likes give pretty good predictions of who they are (indeed, better than many friends know them) there are better methods if you happen to be a big intelligence agency.

Still, while text and other online behavior signal a lot about a person, it might not be a great tool for making proper watchlists since there is a lot of noise. For example, this paper extracts personality dimensions from online texts and looks at civilian mass murderers. They state:

Using this ranking procedure, it was found that all of the murderers’ texts were located within the highest ranked 33 places. It means that using only two simple measures for screening these texts, we can reduce the size of the population under inquiry to 0.013% of its original size, in order to manually identify all of the murderers’ texts.

At first, this sounds great. But for the US, that means the watchlist for being a mass murderer would currently have 41,000 entries. Given that over the past 150 years there has been about 150 mass murders in the US, this suggests that the precision is not going to be that great – most of those people are just normal people. The base rate problem crops up again and again when trying to find rare, scary people.

The deep problem is that there is not enough positive data points (the above paper used seven people) to make a reliable algorithm. The same issue cropped up with NSA’s SKYNET program – they also had seven positive examples and hundreds of thousands of negatives, and hence had massive overfitting (suggesting the Islamabad Al Jazeera bureau chief was a prime Al Qaeda suspect).

## Rational watchlists

The rare positive data point problem strikes any method, no matter what it is based on. Yes, looking at the social network around people might give useful information, but if you only have a few examples of bad people the system will now pick up on networks like the ones they had. This is also true for human learning: if you look too much for people like the ones that in the past committed attacks, you will focus too much on people like them and not enemies that look different. I was told by an anti-terrorism expert about a particular sign for veterans of Afghan guerrilla warfare: great if and only if such veterans are the enemy, but rather useless if the enemy can recruit others. Even if such veterans are a sizable fraction of the enemy the base rate problem may make you spend your resources on innocent “noise” veterans if the enemy is a small group. Add confirmation bias, and trouble will follow.

Note that actually looking for a small set of people on the watchlist gets around the positive data point problem: the system can look for them and just them, and this can be made precise. The problem is not watching, but predicting who else should be watched.

The point of a watchlist is that it represents a subset of something (whether people or stocks) that merits closer scrutiny. It should essentially be an allocation of attention towards items that need higher level analysis or decision-making. The U.S. Government’s Consolidated Terrorist Watch List requires nomination from various agencies, who presumably decide based on reasonable criteria (modulo confirmation bias and mistakes). The key problem is that attention is a limited resource, so adding extra items has a cost: less attention can be spent on the rest.

This is why automatic watchlist generation is likely to be a bad idea, despite much research. Mining intelligence to help an analyst figure out if somebody might fit a profile or merit further scrutiny is likely more doable. As long as analyst time is expensive it can easily be overwhelmed if something fills the input folder: HUMINT is less likely to do it than SIGINT, even if the analyst is just doing the preliminary nomination for a watchlist.

## The optimal Bayesian watchlist

One can analyse this in a Bayesian framework: assume each item has a value $x_i$ distributed as $f(x_i)$. The goal of the watchlist is to spend expensive investigatory resources to figure out the true values; say the cost is 1 per item. Then a watchlist of randomly selected items will have a mean value $V=E[x]-1$. Suppose a cursory investigation costing much less gives some indication about $x_i$, so that it is now known with some error: $y_i = x_i+\epsilon$. One approach is to select all items above a threshold $\theta$, making $V=E[x_i|y_i<\theta]-1$.

If we imagine that everything is Gaussian $x_i \sim N(\mu_x,\sigma_x^2), \epsilon \sim N(0,\sigma_\epsilon^2)$, then  $V=\int_\theta^\infty t \phi(\frac{t-\mu_x}{\sigma_x}) \Phi\left(\frac{t-\mu_x}{\sqrt{\sigma_x^2+\sigma_\epsilon^2}}\right)dt$. While one can ram through this using Owen’s useful work, here is a Monte Carlo simulation of what happens when we use $\mu_x=0, \sigma_x^2=1, \sigma_\epsilon^2=1$ (the correlation between x and y is 0.707, so this is not too much noise):

Note that in this case the addition of noise forces a far higher threshold than without noise (1.22 instead of 0.31). This is just 19% of all items, while in the noise-less case 37% of items would be worth investigating. As noise becomes worse the selection for a watchlist should become stricter: a really cursory inspection should not lead to insertion unless it looks really relevant.

Here we used a mild Gaussian distribution. In term of danger, I think people or things are more likely to be lognormal distributed since it is a product of many relatively independent factors. Using lognormal x and y leads to a situation where there is a maximum utility for some threshold. This is likely a problematic model, but clearly the shape of the distributions matter a lot for where the threshold should be.

Note that having huge resources can be a bane: if you build your watchlist from the top priority down as long as you have budget or manpower, the lower priority (but still above threshold!) entries will be more likely to be a waste of time and effort. The average utility will decline.

## Predictive validity matters more?

In any case, a cursory and cheap decision process is going to give so many so-so evaluations that one shouldn’t build the watchlist on it. Instead one should aim for a series of filters of increasing sophistication (and cost) to wash out the relevant items from the dross.

But even there there are pitfalls, as this paper looking at the pharma R&D industry shows:

We find that when searching for rare positives (e.g., candidates that will successfully complete clinical development), changes in the predictive validity of screening and disease models that many people working in drug discovery would regard as small and/or unknowable (i.e., an 0.1 absolute change in correlation coefficient between model output and clinical outcomes in man) can offset large (e.g., 10 fold, even 100 fold) changes in models’ brute-force efficiency.

Just like for drugs (an example where the watchlist is a set of candidate compounds), it might be more important for terrorist watchlists to aim for signs with predictive power of being a bad guy, rather than being correlated with being a bad guy. Otherwise anti-terrorism will suffer the same problem of declining productivity, despite ever more sophisticated algorithms.

# The hazard of concealing risk

Review of Man-made Catastrophes and Risk Information Concealment: Case Studies of Major Disasters and Human Fallibility by Dmitry Chernov and Didier Sornette (Springer).

I have recently begun to work on the problem of information hazards: when spreading true information is causing danger. Since we normally regard information as a good thing this is a bit unusual and understudied, and in the case of existential risk it is important to get things right at the first try.

However, concealing information can also produce risk. This book is an excellent series of case studies of major disasters, showing how the practice of hiding information contributed to make them possible, worse, and hinder rescue/recovery.

Chernov and Sornette focus mainly on technological disasters such as the Vajont Dam, Three Mile Island, Bhopal, Chernobyl, the Ufa train disaster, Fukushima and so on, but they also cover financial disasters, military disasters, production industry failures and concealment of product risk. In all of these cases there was plentiful concealment going on at multiple levels, from workers blocking alarms to reports being classified or deliberately mislaid to active misinformation campaigns.

When summed up, many patterns of information concealment recur again and again. They sketch out a model of the causes of concealment, with about 20 causes grouped into five major clusters: the external environment enticing concealment, risk communication channels blocked, an internal ecology stimulating concealment or ignorance, faulty risk assessment and knowledge management, and people having personal incentives to conceal.

The problem is very much systemic: having just one or two of the causative problems can be counteracted by good risk management, but when several causes start to act together they become much harder to deal with – especially since many corrode the risk management ability of the entire organisation. Once risks are hidden, it becomes harder to manage them (management, after all, is done through information). Conversely, they list examples of successful risk information management: risk concealment may be something that naturally tends to emerge, but it can be counteracted.

Chernov and Sornette also apply their model to some technologies they think show signs of risk concealment: shale energy, GMOs, real debt and liabilities of the US and China, and the global cyber arms race. They are not arguing that a disaster is imminent, but the patterns of concealment are a reason for concern: if they persist, they have potential to make things worse the day something breaks.

Is information concealment the cause of all major disasters? Definitely not: some disasters are just due to exogenous shocks or surprise failures of technology. But as Fukushima shows, risk concealment can make preparation brittle and handling the aftermath inefficient. There is also likely plentiful risk concealment in situations that will never come to attention because there is no disaster necessitating and enabling a thorough investigation. There is little to suggest that the examined disasters were all uniquely bad from a concealment perspective.

From an information hazard perspective, this book is an important rejoinder: yes, some information is risky. But lack of information can be dangerous too. Many of the reasons for concealment like national security secrecy, fear of panic, prevention of whistle-blowing, and personnel being worried about personally being held accountable for a serious fault are maladaptive information hazard management strategies. The worker not reporting a mistake is handling a personal information hazard, at the expense of the safety of the entire organisation. Institutional secrecy is explicitly intended to contain information hazards, but tends to compartmentalize and block relevant information flows.

A proper information hazard management strategy needs to take the concealment risk into account too: there is a risk cost of not sharing information. How these two risks should be rationally traded against each other is an important question to investigate.

# All models are wrong, some are useful – but how can you tell?

Our whitepaper about the systemic risk of risk modelling is now out. The topic is how the risk modelling process can make things worse – and ways of improving things. Cognitive bias meets model risk and social epistemology.

The basic story is that in insurance (and many other domains) people use statistical models to estimate risk, and then use these estimates plus human insight to come up with prices and decisions. It is well known (at least in insurance) that there is a measure of model risk due to the models not being perfect images of reality; ideally the users will take this into account. However, in reality (1) people tend to be swayed by models, (2) they suffer from various individual and collective cognitive biases making their model usage imperfect and correlates their errors, (3) the markets for models, industrial competition and regulation leads to fewer models being used than there could be. Together this creates a systemic risk: everybody makes correlated mistakes and decisions, which means that when a bad surprise happens – a big exogenous shock like a natural disaster or a burst of hyperinflation, or some endogenous trouble like a reinsurance spiral or financial bubble – the joint risk of a large chunk of the industry failing is much higher than it would have been if everybody had had independent, uncorrelated models. Cue bailouts or skyscrapers for sale.

Note that this is a generic problem. Insurance is just unusually self-aware about its limitations (a side effect of convincing everybody else that Bad Things Happen, not to mention seeing the rest of the financial industry running into major trouble). When we use models the model itself (the statistics and software) is just one part: the data fed into the model, the processes of building and tuning the model, how people use it in their everyday work, how the output leads to decisions, and how the eventual outcomes become feedback to the people involved – all of these factors are important parts in making model use useful. If there is no or too slow feedback people will not learn what behaviours are correct or not. If there are weak incentives to check errors of one type, but strong incentives for other errors, expect the system to become biased towards one side. It applies to climate models and military war-games too.

The key thing is to recognize that model usefulness is not something that is directly apparent: it requires a fair bit of expertise to evaluate, and that expertise is also not trivial to recognize or gain. We often compare models to other models rather than reality, and a successful career in predicting risk may actually be nothing more than good luck in avoiding rare but disastrous events.

What can we do about it? We suggest a scorecard as a first step: comparing oneself to some ideal modelling process is a good way of noticing where one could find room for improvement. The score does not matter as much as digging into one’s processes and seeing whether they have cruft that needs to be fixed – whether it is following standards mindlessly, employees not speaking up, basing decisions on single models rather than more broad views of risk, or having regulators push one into the same direction as everybody else. Fixing it may of course be tricky: just telling people to be less biased or to do extra error checking will not work, it has to be integrated into the organisation. But recognizing that there may be a problem and getting people on board is a great start.

In the end, systemic risk is everybody’s problem.

# Dampening theoretical noise by arguing backwards

Science has the adorable headline Tiny black holes could trigger collapse of universe—except that they don’t, dealing with the paper Gravity and the stability of the Higgs vacuum by Burda, Gregory & Moss. The paper argues that quantum black holes would act as seeds for vacuum decay, making metastable Higgs vacua unstable. The point of the paper is that some new and interesting mechanism prevents this from happening. The more obvious explanation that we are already in the stable true vacuum seems to be problematic since apparently we should expect a far stronger Higgs field there. Plenty of theoretical issues are of course going on about the correctness and consistency of the assumptions in the paper.

# Don’t mention the war

What I found interesting is the treatment of existential risk in the Science story and how the involved physicists respond to it:

Moss acknowledges that the paper could be taken the wrong way: “I’m sort of afraid that I’m going to have [prominent theorist] John Ellis calling me up and accusing me of scaremongering.

Ellis is indeed grumbling a bit:

As for the presentation of the argument in the new paper, Ellis says he has some misgivings that it will whip up unfounded fears about the safety of the LHC once again. For example, the preprint of the paper doesn’t mention that cosmic-ray data essentially prove that the LHC cannot trigger the collapse of the vacuum—”because we [physicists] all knew that,” Moss says. The final version mentions it on the fourth of five pages. Still, Ellis, who served on a panel to examine the LHC’s safety, says he doesn’t think it’s possible to stop theorists from presenting such argument in tendentious ways. “I’m not going to lose sleep over it,” Ellis says. “If someone asks me, I’m going to say it’s so much theoretical noise.” Which may not be the most reassuring answer, either.

There is a problem here in that physicists are so fed up with popular worries about accelerator-caused disasters – worries that are often second-hand scaremongering that takes time and effort to counter (with marginal effects) – that they downplay or want to avoid talking about things that could feed the worries. Yet avoiding topics is rarely the best idea for finding the truth or looking trustworthy. And given the huge importance of existential risk even when it is unlikely, it is probably better to try to tackle it head-on than skirt around it.

# Theoretical noise

“Theoretical noise” is an interesting concept. Theoretical physics is full of papers considering all sorts of bizarre possibilities, some of which imply existential risks from accelerators. In our paper Probing the Improbable we argue that attempts to bound accelerator risks have problems due to the non-zero probability of errors overshadowing the probability they are trying to bound: an argument that there is zero risk is actually just achieving the claim that there is about 99% chance of zero risk, and 1% chance of some risk. But these risk arguments were assumed to be based on fairly solid physics. Their errors would be slips in logic, modelling or calculation rather than being based on an entirely wrong theory. Theoretical papers are often making up new theories, and their empirical support can be very weak.

An argument that there is some existential risk with probability P actually means that, if the probability of the argument is right is Q, there is risk with probability PQ plus whatever risk there is if the argument is wrong (which we can usually assume to be close to what we would have thought if there was no argument in the first place) times 1-Q. Since the vast majority of theoretical physics papers never go anywhere, we can safely assume Q to be rather small, perhaps around 1%. So a paper arguing for P=100% isn’t evidence the sky is falling, merely that we ought to look more closely to a potentially nasty possibility that is likely to turn into a dud. Most alarms are false alarms.

However, it is easier to generate theoretical noise than resolve it. I have spent some time working on a new accelerator risk scenario, “dark fire”, trying to bound the likelihood that it is real and threatening. Doing that well turned out to be surprisingly hard: the scenario was far more slippery than expected, so ruling it out completely turned out to be very hard (don’t worry, I think we amassed enough arguments to show the risk to be pretty small). This is of course the main reason for the annoyance of physicists: it is easy for anyone to claim there is risk, but then it is up to the physics community to do the laborious work of showing that the risk is small.

The vacuum decay issue has likely been dealt with by the Tegmark and Bostrom paper: were the decay probability high we should expect to be early observers, but we are fairly late ones. Hence the risk per year in our light-cone is small (less than one in a billion). Whatever is going on with the Higgs vacuum, we can likely trust it… if we trust that paper. Again we have to deal with the problem of an argument based on applying anthropic probability (a contentious subject where intelligent experts disagree on fundamentals) to models of planet formation (based on elaborate astrophysical models and observations): it is reassuring, but it does not reassure as strongly as we might like. It would be good to have a few backup papers giving different arguments bounding the risk.

# Backward theoretical noise dampening?

The lovely property of the Tegmark and Bostrom paper is that it covers a lot of different risks with the same method. In a way it handles a sizeable subset of the theoretical noise at the same time. We need more arguments like this. The cosmic ray argument is another good example: it is agnostic on what kind of planet-destroying risk is perhaps unleashed from energetic particle interactions, but given the past number of interactions we can be fairly secure (assuming we patch its holes).

One shared property of these broad arguments is that they tend to start with the risky outcome and argue backwards: if something were to destroy the world, what properties does it have to have? Are those properties possible or likely given our observations? Forward arguments (if X happens, then Y will happen, leading to disaster Z) tend to be narrow, and depend on our model of the detailed physics involved.

While the probability that a forward argument is correct might be higher than the more general backward arguments, it only reduces our concern for one risk rather than an entire group. An argument about why quantum black holes cannot be formed in an accelerator is limited to that possibility, and will not tell us anything about risks from Q-balls. So a backwards argument covering 10 possible risks but just being half as likely to be true as a forward argument covering one risk is going to be more effective in reducing our posterior risk estimate and dampening theoretical noise.

In a world where we had endless intellectual resources we would of course find the best possible arguments to estimate risks (and then for completeness and robustness the second best argument, the third, … and so on). We would likely use very sharp forward arguments. But in a world where expert time is at a premium and theoretical noise high we can do better by looking at weaker backwards arguments covering many risks at once. Their individual epistemic weakness can be handled by making independent but overlapping arguments, still saving effort if they cover many risk cases.

Backwards arguments also have another nice property: they help dealing with the “ultraviolet cut-off problem“. There is an infinite number of possible risks, most of which are exceedingly bizarre and a priori unlikely. But since there are so many of them, it seems we ought to spend an inordinate effort on the crazy ones, unless we find a principled way of drawing the line. Starting from a form of disaster and working backwards on probability bounds neatly circumvents this: production of planet-eating dragons is among the things covered by the cosmic ray argument.

Risk engineers will of course recognize this approach: it is basically a form of fault tree analysis, where we reason about bounds on the probability of a fault. The forward approach is more akin to failure mode and effects analysis, where we try to see what can go wrong and how likely it is. While fault trees cannot cover every possible initiating problem (all those bizarre risks) they are good for understanding the overall reliability of the system, or at least the part being modelled.

Deductive backwards arguments may be the best theoretical noise reduction method.