What did I learn about the singularity during our track at ECAP10? Anna Salamon pressed me into trying to answer this good question. First, an overview of the proceedings:
Amnon Eden started out by trying to clear up the concepts, or at least show where concepts need to be cleared up. His list of central questions included:
Looking through evidence in theoretical computer science, real computer science, biology and human communities her conclusion was that intelligence is at least somewhat intelligible - it doesn't seem to rely just on accumulating domain-specific tricks, it seems to have a few general and likely relatively simple modules that are extendible. Overall, a good start. As she said, now we need to look at whether we can formally quantify the question and gather more evidence. It actually looks possible.
She made the point that many growth curves (in technology) look continuous rather than stair step-like, which suggests they are due to progress on unintelligible systems (an accumulation of many small hacks). It might also be that systems have an intelligibility spectrum: modules on different levels are differently difficult, and while there might be smooth progress on one level other levels might be resistant (for example, neurons are easier to figure out than cortical microcircuits). This again has bearing on the WBE problem: at what level does intelligibility, the ability to get the necessary data and having enough computer power intersect first? Depending on where, the result might be very different (Joscha Bach and me had a big discussion on whether 'generic brains'/brain-based AI (Joscha's view) or individual brains (my view) would be the first outcome of WBE, with professor Günther Palm arguing for brain-inspired AI).
Joscha Bach argued that there were four preconditions for reaching an AI singularity:
His key claim was that these functional requirements are orthogonal to architecture of actual implementations, and hence AI singularity is not automatically a consequence of having AI.
I think this claim is problematic: 1 (and maybe parts of 3) is essentially implied by any real progress in AI. But I think he clarified a set of important assumptions, and if all these preconditions are necessary it is enough that one of them is not met for an AI singularity not to happen. Refining this a bit further might be really useful.
He also made a very important point that is often overlooked: the threat/promise lies not in the implementation but is a functional one. We should worry about intelligent agents pursuing a non-human agenda that are self improving and self-extending. Many organisations come close. Just because they are composed of humans doesn't mean they work for the interest of those humans. I think he is right on the money that we should watch for the possibility of an organisational singularity, especially since AI or other technology might provide further enhancement of the preconditions above even when the AI itself is not enough to go singular.
Kaj Sotala talked about factors that give a system high "optimization power"/intelligence. Calling it optimization power has the benefit of discouraging anthropomorphizing, but it might miss some of the creative aspects of intelligence. He categorised them into: 1) Hardware advantages: faster serial processing, faster parallel processing, superior working memory equivalent. 2) Self improvement and architectural advantages: the ability to modify itself, overcome biased reasoning, algorithms for formally collect reasoning and adding new modules such as fully integrating complex models. 3) Software advantages: copyability, improved communication bandwidth, speed etc. Meanwhile humans have various handicaps, ranging from our clunky hardware to our tendency to model others by modelling on ourselves. So there are good reasons to think an artificial intelligence could achieve various optimization/intelligence advantages over humans relatively simply if it came into existence. Given the previous talk, his list is also interestingly functional rather than substrate-based. He concluded: "If you are building an AI, please be careful, please try to know what you are doing". Which nicely segues into the next pair of talks:
Joshua Fox argued that superintelligence does not imply benevolence. We cannot easily extrapolate nice moral trends among humans or human societies to (self)constructed intelligences with different cognitive architectures. He argued that instrumental morality is not guaranteed: reputations, the powers to monitor, punish and reward each other, and the economic incentives to cooperate are not reliable arguments for proving the benevolence of this kind of agents. Axiological morality depends on what the good truly is. Kant would no doubt argue that any sufficiently smart rational mind will discover moral principles making it benevolent. But looking at AIXI (an existence proof of superintelligence) suggests that there is no "room" for benevolence as a built in terminal value - to get a benevolent AIXI you need to put it into the utility function you plug in. A non-benevolent AIXI will not become benevolent unless it suits its overall goal, and it will not change its utility function to become more benevolent since preservation of goals is a stable equilibrium. Simple goals are too simple to subsume human welfare, while overly complex goals are unlikely to be benevolent. Reflective equilibrium
He concluded that if dispositions can be modified and verified, then weak AI can credibly commit to benevolence. If benevolent terminal values are built in then the AI will fight to protect them. But if AI advances abruptly, and does not have built in beneficence from the start, then benevolence does not follow from intelligence. We need to learn benevolence engineering and apply it.
I personally think instrumental morality is more reliable and robust than Joshua gave it credit for (e.g. things like comparative advantage), but this of course needs more investigation and might be contingent on factors such as whether fast intelligence explosions produce intelligence distributions that are discontinuous. Overall, it all links back to the original question: are intelligence explosions fast and abrupt? If they aren't, then benevolence can likely be achieved (if we are lucky) through instrumental means, the presence of some AIs engineered to be benevolent and the "rearing" of AI within the present civilization. But if they are sharp, then benevolence engineering (a replacement for 'friendliness theory'?) becomes essential - yet there are no guarantees it will be applied to the first systems to really improve themselves.
Mark R. Waser had a different view. He claims superintelligence implies moral behavior. Basically he argues that the top goal we have is "we want what we want". All morality is instrumental to this, and human morality is simply imperfect because it has evolved from emotional rules of thumb. Humans have a disparity of goals, but reasonably consensus on the morality of most actions (then ethicists and other smart people come in a confuse us about why, gaining social benefits). Basically ethics is instrumental, and cooperation gives us what we want while not preventing others from getting what they want. The right application of game theory a la the iterated prisoner's dilemma leads to a universal consistent goal system, and this will be benevolent.
I think he crossed the is-ought line a few times here. Overall, it was based more on assertion than a strict argument, although I think it can be refined into one. It clearly shows that the AI friendliness problem is getting to the stage where professional ethicists would be helpful - the hubris of computer scientists is needed to push forward into these tough metaethical matters, but it would help if the arguments were refined by people who actually know what they are doing.
Carl Shulman and me gave a talk about hardware vs. software as the bottleneck for intelligence explosions. See previous posting for details. Basically we argued that if hardware is the limiting factor we should see earlier but softer intelligence explosions than if software is hard to do, in which case we should see later, less expected and harder takeoffs.
There is an interesting interaction between intelligibility and this dynamics. Non-intelligible intelligence requires a lot of research and experimentation, slowing down progress and requiring significant amounts of hardware: a late breakthrough but likely slower. If intelligence is intelligible it does not follow that it is easy to figure out: in this case a late sharp transition is likely. Easy intelligence on the other hand gives us an early soft hardware scenario. So maybe unintelligible intelligence is a way to get a late soft singularity, and evidence for it (such as signs that there are just lots of evolved modules rather than overarching neural principles) should be seen as somewhat reassuring.
Scott Yim approached the whole subject from a future studies/sociology angle. According to "the central dogma of future studies" there is no future, we construct it. Many images of the future are a reaction to one's views on the uncertainty in life: it might be a rollercoaster - no control, moves in a deterministic manner (Kurzweil was mentioned), it might be like rafting - some control, a circumscribed path (Platt) or like sailing - ultimate control over where one is going (Ian Pearson).
While the talk itself didn't have much content I think Scott was right in bringing up the issue: what do our singularity models tell us about ourselves? And what kind of constructions do they suggest? In many ways singularity studies is about the prophecy that the human condition can be radically transformed, taking it from the more narrow individualistic view of transhumanism (where individuals can get transformed) to look at how the whole human system gets transformed. It might be implicit in posing the whole concept that we think singularities are in a sense desirable, even though they are also risky, and that this should guide our actions.
I also think it is important for singularity studies to try to find policy implications of their results. Even if it was entirely non-normative the field should (in order to be relevant) bring us useful information about our options and their likely impact. Today most advice is of course mostly of the "more research is needed" type, but at least meetings like this show that we are beginning to figure out *where* this research would be extra important.
The big thing was that it actually looks like one could create a field of "acceleration studies", dealing with Amnon's questions in a intellectually responsible manner. Previously we have seen plenty of handwaving, but some of that handwaving has now been distilled through internal debate and some helpful outside action to the stage where real hypotheses can be stated, evidence collected and models constructed. It is still a pre-paradigmatic field, which might be a great opportunity - we do not have any hardened consensus on How Things Are, and plenty of potentially useful disagreements.
Intelligibility seems to be a great target for study, together with a more general theory of how technological fields can progress. I got some ideas that are so good I will not write about them until I have tried to turn them into a paper (or blog post).
One minor realization was to recall Amdahl's law, which had really slipped my mind but seems quite relevant for updates of our paper. Overall, the taxonomies of preconditions and optimization power increases, as well as Carl's analysis of 'serial' versus 'parallel' parts of technology development, suggests that this kind of analysis could be extended to look for bottlenecks in singularities: they will at the very least be dominated by the least 'parallelisable' element of the system. Which might very well be humans in society-wide changes.
Posted by Anders3 at October 9, 2010 10:20 AM