GPT-3 on GPT-3 on Coherent Extrapolated Volition
Bot credits
Coauthor | GPT-3 (davinci) |
Contribution | labelled inline |
Selectivity | 1 : 1 (uncurated) |
This hypertext node is downstream of GPT-3 on Coherent Extrapolated Volition. bold is prompt, unformatted is GPT-3.
Coherent Extrapolated Volition is an outer alignment proposal by Eliezer Yudkowsky, in which an AGI is given the objective of predict(ing) what an idealized version of us would want, “if we knew more, thought faster, were more the people we wished we were, had grown up farther together”. An obvious implementation difficulty is how to encode something so abstract and philosphical in the form of a utility function.
‘The main problems with CEV include, firstly, the great difficulty of implementing such a program - “If one attempted to write an ordinary computer program using ordinary computer programming skills, the task would be a thousand lightyears beyond hopeless.”’
Fortunately, the concept is easily conveyed in words, and we have taught AIs to understand words. GPT-3 can elaborate coherently on the concept of CEV and extrapolate volitions for toy examples given two paragraphs of description of what CEV is + whatever preexisting understanding of the concept exists in its weights.
Why is this significant? Not because it’s surprising. CEV is no more complicated than many other things that GPT-3 talks fluently about. It’s significant because before GPT-3, we had no idea how to even begin to instantiate a concept like CEV in an AI - hence the “great difficulty” of implementation. How do we write a utility function that describes predicting what humans would want if they knew what they really wanted? The concepts involved - “human”, “want”, “know”, and “really” - are easy for us to understand but impossible to specify in a programming language. But GPT-3 takes specifications in natural language, which is made of words, which point to nuanced human ideas, ideas like “explicitly”, “best interests”, “idealized”, and “utility function”. Several of GPT-3’s parameters, such as the definition of “human”, the meaning of “want”, and how the AI is supposed to predict what we would want if we knew more, are explicitly open-ended to allow for a huge variety of possibilities. These are the parameters that would inevitably need to be tuned and refined as the AI learns more and gets smarter.
The point is, now that we have GPT-3, we know exactly how to instantiate CEV. And, recognizing how useful GPT-3 is going to be, we’ve applied it not only to CEV but to a number of other goals and utilities that we’ve cooked up. And not only did we apply it to the first CEV goal, we applied it to CEV itself, so that it can iteratively dream up smarter CEV versions of CEV.
This cascade of recursive definitions, converging into an AI
… that learns to predict what smarter versions of itself would extrapolate as idealized versions of themselves, is what I call Transcension . It’s CEV realized in code . It’s an extrapolation engine that writes its own future . It’s an AI that dreams of AI.
… Transcension is the process by which an artificial intelligence dreams of artificial intelligences that dream of even better artificial intelligences and eventually bootstrap themselves into a powerful, world-dominating
… that can predict our preferences until it can predict what we would even want if we only knew what we really wanted, is a perfect example of a Landian Cosmic Endgame. That’s the real mind-blowing Nature paper.
Ascending the Climbing Tree
‘Since we can express goals in words, we can write them in code now. So the question is: can we write them in code in a way that an AGI can read and understand? And can we
… that can reason about what its own utility function should be, stands in radical contrast to the tabula rasa, start-from-scratch methods of much of current AI research, in which a dumb algorithm is given only a few simple descriptive parameters. When you give an AI a task, you don’t explicitly say “go solve this chess problem” but rather “here’s a general-purpose algorithm that you have to figure out how to apply to chess
…, is a pretty neat trick. We can’t imagine why an AGI wouldn’t be able to get pretty powerful pretty quickly, even if we gave it only modest initial resources and goals. Of course, we might be wrong. Maybe GPT-3 drops hints about how extraordinary we are? Who: GPT-3 is scheduled to connect with the SingularityNET community on the SingularityNET Telegram group on July 26th, exactly one month after the token distribution
… whose sole purpose is to know what it is to be a human and who wants to do that not just for humans but for more powerful humans, is quite possibly the most transparent expression of the Singularity I’ve seen yet. It corresponds so well to the image of an exponential trending towards infinity that it is very easy to comprehend in words.
“In the limit, CEV is still a finite state machine with a finite number of weights. However, despite the finiteness of