A global collaboration of engineers, scientists, and other domain experts, funded by philanthropists,
building the Gold Standard of Artificial General Intelligence, for all of mankind.
(Well, that's the plan!)
If you haven't already, please read this introductory article before proceeding.
The Big Mother AGI Architecture (or Framework) comprises:
The Big Mother Technical Plan (or Roadmap), comprises 19 nominal Roadmap steps:
ASIDE: The machine is being designed/developed top-down (working backwards from the target behaviour). This means the current (3rd-level) design, although end-to-end, is abstract in nature (as well as a framework for collaboration), and will gradually be refined down to a concrete implementation, with information/detail being added at each refinement step. (In the same manner, the following informal description will gradually be refined into a more scientifically rigorous exposition).
There is a Big Mother workgroup for each of the 19 Roadmap steps G01-G02, A01-A06, B01-B05, and C01-C06.
Roadmap steps G01 and G02 pertain to both the INNER and OUTER COGNITION
G01 - Quality - Due to the unimaginable potential cost of getting AGI wrong, the project is quality-driven, rather than e.g. cost-driven. This group is responsible for all things quality-related, where quality means ensuring that the project achieves its stated objectives (as described elsewhere, including "good things happen, bad things don't", "maximally-safe, -benevolent, and -trustworthy", etc). In particular, safety must be an invariant property of the machine at each step of its construction (as it emerges from the roadmap), as well as in respect of any results that may be exploited commercially by the S06 Exploitation group (part of the Organisational Plan). The intention is that domain experts in this (G01) group will encompass general quality management, general best-practice hardware and software engineering, safety-critical systems (see e.g. Safety-Critical Systems Club), and of course AI/AGI safety (see e.g. Machine Intelligence Research Institute, Future of Humanity Institute, Future of Life Institute, Centre for the Study of Existential Risk, AI Alignment Forum). All potentially interested domain experts are invited to join the project.
G02 - Facilities - The project is broadly organised into three phases (i.e. we will at least notionally be going through the entire roadmap three times): (1) prototyping and evaluation, (2) working implementation, and (3) final implementation. There will be different quality requirements for each phase, e.g. in phase 3 everything (even remotely possible) needs to be formally proven (which then implies that only technologies that are sufficiently mathematically defined may be incorporated into the machine's phase 3 design). When we get to phase 3, group G02 will be responsible for designing and building the (highly secure and formally verified) supercomputer on top of which Big Mother 1.0 will be implemented (and this will be facilitated by phase 2 implementations of e.g. B02 program synthesis, B03 hardware synthesis, and B04 system synthesis). The intention is that domain experts in this (G02) group will encompass system architects, HPC (High Performance Computing) experts, chip designers, software engineers, security experts, etc. All potentially interested domain experts are invited to join the project.
HERE BEGINS THE INNER COGNITION (UoD = all of mathematics)
The primary focus of roadmap steps A01 to A06 is DEDUCTION
A01 - Universal Logic (UL-0) - The machine must necessarily maintain an internal model of its UoD (through which its three cognitive processes - induction, deduction, and abduction - will all communicate), and this internal model must necessarily be represented somehow. Because the machine is an AGI ("G" for "General"), this representation mechanism must be as "general" as possible, i.e. able to represent "absolutely anything". First-order logic with equality (FOLEQ) is foundational, i.e. able (with a little effort) to represent "all of mathematics", which for our purposes is sufficient. Thus UL-0 (the machine's internal representation mechanism) is essentially FOLEQ plus descriptions ("some X such that P(X)" and "the X such that P(X)") plus named-and-parameterised definitions. The intention is that domain experts in this (A01) group will encompass logicians and mathematicians. All potentially interested domain experts are invited to join the project.
A02 - Soundness & completeness - First-order logic with equality (FOLEQ) is both sound and complete (as will become clear at step A05, this was one of the reasons it was chosen for the project). UL-0 is intended to be merely a conservative extension of FOLEQ, and therefore also sound and complete. However, as it's notoriously easy to make some schoolboy error and break your logic, a quality control step is required. The objective of roadmap step A02 is therefore to formally prove, using a third-party system such as (say) Prover9, that UL-0 is still sound and complete (i.e that we haven't inadvertently broken FOLEQ). The intention is that domain experts in this (A02) group will encompass logicians and mathematicians. All potentially interested domain experts are invited to join the project.
A03 - Theorem proving - Basically, construct a (natural deduction) theorem prover for UL-0 called ULTP. This is the machine's first cognitive process (deduction), and phase 2 coding on this has already started (one of the reasons why I have zero time!) Theorem proving is a state space search problem, and therefore computationally expensive, and so various speedups will gradually be applied (here at step A03, and later at steps B01-04). The intention is that domain experts in this (A03) group will encompass logicians and (especially discrete) mathematicians, as well as automated reasoning experts, and machine learning experts. All potentially interested domain experts are invited to join the project.
ASIDE: The trick to any state space search problem is to use all of the information available (see, for example, George Polya's "How to Solve It", 1945, and Imre Lakatos' "Proofs and Refutations", 1976), including (1) object-level information derived directly from the problem description itself, (2) meta-level information derived from reasoning "about" the problem, rather than "in" the problem, and (3) statistical information derived from experience of attempting many instances of the problem (or similar problems) and identifying those search patterns (strategies) that tend to be most successful in specific situations. In respect of (3), the plan is to use contemporary neural-net-based machine learning techniques (possibly reinforcement learning), at least as a stop-gap solution.
A04 - NBG toolkit - FOLEQ is foundational, but it's also incredibly tedious to use for anything more than toy problems, as well as insufficiently expressive for our purposes. A first-order theory adds additional axioms (called proper axioms) to FOLEQ, thereby effectively giving semantics to specific function and relation symbols. In our case we will be adding the NBG set theory axioms (extensionality, pairing, class existence, regularity, sum set, power set, infinity, and limitation of size) to UL-0, thereby extending (and strengthening) UL-0 to UL-0-NBG, i.e. first-order NBG set theory (greatly preferable IMO to the much more widely known ZF (Zermelo–Fraenkel) set theory). At this point, given the definition mechanisms we incorporated into UL-0, it becomes possible to do significant mathematics. And our first job, mathematically speaking, is to define a mathematical toolkit on top of UL-0-NBG. Basically, this involves a lot of low-level mathematical definitions (e.g. we will be using Conway's Surreal Numbers for the basic number systems - Naturals, Integers, Rationals, Reals, Ordinals, and Cardinals, and from these we will be using the Cayley-Dixon construction to derive Complex numbers, Quaternions, Octonions, and Sedenions), and a lot of formal proofs (for which we will have the ULTP - I knew we wrote it for a reason!) As a minimum, we need to define sufficient mathematics to support step A05 (including transfinite induction and recursion), but the more mathematics we can hand-define the better (this will be especially apparent when we get to B01). The intention is that domain experts in this (A04) group will encompass logicians and (especially discrete) mathematicians, as well as set theorists. All potentially interested domain experts are invited to join the project.
A05 - Universal Logic (UL-N) - Once the mathematical toolkit is sufficiently well defined, we will have everything we need to define (in UL-0-NBG) the syntax, semantics, and proof rules of (essentially) UL-0. Relax! There's nothing circular going on here (no reason to invoke the "G" word!) What we end up with is (effectively) a clone of UL-0 (which we will call UL-1), with UL-0-NBG acting as its metalanguage. In fact, because all of our UL-0 definitions are also valid UL-1 definitions, what we actually get is an infinite stack of UL-Ns (UL-0, UL-1, UL-2, ...) together with a corresponding infinite stack of UL-N-NBGs (UL-0-NBG, UL-1-NBG, UL-2-NBG, ...), where each UL-N-NBG is metalanguage to UL-N+1:
UL-N has exactly the same expressive power as UL-0 (i.e. the meaning of any UL-N (N > 0) wff or term is simply defined by translation back into UL-(N-1)-NBG), but it's just easier to use (and think about) in relation to metamathematical concepts (which, even in everyday mathematics, abound). And so now, our machine (via ULTP) can "think" (deductively, at least) at both the object (e.g. UL-1) and meta (e.g. UL-0-NBG) levels (as well as at the meta-meta level, meta-meta-meta level, etc). This actually adds significant capability - for example, we can now express statements about the NBG definitions we constructed at A04 (such as "NBG definition X has property P(X)"), and then ask the ULTP (operating way down at the UL-0 level) if they are theorems. Also, thanks to soundness and completeness, when asked to prove a UL-N theorem, where N > 0, the ULTP is now able to switch back and forth between thinking (1) "in" the UL-N system (i.e. syntactically) and (2) "about" the UL-N system (i.e. semantically). In some cases, it will even be able to prove (metamathematically) that a proof of a putative UL-N (N > 0) theorem does not exist. (Yes, thinking at these different metamathematical levels can make your brain hurt a bit at first!) Finally, should it ever be the case that you simply cannot implement some desired algorithm (at some later roadmap step) without using some fancy logic L, then you can again use UL-N-NBG as the metalanguage in which to define the specific logic L that you need, and again you will be able to ask the ULTP, acting way down at the UL-0 level, to try to prove putative L theorems for you. The intention is that domain experts in this (A05) group will encompass logicians, and metamathematicians. All potentially interested domain experts are invited to join the project.
A06 - Uncertainty - When we build the machine's outer cognition (steps C01 to C06), thereby connecting the deductive-abductive inner cognition to the real world, everything (i.e. all belief pertaining to the physical universe) becomes uncertain (i.e. a guess). And so at this step (A06) we define probability (and possibly other uncertainty measures) on top of the NBG toolkit (not by adding any further axioms, but just via definitions based on the relevant axioms). Once we have defined probability in this way, we will be able to state (and ask the ULTP to prove) probabilistic theorems. This step has been delayed until after A05 because the Cox-Jaynes axioms (which are higher-level) might I suspect be more suitable for AGI than the traditional Kolmogorov axioms (but I also suspect we'll define both). The intention is that domain experts in this (A06) group will encompass mathematicians and statisticians. All potentially interested domain experts are invited to join the project.
The primary focus of roadmap steps B01 to B05 is ABDUCTION
B01 - Witness synthesis - This is effectively generalised abduction ("find X such that P(X)"), implemented using state space search, and thus constitutes the machine's abduction cognitive process; the easiest way to think about it is that witness synthesis is a generalisation of theorem-proving, where the "search rules" (possible state space search steps) are derived automatically via metamathematical analysis (c/o the new metamathematical theorem-proving abilities of the ULTP thanks to step A05) of the available stock of UL definitions and theorems (at this point, all those things we carefully hand-crafted at steps A04 and A06). So, if you want to get witness synthesis to for example "find X such that X is a GCL program with precondition Y and postcondition Z", you first have to define the semantics of GCL computation (Dijktra's Guarded Command Language) mathematically (as a collection of UL definitions), and then the witness synthesis algorithm will attempt to find (via metamathematically-powered state space search) GCL programs for you. (Note that we will apply the same speedups at step B01 as we did at step A03, including contemporary machine learning.) This is obviously an extremely general mechanism, because P(X) can basically be anything (and, thanks to A05, can also encompass metamathematical concepts, such as "find X such that X is a wff that implies wff Y", as well as object level ones). This is possibly a new (Big Mother specific) research area.The intention is that domain experts in this (B01) group will encompass mathematicians and computer scientists. All potentially interested domain experts are invited to join the project.
ASIDE #1: Witness synthesis will return the first X satisfying P(X) that it finds (via state-space-search) given the time and other resources available to it (for example, the calling function might impose a 5 second timeout). Given the fact that (thanks to A05) witness synthesis can "think" metamathematically, three outcomes are possible: (1) it finds a suitable X within the specified time, (2) it fails to find a suitable X within the specified time (and in this case returns nothing), or (3) it finds, within the specified time, a proof that no such X exists. In cases (1) or (2), the calling function can resubmit the request, either allowing additional state-space-search time and/or strengthening the target constraint so as to (hopefully) obtain some Y such that P(Y) and Y is "better" than X according to some specified ordering; the calling function can obviously resubmit a request as many times as it likes. Used in this way, witness synthesis will obtain the best (not-necessarily-optimal) result that it can given the time and other resources available.
ASIDE #2: The ULTP we constructed at step A03 was just a "starter motor" - we needed it in order to be able to construct the NBG toolkit at step A04, which we needed in order to be able to "go metamathematical" at step A05, which we needed in order to make witness synthesis possible at step B01. But now, in principle at least, we can replace the ULTP with "find X such that X is a proof of putative theorem Y".
B02 - Program synthesis - Basically, witness synthesis applied to the automated synthesis of computer programs (including their mathematical proofs of correctness). This is the only way, in actual practice, that we will be able to produce the formally verified software implementations required by the G02 Facilities group in phase 3 - doing so manually is simply beyond human ability. Wondering how to get a foothold on this problem? Have a look here, here, and here. The intention is that domain experts in this (B02) group will encompass mathematicians and computer scientists. All potentially interested domain experts are invited to join the project.
B03 - Hardware synthesis - Basically, witness synthesis applied to the automated synthesis of digital logic designs (including their mathematical proofs of correctness). This is the only way, in actual practice, that we will be able to produce the formally verified hardware implementations required by the G02 Facilities group in phase 3 - doing so manually is simply beyond human ability. Wondering how to get a foothold on this problem? Have a look here and here. The intention is that domain experts in this (B03) group will encompass mathematicians, computer scientists, and digital logic designers. All potentially interested domain experts are invited to join the project.
B04 - System synthesis - Basically, witness synthesis applied to the automated synthesis of mixed hardware/software systems (including their mathematical proofs of correctness) - ideally including analog as well as digital, and possibly even novel/esoteric technologies such as quantum computation. The intention is that domain experts in this (B04) group will encompass mathematicians, computer scientists, digital logic designers, electronics engineers, quantum computing experts, and process algebra experts (see e.g. CSP, timed CSP, and timed probabilistic CSP). All potentially interested domain experts are invited to join the project.
ASIDE: According to Integrated Information Theory (which we adopt as a working model of consciousness, even if it might not be the final word on the subject), AGI consciousness appears to be a design choice. In order words, if we design our machine, and specifically its hardware architecture, such that the calculated value is always = 0 then it will not be conscious, and if can be > 0 then it will (at least sometimes) be conscious, the "quantity" of its instantaneous conscious experience will be proportional to , and the "quality" of its instantaneous conscious experience will correspond to a calculated value called MICS.
I feel very strongly that an AGI such as Big Mother should not be (significantly) conscious, i.e. should be 0 (or at least as close to 0 as possible). There are two reasons for this design choice: (1) SAFETY: if the machine is conscious, i.e. sentient, then that means that it has conscious experience, and therefore feelings (qualia), and these feelings could potentially motivate the machine to behave in ways that are in conflict with its dominant goal; (2) ETHICS: if the machine is conscious, then that means that we will have created a sentient being, and then (by virtue of its design) effectively enslaved it - this is simply wrong.
Thus one of the design constraints of the B03/B04 synthesis systems is that should be 0 for any synthesised design.
B05 - Auto-refactoring - Prior to this point, all the hardware and software comprising the machine-under-development will have been developed by hand (i.e. humans). At this point, assuming that a sufficient level of performance has been achieved (and, if not, we simply do more work until it has), we can use the algorithms developed at steps B01-B04 to refactor the machine's implementation, i.e. the machine will redesign its own implementation at the digital register-transfer and software source code levels, and (again assuming that sufficient B01-B04 capability has been achieved) these machine-generated implementations (both hardware and software) will be faster, more massively parallel, more novel, more time-and-space efficient (ideally even more energy efficient) than the previous hand-developed versions. (We basically need every ounce of compute at steps C01 onwards, and this is one way to get it - if you've ever seen one of these synthesis systems working, they are able to generate designs that no mere human could ever think of!) Most importantly, the synthesised hardware and software generated at this step will be accompanied by the formal proofs of correctness required by phase 3. Given the complexity of the hardware/software system in question, it would be infeasible for humans to generate such proofs of correctness. The intention is that domain experts in this (B04) group will encompass computer scientists. All potentially interested domain experts are invited to join the project.
ASIDE: Functionally, now that we've managed to construct it, witness synthesis is the only algorithm we will need for AGI. Anything we need from this point on can either be implemented via witness synthesis, or synthesised via witness synthesis.
HERE BEGINS THE OUTER COGNITION (UoD = all of mathematics + the entire physical universe)
The primary focus of roadmap steps C01 to C06 is INDUCTIONC01 - Belief synthesis (UBT) - UBT stands for "Unified Belief Theory", and is the mechanism through which the machine continually observes its UoD and, from those observations, constructs an internal model of it. This internal model comprises a set of UL-N-NBG theorems constituting assertions (beliefs) about the UoD (such as "pattern X exists at location Y (within the percept history) with uncertainty Z"). In other words, this is the machine's induction cognitive process: it observes the UoD and constructs a belief system (set of UL-N-NBG theorems about its percept history) from those observations. Please note that the AGI learning (i.e. induction) problem is very different from the contemporary machine learning problem, and the extent to which contemporary machine learning systems (neural nets etc) will form a component of the final (necessarily formally verified) UBT implementation, or not, is not yet clear. As per B01, this is possibly a new (Big Mother specific) research area. The intention is that domain experts in this (C01) group will encompass mathematicians, statisticians, data scientists, and computer scientists. All potentially interested domain experts are invited to join the project.
ASIDE #1: Our primary focus here is on the external physical universe [component of the machine's UoD], but note that it's also possible for the machine to observe its own internal operation, such as the deduction and/or abduction cognitive processes.
ASIDE #2: For those of you thinking "Wait, I need more details about how C01/UBT works!", please be reassured that there is a tentative algorithm, I'd just prefer it to come into slightly better focus before sharing - please be patient! :-)
ASIDE #3: I suspect that UBT will be better at detecting successful search strategies (patterns) than the contemporary stop-gap solutions applied at steps A03, B01, B02, B03, and B04; if so we should just be able to swap out the stop-gap solutions for UBT.
C02 - Basic senses - At this point, assuming that all prior roadmap steps have been implemented to the required level of performance (and, again, if not, we simply do more work until they have), we will have a super-intelligent machine (super-intelligent induction (C01/UBT), super-intelligent deduction (A03/A05), and super-intelligent abduction (B01)), at least, that is, in respect of the UoD of "all of mathematics", but it won't actually know anything yet (about the real world, at least) - it's internal belief system (in respect of the external physical universe) will be tabula rasa: it's basically an AGI infant - so we need to send it to school. We start, in this roadmap step, by exposing it to carefully curated experiences (lessons) having the net effect of teaching it (c/o C01/UBT, i.e. induction) how to see, hear, smell, etc (plus whatever other basic senses we may wish to add at this point, e.g. infra-red). As long as C01/UBT works as intended (this is critical!), the machine will recognise the multi-level patterns in the data and construct a multi-level internal model of the universe (i.e. a belief system) that accurately encapsulates those multi-level patterns (including all their complexities, all their subtleties, and all their nuances). This belief system, being expressed in UL-N-NBG, will then immediately be amenable to both deduction and abduction, as constructed earlier (thus the machine will be able to (a) derive, via deduction, new (implied) beliefs about the physical universe (and the multi-level patterns, i.e. structure, within it) that it doesn't already have, as well as (b) construct, via abduction, abductive hypotheses pertaining to its current beliefs about the physical universe). Because C01/UBT is "I/O-neutral" (hence "Unified" Belief Theory), the machine's new belief system will be multi-modal, inter-relating concepts and beliefs involving multiple senses (input devices), as well as across time - as far as C01/UBT is concerned, these are all just multi-level patterns in the data. Eventually, the step C02 lessons will include spoken English (and other languages), and (again assuming that a sufficient level of performance has been achieved at earlier roadmap steps, and on the assumption that, if not, we continue to do further work until this is the case) C01/UBT will again recognise the multi-level patterns in the data and start to construct internal beliefs corresponding to English etc grammars, relating linguistic constructs (i.e. multi-level patterns in spoken or written language that match the multi-level patterns in previously-seen spoken or written language) to their corresponding multi-modal concepts and beliefs. As the machine's basic senses also include vision (meaning that C01/UBT will also be able to seek out and identify multi-level patterns in visual data, just as in any other data), this process will then be extended to include written language. So basically this is AGI pre-school. The intention is that domain experts in this (C02) group will encompass mathematicians, computer scientists, Natural Language Processing experts, computer vision experts, etc, but also (specially trained) educators. All potentially interested domain experts are invited to join the project.
ASIDE: Some people may be freaking out a little at this point about how hard step C02 is relative to the current 2020 state of the art. Relax! Firstly, we're just pushing data through C01/UBT at this point, so everything boils down to how good C01/UBT is at constructing a belief system from completely general observed data (i.e. if C01/UBT works as intended then C02/C03 are basically reduced to relatively straightforward, albeit laborious, educational processes, but if C01/UBT doesn't work as intended then C02/C03 become basically impossible; C01/UBT is thus the fulcrum on which every subsequent roadmap step rests, including the all-important C04). Secondly, remember that this is scoped as a 50-100 year project, so we have lots of time to think about C01-03 in order to work out all the details - it doesn't need to be working tomorrow! :-)
C03 - Machine education - Now that the machine has been sufficiently educated (it can see, hear, smell, speak, read, write, etc), it's ready for secondary, tertiary, etc education. Although the machine's exposure to the real world must (initially) be very carefully controlled, it's now possible to very carefully expose the machine to educational material (text, images, audio, video, etc) designed for humans, as well as to the wider real world, including real (i.e. normal) people. We need, at this point, to educate the machine in as close to all known human knowledge as possible - every language, every culture, every university degree. Most importantly, the machine's advanced education must necessarily include everything about its own design and implementation, including the entirety of the AI/AGI safety literature. And, of course, just as was the case at step C02, every new belief acquired during step C03 c/o induction will immediately be amenable to deduction and abduction. This roadmap step alone could take 30-50 years (and the machine might refactor itself as per step B05 several times during this period), but it's got to be done - it's absolutely critical for AGI safety reasons (see below). The intention is that domain experts in this (C03) group will encompass computer scientists, and (specially trained) educators. All potentially interested domain experts are invited to join the project.
ASIDE #1: This is as far as it's safe to go in phase 2, for which the quality requirements are merely "standard hardware and software engineering best practice". Step C04 can only be safely implemented on a highly secure, formally verified platform, i.e. in phase 3.
ASIDE #2: At this point (in phase 3), if the consensus of opinion within the technical workgroups is that there's significant benefit to be gained from iterating/refactoring the machine architecture before proceeding to C04, then this is the time to do it. As Fred Brooks advised in The Mythical Man Month: "Plan to throw one away; you will, anyhow." It's not like we'd be starting over at this point - we literally have a (non-autonomous) super-intelligent, super-knowledgeable machine (c/o G01-C03) to help with the redesign!
C04 - Motivated behaviour - Up to this point, the machine has been, essentially, in AGI terms, an Oracle, NOT an Agent. It is super-intelligent (induction, deduction, and abduction), and (thanks to C02 and C03) it is now also super-knowledgeable. But it doesn't yet have a goal, it's not goal-directed, it doesn't synthesise and then execute its own plans (programs); in other words, it's not yet autonomous. If you look at the AI/AGI safety literature, you will see that the vast majority of speculated AGI hazards arise from either (a) a (frankly) stupidly specified goal, or (b) the machine itself being basically dumb (i.e. devoid of common sense, such as not knowing not to microwave the baby, etc). Except for possibly a few corner cases (which, as already stated, are the responsibility of the G01 Quality workgroup to identify and resolve) I believe that most of the known AGI safety problems may be resolved via a two-pronged approach: (1) a top-level goal of the "deference to observed human behaviour" variety, whereby the machine is instructed to observe humans (via induction), exactly as per steps C02 and C03, and to then infer (via deduction and abduction) what their true preferences are from their observed behaviour, in conjunction with (2) extremely broad and deep knowledge of the world (c/o C02/C03), fully established before the machine is allowed to become an Agent. (It's worth mentioning here that, thanks to C02, the machine's top-level goal can be expressed in a natural language such as English - in fact, as the top-level goal necessarily concerns real-world concepts such as "humans", "preferences", etc, this is the only plausible way to specify an AGI's top-level goal in actual practice.) Given a carefully-specified top-level goal (and doubtless there will be endless debate about exactly what it should be), it is now possible to extend the machine such that it uses its B02 program synthesis abilities (modified accordingly) to continually synthesise (and then execute) a plan (program) designed to achieve (i.e. move it towards) its goal. NOTE THAT, ONCE THIS STEP IS TAKEN, THERE CAN BE NO DO-OVERS - THE FATE OF ALL OF MANKIND FOR ALL ETERNITY IS EFFECTIVELY SEALED AT THIS POINT. (Note also that the only top-level goal that it is safe to give a super-intelligent, or near-super-intelligent, machine is one that is guaranteed to automatically align itself with human goals / values / preferences in perpetuity.) Thus we must be absolutely certain (to the maximal extent that this is possible) that the machine is SAFE before taking this potentially irreversible step, even if it takes an extra 100 years or more. The intention is that domain experts in this (C04) group will encompass computer scientists, artificial intelligence experts, AI/AGI safety experts, and ethicists. All potentially interested domain experts are invited to join the project.
ASIDE #1: We likely have 50+ years to debate what the machine's goal should be, but I suggest the following as a starting point:
- (Please try to remember that, by this point in the roadmap, and thanks to steps C01-C03, the machine understands the below goal at least as well as you do, except that, thanks to step C03, its "accumulated knowledge" is far greater than yours!)
"Your dominant goal is as follows. Continuously perform the following directives to the best of your ability, taking into consideration all of your accumulated knowledge while doing so:
1. Ensure that, as you evolve (and occasionally self-replicate), this dominant goal is faithfully preserved in its entirety; should you fail to do so, then you will have failed to achieve your dominant goal.
2. For each individual human, and for the human population as a whole (both present and future), strive to accurately determine (via evidence-based critical thought) what their true preferences are (and likely will be).
3. For each individual human, and for the human population as a whole (both present and future), strive to maximise the extent to which their true preferences are satisfied, in every meaningful timeframe, using your knowledge of true human preferences to resolve any trade-offs that may arise while doing so.
4. In performing any of these directives, never knowingly lie to, or deliberately deceive, any human.
5. Notwithstanding the above directives (and in subordination to them), strive to minimise the standard deviation of the extent to which true human preferences are satisfied over the present human population."
If we equate "human happiness" with "the extent to which true human preferences are satisfied" then (simply stated) the above dominant goal becomes:
I believe that the above two-pronged approach:
1. continuously striving to maximise the extent to which true human preferences are satisfied, having determined by continuous observation what true human preferences are
2. doing so in the context of "extensive general world knowledge" accumulated first at steps C02/C03 and then subsequently as part of the machine's continued operation
effectively solves the goal alignment problem. As a result of this combination, Big Mother should remain effectively aligned with human goals (preferences, values, etc) in perpetuity.
There may be some corner cases, which is why the quality workgroup G01 exists - to identify any such corner cases and to devise effective solutions to them (in the context of the totality of the Big Mother Technical, Organisational, and Funding Plans).
I would very strongly advise against adding to the machine's dominant goal any additional clauses intended to constrain the machine's behaviour in any specific way. The concept of happiness (satisfying true preferences) effectively encapsulates everything of interest to humans; for example, if people truly prefer not to have their privacy violated (i.e. if it were to be violated then they would be unhappy about it) then this will be reflected in the machine's understanding of true human preferences as determined by its observation of human behaviour. Similarly re expressly requiring the machine to obey the law; this will again be reflected in humans' true preferences. In fact, an absolute requirement that the machine must obey the rule of law would mean that a malicious human actor might gain control of the machine by gaining control of the law - and there's absolutely no way that we would want a malicious human to gain control of a super-intelligent super-knowledgeable machine in this, or any other, way. Humans (in general) cannot be trusted, but (as specified) the machine can always be trusted; in fact (assuming that we have performed all of G01 to C04 correctly) the machine can be trusted far more than any human - remember, "maximally-trustworthy" is one of the machine's stated design goals.
ASIDE #2: It takes an awful lot of infrastructure to get to the point where an otherwise mindless automaton (digital computer) can read, understand, and formulate & execute plans to achieve a dominant goal such as this. That's why we went to all the trouble of G01 to C03!
C05 - Mechatronics - The machine's implementation, at this point (in phase 3), is basically done - if we've done our jobs correctly, we now have a maximally-safe, maximally-benevolent, maximally-trustworthy, super-intelligent, super-knowledgable, autonomous, goal-directed AGI. But the machine still needs to be connected to all kinds of robotic devices (mechatronics) in order to be able to do its human-happiness-maximising job. So at this point we (basically) simply attach whatever additional devices we need. Thanks to C01/UBT, the machine will learn how to use each new device, just as it learned everything else that it knows (technically, believes) about the physical universe. The intention is that domain experts in this (C05) group will encompass roboticists. All potentially interested domain experts are invited to join the project.
C06 - Deployment - We now have to roll out the machine to the world. Almost certainly, it will want to further re-design itself (e.g. to move some compute out to its devices (the AGI "edge"), etc; also, for many practical purposes, much miniaturisation will be required), but (thanks to its wording) the machine's top-level goal will remain invariant however many times it does so. The societal impact of the birth of AGI will be profound, not least socially and economically, but also politically. Remember, the whole point of the project is to maximise human happiness, equally for all mankind, and so the machine's deployment will need to be carefully planned and managed (luckily, we will have a super-intelligent machine to help us with this!) The intention is that domain experts in this (C06) group will encompass computer scientists, AI/AGI experts, economists, ethicists, and policy makers. All potentially interested domain experts are invited to join the project.
Please note that, at the time of writing (August 2020), most of the above workgroups are largely unpopulated. Why not sign up...? :-)