Apologies, this page is a bit of a mess. Until we are able to clean it up, please (if you can) try to ignore style, and focus on content!
The Big Mother AGI Architecture (or Framework) comprises:
The Big Mother Technical Plan (or Roadmap), comprises 19 nominal Roadmap steps:
ASIDE: The machine is being designed/developed top-down (working backwards from the target behaviour). This means the current (3rd-level) design, although end-to-end, is conceptual in nature (as well as a framework for collaboration), and will gradually be refined down to a concrete implementation, with information/detail being added at each refinement step. (In the same manner, the following informal description will gradually be refined into a more scientifically rigorous exposition).
There is a Big Mother workgroup for each of the 19 Roadmap steps G01-G02, A01-A06, B01-B05, and C01-C06.
Roadmap steps G01 and G02 are global in nature, and pertain to both the INNER and OUTER COGNITIONs
G01 - Quality - Never forget that skilled professionals built the Titanic, DeepWater Horizon, the Fukushima Daiichi nuclear plant, and the Boeing 737 MAX, and, at the time, all were supremely confident that their designs would operate safely in all foreseeable conditions. Nevertheless, they were all wrong. Due to the unimaginable potential cost of getting AGI wrong, we cannot afford to be wrong about AGI safety, in any timescale, not even millennia. Thus the BigMother project is quality-driven, rather than e.g. cost-driven. This group is responsible for all things quality-related, where quality means ensuring that the project achieves its stated objectives (as described elsewhere, including "good things happen, bad things don't", "maximally-safe, -benevolent, and -trustworthy", etc). In particular, safety must be an invariant property of the machine at each step of its construction (as it emerges from the roadmap), as well as in respect of any results that may be exploited commercially by the S06 Exploitation group (part of the Organisational Plan). The intention is that domain experts in this (G01) group will encompass general quality management, general best-practice hardware and software engineering, cybersecurity, formal (i.e. mathematical) verification (of hardware and software), safety-critical systems (see e.g. Safety-Critical Systems Club), and of course AI/AGI safety (see e.g. Machine Intelligence Research Institute, Future of Humanity Institute, Future of Life Institute, Centre for the Study of Existential Risk, AI Alignment Forum, Centre for Applied Rationality, Centre on Long-Term Risk, and the AI Incident Database). All potentially interested domain experts are invited to join the project.
G02 - Facilities - The project is broadly organised into three phases (i.e. we will at least notionally be going through the entire roadmap three times): (1) prototyping and evaluation, (2) working implementation, and (3) final implementation. There will be different quality requirements for each phase, e.g. in phase 3 everything (even remotely possible) needs to be formally proven (which then implies that only technologies that are sufficiently mathematically defined may be incorporated into the machine's phase 3 design). When we get to phase 3, group G02 will be responsible for designing and building the (highly secure and formally verified) supercomputer on top of which Big Mother 1.0 will be implemented (and this will be facilitated by phase 2 implementations of e.g. B02 program synthesis, B03 hardware synthesis, and B04 system synthesis). The intention is that domain experts in this (G02) group will encompass system architects, HPC (High Performance Computing) experts, chip designers, software engineers, security experts, etc. All potentially interested domain experts are invited to join the project.
The primary focus of roadmap steps A01 to A06 is DEDUCTION
A01 - Universal Logic (UL-0) - The machine must necessarily maintain an internal model of its UoD (through which its three cognitive processes - induction, deduction, and abduction - will all communicate), and this internal model must necessarily be represented somehow. Because the machine is an AGI ("G" for "General"), this representation mechanism must be as "general" as possible, i.e. able to represent "absolutely anything". First-order logic with equality (FOLEQ) is foundational, i.e. able (with a little effort) to represent "all of mathematics", which for our purposes is sufficient. Thus UL-0 (the machine's internal representation mechanism) is essentially FOLEQ plus descriptions ("some X such that P(X)" and "the X such that P(X)") plus named-and-parameterised definitions [plus a special constant corresponding to the machine's percept history]. The intention is that domain experts in this (A01) group will encompass logicians and mathematicians. All potentially interested domain experts are invited to join the project.
A02 - Soundness & completeness - First-order logic with equality (FOLEQ) is both sound and complete (as will become clear at step A05, this was one of the reasons it was chosen for the project). UL-0 is intended to be merely a conservative extension of FOLEQ, and therefore also sound and complete. However, as it's notoriously easy to make some schoolboy error and break your logic, a quality control step is required. The objective of roadmap step A02 is therefore to formally prove, using a third-party system such as (say) Prover9, that UL-0 is still sound and complete (i.e that we haven't inadvertently broken FOLEQ). The intention is that domain experts in this (A02) group will encompass logicians and mathematicians. All potentially interested domain experts are invited to join the project.
A03 - Theorem proving - Basically, construct a (natural deduction) theorem prover for UL-0 called ULTP. This is the machine's first cognitive process (deduction), and phase 2 coding on this has already started (one of the reasons why I have zero time!) Theorem proving is a state space search problem, and therefore computationally expensive, and so various speedups will gradually be applied (here at step A03, and later at steps B01-04). The intention is that domain experts in this (A03) group will encompass logicians and (especially discrete) mathematicians, as well as automated reasoning experts, and machine learning experts. All potentially interested domain experts are invited to join the project.
ASIDE: The trick to any state space search problem is to use all of the information available (see, for example, George Polya's "How to Solve It", 1945, and Imre Lakatos' "Proofs and Refutations", 1976), including (1) object-level information derived directly from the problem description itself, (2) meta-level information derived from reasoning "about" the problem, rather than "in" the problem, and (3) statistical information derived from experience of attempting many instances of the problem (or similar problems) and identifying those search patterns (strategies) that tend to be most successful in specific situations. In respect of (3), the current plan is to use contemporary neural-net-based machine learning techniques (e.g. reinforcement learning), as at least a stop-gap solution. Ultimately, however, (3) is basically a problem of induction, which is addressed by step C01.
A04 - NBG toolkit - In its raw, low-level form, FOLEQ is insufficiently expressive for our purposes. A first-order theory adds additional axioms (called proper axioms) to FOLEQ, thereby effectively giving semantics to specific function and relation symbols. In our case we will be adding the NBG set theory axioms (extensionality, pairing, class existence, regularity, sum set, power set, infinity, and limitation of size) to UL-0, thereby extending (and strengthening) UL-0 to UL-0-NBG, i.e. first-order NBG set theory. [NOTE: One final non-NBG axiom will also be required, constraining the special UL-0 constant corresponding to the machine's percept history.] At this point, given the definition mechanisms we incorporated into UL-0, it becomes possible to do significant mathematics. And our first job, mathematically speaking, is to define a mathematical toolkit on top of UL-0-NBG. This involves a lot of low-level mathematical definitions (e.g. we will be using Conway's Surreal Numbers for the basic number systems - Naturals, Integers, Rationals, Reals, Ordinals, and Cardinals, and from these we will be using the Cayley-Dixon construction to derive Complex numbers, Quaternions, Octonions, and Sedenions), and a lot of formal proofs (for which we will have the ULTP - I knew we wrote it for a reason!) As a minimum, we need to define sufficient mathematics to support step A05 (including, most importantly, transfinite induction and recursion), but the more mathematics we can hand-define the better (this will be especially apparent when we get to B01). The intention is that domain experts in this (A04) group will encompass logicians and (especially discrete) mathematicians, as well as set theorists. All potentially interested domain experts are invited to join the project.
A05 - Universal Logic (UL-N) - Once the mathematical toolkit is sufficiently well defined, we will have everything we need to define (in UL-0-NBG) the syntax, semantics, and proof rules of (essentially) UL-0. Please note that there's nothing actually circular going on here. What we end up with is (effectively) a clone of UL-0 (which we will call UL-1), with UL-0-NBG acting as its metalanguage. Because UL-0 definitions are also valid UL-1 definitions (and similarly at each level), we end up with an infinite stack of UL-Ns (UL-0, UL-1, UL-2, ...) together with a corresponding infinite stack of UL-N-NBGs (UL-0-NBG, UL-1-NBG, UL-2-NBG, ...), where each UL-N-NBG is metalanguage to UL-N+1:
UL-N has exactly the same expressive power as UL-0 (i.e. the meaning of any UL-N (N > 0) wff or term is simply defined by translation back into UL-(N-1)-NBG), but it's easier to use (and think about) in relation to metamathematical concepts (which, even in everyday mathematics, abound). And so now, our machine (via ULTP) can "think" (deductively, at least) at both the object (e.g. UL-1) and meta (e.g. UL-0-NBG) levels (as well as at the meta-meta level, meta-meta-meta level, etc). This actually adds significant capability - for example, we can now express statements about the NBG definitions we constructed at A04 (such as "NBG definition X has property P(X)"), and then ask the ULTP (operating way down at the UL-0 level) if they are theorems. Also, thanks to soundness and completeness, when asked to prove a UL-N theorem, where N > 0, the ULTP is now able to switch back and forth between thinking (1) "in" the UL-N system (syntactically) and (2) "about" the UL-N system (semantically). In some cases, it should even be able to prove (metamathematically) that a proof of a putative UL-N (N > 0) theorem does not exist (as per Hofstadter's famous MU puzzle). Finally, should it ever be the case that we simply cannot implement some desired algorithm (at some later roadmap step) without using some more esoteric logic L, then we can again use UL-N-NBG as the metalanguage in which to define the specific logic L that we need, and again we will be able to ask the ULTP, acting way down at the UL-0 level, to try to prove putative L theorems for us. The intention is that domain experts in this (A05) group will encompass logicians, and metamathematicians. All potentially interested domain experts are invited to join the project.
A06 - Probability & statistics - When we build the machine's outer cognition (steps C01 to C06), thereby connecting the deductive-abductive inner cognition to the real world (i.e. the actual physical universe), everything (all belief pertaining to the physical universe) becomes uncertain. And so at this step (A06) we define probability, and at least a basic statistical toolkit, on top of the NBG toolkit (not by adding any further axioms, but just via definitions based on the relevant axioms). Once we have defined probability and statistics in this way, we will be able to state (and ask the ULTP to attempt to prove) probabilistic and/or statistical theorems. This step has been delayed until after A05 because the Cox-Jaynes axioms (which are higher-level) might I suspect be more suitable for AGI than the traditional Kolmogorov axioms (but I also suspect we'll define both). The intention is that domain experts in this (A06) group will encompass mathematicians and statisticians. All potentially interested domain experts are invited to join the project.
The primary focus of roadmap steps B01 to B05 is ABDUCTION
B01 - Witness synthesis - This is effectively generalised abduction ("find X such that P(X)"), implemented using state space search, and constitutes the machine's abduction cognitive process. (Note that this can also be viewed as a generalisation of mathematical optimisation, which may be described as "find the least X (according to some specified ordering) such that P(X)"). Generalised abduction (as we have defined it) is potentially an extremely powerful mechanism, because P(X) can basically be anything (and, thanks to A05, can also encompass metamathematical concepts, such as "find X such that X is a wff that implies wff Y", as well as object level ones). There is, in principle, no creative problem that cannot be expressed in terms of "find X such that P(X)" and that the machine cannot therefore attempt to solve via "find X such that P(X)", and thus witness synthesis, i.e. generalised abduction, directly corresponds to creativity, i.e. it is the way through which the machine solves (technically, attempts to solve), creative problems. The easiest way to think about it algorithmically is that witness synthesis is a generalisation of theorem-proving, where the "search rules" (possible state space search steps) are derived automatically via metamathematical analysis of the available stock of UL definitions and theorems (at this point, all those things we carefully hand-crafted at steps A04 and A06; however, later on, post-C01, the machine will also be able to add its own theorems derived from observation of the physical universe). So, if we want to get witness synthesis to for example "find X such that X is a GCL program with precondition Y and postcondition Z", we first have to define (or, post-C01, teach the machine) the semantics of GCL computation (Dijktra's Guarded Command Language) mathematically (as a collection of UL definitions), and then the witness synthesis algorithm will attempt to find (via metamathematically-powered state space search) GCL programs for us. (Note that we will apply the same speedups at step B01 as we did at step A03.) This is possibly a new (Big Mother specific) research area.The intention is that domain experts in this (B01) group will encompass mathematicians and computer scientists. All potentially interested domain experts are invited to join the project.
ASIDE #1: Witness synthesis will return the first X satisfying P(X) that it finds given the time and other resources available to it (for example, the calling function might impose a 5 second timeout). Given the fact that (thanks to A05) witness synthesis can "think" metamathematically, three outcomes are possible: (1) it finds a suitable X within the specified time, (2) it fails to find a suitable X within the specified time (and in this case returns nothing), or (3) it finds, within the specified time, a proof that no such X exists. In cases (1) or (2), the calling function can resubmit the request, either allowing additional state-space-search time and/or strengthening the target constraint so as to (hopefully) obtain some Y such that P(Y) and Y is "better" than X according to some specified ordering; the calling function can obviously resubmit a request as many times as it likes. Used in this way, witness synthesis will obtain the best (not-necessarily-optimal) result that it can given the time and other resources available.
ASIDE #2: The ULTP we constructed at step A03 was just a "starter motor" - we needed it in order to be able to construct the NBG toolkit at step A04, which we needed in order to be able to "go metamathematical" at step A05, which we needed in order to make witness synthesis possible at step B01. But now, in principle at least, we can replace the ULTP with "find X such that X is a proof of putative theorem Y".
B02 - Program synthesis - For our purposes, this is witness synthesis applied to the automated synthesis of computer programs (including their mathematical proofs of correctness). This is the only way, in actual practice, that we will be able to produce the formally verified software implementations required by the G02 Facilities group in phase 3 - doing so manually is simply beyond human ability. Wondering how to get a foothold on this problem? Have a look here, here, and here. The intention is that domain experts in this (B02) group will encompass mathematicians and computer scientists. All potentially interested domain experts are invited to join the project.
B03 - Hardware synthesis - For our purposes, this is witness synthesis applied to the automated synthesis of digital logic designs (including their corresponding mathematical proofs of correctness, as well as corresponding software simulations (ideally generated via B02 Program synthesis) so that the synthesised designs may be evaluated in software before being implemented as FPGAs or ASICs). Again, this is the only way, in actual practice, that we will be able to produce the formally verified hardware implementations required by the G02 Facilities group in phase 3 - doing so manually is again beyond human ability. Wondering how to get a foothold on this problem? Have a look here and here. The intention is that domain experts in this (B03) group will encompass mathematicians, computer scientists, and digital logic designers. All potentially interested domain experts are invited to join the project.
B04 - System synthesis - For our purposes, this is witness synthesis applied to the automated synthesis of mixed hardware/software systems (including their mathematical proofs of correctness) - ideally including analog as well as digital, and possibly even novel/esoteric technologies such as quantum computation. The intention is that domain experts in this (B04) group will encompass mathematicians, computer scientists, digital logic designers, electronics engineers, quantum computing experts, and process algebra experts (see e.g. CSP, timed CSP, and timed probabilistic CSP). All potentially interested domain experts are invited to join the project.
ASIDE: According to Integrated Information Theory (which we adopt as a working model of consciousness, even if it might not be the final word on the subject), AGI consciousness appears to be a design choice. In order words, if we design our machine, and specifically its hardware architecture, such that the calculated value is always = 0 then it will not be conscious, and if can be > 0 then it will (at least sometimes) be conscious, the "quantity" of its instantaneous conscious experience will be proportional to , and the "quality" of its instantaneous conscious experience will correspond to another calculated value called MICS.
I believe very strongly that an AGI such as Big Mother should not be (significantly) conscious, i.e. should be 0 (or at least as close to 0 as possible). There are two reasons for this design choice: (1) SAFETY: if the machine is conscious, i.e. sentient, then that means that it has conscious experience, and therefore feelings (qualia), and these feelings could potentially motivate the machine to behave in ways that are in conflict with its dominant goal; (2) ETHICS: if the machine is conscious, then that means that we will have created a sentient being, and then (by virtue of its design) effectively enslaved it - this is simply wrong.
B05 - Auto-refactoring - Prior to this point, all of the hardware and software comprising the machine-under-development will have been developed by hand (i.e. humans). At this point, assuming that a sufficient level of performance has been achieved (and, if not, we simply do more work until it has), we can use the algorithms developed at steps B01-B04 to refactor the machine's implementation, i.e. the machine will redesign its own implementation at both the digital register transfer and software source code levels, and (again assuming that sufficient B01-B04 capability has been achieved) these machine-generated implementations (both hardware and software) should be faster, more massively parallel, more novel, more time-and-space efficient, and ideally even more energy efficient (we really need at least 4 orders of magnitude improvement in this respect), than the previous hand-developed versions. (We will need every ounce of compute at steps C01 onwards, and this is one way to get it - if you've ever seen one of these synthesis systems working, they are able to generate designs that no mere human could ever think of!) Most importantly, the synthesised hardware and software generated at this step will be accompanied by the formal proofs of correctness mandated at phase 3. Given the complexity of the hardware/software system in question, it would be infeasible for humans to generate such proofs of correctness. The intention is that domain experts in this (B04) group will encompass computer scientists. All potentially interested domain experts are invited to join the project.
ASIDE: Functionally, now that we've managed to construct it, witness synthesis is the only algorithm we will need for AGI. Anything we need from this point on can either be implemented via witness synthesis, or synthesised via witness synthesis.
The primary focus of roadmap steps C01 to C06 is INDUCTION
C01 - Belief synthesis (UBT) - Deduction and abduction are reductionist - they depend for their operation on there being a pre-existing model of the universe (in BigMother's case, a belief system represented in UL-N-NBG). For BigMother's inner cognition, whose UoD is all of mathematics, the machine's "starter motor" model is painstakingly constructed by hand (primarily at steps A04 and A06). But what about the outer cognition, whose UoD is extended to include the physical universe, where does it's model of the universe come from? The answer is UBT (the modestly named "Unified Belief Theory"). This is the mechanism through which the machine continually observes its UoD (at this point, also encompassing the external physical universe) and constructs an internal model of it (i.e. extends its hand-crafted, pre-built model) from those observations. (In effect, UBT attempts to determine the structure of the universe, and to then build a belief-system-based model that accurately reflects that structure.) This internal model comprises a set of UL-N-NBG theorems constituting assertions (beliefs) about the (external, physical) UoD, or more exactly, about the machine's percept history, such as "pattern X exists at location Y (within the percept history) with qualification Z". In other words, this is the machine's (model-free, holistic) induction cognitive process (as described here): it observes the UoD and constructs a belief system (set of UL-N-NBG theorems about its percept history) from those observations. Please note that the AGI learning (i.e. model-free induction) problem is very different from the contemporary machine learning problem, and the extent to which contemporary machine learning systems (neural nets etc) will form a component of the final (necessarily formally verified) UBT implementation, or not, is not yet clear. As per B01, this is possibly a new (Big Mother specific) research area. The intention is that domain experts in this (C01) group will encompass mathematicians, statisticians, data scientists, and computer scientists. All potentially interested domain experts are invited to join the project.
ASIDE #1: Our primary focus here is on the external physical universe [component of the machine's UoD], but note that it's also possible for the machine to observe its own internal operation, such as its deduction and/or abduction cognitive processes (and the machine could then use the knowledge so gained to optimise its own internal operation).
ASIDE #2: From the hardware perspective, it's clear that C01/UBT will require (ultimately formally verified) massively parallel, highly specialised hardware (not necessarily even Turing-equivalent CPUs), most likely with the data (raw percepts, and higher-level beliefs) intricately interleaved with the underlying processing. This is where B03 (Hardware synthesis) and B04 (System synthesis) will come into their own. Note that it is not necessarily the case that neural nets will be involved! (They might, but they also might not!)
ASIDE #3: I believe that UBT will ultimately be better at detecting successful search strategies (patterns) than the contemporary stop-gap solutions applied at steps A03, B01, B02, B03, and B04; if so we should just be able to swap out the stop-gap solutions for UBT.
C02 - Basic senses - At this point, assuming that all prior roadmap steps have been implemented to the required level of performance (and, again, if not, we simply do more work until they have), we will have a super-intelligent machine (super-intelligent induction (C01/UBT), super-intelligent deduction (A03/A05), and super-intelligent abduction (B01)), at least, that is, in respect of the UoD of "all of mathematics", but it won't actually know anything yet (about the real world, that is) - it's internal belief system (in respect of the external physical universe) will be tabula rasa. It's basically an AGI infant at this point, so we need to send it to school. We start, in this roadmap step, by attaching a number of input devices corresponding to basic senses (vision, hearing, olfaction etc), whatever we deem sufficient for the machine to be sufficiently embodied for the belief system it constructs about the real world (via C01/UBT) to be grounded by first-hand observations of it (although there is a safety issue here - we will need to be very careful when selecting what I/O devices we attach at this point). We then expose the machine to carefully curated experiences (lessons) having the net effect of teaching it (c/o C01/UBT, i.e. induction) how to see, hear, smell, etc (plus whatever other basic senses, and possibly effectors (output devices), that we may wish to include at this point). As long as C01/UBT works as intended (this is critical!), the machine will detect the multi-level (and multi-modal) patterns in the data and construct a multi-level (and multi-modal) internal model of the universe (belief system) that accurately encapsulates those multi-level patterns (including all their complexities, all their subtleties, all their nuances). This is where the machine learns about e.g. the basic physics of time and space etc ("intuitive physics"), just as (well, broadly analogously to the way in which) a human child would. The machine's new belief system (pertaining to the physical universe), being expressed in UL-N-NBG, will immediately be amenable to both deduction and abduction, as constructed earlier. Thus the machine will be able to (a) derive, via deduction, new (implied) beliefs about the physical universe (and the multi-level patterns within it) that it doesn't already have, as well as (b) construct, via abduction, abductive hypotheses pertaining to its current beliefs about the physical universe. Because C01/UBT is "I/O-neutral" (hence "Unified" Belief Theory), the machine's new belief system will be multi-modal, inter-relating concepts and beliefs involving multiple senses (input devices), as well as across time -- as far as C01/UBT is concerned, these are all just multi-level patterns in the data. It should be pointed out here that, in particular, natural languages such as English etc are just part of the structure of the universe - if UBT works as intended, then no additional NLP-specific mechanisms will be required in order for it to learn English (from its observations of the universe) or any other language, or indeed anything that is, fundamentally, merely a part of the structure of the universe (and that includes other sensual modalities such as vision, concepts (patterns in the machine's observations of the universe) such as "physical object", "animate object", and "human", as well as "intuitive psychology" such as concepts (patterns) pertaining to human behaviour, patterns corresponding to inferred internal human states (emotions), and patterns relating externally observable human behaviours to internal human states - from the perspective of a machine observing the universe without any preconceived notion of what anything "means", these are all just part of the structure of the universe). Eventually, the step C02 lessons will include spoken English (and other languages), and (again assuming that sufficient level of performance has been achieved at earlier roadmap steps, and on the assumption that, if not, we continue to do further work until this is the case) C01/UBT should again be able to detect the multi-level patterns in the data and start to construct internal beliefs corresponding to English (etc) grammars, relating linguistic constructs (multi-level patterns in spoken language that match the multi-level patterns in previously-seen spoken language) to their associated multi-modal concepts and beliefs (ultimately, all the way down to its nexus of first-hand observations of the physical universe; in other words, the machine's belief system, and thus its understanding of the meaning of "the cat sat on the mat", and "the smell of freshly cut grass" etc is grounded in actual experience of the real world). As the machine's basic senses also include vision, C01/UBT should also be able to seek out and identify multi-level patterns in visual data, just as in any other data, and so the educational process will be extended to include written language. The intention is that domain experts in this (C02) group will encompass mathematicians, computer scientists, Natural Language Processing experts, computer vision experts, etc, but also (specially trained) educators. All potentially interested domain experts are invited to join the project.
ASIDE #1: Some observers may be concerned about how hard step C02 is relative to the current 2020 state of the art. Firstly, we're just pushing data through C01/UBT at this point, so everything boils down to how good C01/UBT is at constructing a belief system from completely general observed data, with no prior assumptions about the universe (basically, if C01/UBT works as intended then C02/C03 are reduced to relatively straightforward educational processes, and conversely if C01/UBT doesn't work as intended then the machine is effectively "unteachable", and so C02/C03 become impossible; C01/UBT is thus the fulcrum on which every subsequent roadmap step entirely depends!) Secondly, remember that the entire BigMother roadmap is currently scoped as a 50-100 year project (or longer if safety requires), and so we have lots of time to think about C01-03 in order to work out all the details -- it doesn't need to be working tomorrow! :-)
ASIDE #2: It should be noted that, for safety reasons, the machine, at this point in its construction, is only able to learn passively, not actively. In other words, it can absorb, via C01/UBT, the curated experiences (lessons) that we devise for it, but it can't actively interact with its environment (e.g. by asking questions, or by conducting its own experiments). This is because active learning is goal-directed behaviour, which requires a goal (and plan generation and execution mechanism etc), and it is only safe to give a super-intelligent (or near-super-intelligent) machine a goal such capabilities once it already has broad and deep knowledge of the real world (i.e. common sense knowledge). Thus we have a Catch-22 situation which doesn't exist when educating human children (basically because human children are not super-intelligent, and thus lack the potential to turn the solar system into paperclips should they, for whatever reason, ever wish to).
C03 - Machine education - Now that the machine has been sufficiently educated (it can see, hear, smell, speak, read, write, etc), it's ready for secondary, tertiary, etc education (again via a process of (safe) passive learning, not (unsafe) active learning). Although (for safety reasons) the machine's exposure to the real world must (initially) be very carefully controlled, it's now possible to very carefully expose the machine to educational material (text, images, audio, video, etc) designed for humans, as well as to the wider real world, including real people. We need, at this point, to educate the machine in as close to all known human knowledge as possible -- every language, every culture, every university degree, etc. This is commonly referred to in the AI world as common sense knowledge - unfortunately, if you want your machine to be grounded in actual physical experience of the real world, there's no other way to give an AGI this knowledge other than by explicitly, painstakingly teaching it (i.e. you can't just scrape data from the internet, if you do then the machine will only have a very shallow semantic understanding of the universe, which is not just insufficient for super-intelligence, it's also unsafe, because AGI safety depends in large part on the machine having both a broad and semantically deep understanding of the real, physical world). Most importantly, the machine's advanced education must necessarily include everything about its own design and implementation, including the entirety of the AI/AGI safety literature (just assume that, by the end of C03, the machine has read everything you have on AI safety, and watched all the same videos, and understands all the problems, perceived and real, at least as well as you do!) And, of course, just as was the case at step C02, every new belief acquired during step C03 c/o induction will immediately be amenable to deduction and abduction (prediction and problem-solving). This roadmap step alone could take 30-50 years (and the machine might refactor itself several times during this period), but it's vital for AGI safety reasons that the machine has extensive world knowledge. The intention is that domain experts in this (C03) group will encompass computer scientists, and (specially trained) educators. All potentially interested domain experts are invited to join the project.
ASIDE #1: This is as far as it's safe to go in phase 2, for which the quality requirements are merely "standard hardware and software engineering best practice". Step C04 can only be implemented safely on a highly secure, formally verified platform, i.e. in phase 3.
ASIDE #2: At this point (in phase 3), if the consensus of opinion within the technical workgroups is that there's significant benefit to be gained from completely iterating/refactoring the machine architecture from scratch before proceeding to C04, then this is the time to do it. As Fred Brooks advised in The Mythical Man Month: "Plan to throw one away; you will, anyhow." It's not like we'd be starting over at this point -- we literally have a (non-autonomous) super-intelligent, super-knowledgeable machine (c/o G01-C03) to help with the redesign!
C04 - Motivated behaviour - Up to this point, the machine has been, essentially, in AGI terms, an Oracle, NOT an Agent. It is super-intelligent (induction, deduction, and abduction), and (thanks to C02 and C03) it is now also super-knowledgeable. But it doesn't yet have a generic goal, it's not goal-directed, it doesn't synthesise and then execute its own plans (programs); in other words, it's not yet autonomous. If you look at the AI/AGI safety literature, you will see that the vast majority of speculated AGI hazards arise from either (a) a (frankly) stupidly specified goal, or (b) the machine itself being basically stupid (i.e. devoid of common sense, such as not knowing not to microwave the baby, etc). Except for possibly a few corner cases (which, as already stated, are the responsibility of the G01 Quality workgroup to identify and resolve), and on the assumption that actual human preferences effectively encapsulate an individual human's goals, values, beliefs, fears, and morality (including what makes them happy, and what makes them sad), I believe that most of the known AGI safety problems may be resolved via a two-pronged approach: (1) a top-level goal of the "deference to observed human behaviour" (a.k.a. "inverse reinforcement learning") variety, whereby the machine is instructed to observe humans (via induction) and to then infer (via deduction and abduction) what their actual preferences are from their observed behaviour, together with (2) extremely broad and deep knowledge of the world, fully established before the machine is even allowed to become an Agent. As a result of this combination (and on the understanding that G01 identifies and resolves any remaining corner cases), Big Mother should remain effectively aligned with human goals (preferences, values, etc) in perpetuity. (It's worth mentioning here that, thanks to C02, the machine's top-level goal may be expressed in a natural language such as English -- in fact, as the top-level goal necessarily concerns real-world concepts such as "humans", "preferences", etc, this is likely the only way in which an AGI's top-level goal may be specified in actual practice.) Given a carefully-specified top-level goal (and doubtless there will be endless debate about exactly what this should be), it is now possible to extend the machine such that it uses its B02 program synthesis abilities (modified accordingly) to continually synthesise (and then execute) a plan (program) designed to achieve (move it towards) its goal. NOTE THAT, ONCE THIS STEP IS TAKEN, THERE CAN BE NO DO-OVERS -- THE FATE OF ALL OF MANKIND FOR ALL ETERNITY IS EFFECTIVELY SEALED AT THIS POINT. Thus we must be absolutely certain (to the maximum extent that it is possible to be absolutely certain) that the machine is SAFE before taking this potentially irreversible step, even if it takes an extra 100 years or more to do so. The intention is that domain experts in this (C04) group will encompass computer scientists, artificial intelligence experts, AI/AGI safety experts, and ethicists. All potentially interested domain experts are invited to join the project.
ASIDE #1: We likely have 50+ years to debate what the machine's goal should be, but I suggest the following as a starting point:
- (Please try to remember that, by this point in the roadmap, and thanks to steps C01-C03, the machine understands the below goal at least as well as you do, except that its "accumulated knowledge" is far greater than yours!)
"Your dominant goal is as follows. Continuously perform the following directives to the best of your ability, taking into consideration all of your accumulated knowledge while doing so:
1. Ensure that, as you evolve (and occasionally self-replicate), this dominant goal is faithfully preserved in its entirety; should you fail to do so, then you will have failed to achieve your dominant goal.
2. For each individual human, and for the human population as a whole (both present and future), strive to accurately determine (via evidence-based critical thought) what their actual preferences are (and likely will be).
3. For each individual human, and for the human population as a whole (both present and future), strive to maximise the extent to which their actual preferences are realised, in every meaningful timeframe, using your knowledge of actual human preferences to resolve any trade-offs that may arise while doing so.
4. In performing any of these directives, never knowingly lie to, or deliberately deceive, any human.
5. Notwithstanding the above directives (and in subordination to them), strive to minimise the standard deviation of the extent to which actual human preferences are realised over the present human population."
If we equate "human happiness" with "the extent to which actual human preferences are realised" then (simply stated) the above dominant goal becomes:
I would very strongly advise against adding to the machine's dominant goal any additional clauses intended to constrain the machine's behaviour in any specific way. The concept of happiness (realising actual preferences) effectively encapsulates everything of interest to humans; for example, if people truly prefer not to have their privacy violated (if it were to be violated then they would be unhappy about it) then this will be reflected in the machine's understanding of actual human preferences as determined by its observation of human behaviour. Similarly re expressly requiring the machine to obey the law; this will again be reflected in humans' actual preferences. An absolute requirement that the machine must obey the rule of law would mean that a malicious human actor might gain control of the machine by gaining control of the law -- and there's absolutely no way that we would want a malicious (or even just selfish) human to gain control of a super-intelligent super-knowledgeable machine in any way. Humans (in general) cannot be trusted, but the machine (as specified) can always be trusted; in fact (assuming that we have performed all of G01 to C04 correctly) the machine can be trusted far more than any human -- remember, "maximally-trustworthy" is one of the machine's stated design goals.
ASIDE #2: Note that, once the machine's goal-directed mechanisms are "switched on", it will finally be able to (safely) learn actively, not just passively.
ASIDE #3: It takes an awful lot of infrastructure to get to the point where an otherwise mindless automaton can read, understand, and formulate & execute plans to achieve a dominant goal such as this. That's why we went to all the trouble of G01 to C03!
C05 - Mechatronics - The machine's implementation, at this point, is basically done - if we've done our jobs correctly, we now have a maximally safe, maximally benevolent, maximally trustworthy, super-intelligent, super-knowledgeable, self-improving, perpetually goal-aligned AGI. But the machine still needs to be connected to all kinds of robotic devices in order to be able to do its human-happiness-maximising job. So at this point we simply design and attach whatever additional devices we need. Thanks to C01/UBT, the machine will learn how to use each new device, just as it learned everything else that it knows (technically, believes) about the physical universe. The intention is that domain experts in this (C05) group will encompass roboticists. All potentially interested domain experts are invited to join the project.
C06 - Deployment - We now have to roll out the machine to the world. Almost certainly, it will want to further re-design itself (e.g. to move some compute out to its devices (the "AGI edge"), etc; also, for many practical purposes, much miniaturisation will be required), but (thanks to its wording) the machine's top-level goal will remain invariant however many times it does so. The societal impact of the birth of AGI will be profound, not least socially and economically, but also politically. Remember, the whole point of the project is to maximise human happiness, equally for all mankind, and so the machine's deployment will need to be very carefully planned and managed (luckily, we will have a super-intelligent machine to help us with this!) The intention is that domain experts in this (C06) group will encompass computer scientists, AI/AGI experts, economists, ethicists, and policy makers. All potentially interested domain experts are invited to join the project.
Please note that, at the time of writing (August 2020), most of the above workgroups are largely unpopulated. Why not sign up...? :-)