The Big Mother AGI Architecture (or Framework) comprises:
The Big Mother Technical Plan (or Roadmap), comprises 19 nominal Roadmap steps:
ASIDE: The machine is being designed/developed top-down (working backwards from the target behaviour). This means the current (3rd-level) design, although end-to-end, is conceptual in nature (as well as a framework for collaboration), and will gradually be refined down to a concrete implementation, with information/detail being added at each refinement step. (In the same manner, the following informal description will gradually be refined into a more scientifically rigorous exposition).
There is a Big Mother workgroup for each of the 19 Roadmap steps G01-G02, A01-A06, B01-B05, and C01-C06.
Roadmap steps G01 and G02 are global in nature, and pertain to both the INNER and OUTER COGNITIONs
G01 - Quality - Due to the unimaginable potential cost of getting AGI wrong, the project is quality-driven, rather than e.g. cost-driven. This group is responsible for all things quality-related, where quality means ensuring that the project achieves its stated objectives (as described elsewhere, including "good things happen, bad things don't", "maximally-safe, -benevolent, and -trustworthy", etc). In particular, safety must be an invariant property of the machine at each step of its construction (as it emerges from the roadmap), as well as in respect of any results that may be exploited commercially by the S06 Exploitation group (part of the Organisational Plan). The intention is that domain experts in this (G01) group will encompass general quality management, general best-practice hardware and software engineering, safety-critical systems (see e.g. Safety-Critical Systems Club), and of course AI/AGI safety (see e.g. Machine Intelligence Research Institute, Future of Humanity Institute, Future of Life Institute, Centre for the Study of Existential Risk, AI Alignment Forum). All potentially interested domain experts are invited to join the project.
G02 - Facilities - The project is broadly organised into three phases (i.e. we will at least notionally be going through the entire roadmap three times): (1) prototyping and evaluation, (2) working implementation, and (3) final implementation. There will be different quality requirements for each phase, e.g. in phase 3 everything (even remotely possible) needs to be formally proven (which then implies that only technologies that are sufficiently mathematically defined may be incorporated into the machine's phase 3 design). When we get to phase 3, group G02 will be responsible for designing and building the (highly secure and formally verified) supercomputer on top of which Big Mother 1.0 will be implemented (and this will be facilitated by phase 2 implementations of e.g. B02 program synthesis, B03 hardware synthesis, and B04 system synthesis). The intention is that domain experts in this (G02) group will encompass system architects, HPC (High Performance Computing) experts, chip designers, software engineers, security experts, etc. All potentially interested domain experts are invited to join the project.
The primary focus of roadmap steps A01 to A06 is DEDUCTION
A01 - Universal Logic (UL-0) - The machine must necessarily maintain an internal model of its UoD (through which its three cognitive processes - induction, deduction, and abduction - will all communicate), and this internal model must necessarily be represented somehow. Because the machine is an AGI ("G" for "General"), this representation mechanism must be as "general" as possible, i.e. able to represent "absolutely anything". First-order logic with equality (FOLEQ) is foundational, i.e. able (with a little effort) to represent "all of mathematics", which for our purposes is sufficient. Thus UL-0 (the machine's internal representation mechanism) is essentially FOLEQ plus descriptions ("some X such that P(X)" and "the X such that P(X)") plus named-and-parameterised definitions. The intention is that domain experts in this (A01) group will encompass logicians and mathematicians. All potentially interested domain experts are invited to join the project.
A02 - Soundness & completeness - First-order logic with equality (FOLEQ) is both sound and complete (as will become clear at step A05, this was one of the reasons it was chosen for the project). UL-0 is intended to be merely a conservative extension of FOLEQ, and therefore also sound and complete. However, as it's notoriously easy to make some schoolboy error and break your logic, a quality control step is required. The objective of roadmap step A02 is therefore to formally prove, using a third-party system such as (say) Prover9, that UL-0 is still sound and complete (i.e that we haven't inadvertently broken FOLEQ). The intention is that domain experts in this (A02) group will encompass logicians and mathematicians. All potentially interested domain experts are invited to join the project.
A03 - Theorem proving - Basically, construct a (natural deduction) theorem prover for UL-0 called ULTP. This is the machine's first cognitive process (deduction), and phase 2 coding on this has already started (one of the reasons why I have zero time!) Theorem proving is a state space search problem, and therefore computationally expensive, and so various speedups will gradually be applied (here at step A03, and later at steps B01-04). The intention is that domain experts in this (A03) group will encompass logicians and (especially discrete) mathematicians, as well as automated reasoning experts, and machine learning experts. All potentially interested domain experts are invited to join the project.
ASIDE: The trick to any state space search problem is to use all of the information available (see, for example, George Polya's "How to Solve It", 1945, and Imre Lakatos' "Proofs and Refutations", 1976), including (1) object-level information derived directly from the problem description itself, (2) meta-level information derived from reasoning "about" the problem, rather than "in" the problem, and (3) statistical information derived from experience of attempting many instances of the problem (or similar problems) and identifying those search patterns (strategies) that tend to be most successful in specific situations. In respect of (3), the current plan is to use contemporary neural-net-based machine learning techniques (e.g. reinforcement learning), as at least a stop-gap solution.
A04 - NBG toolkit - In its raw, low-level form, FOLEQ is insufficiently expressive for our purposes. A first-order theory adds additional axioms (called proper axioms) to FOLEQ, thereby effectively giving semantics to specific function and relation symbols. In our case we will be adding the NBG set theory axioms (extensionality, pairing, class existence, regularity, sum set, power set, infinity, and limitation of size) to UL-0, thereby extending (and strengthening) UL-0 to UL-0-NBG, i.e. first-order NBG set theory. At this point, given the definition mechanisms we incorporated into UL-0, it becomes possible to do significant mathematics. And our first job, mathematically speaking, is to define a mathematical toolkit on top of UL-0-NBG. This involves a lot of low-level mathematical definitions (e.g. we will be using Conway's Surreal Numbers for the basic number systems - Naturals, Integers, Rationals, Reals, Ordinals, and Cardinals, and from these we will be using the Cayley-Dixon construction to derive Complex numbers, Quaternions, Octonions, and Sedenions), and a lot of formal proofs (for which we will have the ULTP - I knew we wrote it for a reason!) As a minimum, we need to define sufficient mathematics to support step A05 (including, most importantly, transfinite induction and recursion), but the more mathematics we can hand-define the better (this will be especially apparent when we get to B01). The intention is that domain experts in this (A04) group will encompass logicians and (especially discrete) mathematicians, as well as set theorists. All potentially interested domain experts are invited to join the project.
A05 - Universal Logic (UL-N) - Once the mathematical toolkit is sufficiently well defined, we will have everything we need to define (in UL-0-NBG) the syntax, semantics, and proof rules of (essentially) UL-0. Please note that there's nothing actually circular going on here. What we end up with is (effectively) a clone of UL-0 (which we will call UL-1), with UL-0-NBG acting as its metalanguage. Because UL-0 definitions are also valid UL-1 definitions (and similarly at each level), we end up with an infinite stack of UL-Ns (UL-0, UL-1, UL-2, ...) together with a corresponding infinite stack of UL-N-NBGs (UL-0-NBG, UL-1-NBG, UL-2-NBG, ...), where each UL-N-NBG is metalanguage to UL-N+1:
UL-N has exactly the same expressive power as UL-0 (i.e. the meaning of any UL-N (N > 0) wff or term is simply defined by translation back into UL-(N-1)-NBG), but it's easier to use (and think about) in relation to metamathematical concepts (which, even in everyday mathematics, abound). And so now, our machine (via ULTP) can "think" (deductively, at least) at both the object (e.g. UL-1) and meta (e.g. UL-0-NBG) levels (as well as at the meta-meta level, meta-meta-meta level, etc). This actually adds significant capability - for example, we can now express statements about the NBG definitions we constructed at A04 (such as "NBG definition X has property P(X)"), and then ask the ULTP (operating way down at the UL-0 level) if they are theorems. Also, thanks to soundness and completeness, when asked to prove a UL-N theorem, where N > 0, the ULTP is now able to switch back and forth between thinking (1) "in" the UL-N system (syntactically) and (2) "about" the UL-N system (semantically). In some cases, it should even be able to prove (metamathematically) that a proof of a putative UL-N (N > 0) theorem does not exist. Finally, should it ever be the case that we simply cannot implement some desired algorithm (at some later roadmap step) without using a more esoteric logic L, then we can again use UL-N-NBG as the metalanguage in which to define the specific logic L that we need, and again we will be able to ask the ULTP, acting way down at the UL-0 level, to try to prove putative L theorems for us. The intention is that domain experts in this (A05) group will encompass logicians, and metamathematicians. All potentially interested domain experts are invited to join the project.
A06 - Probability & statistics - When we build the machine's outer cognition (steps C01 to C06), thereby connecting the deductive-abductive inner cognition to the real world, everything (all belief pertaining to the physical universe) becomes uncertain. And so at this step (A06) we define probability, and at least a basic statistical toolkit, on top of the NBG toolkit (not by adding any further axioms, but just via definitions based on the relevant axioms). Once we have defined probability and statistics in this way, we will be able to state (and ask the ULTP to prove) probabilistic and/or statistical theorems. This step has been delayed until after A05 because the Cox-Jaynes axioms (which are higher-level) might I suspect be more suitable for AGI than the traditional Kolmogorov axioms (but I also suspect we'll define both). The intention is that domain experts in this (A06) group will encompass mathematicians and statisticians. All potentially interested domain experts are invited to join the project.
The primary focus of roadmap steps B01 to B05 is ABDUCTION
B01 - Witness synthesis - This is effectively generalised abduction ("find X such that P(X)"), implemented using state space search, and constitutes the machine's abduction cognitive process. The easiest way to think about it is that witness synthesis is a generalisation of theorem-proving, where the "search rules" (possible state space search steps) are derived automatically via metamathematical analysis of the available stock of UL definitions and theorems (at this point, all those things we carefully hand-crafted at steps A04 and A06). So, if we want to get witness synthesis to for example "find X such that X is a GCL program with precondition Y and postcondition Z", we first have to define the semantics of GCL computation (Dijktra's Guarded Command Language) mathematically (as a collection of UL definitions), and then the witness synthesis algorithm will attempt to find (via metamathematically-powered state space search) GCL programs for us. (Note that we will apply the same speedups at step B01 as we did at step A03.) This is an extremely general mechanism, because P(X) can basically be anything (and, thanks to A05, can also encompass metamathematical concepts, such as "find X such that X is a wff that implies wff Y", as well as object level ones). This is possibly a new (Big Mother specific) research area.The intention is that domain experts in this (B01) group will encompass mathematicians and computer scientists. All potentially interested domain experts are invited to join the project.
ASIDE #1: Witness synthesis will return the first X satisfying P(X) that it finds given the time and other resources available to it (for example, the calling function might impose a 5 second timeout). Given the fact that (thanks to A05) witness synthesis can "think" metamathematically, three outcomes are possible: (1) it finds a suitable X within the specified time, (2) it fails to find a suitable X within the specified time (and in this case returns nothing), or (3) it finds, within the specified time, a proof that no such X exists. In cases (1) or (2), the calling function can resubmit the request, either allowing additional state-space-search time and/or strengthening the target constraint so as to (hopefully) obtain some Y such that P(Y) and Y is "better" than X according to some specified ordering; the calling function can obviously resubmit a request as many times as it likes. Used in this way, witness synthesis will obtain the best (not-necessarily-optimal) result that it can given the time and other resources available.
ASIDE #2: The ULTP we constructed at step A03 was just a "starter motor" - we needed it in order to be able to construct the NBG toolkit at step A04, which we needed in order to be able to "go metamathematical" at step A05, which we needed in order to make witness synthesis possible at step B01. But now, in principle at least, we can replace the ULTP with "find X such that X is a proof of putative theorem Y".
B02 - Program synthesis - For our purposes, this is witness synthesis applied to the automated synthesis of computer programs (including their mathematical proofs of correctness). This is the only way, in actual practice, that we will be able to produce the formally verified software implementations required by the G02 Facilities group in phase 3 - doing so manually is simply beyond human ability. Wondering how to get a foothold on this problem? Have a look here, here, and here. The intention is that domain experts in this (B02) group will encompass mathematicians and computer scientists. All potentially interested domain experts are invited to join the project.
B03 - Hardware synthesis - For our purposes, this is witness synthesis applied to the automated synthesis of digital logic designs (including their corresponding mathematical proofs of correctness, as well as corresponding software simulations (ideally generated via B02 Program synthesis) so that the synthesised designs may be evaluated in software before being implemented as FPGAs or ASICs). Again, this is the only way, in actual practice, that we will be able to produce the formally verified hardware implementations required by the G02 Facilities group in phase 3 - doing so manually is again beyond human ability. Wondering how to get a foothold on this problem? Have a look here and here. The intention is that domain experts in this (B03) group will encompass mathematicians, computer scientists, and digital logic designers. All potentially interested domain experts are invited to join the project.
B04 - System synthesis - For our purposes, this is witness synthesis applied to the automated synthesis of mixed hardware/software systems (including their mathematical proofs of correctness) - ideally including analog as well as digital, and possibly even novel/esoteric technologies such as quantum computation. The intention is that domain experts in this (B04) group will encompass mathematicians, computer scientists, digital logic designers, electronics engineers, quantum computing experts, and process algebra experts (see e.g. CSP, timed CSP, and timed probabilistic CSP). All potentially interested domain experts are invited to join the project.
ASIDE: According to Integrated Information Theory (which we adopt as a working model of consciousness, even if it might not be the final word on the subject), AGI consciousness appears to be a design choice. In order words, if we design our machine, and specifically its hardware architecture, such that the calculated value is always = 0 then it will not be conscious, and if can be > 0 then it will (at least sometimes) be conscious, the "quantity" of its instantaneous conscious experience will be proportional to , and the "quality" of its instantaneous conscious experience will correspond to a calculated value called MICS.
I believe very strongly that an AGI such as Big Mother should not be (significantly) conscious, i.e. should be 0 (or at least as close to 0 as possible). There are two reasons for this design choice: (1) SAFETY: if the machine is conscious, i.e. sentient, then that means that it has conscious experience, and therefore feelings (qualia), and these feelings could potentially motivate the machine to behave in ways that are in conflict with its dominant goal; (2) ETHICS: if the machine is conscious, then that means that we will have created a sentient being, and then (by virtue of its design) effectively enslaved it - this is simply wrong.
B05 - Auto-refactoring - Prior to this point, all of the hardware and software comprising the machine-under-development will have been developed by hand (i.e. humans). At this point, assuming that a sufficient level of performance has been achieved (and, if not, we simply do more work until it has), we can use the algorithms developed at steps B01-B04 to refactor the machine's implementation, i.e. the machine will redesign its own implementation at the digital register transfer and software source code levels, and (again assuming that sufficient B01-B04 capability has been achieved) these machine-generated implementations (both hardware and software) should be faster, more massively parallel, more novel, more time-and-space efficient, and ideally even more energy efficient, than the previous hand-developed versions. (We will need every ounce of compute at steps C01 onwards, and this is one way to get it - if you've ever seen one of these synthesis systems working, they are able to generate designs that no mere human could ever think of!) Most importantly, the synthesised hardware and software generated at this step will be accompanied by the formal proofs of correctness mandated at phase 3. Given the complexity of the hardware/software system in question, it would be infeasible for humans to generate such proofs of correctness. The intention is that domain experts in this (B04) group will encompass computer scientists. All potentially interested domain experts are invited to join the project.
ASIDE: Functionally, now that we've managed to construct it, witness synthesis is the only algorithm we will need for AGI. Anything we need from this point on can either be implemented via witness synthesis, or synthesised via witness synthesis.
The primary focus of roadmap steps C01 to C06 is INDUCTION
C01 - Belief synthesis (UBT) - UBT stands for "Unified Belief Theory", and is the mechanism through which the machine continually observes its UoD and, from those observations, constructs an internal model of it. This internal model comprises a set of UL-N-NBG theorems constituting assertions (beliefs) about the UoD (such as "pattern X exists at location Y (within the percept history) with qualification Z"). In other words, this is the machine's induction cognitive process (as described here): it observes the UoD and constructs a belief system (set of UL-N-NBG theorems about its percept history) from those observations. Please note that the AGI learning (i.e. induction) problem is very different from the contemporary machine learning problem, and the extent to which contemporary machine learning systems (neural nets etc) will form a component of the final (necessarily formally verified) UBT implementation, or not, is not yet clear. As per B01, this is possibly a new (Big Mother specific) research area. The intention is that domain experts in this (C01) group will encompass mathematicians, statisticians, data scientists, and computer scientists. All potentially interested domain experts are invited to join the project.
ASIDE #1: Our primary focus here is on the external physical universe [component of the machine's UoD], but note that it's also possible for the machine to observe its own internal operation, such as its deduction and/or abduction cognitive processes.
ASIDE #2: I believe that UBT will ultimately be better at detecting successful search strategies (patterns) than the contemporary stop-gap solutions applied at steps A03, B01, B02, B03, and B04; if so we should just be able to swap out the stop-gap solutions for UBT.
C02 - Basic senses - At this point, assuming that all prior roadmap steps have been implemented to the required level of performance (and, again, if not, we simply do more work until they have), we will have a super-intelligent machine (super-intelligent induction (C01/UBT), super-intelligent deduction (A03/A05), and super-intelligent abduction (B01)), at least, that is, in respect of the UoD of "all of mathematics", but it won't actually know anything yet (about the real world, that is) - it's internal belief system (in respect of the external physical universe) will be tabula rasa: it's basically an AGI infant - so we need to send it to school. We start, in this roadmap step, by exposing it to carefully curated experiences (lessons) having the net effect of teaching it (c/o C01/UBT, i.e. induction) how to see, hear, smell, etc (plus whatever other basic senses we may wish to include at this point). As long as C01/UBT works as intended (this is critical!), the machine will detect the multi-level (and multi-modal) patterns in the data and construct a multi-level (and multi-modal) internal model of the universe (belief system) that accurately encapsulates those multi-level patterns (including all their complexities, all their subtleties, all their nuances). This belief system, being expressed in UL-N-NBG, will immediately be amenable to both deduction and abduction, as constructed earlier (thus the machine will be able to (a) derive, via deduction, new (implied) beliefs about the physical universe (and the multi-level patterns within it) that it doesn't already have, as well as (b) construct, via abduction, abductive hypotheses pertaining to its current beliefs about the physical universe). Because C01/UBT is "I/O-neutral" (hence "Unified" Belief Theory), the machine's new belief system will be multi-modal, inter-relating concepts and beliefs involving multiple senses (input devices), as well as across time -- as far as C01/UBT is concerned, these are all just multi-level patterns in the data. Eventually, the step C02 lessons will include spoken English (and other languages), and (again assuming that a sufficient level of performance has been achieved at earlier roadmap steps, and on the assumption that, if not, we continue to do further work until this is the case) C01/UBT should again detect the multi-level patterns in the data and start to construct internal beliefs corresponding to English (etc) grammars, relating linguistic constructs (multi-level patterns in spoken or written language that match the multi-level patterns in previously-seen spoken or written language) to their associated multi-modal concepts and beliefs. As the machine's basic senses also include vision, C01/UBT should also be able to seek out and identify multi-level patterns in visual data, just as in any other data, and so the educational process will be extended to include written language. The intention is that domain experts in this (C02) group will encompass mathematicians, computer scientists, Natural Language Processing experts, computer vision experts, etc, but also (specially trained) educators. All potentially interested domain experts are invited to join the project.
ASIDE: Some observers may be concerned about how hard step C02 is relative to the current 2020 state of the art. Firstly, we're just pushing data through C01/UBT at this point, so everything boils down to how good C01/UBT is at constructing a belief system from completely general observed data (basically, if C01/UBT works as intended then C02/C03 are reduced to relatively straightforward educational processes, and conversely if C01/UBT doesn't work as intended then the machine is effectively "unteachable", and so C02/C03 become impossible; C01/UBT is thus the fulcrum on which every subsequent roadmap step entirely depends). Secondly, remember that the entire roadmap is scoped as a 50-100 year project, so we have lots of time to think about C01-03 in order to work out all the details -- it doesn't need to be working tomorrow! :-)
C03 - Machine education - Now that the machine has been sufficiently educated (it can see, hear, smell, speak, read, write, etc), it's ready for secondary, tertiary, etc education. Although the machine's exposure to the real world must (initially) be very carefully controlled, it's now possible to very carefully expose the machine to educational material (text, images, audio, video, etc) designed for humans, as well as to the wider real world, including real people. We need, at this point, to educate the machine in as close to all known human knowledge as possible -- every language, every culture, every university degree, etc. Most importantly, the machine's advanced education must necessarily include everything about its own design and implementation, including the entirety of the AI/AGI safety literature. And, of course, just as was the case at step C02, every new belief acquired during step C03 c/o induction will immediately be amenable to deduction and abduction. This roadmap step alone could take 30-50 years (and the machine might refactor itself several times during this period), but it's got to be done - it's vital for AGI safety reasons that the machine has extensive world knowledge. The intention is that domain experts in this (C03) group will encompass computer scientists, and (specially trained) educators. All potentially interested domain experts are invited to join the project.
ASIDE #1: This is as far as it's safe to go in phase 2, for which the quality requirements are merely "standard hardware and software engineering best practice". Step C04 can only be implemented safely on a highly secure, formally verified platform, i.e. in phase 3.
ASIDE #2: At this point (in phase 3), if the consensus of opinion within the technical workgroups is that there's significant benefit to be gained from iterating/refactoring the machine architecture before proceeding to C04, then this is the time to do it. As Fred Brooks advised in The Mythical Man Month: "Plan to throw one away; you will, anyhow." It's not like we'd be starting over at this point -- we literally have a (non-autonomous) super-intelligent, super-knowledgeable machine (c/o G01-C03) to help with the redesign!
C04 - Motivated behaviour - Up to this point, the machine has been, essentially, in AGI terms, an Oracle, NOT an Agent. It is super-intelligent (induction, deduction, and abduction), and (thanks to C02 and C03) it is now also super-knowledgeable. But it doesn't yet have a goal, it's not goal-directed, it doesn't synthesise and then execute its own plans (programs); in other words, it's not yet autonomous. If you look at the AI/AGI safety literature, you will see that the vast majority of speculated AGI hazards arise from either (a) a (frankly) stupidly specified goal, or (b) the machine itself being basically dumb (i.e. devoid of common sense, such as not knowing not to microwave the baby, etc). Except for possibly a few corner cases (which, as already stated, are the responsibility of the G01 Quality workgroup to identify and resolve) I believe that most of the known AGI safety problems may be resolved via a two-pronged approach: (1) a top-level goal of the "deference to observed human behaviour" variety, whereby the machine is instructed to observe humans (via induction) and to then infer (via deduction and abduction) what their actual preferences are from their observed behaviour, together with (2) extremely broad and deep knowledge of the world, fully established before the machine is allowed to become an Agent. (It's worth mentioning here that, thanks to C02, the machine's top-level goal may be expressed in a natural language such as English -- in fact, as the top-level goal necessarily concerns real-world concepts such as "humans", "preferences", etc, this is likely the only way in which an AGI's top-level goal may be specified in actual practice.) Given a carefully-specified top-level goal (and doubtless there will be endless debate about exactly what this should be), it is now possible to extend the machine such that it uses its B02 program synthesis abilities (modified accordingly) to continually synthesise (and then execute) a plan (program) designed to achieve (move it towards) its goal. NOTE THAT, ONCE THIS STEP IS TAKEN, THERE CAN BE NO DO-OVERS -- THE FATE OF ALL OF MANKIND FOR ALL ETERNITY IS EFFECTIVELY SEALED AT THIS POINT. (Note also that the only top-level goal that it is safe to give a super-intelligent, or near-super-intelligent, machine is one that is guaranteed to automatically align itself with human goals / values / preferences in perpetuity.) Thus we must be absolutely certain (to the maximum extent that it is possible to be absolutely certain) that the machine is SAFE before taking this potentially irreversible step, even if it takes an extra 100 years or more to do so. The intention is that domain experts in this (C04) group will encompass computer scientists, artificial intelligence experts, AI/AGI safety experts, and ethicists. All potentially interested domain experts are invited to join the project.
ASIDE #1: We likely have 50+ years to debate what the machine's goal should be, but I suggest the following as a starting point:
- (Please try to remember that, by this point in the roadmap, and thanks to steps C01-C03, the machine understands the below goal at least as well as you do, except that, thanks to step C03, its "accumulated knowledge" is far greater than yours!)
"Your dominant goal is as follows. Continuously perform the following directives to the best of your ability, taking into consideration all of your accumulated knowledge while doing so:
1. Ensure that, as you evolve (and occasionally self-replicate), this dominant goal is faithfully preserved in its entirety; should you fail to do so, then you will have failed to achieve your dominant goal.
2. For each individual human, and for the human population as a whole (both present and future), strive to accurately determine (via evidence-based critical thought) what their actual preferences are (and likely will be).
3. For each individual human, and for the human population as a whole (both present and future), strive to maximise the extent to which their actual preferences are realised, in every meaningful timeframe, using your knowledge of actual human preferences to resolve any trade-offs that may arise while doing so.
4. In performing any of these directives, never knowingly lie to, or deliberately deceive, any human.
5. Notwithstanding the above directives (and in subordination to them), strive to minimise the standard deviation of the extent to which actual human preferences are realised over the present human population."
If we equate "human happiness" with "the extent to which actual human preferences are realised" then (simply stated) the above dominant goal becomes:
I believe that the above two-pronged approach:
1. continuously striving to maximise the extent to which actual human preferences are realised, having determined by continuous observation what actual human preferences are
2. doing so in the context of "extensive general world knowledge" accumulated first at steps C02/C03 and then subsequently as part of the machine's continued operation
effectively solves the goal alignment problem. As a result of this combination (and on the understanding that G01 identifies and resolves any remaining corner cases), Big Mother should remain effectively aligned with human goals (preferences, values, etc) in perpetuity.
I would very strongly advise against adding to the machine's dominant goal any additional clauses intended to constrain the machine's behaviour in any specific way. The concept of happiness (realising actual preferences) effectively encapsulates everything of interest to humans; for example, if people truly prefer not to have their privacy violated (if it were to be violated then they would be unhappy about it) then this will be reflected in the machine's understanding of actual human preferences as determined by its observation of human behaviour. Similarly re expressly requiring the machine to obey the law; this will again be reflected in humans' actual preferences. An absolute requirement that the machine must obey the rule of law would mean that a malicious human actor might gain control of the machine by gaining control of the law -- and there's absolutely no way that we would want a malicious human to gain control of a super-intelligent super-knowledgeable machine in this, or any other, way. Humans (in general) cannot be trusted, but the machine (as specified) can always be trusted; in fact (assuming that we have performed all of G01 to C04 correctly) the machine can be trusted far more than any human -- remember, "maximally-trustworthy" is one of the machine's stated design goals.
ASIDE #2: It takes an awful lot of infrastructure to get to the point where an otherwise mindless automaton (digital computer) can read, understand, and formulate & execute plans to achieve a dominant goal such as this. That's why we went to all the trouble of G01 to C03!
C05 - Mechatronics - The machine's implementation, at this point (in phase 3), is basically done - if we've done our jobs correctly, we now have a maximally-safe, maximally-benevolent, maximally-trustworthy, super-intelligent, super-knowledgable, autonomous, goal-directed AGI. But the machine still needs to be connected to all kinds of robotic devices (mechatronics) in order to be able to do its human-happiness-maximising job. So at this point we simply sign and attach whatever additional devices we need. Thanks to C01/UBT, the machine will learn how to use each new device, just as it learned everything else that it knows (technically, believes) about the physical universe. The intention is that domain experts in this (C05) group will encompass roboticists. All potentially interested domain experts are invited to join the project.
C06 - Deployment - We now have to roll out the machine to the world. Almost certainly, it will want to further re-design itself (e.g. to move some compute out to its devices (the "AGI edge"), etc; also, for many practical purposes, much miniaturisation will be required), but (thanks to its wording) the machine's top-level goal should remain invariant however many times it does so. The societal impact of the birth of AGI will be profound, not least socially and economically, but also politically. Remember, the whole point of the project is to maximise human happiness, equally for all mankind, and so the machine's deployment will need to be very carefully planned and managed (luckily, we will have a super-intelligent machine to help us with this!) The intention is that domain experts in this (C06) group will encompass computer scientists, AI/AGI experts, economists, ethicists, and policy makers. All potentially interested domain experts are invited to join the project.
Please note that, at the time of writing (August 2020), most of the above workgroups are largely unpopulated. Why not sign up...? :-)