An international collaboration of engineers, scientists, and other domain experts, funded by philanthropists,

building the gold standard of Artificial General Intelligence, for all mankind.

- About
- 02 - (Major design choices, project management, formal systems)

PREREQUISITE: It is recommended that you watch Part 01 before watching this video!

- Clip 01 (1:10)
- Intro
- [Hello & audience photo]
- Clip 02 (12:13)
- The next 700 AGIs
- AGI design choices
- what knowledge is hard-coded vs learned?
- to what extent is the machine design based on an archetype (e.g. humans)?
- Clip 03 (4:13)
- We're not building a human!
- the perils of anthropomorphisation
- Clip 04 (16:55)
- Top-down design
- iPhone example
- hierarchical (tree-like) structure
- top-down recursive decomposition
- multiple design teams working in parallel
- resolving design choices
- quality control & quality assurance
- backtracking (rip-up-and-retry)
- ... applied to Big Mother
- Clip 05 (9:17)
- The unbreakable iron triangle
- functionality vs quality vs cost
- Big Mother is quality-driven
- Clip 06 (24:56)
- Deduction
- Formal systems
- the MHW (Man Horse Wheel) system
- syntax
- alphabet
- formula
- wff (well-formed formula)
- semantics
- proof rules
- axioms and inference rules
- theorems
- proofs
- theorem-proving
- theorem-provers
- homework - can you prove Man Wheel...?

Transcript 01:

Aaron Turner:

I'm Daddy Bear, or you can call me Aaron. I'm nervous. This doesn't get easier the more often you do it. I have to take an audience photo. Is that okay? Are you okay with that? Yeah? Cool. He's putting his jacket on look! Okay, brilliant. Three, two, one. Excellent. Thank you very much.

Aaron Turner:

Okay. Now as I say this is a very long presentation. It's not just today. It's like there'll be 10 more of these, so we can't have questions at the end. Just - if you have a question - just interrupt me at any time. From my perspective, any feedback, any comment is useful information, okay.Transcript 02:

Aaron Turner:

Aaron Turner:

In particular, of course, the machine should be maximally safe, maximally benevolent and maximally trustworthy. I know it might sound impossible, might sound crazy, because we're just talking about a computer here. How do you achieve that? But it's just done in many, many, many, many levels. Okay? You just start with the hardware, you add many, many, many, many levels until you get that level of behavior.

Aaron Turner:

Okay, so the next 700 AGIs, right. There's this chap called Seth Baum, and I've forgotten which organization he's with ... The Global Catastrophic Risk Institute, I think. He did a survey recently of AGI projects, and there are about 45 apparently, and I've asked him to include us in his next survey.

Aaron Turner:

Now, if you imagine the space of all possible AGI designs, so all possible ways you might build an intelligent machine, and there's probably thousands of them, even an infinite number, only some of which will be technically viable. Okay? And, in particular, only some of which will be safe, benevolent and trustworthy.

Aaron Turner:

So, when you are designing an AGI, you have to make a number of design choices. So if for one particular design choice, you've got three choices, that essentially partitions the space of all possible AGI designs. So, there's AGI designs the way you made that choice down there, and then AGI designs the way you made that choice there, and AGI designs the way you made that choice there.

Aaron Turner:

I'm just going to talk now about a couple of the design choices, major design choices, that I made. One of the key design choices is, what knowledge do you actually build into the machine, do you hard code into the machine? And what is learned by the machine?

Aaron Turner:

So, for example, there are some AGI designs, like reinforcement learning, where basically everything is learned by the machine. They just build this learning engine and hope to switch it on and hope for the machine to learn absolutely everything from scratch, and nothing is hard-coded. The same could be said maybe of the whole brain emulation approach, where the idea is you have a big computer that emulates a hundred billion neurons or something. And if it's similar to the human brain, then it will learn similarly and it will learn everything. So, note again, in that instance, no knowledge about anything is hard-coded into the machine.

Aaron Turner:

Now this design, the Big Mother design, is deliberately split into two levels of cognition. There's an inner cognition where the universe of discourse is all of mathematics. So, it's not interfacing with the real world. When the inner cognition looks out at the world, all it sees is mathematics, but it's essentially clever enough to be able to ... The universe of discourse can represent essentially any, all of current day mathematics. It's basically set theory on which all of mathematics is based.

Aaron Turner:

And it only has two cognitive processes, deduction and abduction. So, there's no learning cognitive process here. [Sorry, that's not quite true - please see below!] So, there's no induction [in the sense of observing the physical universe and synthesising beliefs from those observations]. So, most of the knowledge, because the universe of discourse is math, so most of the knowledge is hard-coded during the design process. We'll be encoding the [mathematical] knowledge that we need and we'll be talking about that later on. Some [level of knowledge] about mathematical objects will be learned, and I'll be describing that later. Actually, quite a bit later. Because that's a speed up. It's a speed up mechanism.

[AMT - editor]:

[Apologies for the confusion, let me try again! If you think of the hard-coded mathematical knowledge as a set of definitions, then we can think of the process of solving a particular mathematical problem as essentially one of using those definitions as "stepping stones" in the search for a solution. But in which order should the definitions be applied? Is there some strategy that will find a path (sequence of stepping stones) to a solution more quickly than just random search? Well, one of the speedups we will talk about later involves applying learning to any state space search problem. So, internally to the mathematics-based inner cognition, the machine will observe itself trying to find solutions to mathematical problems, and gradually synthesise statistically-derived strategies for choosing which stepping stone to try next. This is similar to how AlphaGo (etc) learned how to play Go (etc).]

Aaron Turner:

The outer cognition is basically built on top of the inner cognition, and the universe of discourse [of the outer cognition] is the entire physical universe. So, when that level of machine looks out at its universe, it basically sees everything we see. Okay? And there is a primary cognitive process which is called induction. And all knowledge, all belief about the physical universe is learned by the machine. Okay.

Aaron Turner:

So there's those two levels of cognition. The inner cognition is, it's just deductive [and abductive], it's just mathematical, and it essentially grounds the machine logically. And this is a big part of making the machine safe. Okay? And then the outer cognition is built on top of that, and that's where it's looking at, that's where you connect the inner cognition to the physical universe, and once you're looking at the physical universe, everything's uncertain. Everything's a guess. Okay? But rather than hard-code belief about the physical universe, the machine will learn essentially everything "from bare metal" about the universe.

Aaron Turner:

Because anything we might tell it [could be] wrong? Because anything we think we know [about the physical universe] is actually a guess. Everybody's belief system is different and subjective. So, we might think we know what cats are, but actually, other people have different belief. Cats is a slightly different concept [from person to person]. So, basically, the machine, the outer cognition, will learn everything. All we do is we build this mechanism for observing the universe and synthesizing beliefs from it.

Andrew:

Isn't there something about a model is not more or less accurate, it's more or less useful, I can't remember who said that [inaudible 00:07:50] I think. So it would depend what direction, your AGI is coming from as to what kind of models it builds. There isn't a right or a wrong model other than the context. You see what I mean?

Aaron Turner:

Well, there is such a thing, if you look at the problem of induction there is, there is something called weak induction, where you can synthesize beliefs which are not justified by evidence. And then there's strong induction, where you synthesize beliefs which are in fact justified by evidence. So what we want to achieve, of course, is strong induction. So when it synthesizes a belief, it does actually have genuine justification for doing so.

Aaron Turner:

But just like any entity, when it looks out at the world, at the universe, and collects percepts, it will have a percept history, which is its history of observation of the world. It's always going to be finite, and it's always going to be different from anybody else's percept history. Everybody's experience of the world is different. And that's the information that they use to synthesize their beliefs from.

Aaron Turner:

The most important thing is to make sure that when it synthesizes beliefs, it's done so on the basis of strong induction. Okay. So the beliefs, you can actually have confidence in them relative to its percept history.

[AMT - editor]:

[The next problem of course is that, like an infant, the machine's percept history will initially be very limited and, even with strong induction, the quality of its beliefs will be constrained by the size of that percept history, i.e. by the range and depth of its experience of the world. The solution to that of course is to comprehensively educate the machine about the world (once it's capable of being so educated!) so that its beliefs are synthesised from a massive and varied corpus of observation and experience. But that's for much later on in the roadmap - steps C02 and C03.]

Aaron Turner:

So, again, this is something we were talking about a second ago. Another key design decision is, to what extent do you base the machine architecture on, for example, the human archetype as opposed to trying to just think mathematically about what intelligence is.

Aaron Turner:

Now, there's a veteran AI researcher [sadly no longer with us] called Patrick Henry Winston, and he [loved] to point out that if you read any neuroscience textbook, the most common thing you will see expressed is: we don't know how this bit works yet. So really, for a start, we really don't know enough about how the brain works to be able to reverse engineer one, in truth. You know, we have some partial ideas, but we're not at the point yet where we are able to reverse engineer human intelligence, human level intelligence, by studying the brain.

Aaron Turner:

However, even if we did, this is what I was saying earlier on and what I said in the introduction, human cognition is seriously flawed relative to the logico-mathematical ideal. All right? Now, it's easy to think that humans are the most intelligent species on earth, and they're the most intelligent example that we have of intelligence. That doesn't necessarily mean that they are [perfect] ... that you can't improve on that. And in fact, once you start studying intelligence from the logico-mathematical perspective, it is immediately obvious that humans are flawed along these various dimensions. So why, if human cognition is flawed, why reproduce that flaw, those flaws in an AGI. You see?

Aaron Turner:

So, for that reason, I've decided to take the logico-mathematical approach, and sometimes this is called machine intelligence rather than artificial intelligence. Artificial intelligence kind of suggests that you're starting with some archetype, typically humans, and trying to construct an artificial version of it. But machine intelligence suggests that you're just studying intelligence as a concept in its own right, starting from a blank piece of paper. And that's more the approach that I've taken.

Transcript 03:

Aaron Turner:

Aaron Turner:

So, for example, you were mentioning emotion, human emotion involved in planning. The mistake would be to assume that just because humans use emotional information in planning, that any intelligent machine would have to do so. It's really, really easy to anthropomorphize like that because, like I say, it's the only example we have. Humans are the only example we have.

Aaron Turner:

So these traits, consciousness, emotion, feelings, empathy, fear, self-preservation, attention (the human brain has an attention mechanism), selfish ambition, envy, jealousy, anger, deceitfulness, these are all human traits. They are not necessarily going to be reproduced in an artificial machine, in a designed machine.

Aaron Turner:

As long as there's no spiritual component to humans, as long as they are made of the same stuff of the physical universe, then if you wanted to, we could, in principle, build any of these things into a machine, once we understand how the mechanism works. Because you're built out of the stuff the physical universe gives us, and we could just use those same properties to build a machine. If we want to build a conscious machine [AMT edit: not that I think we would!], we can do so. If we want to build an emotional machine [AMT edit: not that I think we would!], we can do so. But, again, we don't have to. It's not necessarily a prerequisite for intelligent behavior.

Aaron Turner:

So, yeah, like I was saying, the natural tendency to anthropomorphize AGI is misguided, and it's absolutely not the case that an AGI must possess any of these traits or that an AGI will necessarily develop any of these traits.

Andrew:

[inaudible 00:02:56] They might be useful, but they might, but they might [inaudible 00:02:59].

Aaron Turner:

Yeah, they might be. [crosstalk 00:03:00] They might be but [crosstalk 00:03:00]-

Andrew:

We might have consciousness for a reason. There may be a reason why we have that, and it may be it makes other stuff work better, but we don't actually understand [crosstalk 00:03:07]-

Aaron Turner:

It might be, or it might be, like I said, that we only have three pounds of brain matter, and it's built in ... it only consumes 20 watts of power, and it's built in a particular way. It's built in a biological way.

Aaron Turner:

But if you see from the introduction, BigMother 1.0 will be a supercomputer, will weigh 400 tons, consume four megawatts of power. We don't have the same constraints that humans have. It also, these computers, just by the way they're constructed, have a precision of thought that humans simply do not have, okay?

Aaron Turner:

But you're right. But we will work out which of them are necessary and which of them aren't. But it is unsafe to assume that they are necessarily requirements for AGI. We basically have to start with a blank piece of paper.

Transcript 04:

Aaron Turner:

Aaron Turner:

But if you look inside it, open it up, you will see that it breaks down into these various subassemblies, and you can go even further. If you see the boards labeled five and six, I think it is, right, those boards themselves have a number of sub-subassemblies, in other words silicon chips. And we can keep going down. The large chip on one of the boards, board five, is basically the processor, a six-core CPU and three-core GPU with 4.3 billion transistors.

Aaron Turner:

And if we looked very close, even more deeply, we would see loads of logic gates like this made out of individual transistors and we would see an array of transistors like that. This isn't actually that chip. This is just another example chip, but they all look roughly the same at this level.

Aaron Turner:

Okay. Now, the CPUs, in particular, without software... I mean, CPUs exist to execute programs, to execute software, and without software, all they do is generate heat. Okay? So, typically, in any system that's got a CPU in it, you also need software as a component. So that's a little listing of some C code.

Aaron Turner:

By taking apart each level, we end up with a tree of subassembly, sub-subassembly, sub-sub-subassemblies. And at the bottom, we have actual implementable hardware. Sorry, Pawel this is going to be very boring for you. So, we have actual physical things, physical transistors. We have physical cases and springs and things like that, and we have software. And if we start at the leaves, if we start with all those bottom-level components and gradually integrate them, okay, all the way up to the top, then boom, we have a phone. Okay?

Aaron Turner:

The systems themselves have this tree-like structure, so the design process also has a tree-like structure. So we start with a very, very high level of specification. It's purely abstract, just says, in abstract terms, how the system... how we want the system to interact with its environment. Okay. There's nothing concrete here. We just have a behavior in mind. And then we can decompose that into subcomponents such that we know that if you combine the behavior of the subcomponents, it refines the desired top-level behavior.

Aaron Turner:

Each time you do that decomposition, which is also called a refinement, you're effectively adding more information to your spec or your design-in-progress, and you're getting closer and closer to it being concretely implementable. And you just keep going. So, here, we've got three levels.

Aaron Turner:

Okay. While you're partway through your design, you can imagine future levels of design that you haven't yet done yet. So, here, the blue part of the tree is the bits we've done so far, and the yellow bits are the bits that we have yet to do, okay? I mean, you can't see them physically, but obviously, we know that we've still got those levels to do. And we keep recursively doing this until we get to the stage where everything we've got is now low enough level that we can actually implement it concretely in hardware or software.

Aaron Turner:

And then we build the hardware, software, then we go back up the tree, integrating it all. As long as we did it all correctly, we get the final system with the behavior that we initially... the target behavior that we initially specified.

Aaron Turner:

So, the overall process is: specify the very, very top level, then you refine, refine, refine down each level. Then when you get to the bottom, you just build, build, build, going back up the leaves, integrate, integrate, integrate. And finally, you've got the top-level system, and you deploy it. And this process is virtually ubiquitous in engineering, or just the intuitive idea of it, anyway. So, it doesn't matter if you're building planes, trains, whatever, space stations, smartphones, computers, or AI, you can adopt this top-down, recursive decomposition process.

Aaron Turner:

So, a couple of additional points. So because of the way it's organized, you can actually have multiple teams working on different subtrees, okay? So, this is a way of having more than one person, more than one team working on the design in parallel, okay?

Aaron Turner:

Another slight complication is you might get to a particular point and you need to decompose your design, but there's more than one way of doing it, okay? There might be half a dozen ways, and maybe you don't know which one is best. So, you might actually implement them both or partially implement them both. And [when] you have enough information to work out which choice is going to be better, you just throw the one... the others away.

Aaron Turner:

So, there's a quality dimension to this, which is, of course, important when you're building AGI because we want it to work, right? Okay. We want it to be safe. So if we correctly decompose each level and then correctly integrate when we're coming back up again, we will get a final system that displays the behavior that we want. But humans make mistakes. So, if any errors occur, the final system is not going to work, okay? So, we have to have quality control and assurance... and quality assurance. And I know this is really boring, but this is real engineering.

Aaron Turner:

Like I say, we want to actually build this machine, so we need to have design teams. We need to have quality control and quality assurance. And in engineering, generally speaking, the cost of fixing an error increases exponentially since the time of its introduction, so you always want to catch... you want to have quality control and quality assurance processes that catch errors essentially as quickly as possible because ...

Andrew:

Of course the other thing is it assumes that you do know exactly what you're doing, which isn't quite so easy on these sort of state-of-the-art projects. But when you come to build it, you kind of realize it's not quite as simple as you thought it was. And then you have to start changing what you're doing, and then it doesn't work with all the other stuff. And it all gets really difficult.

Aaron Turner:

Well, yes, but the more methodical you are during the design process, the fewer problems you're going to have. I'm not saying that problems aren't going to occur, but the more methodical you are, if you design the interfaces between components properly and you have defined ... it's called design-by-contract, and if you have quality processes, you will find these problems more often than if you don't, okay?

Aaron Turner:

So, if you make any errors, so if you make a wrong design choice, if you have a design choice and you think, "I can do that or that or that," and you choose that one and then later on, you decide that you made a mistake, that is essentially a form of error. You've made a design error. But when you've built enough of it and you can evaluate it, then you realize that you've made this mistake. You've chosen the suboptimal path. But it doesn't matter because you can just... if you designed your interfaces correctly, you can just swap that design out and then substitute it for a different design, okay? It's called backtracking, or sometimes, it's called rip-up-and-retry.

Aaron Turner:

And basically, if you have a top-down design process that's got quality control and quality assurance and backtracking, if you find that you've made these design choice errors, the process is effectively self-correcting, okay, because if you make a mistake, the quality control, as long as it's strong enough, will pick it up. If you make a design mistake, it doesn't matter. You can just cut that bit out and iterate and fix it, in principle, anyway.

Aaron Turner:

So obviously, Big Mother is a big machine, so we can look at the machine through the same lens. The top-level specification is maximally-safe, maximally-benevolent and maximally-trustworthy AGI. I mean, simply stated, that is the top-level goal, and then that goal decomposed into a number of other goals. Actually, there were four, but two of them (how you represent the knowledge inside the machine and deduction) are so closely related you might as well think of them as one subtask. So, here, I've got deduction, abduction and induction, the three main cognitive processes and representation... how you represent your knowledge inside the machine, how you represent the machine's belief system.

Aaron Turner:

Okay. Now the next level, okay, so those three sub-designs then essentially refine into.. This is the 19-step Big Mother Roadmap, okay, so 19 technical steps. And if implemented, they will implement the second layer, and then if they're integrated, they will implement the top layer.

Aaron Turner:

So, that's essentially how far we've got. And, of course, like I said, there are future levels that we haven't done yet. So in the introduction, I mentioned that there will be a prototyping and evaluation phase and a working implementation phase and then the final implementation phase. You can think of those maybe as one more level or three more levels. It doesn't matter. The point is that there's just further refinement, further information and detail that needs to be added from where we are now. So, where we are now, that's why I call it a framework because it's essentially a design, okay, which if implemented, will have the behavior that we're aiming for. And then, of course, at the end, Big Mother will bottom out with actual hardware and software, which we can implement.

Aaron Turner:

Now, just to put it in perspective, it took roughly one person, i.e. roughly me, about a year to refine the top level into the second level, okay? And it took roughly 34 years to refine the second level into the third level. And, in fact, up until March 2019, if you can see the road map step called C01, that was still yellow, okay, which meant there wasn't an end-to-end design at that level. But as of March 2019, I was able to... I basically saw a path to completing C01, and then now, the design is essentially complete at that level of abstraction.

Aaron Turner:

So, now, it will take a collaboration of however many people it takes, 50 to 100 years, to refine the design now from where we are now to a final hardware and software. And that's why, because I've run out of lifetime, we need a collaboration of... we need volunteers of all kinds. We need donors, people... We need philanthropic donors because we are essentially a nonprofit because we're doing this... The machine is intended to be publicly owned by all mankind, so we want to avoid private equity, and we want to avoid the machine being owned by private corporations or even individual countries.

Aaron Turner:

And, of course, we need what I call vertical domain experts. So, if you look at each of those 19 steps, that is a domain... each one is a domain of expertise. And it's essentially impossible for somebody like me [or in fact anybody] to have a PhD in 19 different... to be a world-level, world-class scientist in 19 different disciplines. I mean, I'm just a horizontal expert. I know enough about each of those things to be able to stitch them together, but now we need input from, yeah, vertical domain experts in each one of those disciplines. So, it's like putting a band together, really.

Transcript 05:

Aaron Turner:

Aaron Turner:

So basically if you change one, at least one of the others has to move. So if you want more functionality, either you pay for it or you decrease quality. If you want more quality, you're either going to have increased costs or decreased functionality. And if you want to reduce the costs, you've either got to decrease the functionality or decrease the quality.

Aaron Turner:

That's why I say, it's an unbreakable iron triangle. They're all linked. You can't change one without changing at least one of the others. Now, in this sort of industrial world, the vast majority of projects proceed on the basis that cost can be reliably estimated and in advance. People produce... Sorry?

Andrew:

"Ha ha!"

Aaron Turner:

Yeah, of course. But it's a delusion. Okay. Every single project I've ever worked on in 40 years, the initial Gantt chart, which says this is when we're going to complete all these things, it lasts at most a week before it's out of date. And they're always wrong. Okay. So there's this law called Hofstadter's Law, which always applies.

Aaron Turner:

"It always takes longer than you think, even when you take into account Hofstadter's Law." Okay. So all non-trivial projects always take longer than the original guess. And in reality, projects are finished when they're finished. I mean it takes a certain amount of effort to build something to a certain level of quality and it's going to take that long no matter what you do.

Aaron Turner:

Now, a lot of commercial organizations, of course, because of the way they're organized, they quite often enforce hard deadlines. So they make an estimate of how long they think it's going to take, and then they enforce it even when the project overruns. Or they try to enforce it. Okay.

Aaron Turner:

And because of the unbreakable iron triangle, well, either functionality or quality has to be sacrificed. Okay. And politically it's easier to sacrifice quality because it's kind of invisible, right? You can cut corners on quality, you can maybe not do a design iteration. You know that you should backtrack and throw some bit of design away and change it with another one.

Aaron Turner:

But if you don't do that other iteration, well all that's going to happen is your system is going to be suboptimal in some way further down the line. And it might be years before the problems start emerging. And also the managers that made that decision, they're not going to be connected to it anymore. You see? So it's really, really easy to cut quality, to sacrifice quality because it's invisible. You can get away with it.

Aaron Turner:

Obviously quite often people cut functionality as well. So probably the best project I've ever seen managed 75% functionality. But this is the reality. And the danger is cutting quality, especially in AGI. Safety, remember? So when you embark on a technical project like ours, you have to decide in advance when you come to these hard decisions, which of functionality, cost, quality, and cost are fixed and which are flexible. Which can you sacrifice when things take longer than you had hoped?

Aaron Turner:

Now, it's not normal for the fate of all mankind for all eternity to rest on the outcome of your project. But in our case, this is the reality with AGI, okay? I discussed this in the first talk. By definition an AGI has maximal cognition along all cognitive dimensions. And one of those dimensions is experience, meaning knowledge. Meaning it knows everything we know, including everything about computer science and artificial intelligence and how it's built. Okay?

Aaron Turner:

And it's smarter than us. Okay? So once you've built it, that's it. You will never put the genie back in the bottle because it's smarter than you. And it can self-replicate and everything. So what we build when we build a AGI one way or another, either through this project or AGI-by-stealth, that determines the fate of mankind for the rest of eternity. Okay.

[AMT editor]:

[No pressure! :-)]

Aaron Turner:

So for that reason, the Big Mother Project is quality-driven. It's quality, not cost that has priority. Okay. And this is unusual. This is not normal in the engineering world. Okay. Normally cost has priority. But in AGI because the potential [downside] consequences are so unimaginable we have to make quality and safety and benevolence and trustworthiness. But it's essentially quality, our priority.

Aaron Turner:

So we specify the quality requirements. And the quality requirements are essentially that the target behavior is satisfied. And it doesn't matter how much time is required. So if it takes an extra 200 years or whatever, we have to allow it to take that time. I don't think it will take an extra 200 years. But I'm just saying quality takes precedence over everything.

Aaron Turner:

I mean, the benefits of AGI [i.e. the potential upside] are virtually immeasurable. I mean, by my estimate, AGI will amplify gross world [product]...

Andrew:

Gross national product?

Aaron Turner:

Yeah. It's global GDP. So gross world product, I think it's called; by an order of magnitude, that's $800 trillion a year. The potential benefits are vast. So who cares if it costs $10 billion to implement rather than 5 billion. Relative to the potential benefits, it's chicken feed. And also relative to the potential consequences, the Terminator scenario or other scenarios which are possibly not quite as bad. Again...

Andrew:

Did you say there's 45 projects? So isn't there a bit of a chance, [of] all those 45 something might go wrong with them

Aaron Turner:

A virtual certainty. Yeah, because they [at least one such AI/AGI project] will cut corners.

Andrew:

Probably won't be around to finish it because someone else has messed it up.

Aaron Turner:

Yeah. It doesn't matter.

Andrew:

Like you said, we shouldn't.

Aaron Turner:

But like I said last time, we can't control what those other people [AI/AGI projects] do. We can only control what we do. Okay. So we're building this machine trying to improve the final [AGI] outcome. Okay. And we will do it in a quality-driven way, irrespective of what other people are doing. And hopefully they won't cut too many corners and they won't have too many disasters.

Aaron Turner:

They [at least some AI/AGI projects] will cut corners. I can guarantee that because they're [at least some of them are] profit motivated, and they will make sacrifices to the God of First-Mover Advantage. It's certain to happen. But anyway, that's them. I mean, but what we do, we can control. So like I say, the potential negative consequences if we don't do this right, are unimaginable.

Transcript 06:

Aaron Turner:

Aaron Turner:

Now, a formal system of symbol shuffling rules has three parts. There's the syntax, which is a set of strings of symbols. And they're also called wffs or well-formed formulae. I hope the rest of the world pronounces it "woof", because that's how I pronounce it. So there's syntax, there's also semantics. Which is, you give each wff some meaning, a mathematical meaning. And then there are the proof rules, which are the ways in which these wffs can be shuffled. And we're going to look at a very simple formal system called MHW, Man Horse Wheel, which is purely educational.

Aaron Turner:

Remember I mentioned Hofstadter's Law. Okay? That is from this guy called Douglas Hofstadter, who wrote this book, Godel, Escher, Bach. In that book he described this system called the MIU system. So MHW is basically the same as the MIU system. I've just used different symbols, more fun symbols. Okay, so the first part, remember, is syntax. So the first thing we have to do is define an alphabet, which is the set of symbols from which the formulas are constructed. So here are some example alphabets. Now, this alphabet is an alphabet of symbols that American hobos [used to] use. They write these little chalk symbols [e.g. on fenceposts] to mean, "Oh yeah, you'll get a good meal here." Or, "Beware they've got a bad dog or whatever." So, that's one particular alphabet.

Aaron Turner:

This is the ASCII table, I think. Actually a computer, a character-set. Emojis, that's an alphabet. That's the Greek alphabet. Those must be Egyptian Hieroglyphs, I think. And I don't know... It could also be Egyptian Hieroglyphs.

Ersin:

It looks like Hebrew.

Aaron Turner:

What is it?

Ersin:

Hebrew.

Marta:

Hebrew.

[AMT - editor]:

[Atcually, I double-checked. They're Linear B syllabic signs (as opposed to Linear B ideograms). So, not Hebrew - sorry guys!]

Aaron Turner:

Oh, okay, there you go. So I've chosen as MHW's alphabet, the set of Linear B ideograms. Okay? And now any culture that has symbols for footstool and bathtub, are my kind of people. That's why I chose this. And so that's the alphabet and a formula of MHW. When we're talking about lots of different formal systems, quite often, we'll put the name of the formal system at the beginning. So MHW-formula, et cetera. MHW-wff, MHW-theorem or whatever.

Aaron Turner:

So anyway, a formula is a possibly empty finite sequence of symbols. Finite string of symbols, taken from that alphabet. That's what a formula is, okay? Just a long string of symbols, but always finite and always non-empty. Now a well-formed-formula for MHW is a non-empty finite sequence of the symbols: man, horse and wheel. Hence MHW. So here are some examples, wffs... I've made this really simple, okay? Man, horse-man-wheel-man, man-horse-horse-man-horse-horse-wheel-horse-wheel-wheel. Okay, you get the idea. So those are all wffs. You can see that they're a subset of formulas. Because there are lots of symbols from the alphabet that we haven't used. We're not allowed to use in a well-formed formula.

Aaron Turner:

So there you go. So there's a set of all formulas and the wffs are a subset of the set of all formulas. So, every wff is a formula, but only some formulas are wffs. Now the semantics... It doesn't have semantics. Okay? So we can basically skip this section for this particular formal system. So it means the symbols of MHW-wffs are meaningless. We could also say that they're uninterrupted, and that will have some meaning later on. So when we're shuffling these wffs, we are shuffling meaningless symbols. Okay? It's just a symbol shuffling game and we can just switch our brains off, shuffle these symbols.

Aaron Turner:

Now we're going to describe the proof rules, the symbol shuffling rules for MHW. Now, normally, in any system like this, the rules are presented like this: the rule has a name obviously, and you have a line drawn. And above the line there are zero more premises and one conclusion underneath. And sometimes there's a side condition. Okay? And what it means is, if the optional side condition is satisfied and you've already derived all of the premises, then you can derive the conclusion. Now MHW doesn't have side conditions, but later systems will do.

Aaron Turner:

Now, a wff is a theorem (this is a new term now) if and only if it can be derived using the proof rules. And a proof is a non-empty finite sequence of theorems, each of which may be derived from theorems appearing earlier in the proof. These are fundamental mathematical concepts, theorem and proof. And it's just symbol shuffling. When you do it fully formally, it is nothing other than symbol shuffling. Okay? So the proof rules are here. Okay?

Aaron Turner:

So the first rule is an axiom. Basically if a rule doesn't have any premises, so an empty top line, it's called an axiom. So rule zero basically says you can always derive the wff man-horse. Okay? Rule one basically says, now if you see the top line, it says X horse and the bottom line says, X horse-wheel. Now, the X here is what's called a syntactic variable. It's a place holder for, in this case, any [well-formed] formula. So any sequence of man-horse-wheel. So, if you have a... It's basically a way of specifying patterns. So if you have a wff that ends with a horse irrespective of what the X is, if you have a wff that ends with a horse, then you can derive an identical wff with a wheel added onto the end. I'm going to give some examples of these.

Aaron Turner:

Rule two: If you have a wff starting with a man followed by some string X, then you can derive the wff starting with a man followed by two copies of X. Rule three: If you have a wff with three horses in the middle, then you can derive the wff obtained by replacing those three horses with wheel. Okay?

Aaron Turner:

Rule four is, if you have a wff that includes a double wheel, then you can basically delete it. So you can derive the wff obtained by deleting the two wheels. Okay? I'm going to give you some examples. What have I done? Oh, okay. Okay. Here are some examples of each one, examples of each rule. So from nothing we can derive horse wheel [wrong - I meant man horse!] Do you see that? Yes? Now, in our collection of theorems, we've got the theorem horse wheel [wrong again - I meant man horse!] So from horse wheel [wrong!], using rule one we can... Sorry not horse wheel, man horse [finally!] From man horse, we can derive man horse wheel. Do you see that? Because we've instantiated X with man. Okay? So the top line is a man horse, and the bottom line is man horse wheel.

Aaron Turner:

So now, we've got as theorems man horse and man horse wheel. Now rule two, from man horse we can derive man horse horse. Because here, we've instantiated the X is a placeholder for the horse. So we can derive a theorem where we've basically doubled that to horse horse. But then we can apply it again. So we can from man horse horse, we can derive man horse horse horse horse. Okay? These are just examples of pattern matching, where we're instantiating the X with a particular pattern.

Aaron Turner:

So rule three, from man, and I think that's eight horses, we can derive man horse wheel horse horse horse [horse - there should be four horses at the end!] using that rule. Because we've basically taken three of the horses in the middle of that string of eight horses and we've replaced it with a wheel. And again, we can do the same thing again. And we can derive. So from man horse wheel horse horse horse horse, we can derive man horse wheel wheel horse. This is a sentence I never thought I'd ever say.

Aaron Turner:

And the final rule four from... Yeah, we just derived man horse wheel wheel horse. And from man horse wheel wheel horse, using rule four, we can derive man horse horse. Do you get the idea? It's just mechanical symbol shuffling. There is no real thought involved. We're just matching patterns of symbols and matching... Yes.

Marta:

How are the rules generated?

Aaron Turner:

Oh no, you can just make up whatever rules you want when you're designing a formal system. Okay? I mean, Douglas Hofstadter, he just made up this particular formal system as a way for educational purposes for teaching the concepts of theorem and proof and things like that. Okay? Now later, more sophisticated formal systems will have another purpose and we will design them in order to achieve that particular purpose. But right, now we just... Okay? These are just simple symbol shuffling games just to demonstrate how... These are called inference rules, which just to demonstrate how axioms and inference rules work. Okay? By shuffling symbols. And it's amazing all of mathematics can be built on top of this. A different formal system, but you get the idea.

Aaron Turner:

So... What's this now? Okay. Now because when you start off, you haven't got any theorems yet. So when you're initially starting, the only way to start the process off, is with the axiom. Because that doesn't have any premises. Okay? So that's always in any proof, that is always going to be the first rule. Okay? So the first theorem you generate will always be man horse, which starts with a man, okay? Now, if you look at all the rules, all the rules have the effect of preserving the initial man, okay? You can go back and check later, but they do. Okay?

Aaron Turner:

So, if you combine those two facts, the first theorem you always generate is man horse. And every other rule after that, will preserve the initial man. That means every theorem starts with a man. Everything theorem starts with the man symbol. Okay? Now some wffs, for example, this is a wff but not a theorem. So a horse man wheel man is a wff, but it's obviously not a theorem, because it doesn't start with man. All the theorems that are generated by the rules guarantee that the theorem starts with man. So by seeing that horse man... It's so hard to say horse man wheel man, doesn't start with man, we know that it's not a theorem.

Aaron Turner:

So therefore, we know that there are at least some wffs that are not theorems. And they're called non-theorems. And those non-theorems are unreachable by the proof rules. Okay? So, there you go. The theorems are a subset of the set of wffs. So only some formulas are wffs and only some wffs are theorems. So only some wffs are reachable by the proof rules. It's really hard to tell if I've made this too simple or too complicated. So the set of all... if you imagine the set of all theorems, we can visualize them as a tree. It's actually an infinite tree.

Aaron Turner:

So if we start at the top by rule zero, we get man horse and then we can split off. If we go from there with rule one, we get one thing. If we go from there with rule two, we get another thing. And you can imagine the set of all MHW theorems as an infinite tree. It's infinite. It goes on forever. Okay? The individual theorems are all finite in length, but there's an infinite number of them. Yeah. So for each theorem, if you look at the path from the root down to that theorem that corresponds to its proof, because it's basically the list of rules that you use to get to that theorem. Okay?

Aaron Turner:

And traditionally, when you write a proof out, you'll do it as numbered of steps and you will include the justification, how you got there. So step one is always rule zero, which is the axiom and the theorem is man horse. Step two by rule two. Step three, from two by rule two. And then step four, we got to that from step three by rule one, et cetera. Okay? If you're doing it really, really formally, that's how you would write a proof out. Now the great thing about this level of formality is that it's completely unambiguous. Okay?

Aaron Turner:

And actually in a future talk, I'll describe how these systems came to be designed. And it's because philosophers, centuries ago, were trying to reason about things in natural language with common sense reasoning. And they would tie themselves up in knots and never reach any [firm] conclusions. But this is completely unambiguous. If somebody says to you, this is a theorem, right? You can check. Well, if they give you a proof, you can check their proof mechanically and determine if they've made a mistake or not when applying the rules. Okay? So the proofs can be checked completely mechanically if you don't believe the final theorem.

Aaron Turner:

Right. Now... That only works, obviously, if we're given a proof. But what if somebody instead gives us a target wff like man wheel? And they say to us, is it theorem or not? Okay? So, in order to answer this question, we have to basically search the infinite tree of theorems for that pattern, man wheel. And we don't know if it's there or not for this formal system. If it is there. In other words, if it is a theorem and we search the tree systematically, we are guaranteed to find it. We will eventually find it in a finite amount of time. If it isn't a theorem, in other words, it's not on the tree, then we'll just search forever and never find it. Obviously, we can't find it because it's not there. Okay?

Aaron Turner:

So in this formal system, if something is a theorem, we can confirm that it is a theorem. Because we will eventually find a proof. But if it isn't a theorem, we won't be able to confirm either way. Because we might search the tree for a year and not find it. But who knows, if we kept going for another week, we might find it. So we can't tell just because it's taken us a year and we haven't found it, we still can't categorically say that it's not a theorem.

Marta:

Isn't that a limitation for AI?

Aaron Turner:

For AI, No. It's a limitation of... There are a number of limitations of formal systems. There are a number of limitations of logic, which is basically just part of the structure of mathematics. But does that imply a limitation for AI? No, it doesn't. Because actually, humans for example, are subject to exactly the same mathematical constraints. Right? If something is not a theorem, you won't be able to prove that it isn't a theorem by using these rules. You will be able to prove that it isn't a theorem using another mechanism, which we will talk about in a future talk. But so can the machine use that same mechanism. It's just much more complicated. Okay? So, basically, there is no... It is a limitation of logic, but it doesn't constrain AI any more than it constraints humans.

Aaron Turner:

So yeah, the process of searching a tree for a particular wff, a target wff, is called theorem-proving. We're trying to prove a theorem. And we can do it manually, which is really tedious. Or we can try to write a computer program to at least help us. So, there are... Back in the seventies and eighties, there were programs which were called, theorem-proving assistants, which would help you in the process of proving a theorem. They would keep track of everything.

Aaron Turner:

And then more recently, from the 80s to basically now, there are programs that are essentially fully-automated just because computers are so much faster now. Now even the fully-automated theorem-prover wouldn't... If you give it a wff that isn't a theorem, it'll still never find it. Okay? But they are capable enough to find proofs in many many cases. And the whole point of course is utility. You want utility, you want value. So it doesn't necessarily have to find a proof in every single case for there to be value. Okay? So maybe you can find... in a reasonable time, it can find all proofs that are up to a million-lines long, but it can't find proofs that are longer than that. Well, that has still got a lot of use in engineering.

Aaron Turner:

So, we have introduced the following concepts: What an alphabet is. What formula, well-formed formula. Proof rule, axiom. Theorems, non-theorems. The concept of proof or derivation. The process of generating theorems from the rules. The process of proving a theorem. The concept of a theorem-prover, which is a program to actually automate that. And these concepts recur in any logic that you look at. And that's obviously what we're going to look at later on. We're going to look at more sophisticated formal systems that are actually logics and that have actual real utility. But it's much easier to start off with this little toy formal system to introduce these various things like symbol shuffling and things like that.

Aaron Turner:

So, there is homework. And the task is, can you prove man wheel? There's the rules. Now this will be up... I'll put the video up so you'll be able to see these rules. The goal is, can you prove man wheel? Can you find a proof of man wheel using those rules? Okay?

Aaron Turner:

So that's it for today. Not bad timing. Quarter past eight, an hour and a quarter. So like I say, next time, we're going to pick up from here, and we're going to be looking at propositional logic. Which is a different formal system, a little bit more complicated, more complicated rules, but we will actually be doing proper deduction. Remember, that's our first cognitive process. We will doing deduction using a set of proof rules called natural deduction actually. But it starts here.

Aaron Turner:

Okay. Thank you for your attention. Oh! Any questions? Do you have a question? No. Okay. So, thank you for your attention and we'll see you all next month. Hopefully.