An international collaboration of engineers, scientists, and other experts, funded by philanthropists,

building the gold standard of Artificial General Intelligence, for all of mankind.

- About
- Publications
- Videos
- 03 - (Propositional logic, natural deduction theorem-proving)

PREREQUISITE: It is recommended that you watch Part 02 before watching this video!

- Clip 01 (3:34)
- Intro
- [Hello & audience photo]
- Why study logic?
- reductio ad penguin
- Clip 02 (6:25)
- Propositional logic - Syntax
- Clip 03 (18:45)
- Propositional logic - Semantics
- Clip 04 (15:16)
- Propositional logic - Proof rules
- Clip 05 (2:37)
- Propositional logic - Soundness & completeness
- Clip 06 (12:57)
- Propositional logic - Theorem-proving
- Clip 07 (3:32)
- Outro
- Homework!

Transcript 01:

Aaron Turner:

Hi, I'm Daddy Bear, or you can call me Aaron. I am nervous, even though it's a very small crowd. I need to take an audience photo. Is that okay with everybody? Yes? That's just fantastic. Thank you. [Edit: I'm joking around here a little because the audience on this occasion comprised only two people: my friend Pawel, plus my housemate Annie (who was manning the registration desk) who sat in in order to boost the numbers]. Here we go. Big crowd. Big smile. Thanks very much. Okay. This is a very long presentation, so interrupt me as we go along. Ask any questions you want. Any question you ask is good information for me. Useful information. Why study logic? Okay. This is an example of a logical fallacy I call reductio ad penguin. Penguins are black and white. Some old TV shows are black and white. Therefore some penguins are old TV shows. Obviously that's a logical fallacy. And I have argued in the introductory talk that human cognition is seriously flawed. Humans are prone to forming beliefs that are unsupported by evidence. They're prone to reaching invalid conclusions via logical fallacy. And they're prone to believing unverified speculative hypotheses. So obviously [reductio ad penguin] is an example of an invalid conclusion, reached via a logical fallacy.

Aaron Turner:

Flawed reasoning leads to flawed conclusions. And flawed conclusions lead to flawed or reduced utility. So if you design something (a plane, a computer, a phone, an AGI), if you produce a business plan, or formulate public policies, if they're based on flawed reasoning and flawed conclusions, then they're unlikely to deliver the desired utility. In other words they just won't work. So formal logic (for example, propositional logic that we're going to look at today, or first-order logic with equality that we'll look at next month) ... was developed from around the fifth century BC (a guy called Zeno of Elea came up with the first inference rule, which we'll be seeing later). And around 1929 (surprisingly recently) a mathematician called Kurt Gödel proved the [semantic] completeness of first-order logic.

Aaron Turner:

[Formal logic has] been developed in order to facilitate logical deduction with far greater utility than the common sense logic that is built into humans, which as I've said is flawed. And all you have to do is go down to the pub and listen to some, almost any conversation or go onto the interweb, and you will see millions of examples of logical fallacies being used in arguments.

Transcript 02:

Aaron Turner:

As you will no doubt remember from the talk on formal systems, we need to define, for propositional logic, we need to define its syntax, its semantics and its proof rules. And we shall also be looking at a couple of properties of propositional logic called soundness and [semantic] completeness. And then we'll look at theorem-proving, the actual process of proving theorems. Any formal system has an alphabet, and the symbols of propositional logic are the lower case Latin letters, the underscore, the digits 0 to 9, the left and right parentheses and the comma. And these symbols [∧, ∨, ¬, ⇒, ⇔] are typically used for and [∧], or [∨], not [¬], implies [⇒], and equivalence [⇔].

Aaron Turner:

This is a little bit confusing, but when we talk about a formal language like this, there are actually two languages being used. So there's the language that we're actually defining [in this case PROP], and the language that we're using to define it, which is basically "mathematician's English". And they share many symbols, so it can be very confusing which symbol belongs to which language. So, in order to make things clear, I've put the symbols of PROP in pink in all these slides. So you can tell which symbols are part of the PROP language and which are part of the metalanguage (which is the informal mathematician's English).

Aaron Turner:

This is essentially the same as the formal system we looked at last time. A formula of PROP is simply a possibly-empty finite sequence of symbols taken from that alphabet. So, like before, we're just shuffling symbols here. It's really mechanical. The [set of] well-formed formulae [wffs] is a subset of the set of all formulae, and we can define the wffs of PROP in two stages. Lowercase names are defined by this regular expression. And it basically just means one instance of the symbols a to z, followed by zero or more instances of one of the symbols underscore, a to z, and 0 to 9. In other words, a lowercase name is just a lowercase letter followed by zero or more underscores, lowercase letters, or digits.

Aaron Turner:

It might seem really tedious, but this is how carefully defined a formal language has to be in order for it to work. So [now we define] the set of all wffs. Now we know what a ... lowercase name is, we can define the set of all wffs recursively. So a proposition symbol is just a lowercase name, and a wff can either be a proposition symbol, an and [∧] symbol followed by a comma-separated list of wffs, an or [∨] symbol followed by a comma-separated list of wffs, the negation symbol [¬] followed by a wff, one wff implies [⇒] another wff, or the equivalence symbol [⇔] and a comma-separated list of wffs. Remember, these are just strings. We're just defining strings of symbols at this point.

Aaron Turner:

There can also be spaces and things like that [e.g. newlines] in between these characters, just for formatting or whatever, but they don't affect the meaning. So here are some example wffs. Simple ones at the top, just the lowercase names. And they get more and more complicated as we go down. But they're fairly straightforward I think.

Aaron Turner:

The PROP-wffs are defined recursively. It's easier to visualize if we draw [a wff] as a concrete syntax tree. So there in pink [at] the top right, there's a reasonably complex wff, but there's [also] the tree representing its structure. So you can see it's just a, at the top left, it's just a wff, and then underneath that there's how it's structured as sub-wffs and things.

Aaron Turner:

And despite the fact that wffs are defined recursively, it doesn't iterate forever, obviously. I mean, it always bottoms out with either a proposition symbol or an empty comma-separated list. If we remove white space and punctuation, then we can simplify a concrete syntax tree as an abstract syntax tree. And you can see it's much simpler. All it shows is the essential features. So again, right at the top we just have a wff and then I think that's an and [∧], and then two wffs. Because all of the punctuation is just there to make parsing easier. But the semantics depends just on this abstract structure. [Finally], two different concrete syntax trees can map onto the same abstract syntax tree, because things like spaces they are of no consequence.

Transcript 03:

Aaron Turner:

[Now] we look at the semantics. We've defined essentially the syntax (the grammar) of the language PROP. When we looked at Man Horse Wheel, the formal system last time, it didn't have a semantics, but PROP does. Which means, for every wff in the language, we will give that wff a meaning. All we need to do is we define some mechanism that assigns a mathematical value to any given wff, okay? And that value is then the meaning of that wff. We have a set of wffs and we define the semantic domain W [for wffs] to be [the set {⟙, ⟘}]. These symbols are typically called top and bottom in mathematics, but we can think of them as true and false. Okay?

Aaron Turner:

I've put true and false in quotes because there's a difference here between truth in the real world, and mathematical truth. This is a mathematical truth that we're defining for PROP, for this particular language. And they're not the same thing. Mathematical truth and truth in the real world are not the same thing. The value assigned to a wff by the semantics will be a member of [{⟙, ⟘}]. In other words, it will be top or bottom, which I'm just going to call true and false. Now we give each wff form a semantic definition, and when you see w, w1, w2, whatever, they're just placeholders for wffs. So you can basically just plug in any of the wffs that we've seen, any wff that you can construct from the syntax, you just plug it into one of those placeholders.

Aaron Turner:

Okay, now this is a conjunction. So do you see that and [∧] symbol, so a wff of the form and [∧], and then a comma separated list of wffs is true if every member of that list, w1 to wN, evaluates to true, and is false otherwise. So that's essentially the meaning of and [∧], the logical meaning of and [∧]. The way I've formulated PROP, is slightly different to what you'll see in most of the textbooks. In most of the textbooks and [∧] would just take two parameters, w1 and w2, but I've formulated it so that it can take any number, including zero.

Aaron Turner:

So if you go and [∧], and then an empty list, that's a base case, and it evaluates to true, because the semantics is: every member of that wff has to be true in order for the and [∧] to be true, but because it's an empty set it's vacuously true. So it's basically a shorthand for the constant true. So the intended meaning of and [∧] for any number of parameters can be summarized by truth tables, which you'll often see. So here are some truth tables for and [∧]. Right at the top, there's the truth table for the vacuous case when it doesn't have any arguments.

Aaron Turner:

And then the next one down is if it just takes one argument, but the truth table of real interest is the one at the bottom where it takes two arguments and you'll see and [∧] is only true if all of its arguments are true. So the fourth case at the bottom. Have you got that? For and [∧] to be true, for the two parameter case, both of the parameters have to be true. Okay? And we're going to do this for each of the constructs in the syntax. A disjunction is a just a posh name for or [∨], which is that symbol there, looks like a v, and it's very similar. But an or [∨] is true (a disjunction is true) if at least one member of the list is true and false otherwise. Okay? And in this case the base case evaluates to false.

Aaron Turner:

Okay? Because in order for it to be true, at least one member has to be true, but there aren't any members to be true, therefore it evaluates to false, and it's essentially a way of writing the constant false. And again, we can write truth tables for or [∨], and (if we just drop down to the truth table for the two parameter case) you can see if either w1 or w2 is true then the disjunction is true. So it's only false when both w1 and w2 are false, which is the logical meaning of or [∨]. Negation [¬] is really easy. Basically negation [¬] just inverts the truth value of whatever it's given. So it converts false to true and true to false.

Aaron Turner:

Implication. w1 implies [⇒] w2 if w1 is true and w2 is false. This is quite confusing for a lot of beginners to logic. Note if w1 is false, the implication [⇒] always evaluates to true. So basically false implies anything. This is what one of my lecturers used to say: if false is true, then monkeys ride elephants. So if the thing to the left of the implication sign [⇒] is false, then the implication as a whole is true.

Aaron Turner:

So if you ever manage to prove false, if you're making some argument and you are able to derive false, then from that you can conclude anything, because false implies [⇒] anything. So it indicates that the assumptions that you started with are inconsistent. Equivalence. This is basically w1 implies [⇒] w2 and w2 implies [⇒] w1. So the two things are logically equivalent [⇔].

Aaron Turner:

An equivalence [⇔] is true, basically, if every member of w1 to wN is true, or every member is false. Okay? So they basically just have to have the same logical value. And again, we can put that in a truth table. So again, if you look at the bottom table, for two parameters, the first and last entries are true, because both of the inputs are the same, but the middle two entries are false because the inputs are different from each other.

Aaron Turner:

We've now given a semantics to every wff form except proposition symbols. Those were the lower case names. We need to use the concept of an interpretation; an interpretation is a mapping from the set of proposition symbols to the set {⟙, ⟘}. It's just a function ... this will become clear in a minute. Given an interpretation, we can evaluate any wff to either true or false, because if it's of the propositional form, we can just use the interpretation to decide if it's true or false, and if it's any of the other forms, we can use the truth tables to decide whether it's true or false.

Aaron Turner:

Any wff is going to be finite, so it's only going to have a finite number of different proposition symbols. So when we think about interpretations, we only actually need to assign a value to the proposition symbols that actually occur in a wff. So here's an example and this will be really simple. This interpretation assigns true to p, false to q, and true to r. So, in that interpretation, this wff ∧(p, q, r) evaluates to false. And this other wff, q ⇒ r, evaluates to true (in that interpretation). Okay? So I'm wishing I hadn't put this in now!

Aaron Turner:

Here's an interpretation: p is false and q is true, and there's a more complicated wff involving p and q and we want to evaluate it to true or false. And you basically do it just by writing it out. And initially you mark everything as unknown and gradually you fill it in. So you label all the p's as false, because that's what the interpretation says, you label all the q's as true. Then you can start to evaluate the operators. So you can evaluate the negations [¬], you can evaluate the innermost implications [⇒]. And then finally you can evaluate the outermost and [∧]. And then there's the outermost implication [⇒], which evaluates to true.

Aaron Turner:

So that reasonably complicated wff evaluates to true in that particular interpretation. Now imagine the set of all possible interpretations for a particular wff. A wff is called unsatisfiable if it evaluates to false in every possible interpretation. It's called invalid if it evaluates to false in at least one interpretation. It's called satisfiable if it evaluates to true in at least one interpretation. And it's called valid if it evaluates to true in every interpretation.

Aaron Turner:

Unsatisfiable wffs are also called contradictions, they can basically never be true, and valid wffs are called tautologies, they're always true. The valid wffs are particularly important because it doesn't matter what statements you plug in for the propositional symbols, they always work, they always evaluate to true. So for example, ∨(p, ¬p) is a tautology. It doesn't matter what p stands for, ∨(p, ¬p) is always going to evaluate to true. Okay?

Aaron Turner:

Just being thorough here, I've got an example of a wff that is both invalid and unsatisfiable. Now it's only got two propositional symbols within it. So there are only actually four possible interpretations, and I've just listed them all out. So you can see that in interpretation 1, it evaluates to false, and in all the others it evaluates to true.

Aaron Turner:

So because it evaluates to false in interpretation 1 it's invalid. And because it evaluates to true in at least some interpretations, it's therefore satisfiable. So it's neither unsatisfiable nor valid.

Aaron Turner:

Another example, of a valid wff, a tautology. Here are the four possible interpretations. And you see it evaluates to true in every single interpretation. It doesn't matter whether you plug true or false in for p and q, it always comes out true. And for that reason it's valid according to the definition.

Aaron Turner:

Okay. So there's a funny symbol [⊨] here. If you have a comma separated list of wffs, then that funny symbol [⊨] which is called a double turnstile, and then w on the right hand side, that is read "wff w is a semantic consequence of the wffs on the left hand side of the turnstile".

Aaron Turner:

Just as a note here, I've put the turnstile [⊨] and then the subscript PROP. This is to make it clear that we're talking about this particular formal system. Usually if you know what formal system you're talking about, then you can usually drop the subscript. But sometimes you're talking about more than one formal system, and you have to be unambiguous about which one you mean.

Aaron Turner:

The wffs to the left of the turnstile [⊨] are called antecedents or assumptions or hypotheses. And the one on the right is called the consequent or succedent. This is the definition of semantic consequence: the consequent w is a semantic consequence of the antecedents w1 to wN if and only if, for all possible interpretations (remember there's only a finite number of them), if all of w1 to wN evaluate to true (in that particular interpretation) then w also evaluates to true [in that particular interpretation].

Aaron Turner:

This is the most important concept in all of logic. This is basically the idea of when certain beliefs, certain assumptions, necessarily entail a conclusion. And that is basically the fundamental property that we're trying to understand [capture]. So yeah, lots of examples of this. So here's an example. There's two wffs on the left [of the turnstile ⊨], p ⇒ q and p ⇒ ¬q. And the conclusion [consequent] is ¬p.

Aaron Turner:

And we can see from this truth table, for every line in the truth table in which all of the antecedents are true, the consequent w is also true. So it's only the top two, and in the top two both of the antecedents are always true, and the consequent is always true. In the bottom ones, sometimes one of the antecedents is false. In that case, we don't care, okay? We only care [about the consequent] when the assumptions, the antecedents on the left hand side [of the turnstile ⊨], are true. Okay?

Aaron Turner:

In this case, whenever all of the antecedents on the left hand side of the turnstile [ ⊨] are true, the consequent is true. And so the semantic consequence holds. So this is a counter example. This is a semantic non-consequence, and you see I've put a stroke through the double turnstile [⊨]. So from this truth table, we can see that, in at least one line of the truth table, in at least one interpretation in which all of the antecedents are true, the consequent is a false, and I've highlighted it as the top one. So all the antecedents are true, yet the consequent is false. Therefore, that logical consequence doesn't hold.

Transcript 04:

Aaron Turner:

Okay. So, that's the semantics of propositional logic, where we give each wff a meaning, either true or false, top or bottom. So now we're going to look at the proof rules, which is just where we're doing symbol shuffling, just mechanical symbol shuffling. There are lots of ways to define the proof rules for propositional logic or first-order logic. The easiest really to understand is called natural deduction, which was I think was first formulated by Gerhard Gentzen in 1935. So this is called a syntactic sequent. Remember the other one was called a semantic sequent. So this looks very, very similar. We've got a list of wffs on the left. We've got a funny turnstile thing [⊢] and then we've got the conclusion w, the consequent. So this means wff w is a syntactic consequence of w1 to wN. And this symbol [⊢] in this case, it's called a single turnstile [⊢] obviously. But the others are called the same as in the semantic case. The wffs on the left are called the antecedents, assumptions or hypotheses. And the w on the right of the turnstile [⊢] is called the consequent or succedent.

Aaron Turner:

In Man Horse Wheel, that we looked at when we looked at formal systems, the theorems were all wffs. But in PROP the theorems are syntactic sequents. And a syntactic sequent is a theorem if and only if it can be derived using the proof rules, which I haven't given you yet, but I'm just going to describe in a minute. And a proof is a non-empty finite sequence of theorems [i.e. syntactic sequents], each of which may be derived from theorems [syntactic sequents] appearing earlier in the proof. We're going to see lots of examples, but these are the formal definitions. In the proof rules that follow, we're going to see two rules for each of these logical connectives, we're going to see one rule for introducing the connective called something-introduction, and one rule for eliminating the connective called something-elimination. And we're also going to see two structural rules called Hyp and Weaken.

Aaron Turner:

Saying w1 to wN all the time is a bit tedious. So sometimes we will just say A's, B's and P's, etc, to mean a list of wffs, or a set of wffs. The first proof rule is as follows. Do you remember from the discussion of formal systems when we looked at Man Horse Wheel that an inference rule is typically drawn as a set of premises above a line and a single conclusion beneath the line. The number of premises can be zero as in this case. So this rule has no premises and it has a conclusion obviously underneath the line. And because it has no premises, it's called an axiom. There's [also] a side condition here.

Aaron Turner:

Basically, ... we can conclude that C is a syntactic consequence of A's, as long as C is a member of A's. The examples will make it clear. Okay. And this is how we write proofs. So in step 1 you can see to the left of turnstile [⊢] we see p and q and to the right [of the turnstile ⊢] we can have p, because p is a member of the set of wffs on the left of the turnstile [⊢]. And we justify that step by saying that we've used the Hyp rule. And the second step is basically exactly the same, okay? So the Weaken proof rule just means that if we have a set of assumptions to the left of the turnstile [⊢], we can always add things to it, okay?

Aaron Turner:

Again, if you look at the example. Step 1, we've got p, q ⊢ p. And we can just add an r to the assumption. So we go p, q, r ⊢ p, and we can justify that as being derived from step 1 using the Weaken rule, okay? So we'll just go through them all. So this is the ∧-intro proof rule and it's called introduction, because if you see on the bottom line, the conclusion, we are introducing a conjunction, we're introducing the ∧. So, basically if we've already proved C for every C in some set of C's, then we can derive, we can conclude ∧ of all of those C's. It's fairly straightforward from the semantics.

Aaron Turner:

There's an example at the bottom. So we have two hypothesis steps where we've concluded p and q. And because we've already [separately] derived p and q, we can derive ∧(p, q). We can go the other way. So if we already know for example, ∧(p, q) is true, then we can then we can derive that either p or q is true as I showed in this example. So we've derived ∧(p, q) and from that we can derive p using ∧-elim. For the ∨-intro, if we've got, if we've already derived an ∨ with a certain number of wffs in it, we can always add to that list.

Aaron Turner:

So as an example, if we've already derived ∨(p, q) then we can derive ∨(p, q, r) using ∨-intro as the justification. ∨-elim is a little bit more complicated. This rule [has] got two premises. So we actually have to prove two things in order to be able to derive the conclusion. So basically if we've already proved some disjunction, so ∨(C's), [and] if we've also proved, for every C in that list of C's, ... that C ⇒ D, then we can conclude D, okay? Because it doesn't matter which of the C's is true, we know whichever one it is, we can deduce D, and we know that at least one C is true. There's an example at the bottom again, so we've already proved ∨(p, q), we know that p ⇒ r, we know that q ⇒ r, so we can deduce r.

Aaron Turner:

So ¬-intro is called reductio ad absurdum, sometimes proof by contradiction. And this is actually the first proof rule that was ever formulated (by Zeno of Elea in the 5th century BC). So that was really the start of the process of the formulation of modern logic. This is the first rule they came up with. So if we've proved that C ⇒ D and we've also proved that C ⇒ ¬D well something's wrong, because they can't both be true. So from that we can conclude, ¬C, i.e. C is not true, because if C is true we can conclude a contradiction. There's an example of the application of that rule at the bottom. So if p ⇒ q and p ⇒ ¬q, then we can conclude ¬p. ¬-elim is probably the easiest one. So ¬¬C is C. So if we have a double negation we can just eliminate it, okay? So if we derive ¬¬p, we can derive p.

Aaron Turner:

⇒-intro. Now this is a little bit more complicated. You see on the top line, on the left of the turnstile [⊢], we've got A's and we've got a wff C. So if the assumptions include C and from [those] assumptions we can derive D, then we can derive C ⇒ D but with just A's on the left of the turnstile [⊢] and without the C to the left of the turnstile. So there's an example: we've assumed p and q, and we've derived p using the Hyp rule. And then we can derive the sequent which has just got p to the left of the turnstile [⊢] and q ⇒ p to the right.

Aaron Turner:

This rule, ⇒-elim, is also called modus ponens. Really important when you're making an argument, if you've proved that C is true and you have also proved that C ⇒ D, then you can conclude D. Sounds very simple, but it's really, really powerful. Really useful. There's an example [where you've] derived p, derived p ⇒ q, and so therefore you can derive q using ⇒-elim as justification. The last operator we've got is equivalence [⇔]. There's an ⇔-intro proof rule. It's a little bit more complicated because we've got a list. On the bottom line, we want to conclude that all the C's are equivalent and we can do so if on the top line we've already derived C ⇒ D for every C and D in C's.

Aaron Turner:

You only have to consider distinct C's and D's, otherwise you end up unnecessarily proving that C ⇒ C. There is an example at the bottom which will make it clear. We've already derived p ⇒ q and we've derived q ⇒ p. And consequently on the third step of the proof, we can derive that p and q are equivalent [⇔(p, q)] using the ⇔-intro rule. ⇔-elim is very simple. If we've already proved that a bunch of wffs C's are equivalent, then we can take any two [C and D] out of that list, and we can just say C ⇒ D. So here at the bottom if we've proved that p and q are equivalent [⇔(p, q)], then we know that p ⇒ q, justified by ⇔-elim.

Aaron Turner:

That is the complete set of proof rules. I think there's 12.

Transcript 05:

Aaron Turner:

We've [now] looked at both the semantics (with all the truth tables and things, and the interpretations), and we've looked at the proof rules (which are the rules for shuffling symbols around, mechanically shuffling symbols). They're very different and reasonably complicated. What's quite incredible is that these things are equivalent. So there's a semantic [consequence - sorry!], and there's a syntactic [consequence - sorry again!], and you can see the similarity. And, for this sequent at least, we can use either [the] semantics or the proof rules to confirm that ¬p is a logical consequence of the [assumptions ... on the left of the turnstile]. So, [propositional logic] has the property of being strongly sound.

Aaron Turner:

So by this I mean for any set of wffs A's and any wff C, if C is a syntactic consequence of A's, then it is also a semantic consequence of A's. And the reverse is also true, which is called [strongly semantically complete - got a bit tongue-tied there!] So, this is going [in] the other direction. For all sets of wffs A's and all wffs C, if C is a semantic consequence of A's, then C is a syntactic consequence of A's. So they are equivalent. For any set of wffs A's and any wff C, C is a semantic consequence of A's if and only if it is a syntactic consequence of A's. In other words, the two ways of thinking about things, either thinking about [the] semantics, or the proof rules (the mechanical symbol shuffling), are interchangeable. And later on, this will be very valuable.

Transcript 06:

Aaron Turner:

Theorem proving. Why are we doing all this? Because it's a bit tedious! There is actually a reason, there is utility. So here, from earlier, this is our logical architecture of Big Mother, the artificial generally intelligent machine, and PROP, propositional logic, is a potential concrete representation mechanism we can use for the [machine's] beliefs A, B, and C that you see in that architecture. We can use PROP to represent the machine's beliefs and we can use PROP's proof rules as a mechanism for performing deduction on those beliefs. This is a way of getting the machine to think. Deduction is a form of cognition. It's only very simple at this stage because PROP is very simple, but it's a start. So, how can we actually implement this cognitive process? Well, we need to automate the process of searching the infinite tree of all theorems for a target sequent. In other words, by writing a theorem-prover, a computer program to search for theorems. We're going to do an example now. We're going to prove an example theorem by hand in the same style that a theorem-prover would.

Aaron Turner:

Believe it or not, when you prove a theorem, you actually work backwards. There's the conclusion that we're trying to prove at the end. The assumptions are ∨(p, q) and ¬p, and the consequent is q. Is this [sequent] a theorem or not? We don't know. Working backwards, we can use the HYP rule to get us started. Because ∨(p, q) is on the left of the turnstile [⊢], we can conclude ∨(p, q) on the right of the turnstile [⊢], and that is justified by using the Hyp rule. We now have a theorem at line 2 that includes the ∨ symbol, i.e. it's a disjunction. So that gives us a hint that maybe we can use ∨-elim to prove our target theorem on the bottom line. I've put the justification at the bottom: 2, 3, 4, ∨-elim, but that's assuming of course that we can prove lines 3 and 4.

Aaron Turner:

So these now become new theorems to be proved, i.e. sub-theorems as it were. Incidentally, this is a mechanism that is quite often called a blackboard. We have a blackboard of goals and achievements. The achievements are the theorems we've already proved (with a green tick), and the goals are the theorems that we haven't yet proved (with the red question mark). So now we've proved one theorem (line 2) and in order to discharge the goal at the bottom (line 1), we now have these sub-theorems, i.e. sub-goals (lines 3 and 4). We need to try to prove both p ⇒ q and q ⇒ q. So again it's the actual structure of the theorem we're trying to prove that gives us a hint as to which rule to use. Here we know that if we can prove line 5, i.e. if we can prove q having added q as an extra assumption, then we will be able to use the ⇒-intro rule to prove q ⇒ q. We can discharge line 5 because that's very simple.

Aaron Turner:

We've got q on both the left and right sides of the turnstile [⊢], so we can use the Hyp rule, and because we've just proved line 5 then we've proved all the premises we need to be able to prove line 4. So line 5 is a theorem and we now know that line 4 is a theorem. So now we turn our attention to line 3, p ⇒ q, and again, because of the structure of what we're trying to prove (it includes an ⇒ symbol) we can maybe use ⇒-intro. So if we can prove line 6 with a new p on the left of the turnstile [⊢] and q on the right of the turnstile [⊢], then we'll be able to use ⇒-intro to derive p ⇒ q. So, again, now we're trying to prove line 6, we've just got a q on the right hand side [of the turnstile ⊢] it doesn't have any structure. So it's not giving us a hint like in the other cases where we had an ∨, or we had an ⇒, and so it gave us a hint as to what rule we might be able to use.

Aaron Turner:

When we're kind of stuck, we can always use ¬-elim [thereby setting up ¬-intro, i.e. reductio ad absurdum]. So that's what we've done here. If we can prove ¬¬q then we can use ¬-elim to prove q. And again, now, if we can prove lines 8 and 9 (¬q ⇒ p and ¬q ⇒ ¬p), then we can use reductio ad absurdum (Zeno's proof rule, i.e. ¬-Intro) to prove line 7. So now we have two more sub-goals to try and prove, (i.e. lines 8 and 9). Again [at line 8] we have ¬q ⇒ p, so we have an ⇒ symbol which suggests that maybe we can use the ⇒-intro rule. So now, at line 10, if we add ¬q as a new assumption, can we derive p? If we manage to do that then we can use ⇒-intro to prove line 8. Line 10 we can discharge easily because we've got p on the left hand side of the turnstile [⊢] and p on the right hand side of the turnstile [⊢], so we can use Hyp.

Aaron Turner:

So we know that line 10 is a theorem, and because we know that line 10 is a theorem, we also know that line 8 is a theorem by ⇒-intro. Turning to line 9 we have basically the same thing. It's got an ⇒, so we try to use ⇒-intro. Looking at line 11, because we've got ¬p on both sides of the turnstile [⊢], we can use Hyp, and then now we get a cascade [of goals converting into achievments]. We've proved that line 11 is a theorem, which therefore means that line 9 is a theorem, and then from that we know that line 7 is a theorem, and then line 6 is a theorem, line 3 is a theorem, and then line 1. So basically we've proved everything, and finally we just renumber everything. And that's our proof. With each line we show the assumptions, we show the consequent, and we show the justification (i.e. how we got to it from earlier lines in the proof). And that's what a proof looks like - a formal proof.

Aaron Turner:

So this is pretty obviously a very mechanical process, the rules are very, very precisely defined and we're just shuffling symbols according to these unambiguous rules. So obviously we can write a computer program to do it because this is the sort of thing computers do. There are two obvious problems. The infinite tree of all theorems that we're searching in any particular case is obviously infinite. And, when we're trying to search for a proof, the number of possible paths through it is well, vast, also infinite. This is called the combinatorial explosion problem. And therefore it could take even a computer basically forever to find a proof. The other thing is that when we were proving even that simple theorem, we did seem to use some human ingenuity when deciding which proof rule to use. Sometimes the proof steps were suggested by the structure of the target wff, but sometimes they weren't. So when we used ¬-intro [via ¬-elim], so there was some human ingenuity going on.

Aaron Turner:

So how do you reproduce that in a computer program? Both of these problems, combinatorial explosion and the problem of human ingenuity, apply to any AI search problem, and there are lots of them (so, not just theorem-proving). We'll be looking at both of these issues because we want our machine to reason effectively, to be able to do these search problems despite the combinatorial explosion. And we need to find some way of reproducing the human ingenuity that a human mathematician would apply.

Transcript 07:

Aaron Turner:

So there is homework. I knew you'd be pleased. There's some suggested reading. This is really good. It's on the internet. The Stanford Encyclopedia of Philosophy. There are millions of books on logic. Probably the clearest one I've found is called The Logic Manual, and there's really quite a classic book called How To Prove It, which gives you all these little tricks and techniques for deriving proofs by hand.

Aaron Turner:

I have to warn you that all of these books will define whichever logic it is, propositional logic or first-order logic, slightly differently. The syntax is always slightly different. The semantics is always slightly different. The proof rules are always slightly different. But they are essentially equivalent. So the homework is try to prove a bunch theorems by hand, and while you're doing so, ask yourself what information did I just use to prove that theorem? Because this is the key, okay? You have to use all the information that is available, but, of course, to do that, you have to know what information there is. So think about what information you use to prove the theorem, where it came from, how that information might be represented in a computer, how might that computer derive that information for itself rather than being programmed to do it.

Aaron Turner:

So if you ask these questions enough times, you will begin to see solutions to these problems. I spent an entire year doing nothing but proving theorems by hand. I'm not joking. I did. To ask myself what information did I use? Where did it come from? And I don't think there's any shortcut. If you're a programmer, have a go at writing a program, say in Python or something, that takes a PROP syntactic sequent as input and tries to find a proof of that sequent. You know what the syntax is for PROP. You can write a little parser to parse it in. You know what the proof rules are. They're very mechanical. They're very simple. And you can write a little program to use those rules to search the infinite tree of all theorems looking for a proof.

Aaron Turner:

So, that's it, after only an hour and 10 minutes. Thank you very much for your attention. Next month we'll be taking it up a level to first-order logic. Okay. Thank you.

Annie:

Thank you.