Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Impossibility of Trisecting the Angle and Doubling the Cube (metamath 100 #8) #4246

Open
tirix opened this issue Sep 25, 2024 · 20 comments

Comments

@tirix
Copy link
Contributor

tirix commented Sep 25, 2024

An elegant proof would certainly rely on Galois Theory.

Such a proof is provided Chapter 7.4 of [Ian Stweart - Galois Theory], p.99, theorems 7.13 and 7.14.

@savask
Copy link
Contributor

savask commented Sep 28, 2024

Maybe worth mentioning that the proof doesn't require a great deal of Galois theory, in fact the only crucial bit seems to be that field extensions are multiplicative, and that is already proved in set.mm in extdgmul. The fact that the cubic root of 2 gives us a field extension of degree 3 can be proved by elementary means, by mimicking the argument about minimal polynomials.

Probably the most laborious part would be to define constructible numbers and prove that they give field extensions of degree a power of 2. I took a look at how other formalization systems do this, and both HOL Light and Isabelle go with the geometrical definition. That is, we start with two points (if we identify points on the plane with vectors, then usually (0, 0) and (1, 0)) and call them constructible by default. Then:

  1. if points A, B, C, D are constructible, then the intersection of proper non-parallel lines AB and CD is again constructible,
  2. if points A, B, C, D, E, F are constructible, then the intersection points of a proper line AB with a circle with center at C and radius |EF| are again constructible,
  3. finally, if A, B, C, D are constructible, then the intersection points of two distinct circles with centers A, C and radii |AB|, |CD|, respectively, are also constructible.

I think it would be beneficial to model constructible numbers in set.mm as a subset of complex numbers, as we do in section "Theorems of Pythagoras, isosceles triangles, and intersecting chords" of set.mm. Geometry of CC is more developed in set.mm than the Tarskian geometry, and this gives an additional benefit that we may identify points on the plane with points of CC and hence a real number is constructible if the corresponding complex number (i.e. point) is constructible.

I unfortunately don't know how recursive definitions work in set.mm, so I cannot draft a preliminary definition of constructible (complex) numbers, but it seems that we should start with 0 and 1 as being constructible by default and properties 1)-3) can be easily stated with standard operations with complex numbers, for example, one can use something in the spirit of sigarcol to test if points lie on the same line.

The next step would be to show that construcible numbers form a field (again, I think the fact that we would be able to use + and x on complex numbers should simplify the proof), and then perhaps the most difficult step would be to show that if a number is constructible then the corresponding field extension degree is a power of 2. I don't know if set.mm has notation for a field generated by a subfield and some element, but more importantly, we would have to prove that intersections of circles and lines require only quadratic field extensions.

I would be glad to participate and discuss the details if someone is willing to try formalizing some of the definitions and intermediate steps required.

@icecream17
Copy link
Contributor

Something like ( rec ( ( s e. _V |-> AllPointsConstructibleFrom_s ) , { <. 0 , 0 >. , <. 1 , 0 >. } ) " _om ) would work

@savask
Copy link
Contributor

savask commented Jan 7, 2025

I tried giving a definition of constructible points (and numbers) using icecream17's rec idea:

$c Constr $.
cconstr $a class Constr $.

df-constr $a |- Constr = ( rec ( ( s e. _V |-> { x e. CC |
 ( ( E. a e. s E. b e. s E. c e. s E. d e. s E. t e. RR E. r e. RR ( x = ( a + ( t x. ( b - a ) ) ) /\ x = ( c + ( r x. ( d - c ) ) ) /\ -. ( Im ` ( ( * ` ( b - a ) ) x. ( d - c ) ) ) = 0 ) \/
 E. a e. s E. b e. s E. c e. s E. d e. s E. e e. s E. f e. s E. t e. RR ( x = ( a + ( t x. ( b - a ) ) ) /\ ( abs ` ( x - c ) ) = ( abs ` ( e - f ) ) ) ) \/
 E. a e. s E. b e. s E. c e. s E. d e. s E. e e. s E. f e. s E. t e. RR ( -. a = d /\ ( abs ` ( x - a ) ) = ( abs ` ( b - c ) ) /\ ( abs ` ( x - d ) ) = ( abs ` ( e - f ) ) ) )
 } ) , { 0 , 1 } ) " _om ) $.

Here constructible points are understood as a subset of complex numbers CC. We start with 0 and 1 being constructible by default, and the first line says that the intersection of two not-parallel constructible lines is constructible (notice how the expression with Im checks for collinearity, it was used in sigarcol for the same purpose), second line is for the intersection of a line and a circle, and the third line is for the intersection of two circles with different centers. This is pretty much the same definition as Isabelle or HOL Light use, but inside the complex plane.

I guess one could similarly give a more general definition of "points constructible from some set", but I don't think this is important enough to complicate the definition.

What do you think? Does this definition look right? If we agree on this definition, we could try writing down intermediate statements required to prove the metamath 100 question. Norman suggested formalizing statements (without proofs) of metamath 100 theorems in the past, so this could be a good idea.

@tirix
Copy link
Contributor Author

tirix commented Jan 7, 2025 via email

@savask
Copy link
Contributor

savask commented Jan 8, 2025

so it looks correct

Nice, so at very least we can express the statement of the theorem now (cubic root of 2 and cos(pi/9) don't lie in Constr).

I don't think we have a notation yet for a field generated by a subfield and and some additional element (like e.g. ℚ[√2]), that would certainly be a nice addition, and I'd be interested to contribute!

That's an interesting question whether we really need a full-fledged definition for the proof. We certainly want to work inside CC, so let's say s is a subfield of CC (viewed as a subset), and c e. CC is such that ( c x. c ) e. s, so a square root of some element from s. Then

{ x e. CC | E. a e. s E. b e. s x = ( a + ( b x. c ) ) }

is also a subfield of CC. We want to prove that any constuctible number can be obtained by a finite series of field extensions like that. Maybe we could define "algebraically constructible numbers" as

$c ConstrAlg $.
cconstralg $a class ConstrAlg $.

df-constralg $a |- ConstrAlg = ( rec ( ( s e. _V |-> { x e. CC | E. a e. s E. b e. s E. c e. CC ( ( c x. c ) e. s /\ x = ( a + ( b x. c ) ) ) } ) , QQ ) " _om ) $.

and then we need to prove that Constr C_ ContrAlg (they are equal, of course, but we require only inclusion for impossibility proofs). Now, I'm not sure if it's straightforward enough to relate quadratic field extensions inside CC I described earlier with this definition of ConstrAlg, so your thoughts are appreciated. Certainly the recursive operation in the definition of ConstrAlg doesn't always give a field, so one would need some sort of proof by induction, showing that if a number can be obtained from QQ by k iterations of this operation, then it lies in a finite field extension of QQ of index a power of 2.

@tirix
Copy link
Contributor Author

tirix commented Jan 8, 2025

I believe this approach would work, and we don't really need the full-fledged definition of a generated field for the proof. But it would be sooo nice to use this opportunity to formalize more of the Galois Theory and have some basic bricks for e.g. the insolvability of the quintic...

@tirix
Copy link
Contributor Author

tirix commented Jan 8, 2025

My thought was defining something like:

df-fldgen $a |- fldGen = ( f e. _V |-> ( e e. ~P ( Base ` f ), s e. ~P ( Base ` f ) |-> |^| { a | ( s C_ a /\ f /FldExt a /\ e /FldExt a ) } ) )

Then given a universe field F, a subfield E and a generating set S, ( E ( fldGen ` F ) S ) would be the field extention of E generated by S in F. E.g. we could write ( ( CCfld |`s QQ ) ( fldGen ` CCfld ) { ( sqrt ` 2 ) } ).

@savask
Copy link
Contributor

savask commented Jan 9, 2025

But it would be sooo nice to use this opportunity to formalize more of the Galois Theory and have some basic bricks for e.g. the insolvability of the quintic...

I agree, then let's see if we can formulate it in Galois theory terms.

My thought was defining something like: ...

Any reason why we need a subfield E at all? Why not define fldGen for a universe field F and a subset S, so F fldGen S will be the field generated by S inside F. In theorems where we extend a subfield E with elements from S we could use an idiom F fldGen ( ( Base ` E ) u. S ).

Also, in your definition you intersect fields with |^| operation, is it going to work? I don't understand extensible structures in set.mm very well, but I would assume that one first has to intersect sets of the form ( Base ` a ), and then repackage the result back into a structure.

@tirix
Copy link
Contributor Author

tirix commented Jan 9, 2025

In your definition you intersect fields with |^| operation, is it going to work? I don't understand extensible structures in set.mm very well, but I would assume that one first has to intersect sets of the form ( Base ` a ), and then repackage the result back into a structure.

You're right, this would not work. For an intersection-based definition, the resulting object has to be a subset of the base. We can then recover a field by using the structure restriction operator: ( F |`s X )

Any reason why we need a subfield E at all? Why not define fldGen for a universe field F and a subset S, so F fldGen S will be the field generated by S inside F. In theorems where we extend a subfield E with elements from S we could use an idiom F fldGen ( ( Base ` E ) u. S ).

That's a good remark, this would simplify the definition to really "a field generated by a set" without even mentioning field extension.
A definition could be:

df-fldgen $a |- fldGen = ( f e. _V , s e. ~P ( Base ` f )  |-> |^| { a e. ( ( SubRing ` f ) i^i Field ) | s C_ a } ) )

This resembles the definition of sigma algebra generated by a set (df-sigagen), or that of the ring spanned by a set (df-rgspn)

@tirix
Copy link
Contributor Author

tirix commented Jan 10, 2025

See #4553 for my concrete proposal.

@tirix
Copy link
Contributor Author

tirix commented Jan 10, 2025

I think another thing we would need to formalize is the idea of "tower" of field extensions. Ideally, we should find a way that is generic enough so that it could also be used for towers of normal subgroups, as these are useful for the definition of solvable groups.

I believe the best is to see such towers as chains, i.e. see the subfield (resp. normal subgroup) relation as an order relation, so that the chains are subsets where the order is total.

We already have definitions for Posets and Tosets, so I think we could prove some generic facts in the settings of order theory, and use them for such chains (I don't think we need a lot of them)

@tirix
Copy link
Contributor Author

tirix commented Jan 10, 2025

I believe the best is to see such towers as chains, i.e. see the subfield (resp. normal subgroup) relation as an order relation, so that the chains are subsets where the order is total.

(That's actually what I had in mind when I defined the Field Extension as a relation)

@savask
Copy link
Contributor

savask commented Jan 10, 2025

I agree that we may need towers for general Galois theory, and it looks very logical to model them as chains in posets. Nevertheless, I suggest we choose the simplest path towards trisecting the angle and doubling the cube, so we don't spread ourselves thin :-) For these two theorems, one doesn't need to do a detour into group theory, and I think one doesn't need towers of fields/subgroups.

Since we are using field extensions now, we could give the following definition of "algebraic" constructible numbers:

df-constralg $a |- ConstrAlg = { x e. CC | E. k e. NN E. f e. Field ( ( CCfld /FldExt f /\ f /FldExt ( CCfld |`s QQ ) ) /\ ( f [:] ( CCfld |`s QQ ) ) = ( 2 ^ k ) /\ x e. ( Base ` f ) ) } $.

I.e. a number is constructible, if it lies in some finite field extension of QQ of degree a power of 2. Two big things to prove would be that Constr C_ ConstrAlg and -. ( 2 ^c ( 1 / 3 ) ) e. ConstrAlg (and similarly for trisecting the angle). It seems that to prove the first part we'll have to use induction over the definition of Constr, for example, if two nonparallel lines pass through points which lie in fields F1, F2, F3, F4, and |F_i : Q|is a power of 2, then the composite field K generated by F_i is also an extension of Q of degree a power of 2, and the intersection point lies in K (and thus in ConstrAlg). Here we will definitely use fldGen, and extdgmul. We'll also need a theorem saying that adjoining a square root gives you a field extension of degree dividing 2.

For the second part we'll again use extdgmul and the fact that adjoining a cubic root of 2 gives an extension of Q of degree 3 (and similarly for trisecting the angle). What do you think?

@savask
Copy link
Contributor

savask commented Jan 14, 2025

Since fldGen was merged into set.mm, maybe it's time to move on and see what we can do with it. Spoiler - it seems that we need field towers after all.

I'll use the definition of Constr from #4246 (comment) Now, let's use the following, shorter definition of ConstrAlg (later we'll see that it will have to be redefined, but bear with me for now):

df-constralg $a |- ConstrAlg = U. { f e. ~P CC | ( ( CCfld |`s f ) /FldExt ( CCfld |`s QQ ) /\ E. k e. NN ( ( CCfld |` f ) [:] ( CCfld |`s QQ ) ) = ( 2 ^ k ) ) } $.

Here f is a subset of CC which gives a subfield of C, and the degree |f : Q| is a power of two.

Let's use F for all such subsets of CC:

2flds $e |- F = { f e. ~P CC | ( ( CCfld |`s f ) /FldExt ( CCfld |`s QQ ) /\ E. k e. NN ( ( CCfld |`s f ) [:] ( CCfld |`s QQ ) ) = ( 2 ^ k ) ) } $.

Adding a square root to a field A from the class F will give us a (potentially larger) field from F:

2fldssqr.a $e |- ( ph -> A e. F ) $.
2fldssqr.x $e |- ( ph -> X e. CC ) $.
2fldssqr.s $e |- ( ph -> ( X x. X ) e. A ) $.
2fldssqr $p |- ( ph -> ( CCfld fldGen ( A u. { X } ) ) e. F ) $= ? $.

This should be relatively easy to prove from the definition of field extension degree. Now, the following statement seems to be more problematic: given two fields A and B from F, their composite field is again in that class.

2fldscomp.a $e |- ( ph -> A e. F ) $.
2fldscomp.b $e |- ( ph -> B e. F ) $.
2fldscomp $p |- ( ph -> ( CCfld fldGen ( A u. B ) ) e. F ) $= ? $.

I think to prove this we need to redefine ConstrAlg to use towers :-( The idea is that for B there should exist a tower of intermediate subfields Q = B1, B2, ... Bk = B, such that |B_{i+1} : B_i| = 2, or, equivalently, B_{i+1} is generated by B_i and some square root x_i of an element from B_i. Then one can consider fields A(x_1), A(x_1, x_2), ..., A(x_1, ..., x_k), and derive that the corresponding degrees are at most 2. Now, if AB is the composite field then it quickly follows that |AB : Q| is a power of 2 (and in fact there's also a tower of required type).

Now, the rest of what I'm going to say still holds. The intersection X of two lines defined over the "good" field A also lies in a good field:

constr2fldlem1.a $e |- ( ph -> A e. F ) $.
constr2fldlem1.x $e |- ( ph -> X e. CC ) $.
constr2fldlem1.s $e |- ( ph -> E. a e. A E. b e. A E. c e. A E. d e. A E. t e. RR E. r e. RR ( X = ( a + ( t x. ( b - a ) ) ) /\ X = ( c + ( r x. ( d - c ) ) ) /\ -. ( Im ` ( ( * ` ( b - a ) ) x. ( d - c ) ) ) = 0 ) ) $.
constr2fldlem1 $p |- ( ph -> E. s e. F X e. s ) $= ? $.

An intersection of a line and a circle:

constr2fldlem2.a $e |- ( ph -> A e. F ) $.
constr2fldlem2.x $e |- ( ph -> X e. CC ) $.
constr2fldlem2.s $e |- ( ph -> E. a e. A E. b e. A E. c e. A E. d e. A E. e e. A E. f e. A E. t e. RR ( X = ( a + ( t x. ( b - a ) ) ) /\ ( abs ` ( X - c ) ) = ( abs ` ( e - f ) ) ) ) $.
constr2fldlem2 $p |- ( ph -> E. s e. F X e. s ) $= ? $.

And an intersection of two circles:

constr2fldlem3.a $e |- ( ph -> A e. F ) $.
constr2fldlem3.x $e |- ( ph -> X e. CC ) $.
constr2fldlem3.s $e |- ( ph -> E. a e. A E. b e. A E. c e. A E. d e. A E. e e. A E. f e. A E. t e. RR ( -. a = d /\ ( abs ` ( X - a ) ) = ( abs ` ( b - c ) ) /\ ( abs ` ( X - d ) ) = ( abs ` ( e - f ) ) ) ) $.
constr2fldlem3 $p |- ( ph -> E. s e. F X e. s ) $= ? $.

The proofs of these will probably require only some arithmetic and 2fldssqr. As a result we should be able to prove

constr2fld $p |- Constr C_ ConstrAlg $= ? $.

Now, to prove the impossibility of doubling the cube, we will first show that adjoining a cubic root of 2 gives an extension of degree 3:

cr2fld.x $e |- ( ph -> X e. CC ) $.
cr2fld.c $e |- ( ph -> ( X ^c 3 ) = 2 ) $.
cr2fld $p |- ( ph -> ( ( CCfld |`s ( CCfld fldGen ( QQ u. { X } ) ) ) [:] ( CCfld |`s QQ ) ) = 3 ) $= ? $.

hence this root is not (algebraically) constructible (here we use multiplicativity of degrees):

cr2ncnstr $p |- ( ph -> -. X e. ConstrAlg ) $= ? $.

And finally the desired result:

cantdouble $p |- -. ( 2 ^c ( 1 / 3 ) ) e. Constr $= ? $.

All labels and statements are of course very preliminary.

So what do you think?

  1. How do we define towers? You mentioned chains in posets, but now I'm thinking that maybe a sequence of fields (i.e. a map from some interval) is more appropriate.
  2. Maybe you see a way to avoid towers?
  3. Should we create some file in the set.mm repository like constructible.mm or even problems.mm to collect unproved statements? Once something is formalized, it can be moved into a mathbox.

@tirix
Copy link
Contributor Author

tirix commented Jan 14, 2025

It's really nice to see this slowing taking form! (though the way is still very long!)

As for the method:

Should we create some file in the set.mm repository like constructible.mm or even problems.mm to collect unproved statements? Once something is formalized, it can be moved into a mathbox.

I don't know what's best, I don't think this was ever done before.

At some point in time, there was a suggestion to have a kind of purgatory (between main's heaven and mathboxes hell !) where common work which was not quite ready for main could live. The advantage of staying in one file is that we benefit from CI and all kind of label renamings (there seem to be a lot lately). The advantage of having our own file is that we are not bound to CI, collaboration directly from the GitHub interface is possible and easy (e.g. merges).

I think Lean is working with different GitHub repositories, but they have a different approach altogether.

In any case this could be discussed more largely.

@tirix
Copy link
Contributor Author

tirix commented Jan 15, 2025

How do we define towers? You mentioned chains in posets, but now I'm thinking that maybe a sequence of fields (i.e. a map from some interval) is more appropriate.

I believe towers could be defined from Word; words are sequences of anything and we can add a condition. Interestingly, UpWord is very recent and resembles the kind of ordering condition we need (but is limited to RR)

@tirix
Copy link
Contributor Author

tirix commented Jan 15, 2025

Since fldGen was merged into set.mm, maybe it's time to move on and see what we can do with it.

I've added a few theorems in #4574 , including that QQ is generated by 1. This could be part of the initial step of the induction, since we start with { 0 , 1 } only, the first field will be QQ.

@savask
Copy link
Contributor

savask commented Jan 15, 2025

I don't know what's best, I don't think this was ever done before. ...

I think we can't do it in the main set.mm file, since CI won't accept unproved statements (as far as I understand that's the reason why many people keep "future theorems" commented out). I suggest we create a file "constructible.mm" starting with $[ set.mm $] and containing all the unproved statements. Once some chunk is proved, it can be moved to a mathbox. It shouldn't create problems for other contributors, but if others refuse, we could keep the file elsewhere.

As for my own question

Maybe you see a way to avoid towers?

The answer is apparently no, there are elements x in C such that |Q(x) : Q| = 2^k, yet x isn't constructible, so we really need towers after all.

I believe towers could be defined from Word; words are sequences of anything and we can add a condition. Interestingly, UpWord is very recent and resembles the kind of ordering condition we need (but is limited to RR)

Okay, what about the following definition of algebraically constructible numbers:

df-constralg $a |- ConstrAlg = U. { f e. ~P CC | ( ( CCfld |`s f ) /FldExt ( CCfld |`s QQ ) /\ E. w e. Word ( SubDRing ` CCfld ) ( ( w ` 0 ) = QQ /\ ( w ` ( # ` w ) ) = f /\ A. k e. ( 0 ..^ ( ( # ` w ) - 1 ) ) ( ( CCfld |`s ( w ` ( k + 1 ) ) ) [:] ( CCfld |`s ( w ` k ) ) ) = 2 ) ) } $.

We take all subsets f of CC, such that it gives a field extension of QQ, and there exists a word w consisting of subfields of CCfld, where the first "letter" is QQ and the last is f itself, while any two adjacent letters give a field extension of degree 2. Does it look correct? If yes, then we more or less have an initial proof plan.

I've added a few theorems in #4574 , including that QQ is generated by 1. This could be part of the initial step of the induction, since we start with { 0 , 1 } only, the first field will be QQ.

Nice! Maybe we'll be able to avoid using this fact, since we should be fine with Constr C_ ConstrAlg, in particular, the initial step of the induction simply becomes { 0 , 1 } C_ QQ.

@tirix
Copy link
Contributor Author

tirix commented Jan 15, 2025

I suggest we create a file "constructible.mm"

Ok, let's try it and see how the community reacts! Feel free to launch a PR if you wish, otherwise I might do it in my next PR.

Does it look correct? If yes, then we more or less have an initial proof plan.

Yes it looks correct, with the only difference that it should be ( w ` ( ( # ` w ) - 1 ) ) = f (since we start at index 0, the last index is the length minus one). @avekens has made a nice shortcut for that, lastS, which we could use here.

Globally this seems like a very long definition though, I think I (still!) would like to abstract out the "tower" part, since it should be reusable. I might try to make a proposal.

@savask
Copy link
Contributor

savask commented Jan 15, 2025

... otherwise I might do it in my next PR.

Sure, I'll be happy if you do it, especially since you propose to abstract out the tower part. Feel free to change the labels etc as you wish.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

No branches or pull requests

3 participants