Tierry, Alexander, and Benoit have asked for clarification of the goals of 
set.mm.  Here are some of my opinions.  I am moving the discussion in 
https://github.com/metamath/set.mm/issues/1431 to here for wider readership.

*1.* There is no "goal" per se for set.mm.  People work on what they are 
interested in.  Typically work starts off in mathboxes, where people pretty 
much have freedom to do whatever they want.  Two situations that typically 
will result in the work being moved to the main part of set.mm are (1) more 
than one person wants to use or develop the work and (2) the work is an 
mm100 theorem.  There are other factors, though, that I'll discuss.  And in 
any case it is a judgment call by me and others that may not be perfect.  
Just because we choose not to import from a mathbox right away doesn't mean 
the value of the work is less, it just means that in our possibly flawed 
judgment it either doesn't quite fit in with the set.mm philosophy or it 
hasn't found its place there yet.

So what I will primarily discuss is what I see as the "philosophy" that has 
been generally used so far in set.mm.  Hopefully that can guide the goals 
that people set for themselves as they contribute to set.mm.


*2.* A very important thing to me is the philosophy of striving for 
simplicity wherever possible.  This particularly impacts what definitions 
we choose.

*"Rules" for introducing new definitions*

Definitions are at the heart of achieving simplicity.  While developing 
set.mm, two rules of thumb that evolved in my own work were (1) there 
shouldn't be two definitions that mean almost the same thing and (2) 
definitions shouldn't be introduced unless there is an actual need in the 
work at hand.  Maybe I should add (3), don't formally define something that 
doesn't help proofs and can be stated simply in English in a theorem's 
comment.

Concerning the 1st rule, very early on, when I was starting to work with 
ordinals, I noticed that many books would redefine "e."  (epsilon) as "<" 
and "C_" (subset) as "<_" (less than or equal) since those were the 
ordering roles they played in von Neumann ordinal theory.  I figured since 
books did it, it must be important, so I also did that in the early 
set.mm.  (I may have even used those symbols because reals were a distant 
vision at that point.)  Books casually interchanged them in theorems and 
proofs, but formalization meant I might have 2 or more versions of many 
theorems.  It was a constant nuisance to have to convert back and forth 
inside of proofs, making proofs longer.  I eventually decided to use just 
"e." and "C_", and it made formalizations much more pleasant with fewer 
theorems and shorter proofs, sometimes providing more general theorems that 
could be used outside of ordinal theory.  I never missed not having the 
ordering symbols to remind me of their roles.  I liked the feeling of being 
closer to the underlying set theory rather than being an abstraction layer 
away from it.  If the "less than" role of "e." was important for human 
understanding in a particular case, I could simply use the English language 
"less than" the comment (3rd rule) while continuing to use "e." in the $p 
statement.

As for the 2nd rule, there are often different choices that can be made in 
the details of how something is defined.  It is sometimes hard or 
impossible to anticipate what choice is optimal until we actually start 
using the definition for serious work.  There is also the question of 
whether the definition is even necessary.  More than once I've found that a 
definition I thought might be needed (for example was used in the 
literature proof) actually wasn't.  Sometimes, like the "e." vs.  "<" case 
above, the definition was even a hindrance and made proofs longer.  Other 
times an intermediate definition of my own invention that isn't in the 
literature turned out to be advantageous for shorter proofs.  It can be 
best to let the work drive the need for the definition and its precise 
details.

Another example of a literature definition we purposely don't use and 
illustrates the 3rd rule is Takeuti/Zaring's Russell class with symbol 
"Ru", which they formally define as "{ x | x e/ x }".  The only theorem 
that uses it (in set.mm and in T/K) is ~ ru.  For the purposes of set.mm, 
it is wasteful and pointless since we can just define the Russell class 
verbally in the comment of ~ ru.

A problem that has arisen in the past is where a person has added a set of 
definitions from the literature, proved some simple property theorems, then 
is disappointed that the work isn't imported into the main set.mm.  
Sometimes we do and sometimes we don't, but the main issue is whether the 
2nd rule above is being followed.  Without knowing exactly how the 
definition is going to be applied, very often it won't be optimal and will 
have to be adjusted, and sometimes it is more efficient just to start from 
scratch with a definition in the form that is needed.  I prefer to see some 
"serious" theorems proved from a definition before importing it.

As a somewhat crude example of why a definition in isolation may not be 
optimal, suppose someone defined sine as "$a |- sin A = ..."  (not a straw 
man; this kind of thing has been attempted in the past) i.e. a 1-place 
function-like "sin A" rather than a stand-alone object "sin".  While this 
can be used to prove a few things, it is very limiting since the symbol 
"sin" in isolation has no meaning, and we can't prove e.g.  "sin :  CC --> 
CC".  It is an easy mistake to make if you don't understand set theoretical 
picture but are blindly copying a textbook definition.  Here it is pretty 
obvious, but in general such issues can be subtle and may depend on the 
application.  Also, there are often multiple ways to define something, some 
of which are easy to start from and others which would require a lot of 
background development.  The best choice can't always be anticipated 
without knowing what background will be available.


*3.* Let me comment on the recent proposal to import df-mgm, df-asslaw, 
df-sgrp.  These are my opinions, and correct me if I am wrong.

df-mgm:  A magma is nothing but an empty shell with a binary operation.  
There are no significant theorems that follow from it.  All the properties 
of binary operations that we have needed have already been proven.  If 
additional properties are needed in the future it seems it would be better 
to state them directly because they will be more useful generally.  Just 
because it appears in Bourbaki or other texts is not to me a 
justification:  it also needs to be useful and be driven by an actual need 
for it.  There are many definitions in textbooks that have turned out to be 
unnecessary.

df-asslaw:  All this is doing is encapsulating the class of associative 
operations.  The only thing that can be derived from it AFAIK is the 
associative law.  I don't see how it would benefit proofs since additional 
work is needed to convert to and from it.  I don't see how it helps human 
understanding because it is easier for a human to see "show the associative 
law" in the description of a theorem than be asked to learn yet another 
definition (an example of the 3rd rule).

df-sgrp:  This is basically df-asslaw converted to an extensible 
structure.  So in essence it duplicates df-asslaw (or vice versa depending 
on your point of view), violating the first rule (no redundancy).  More 
importantly, AFAIK there is nothing that can be derived from it except the 
associative law, unlike say df-grp where deep and non-obvious theorems are 
possible.  Yes, it can be used as a "layer" on top of which we can slightly 
simplify df-mnd and df-rng, but it is really worth having an entire, 
otherwise useless definitional layer just to avoid having to state 
"(x+y)+z=x+(y+z)"?  The fact that it simplifies two definitions is a slight 
positive, but I'm still torn about it.  It doesn't quite pay for itself in 
terms of reducing set.mm size, and it increases the burden on the reader 
who has to learn another definition and drill down another layer to 
understand df-mnd and df-rng.  Personally I'd prefer to see 
"(x+y)+z=x+(y+z)" directly because it is immediately obvious without having 
to look at a deeper layer.  In the case of the description in df-rng0, it 
would be more direct and less obscure to state "associative operation" 
instead of "semigroup operation", since most people (at the level of 
studying that definition) would know what "associative" means, but fewer 
would would know what "semigroup" means.

I can't say that these will never be imported.  Possibly something will 
drive a need for them, maybe something related to category theory where 
even a magma might be a useful object.  But I would want to see the 
definitions be driven by that need when it arises; we don't even know yet 
whether the present extensible structure form can be adapted for that 
purpose.  It is fine to store them in a mathbox indefinitely, but I'd like 
to see a true need driving their use before importing.


*4.* As for adding an exhaustive list of all possible definitions one can 
find in Bourbaki or whatever, as someone suggested, I don't think something 
like that belongs in set.mm, for all the reasons above.  There are other 
resources that already list these (Wikipedia, Planetmath, nLab).  Their 
precise formalization will depend on the context in which they are needed 
in set.mm, which is unknown until they are used.  In isolation (perhaps 
with a couple of property theorems that basically check syntax) there is no 
guarantee that they are correct, even in set.mm.  To restate, I think the 
philosophy should be that definitions should be added by need, not as 
busywork.  Adding a definition when needed is one of the smallest parts of 
building a series of significant proofs.  It doesn't have to be done in 
advance, and I don't think it is generally productive or useful to do so.


*5.* Finally, let me touch on the issue of > (greater than).

There are many non-symmetrical relations that use reversed symbols to swap 
arguments: "B contains A" with reversed epsilon, "B includes A" with 
reversed subset symbol, "Q if P" with reversed arrow, etc.  I've seen all 
of these in the literature.  If we really feel the reader will encounter 
them and expect set.mm to explain their meaning (which is likely explained 
in their textbook anyway), we could mention informally the reversed symbol 
usage when we introduce the forward symbol.  But we don't add $a's for them 
because we will never use them.  That is because either we would have to 
add a huge number of theorems containing all possible forward and reversed 
combinations of the symbols, or we would have to constantly convert between 
them inside of proofs.  Both of those are contrary to a philosophy of 
simplicity.

IMO the same should be done with >, mentioning what it means in the 
description for <.  Introducing a formal $a statement for > that will never 
be used is unnecessary and wasteful of resources.  If we want to be 
excessively pedantic, we could mention in the description for < that the 
the formal definition would be "|- > = `' >" , although that seems less 
intuitive than simply saying that "in theorem descriptions, we occasionally 
use the terminology 'A is greater than B' to mean 'B is less than A'."  A 
grade school student can (and does) easily understand that.

Basically, ">" violates all 3 "rules" for new definitions I proposed above.

Norm

-- 
You received this message because you are subscribed to the Google Groups 
"Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/metamath/6e9144b6-5b8f-488d-bf19-83e5a2a250e0%40googlegroups.com.

Reply via email to