Re: [Metamath] Codex wrote an MM verifier in Go pretty easily [AI Progress Report]

Jim Kingdon Wed, 17 Sep 2025 14:48:10 -0700

It isn't either/or. Testsuites specific to one tool can have more assertions on 
error messages and can also test features (e.g. the proof minimizer or web page 
generation) other than proof verification.


I do appreciate David's encouragement of contributions to metamath-test though, 
and am thinking of taking a closer look when I get some time.

On September 17, 2025 2:26:54 PM PDT, Matthew House <[email protected]> 
wrote:
>Coincidentally, by running it on one of my test files, I just noticed that
>mmverify.py suffers from the exact same issue
><https://github.com/david-a-wheeler/mmverify.py/issues/30> of allowing
>proofs to reference inactive hypotheses belonging to other theorems.
>Centralizing all the test suites probably would be a good idea.
>
>On Wed, Sep 17, 2025 at 10:32 AM David A. Wheeler <[email protected]>
>wrote:
>
>>
>>
>> > On Sep 16, 2025, at 6:42 PM, Jim Kingdon <[email protected]> wrote:
>> >
>> > Those all sound like good things to add to the testsuite.
>>
>> Patches welcome :-) to:
>> https://github.com/david-a-wheeler/metamath-test
>>
>>
>> > (The tests in the metamath-exe and metamath-knife repos are I think
>> better developed, although in principle some of their tests could be fed
>> back to a testsuite which could be run on any verifier).
>>
>> That was my goal for metamath-test. I also think it's useful to have a
>> "general" test suite.
>>
>> I'd be happy to contribute that repo to the  https://github.com/metamath
>> organization. They're all MIT licensed, allowing anyone do practically
>> anything except sue the authors. Anyone want me to do that?
>>
>> > On September 16, 2025 12:44:47 PM PDT, Matthew House <
>> [email protected]> wrote:
>> > A few issues I spotted, many of which are nitpicks and some of which are
>> more serious:...
>> ...
>> >  Writing a correct Metamath verifier requires very careful attention to
>> detail, which it seems that LLMs still struggle with, short of directed
>> hand-holding or extraordinarily thorough test suites. The original
>> mmverify.py code is quite compact for how many details it captures.
>>
>>
>> LLMs can be a *great* productivity aid, but only an *aid*. The emerging
>> term is "AI Code Assistant" which I think gives the right flavor. They are
>> helpful *assistants* to humans, but at least so far, they are not
>> *replacements* for humans.
>>
>> --- David A. Wheeler
>>
>>
>>
>
>-- 
>You received this message because you are subscribed to the Google Groups 
>"Metamath" group.
>To unsubscribe from this group and stop receiving emails from it, send an 
>email to [email protected].
>To view this discussion visit 
>https://groups.google.com/d/msgid/metamath/CADBQv9bSGMebPM3QtCDFqfwfobCXj63c%3DyijPc65kXLq9g0vcQ%40mail.gmail.com.

-- 
You received this message because you are subscribed to the Google Groups 
"Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/metamath/4DA14186-A18F-4931-8C53-CB8298568917%40panix.com.

Re: [Metamath] Codex wrote an MM verifier in Go pretty easily [AI Progress Report]

Reply via email to