Cameron, Your suggestion makes me shudder!
Removing all earlier lines of code is often guaranteed to generate errors as variables you are using are not declared or initiated, modules are not imported and so on. Removing just the line or three where the previous error happened would also have a good chance of invalidating something. Someone who really wants to be able to isolate large parts of their code so that an error in once does not compromise lots of remaining code, might build their code in small units on the level of single functions per file and do lots of imports. They can then ask for all the files to be pseudo-compiled to byte-code and that might provide lots of errors to look at in one pass. But asking for a one-file version to find errors and somehow go past them and look for more is more daunting but of course can be done with partial accuracy and usefulness at best. As an analogy, if tolerated, think of a spell-checker on a document that can find oodles of words spelled wrong. Unfortunately, a spell corrector can drive us nuts if it knows little about context. If it sees a word like "reid" should it just change it to "read" or "red" or perhaps "reed" or look to see if the real problem is it is supposed to be unified (no space) with a word before or after? Will it know if the word appears in a context where a language like Latin or French or German or Hungarian is being quoted and perhaps it is spelled right, or if wrong, has other more likely corrections? Now if you add a grammar detector, and it knows you are looking for an adjective or a verb or a noun, it may do better. I use Google translate quite a bit as a tool as I often have to type in various languages and it provides a handy keyboard or lets me check if I used the right grammar especially in languages with silly ideas that objects can have 2 or even three genders. So putting in phrases like "this xyz" can result in language-specific text that tells me if it is masculine or feminine or perhaps neuter. But the reason I mention it is how often it is WRONG. I mean many languages have multiple words that are spelled the same but used and pronounced differently in various contexts. The English word "read" can sound like reed or like red so past tense sounds different as in I read that book last week versus please read it to me now. But some languages such as Hebrew which generally may not show the vowels, can get totally confused in this program as humans often need lots of context to figure out whether the current short word is in a context where it means "you: feminine and singular and is pronounced aht or it is a way of showing what follows is a direct object and loosely means "the" in a redundant way and is pronounced as "eht". Quite a few words have three or more possible ways to pronounce the same letters and without vowel guides need context and sometimes some spreadsheet-like ingenuity as multiple other words are also in limbo and once resolved can impact what other words may now mean. Obviously adding back the vowels makes things clear so people who are used to seeing books written that old way can get hopelessly lost reading a modern newspaper. End of digression, just assume I could have gone on for many pages describing my annoyances at what Google translate does to many other languages that show the imperfections in what is really a great and powerful tool. Well parsing any program in most languages can be equally complex and require lots of context. For example, you can often use the same identifier to be the name of a regular variable or the name of a function and sometimes other things such as the name of a module. They can often be disambiguated in context. Perhaps the same name following by parentheses should be a function call while a name followed by :: or ::: might in that language require it to be the name of a module/package. If followed by [ it might need to be something indexable such as an array or list and so on. So say there is an error in the variable. Can the interpreter or linter figure out what the error is and almost repair it? Can it see a variable name like "alpXha" and note there is no such identifier in the current namespace but there is one called "alpha" that might be the one without the X? But what if what is missing is an open parent or maybe the matching close paren. Does it know if the problem is a bad variable name or a bad function invocation or one of many other possible problems. Code with a random blemish is often not easily figured out. If I type the name of a function without parentheses, it could be an attempt to call the function with no arguments (an error though in many languages) or it could be I want to pass the function itself as n argument in functional programming. But if I have another variable of type array, might it not be parentheses missing but square brackets? The compiler or interpreter often cannot fix it so it often tries to skip forward till it finds something unambiguous that mark the beginning of a new section. That might be something like an unquoted semicolon at the end of a line or a matching close bracket. Depending on such choices, again, varying amounts of the program may be ignored in evaluating what follows. But this is not the same as a human speedreading or daydreaming who misses a bit here and there and just hopes it was not crucial and that what follows probably remains worthy and valid. I have sometimes missed something like a name and then seen pages of pronouns like "she" and eventually give up as no more hints arrive and I have to go back or ask someone lest a big bunch of the text makes no sense to me. Someone is wanting to treat code from a spelling checker perspective and wants all possible mistakes thrown at them at once. As I pointed out, in real life many kinds of context can matter and a really good checker might even consult a personal list of words it has learned you want ignored, like people's names or some abbreviations like LOL. It may even read marked-up text in say HTML or XML or similar formats that is marked with the language they supposedly contain and calls up a spell-checker appropriate for each region. But if they want a really intelligent program that recovers enough from errors to reliably continue, maybe not easy. They have explained and amended that they understand some of these issues and are willing to get lots of false negatives or red herrings and their real goal is to have a chance to detect and maybe fix a few things per round rather than just one. Not a bad wish. Just not a trivial wish to grant and satisfy. -----Original Message----- From: Python-list <python-list-bounces+avi.e.gross=gmail....@python.org> On Behalf Of Cameron Simpson Sent: Sunday, October 9, 2022 6:45 PM To: python-list@python.org Subject: Re: What to use for finding as many syntax errors as possible. On 09Oct2022 21:46, Antoon Pardon <antoon.par...@vub.be> wrote: >>Is it that onerous to fix one thing and run it again? It was once when >>you handed in punch cards and waited a day or on very busy machines. > >Yes I find it onerous, especially since I have a pipeline with unit >tests and other tools that all have to redo their work each time a bug >is corrected. It is easy to get the syntax right before submitting to such a pipeline. I usually run a linter on my code for serious commits, and I've got a `lint1` alias which basicly runs the short fast flavour of that which does a syntax check and the very fast less thorough lint phase. I say this just to ease your write/run-tests cycle. Regarding your main request, had you considered writing your own wrapper tool? Something which ran something like: python -We:invalid -m py_compile your_python_file.py If there's an error, report it, then make a new file commencing with the next unindented line after the error, with all preceeding lines commented out (to keep the line numbers the same). Then run the check again. Repeat until the file's empty or there are no errors. This doesn't sound very complex. Cheers, Cameron Simpson <c...@cskk.id.au> -- https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list