On 18/06/2018 11:45, Chris Angelico wrote:
On Mon, Jun 18, 2018 at 8:33 PM, Bart <b...@freeuk.com> wrote:
You're right in that neither task is that trivial.
I can remove comments by writing a tokeniser which scans Python source and
re-outputs tokens one at a time. Such a tokeniser normally ignores comments.
But to remove type hints, a deeper understanding of the input is needed. I
would need a parser rather than a tokeniser. So it is harder.
They would actually both end up the same. To properly recognize
comments, you need to understand enough syntax to recognize them. To
properly recognize type hints, you need to understand enough syntax to
recognize them. And in both cases, you need to NOT discard important
information like consecutive whitespace.
No. If syntax is defined on top of tokens, then at the token level, you
don't need to know any syntax. The process that scans characters looking
for the next token, will usually discard comments. Job done.
It is very different for type-hints as you will need to properly parse
the source code.
As a simpler example, if the task was the eliminate the "+" symbol, that
would be one kind of token; it would just be skipped when encountered.
But if the requirement to eliminate only unary "+", and leave binary
"+", then that can't be done at tokeniser level; it will not know the
context.
(The matter of leading white space sometimes being important, is a minor
detail. It just becomes a token of its own.)
So in both cases, you would probably end up with something like 2to3.
The effective work is going to be virtually identical. And.... there's
another complication, if you want any form of generic tool. You have
to ONLY remove certain comments, not others. For instance, you
probably should NOT remove copyright/license comments.
What will those look like? If copyright/licence comments have their own
specific syntax, then they just become another token which has to be
recognised.
The main complication I can see is that, if this is really a one-time
source-to-source translator so that you will be working with the result,
then usually you will want to keep the comments.
Then it is a question of more precisely defining the task that such a
translator is to perform.
--
bart
--
https://mail.python.org/mailman/listinfo/python-list