On Sat, Apr 24, 2010 at 8:07 PM, Bruce Momjian <br...@momjian.us> wrote: > Jehan-Guillaume (ioguix) de Rorthais wrote: >> A simple example of a tokenizer is the php one: >> http://fr.php.net/token_get_all >> >> And here is a basic example which return pseudo rows here : >> >> => TOKENIZE $script$ >> SELECT 1; >> UPDATE test SET "a"=2; >> $script$; >> >> type | pos | value | line >> - -------------+-----+----------+------ >> SQL_COMMAND | 1 | 'SELECT' | 1 >> CONSTANT | 8 | '1' | 1 >> DELIMITER | 9 | ';' | 1 >> SQL_COMMAND | 11 | 'UPDATE' | 2 >> IDENTIFIER | 18 | 'test' | 2 >> SQL_KEYWORD | 23 | 'SET' | 2 >> IDENTIFIER | 27 | '"a"' | 2 >> OPERATOR | 30 | '=' | 2 >> CONSTANT | 31 | '1' | 2 > > Sounds useful to me, though as a function like suggested in a later > email.
If tool-builders think this is useful, I have no problem with making it available. It should be suitably disclaimed: "We reserve the right to rip out the entire flex/yacc-based lexer and parser at any time and replace them with a hand-coded system written in Prolog that emits tokenization information only in ASN.1-encoded pig latin. If massive changes in the way this function works - or its complete disappearance - are going to make you grumpy, don't call it." But having said that, assuming there is a real use case for this, I think it's better to let people get at it rather than forcing them to roll their own. Because frankly, if we do rip out the whole thing, then people are going to have to adjust their stuff anyway, regardless of whether they're using some API we provide or something they've cooked up from scratch. And in practice, most changes on our end are likely to be incremental, though, again, we're not guaranteeing that in any way. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers