> It's not easy to explain properly why I need the tokens; the general reason > is that the preexisting application, written long ago by several other > persons, is designed to use them, and changing its design would be too big an > undertaking.
Yeah, I still don't understand why would the code care to poke inside the parser and deal directly with tokens. > I will see if I can use Andrus' pointers to extract the tokens from the > Expression instance. I am afraid you won't find any *tokens* in an Expression instance. Expression is just a tree of objects that can be used to evaluate stuff. If you need it to match something, you can. But a parsed expression is devoid of any links to the original lexical structure. Andrus > On Nov 17, 2014, at 11:46 AM, Davide Vecchi <d...@amc.dk> wrote: > > Thanks for your inputs. > > I'm probably showing my technological age here, but I certainly admit that I > have this tendency to avoid repeating complex operations as a matter of > principle when it's known in advance that the second process will produce > exactly the same result as the first one. When I catch myself doing that I > always feel that my design is not OK. > > However in this case I am quite sure I need to get rid of the double parsing, > although I did not demonstrate in a particularly strict way that that's the > cause of the slowdown. It's more like a qualified (in my opinion) guess, > reinforced by the fact that method Expression.fromString(String) has a TODO > saying "TODO: cache expression strings, since this operation is pretty slow" > (I'm using version 3.0.2). So it looks like the Cayenne coders too had > reasons to worry to some extent about optimization in this area. > > I just used JVisualVM to profile the execution and two of the methods where > by far most of the time is spent are Expression.fromString(String) and > ExpressionParser.getNextToken() . Since I have to cut down the processing > time I do have to focus on them first. > > The situation here is that I modified a preexisting application which was > doing some basic parsing, and after creating the tokens from the parsing it > was using them to match the expression against objects. That parsing is basic > in that it can only parse simple expressions, f.ex. it doesn't support > parentheses grouping. > > My changes consisted of removing that parsing code from the application and > replacing it with calls to Cayenne, because we need real parsing. Of course > the parsing done by Cayenne is way more powerful and that might be the real > and fair reason why it takes longer, but even if this is the case it's > important for me not to do that parsing twice. > > It's not easy to explain properly why I need the tokens; the general reason > is that the preexisting application, written long ago by several other > persons, is designed to use them, and changing its design would be too big an > undertaking. Since all that needs to be improved is the parsing and matching > I thought I'd just use a powerful tool to replace only those parts. > > I will see if I can use Andrus' pointers to extract the tokens from the > Expression instance. > > > > -----Original Message----- > From: Andrus Adamchik [mailto:and...@objectstyle.org] > Sent: Sunday, November 16, 2014 14:57 > To: user@cayenne.apache.org > Subject: Re: Extracting tokens from an expression and matching an object > against that expression without parsing twice > > I second John's assessment. > > BTW, what are the tokens for? Do you actually need to have access to the > lexical structure of the String? As of course parsed Expression object is a > tree itself and gives you access to its own structure either directly > ('getOperand(int)') or via 'traverse' and 'transform' methods. > > Andrus > >> On Nov 14, 2014, at 9:54 PM, John Huss <johnth...@gmail.com> wrote: >> >> This looks like a serious micro optimization. Is the performance for >> this really that critical? Have you demonstrated that this is your >> application's crucial hot spot? >> >> On Fri, Nov 14, 2014 at 7:35 AM, Davide Vecchi <d...@amc.dk> wrote: >> >>> Hi all, >>> >>> I have an expression in a string, and I use Cayenne to parse the >>> expression into tokens, which are needed for a specific purpose. >>> >>> However in addition to having the tokens I also need to evaluate an >>> object against that expression, to see if that object matches the >>> expression. >>> >>> My problem is that the way I'm doing it causes the parsing to be done >>> twice on the same expression, and I would like to avoid to parse the >>> same expression twice. >>> >>> The token creation I'm doing it like this: >>> >>> ----------------------------------- >>> String where = "myField=0"; >>> >>> Reader reader = new StringReader(where); >>> >>> ExpressionParser parser = new ExpressionParser(reader); >>> >>> List<Token> tokens = new ArrayList<>(); >>> >>> Token token = parser.getNextToken(); >>> >>> while (token != null) { >>> >>> tokens.add(token); >>> >>> token = parser.getNextToken(); >>> } >>> ----------------------------------- >>> >>> The object matching I'm doing it like this: >>> >>> ----------------------------------- >>> String where = "myField=0"; >>> >>> Expression expression = Expression.fromString(where); >>> >>> boolean matches = expression.match(object); >>> ----------------------------------- >>> >>> The call to Expression.fromString made in the object matching >>> operation performs a parsing, but the parsing of the same expression >>> had already been done in the token creation operation. >>> >>> Is there a way to redesign this process in order to get the tokens >>> and also match an object against the expression without parsing the >>> same expression twice ? >>> >>> For example, I believe that the call to Expression.fromString must >>> have created the tokens, because it has parsed the string. So I >>> thought I could reverse the order and do the object matching first, >>> keep the Expression instance created in that process and use it to >>> extract the tokens. But I can't see how to extract the tokens from an >>> Expression instance instead of from an ExpressionParser instance as I'm >>> currently doing. >>> >>> Or another possibility could be that I keep creating the tokens >>> first, and then I match my object against them, instead of against >>> the string expression that generated those tokens. But I can't see >>> how to match an object against tokens. >>> >>> So I'm looking for some ideas. >>> >>> Thanks in advance. >>> >>> Davide Vecchi >>> > >