Re: Binary Automaton

2017-10-02 Thread José Tomás Atria
Mike, could you clarify what you meant by the int comment at the end of your last message? I fail to see the significance of having multibyte transition labels for the format of the payloads the automation will run on... Thanks! Jta On Mon, Oct 2, 2017, 12:41 Cristian Lorenzetto < cristian.lorenz

Re: Binary Automaton

2017-10-02 Thread Cristian Lorenzetto
It sounds a good way :) Maybe the code to develop it is not so huge. Thanks for the suggestions :) 2017-10-02 12:27 GMT+02:00 Michael McCandless : > I'm not sure this is exactly what you are asking, but Lucene's terms are > already byte[] (default UTF-8 encoded from char[] terms), and the automat

Re: Binary Automaton

2017-10-02 Thread Michael McCandless
I'm not sure this is exactly what you are asking, but Lucene's terms are already byte[] (default UTF-8 encoded from char[] terms), and the automata that are created for searching (e.g. by WildcardQuery, PrefixQuery, FuzzyQuery, AutomatonQuery) are also byte based (see the crazy UTF32ToUTF8.java con