Oh I was simply explaining that the Lucene Automaton API uses "int" labels,
and so if you want an automaton operating in byte space, you just need to
ensure those ints only use the range supported by unsigned bytes (0 - 255).
Mike McCandless
http://blog.mikemccandless.com
On Mon, Oct 2, 2017 at
Mike, could you clarify what you meant by the int comment at the end of
your last message? I fail to see the significance of having multibyte
transition labels for the format of the payloads the automation will run
on...
Thanks!
Jta
On Mon, Oct 2, 2017, 12:41 Cristian Lorenzetto <
cristian.lorenz
It sounds a good way :) Maybe the code to develop it is not so huge. Thanks
for the suggestions :)
2017-10-02 12:27 GMT+02:00 Michael McCandless :
> I'm not sure this is exactly what you are asking, but Lucene's terms are
> already byte[] (default UTF-8 encoded from char[] terms), and the automat
I'm not sure this is exactly what you are asking, but Lucene's terms are
already byte[] (default UTF-8 encoded from char[] terms), and the automata
that are created for searching (e.g. by WildcardQuery, PrefixQuery,
FuzzyQuery, AutomatonQuery) are also byte based (see the crazy
UTF32ToUTF8.java con
> Preface: I dont know how automaton is implemented deeply inside lucene ,
Well, you can take a look, it's open source. :) There are two
different finite state automata inside Lucene: one is pretty much a
"read-only" transducer from unique input seqences (of bytes) into an
output. This is the FST
*to @Uwe Schindler *
thanks , it is very interesting :)
*to @Dawid*
Preface: I dont know how automaton is implemented deeply inside lucene ,
but (considering automaton is built on the fly when index is already
present) i imagine that the automaton is scanning the lexicons/tokens
present in th
> Hi , it is possible to create a Automaton in lucene parsing not a string
> but a byte array?
Can you state what problem are you trying to solve? This seems to be a
question stripped of a more general context -- why do you need those
byte-based automata?
Dawid
--
Lucene Users
> Subject: Binary Automaton
>
> Hi , it is possible to create a Automaton in lucene parsing not a string
> but a byte array?
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional comma
Hi , it is possible to create a Automaton in lucene parsing not a string
but a byte array?