It is (b). D.
On Fri, Aug 7, 2015 at 3:05 AM, Trejkaz <trej...@trypticon.org> wrote: > I have recently done updates from Lucene 3.6 to 4.x and 4.x to 5.2. > > During this process, I noticed that the FST used by the Japanese > analyser (AKA Kuromoji) was changing between releases. As I fear > breakages in backwards compatibility, I worried that the dictionary > had changed, so I wrote a little program to read it in and print the > words out in order. > > What I find is that in all three releases, the list of words is > exactly the same - even though the files have changed subtly from > release to release. > > What's up with that? I can think of a few possibilities: > > (a) the dictionary _has_ actually changed, and merely printing the > list of words was not enough (e.g., the parts of speech changed) > > (b) the dictionary hasn't changed, but the files change when the FST > format changes > > (c) the dictionary hasn't changed, but the files change because > they're built on demand every time Lucene is built and there is > something non-deterministic about the process (e.g. something is using > a HashMap internally.) > > I'm hoping that it's (b), but does anybody know? > > TX > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org