I doubt anyone has tested it. I'd compile it under Java 8 and see if
all of the tests run.
Best,
Erick
On Sat, Sep 16, 2017 at 7:41 AM, Lisheng Zhang wrote:
> Hi, in one of our product we are still using lucene 2.3, is lucene 2.3
> compatible with java 1.8?
>
> Thanks very much for helps, Lishen
Hi, in one of our product we are still using lucene 2.3, is lucene 2.3
compatible with java 1.8?
Thanks very much for helps, Lisheng
Whoa, thank you Uwe! I will have a look; too bad about the licensing, but
I know dictionaries are often licensed with LGPL.
Mike McCandless
http://blog.mikemccandless.com
On Sat, Sep 16, 2017 at 7:03 AM, Uwe Schindler wrote:
> Hi,
>
> I deduped it. Thanks for the hint!
>
> Uwe
>
> -
> Uwe
Hi,
I deduped it. Thanks for the hint!
Uwe
-
Uwe Schindler
Achterdiek 19, D-28357 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-
> From: Uwe Schindler [mailto:u...@thetaphi.de]
> Sent: Saturday, September 16, 2017 12:51 PM
> To: java-user@lucene.apache.or
Ok sorting and deduping should be easy with a simple command line. Reason is
that it was created from 2 files of Björn Jacke's Data. I thought that I
deduped it...
Uwe
Am 16. September 2017 12:46:29 MESZ schrieb Markus Jelsma
:
>Sorry, i would if i were on Github, but i am not.
>
>Thanks again
Sorry, i would if i were on Github, but i am not.
Thanks again!
Markus
-Original message-
> From:Uwe Schindler
> Sent: Saturday 16th September 2017 12:45
> To: java-user@lucene.apache.org
> Subject: RE: German decompounding/tokenization with Lucene?
>
> Send a pull request. :)
>
> Uwe
Send a pull request. :)
Uwe
Am 16. September 2017 12:42:30 MESZ schrieb Markus Jelsma
:
>Hello Uwe,
>
>Thanks for getting rid of the compounds. The dictionary can be smaller,
>it still has about 1500 duplicates. It is also unsorted.
>
>Regards,
>Markus
>
>
>-Original message-
>> From:Uwe
Hello Uwe,
Thanks for getting rid of the compounds. The dictionary can be smaller, it
still has about 1500 duplicates. It is also unsorted.
Regards,
Markus
-Original message-
> From:Uwe Schindler
> Sent: Saturday 16th September 2017 12:16
> To: java-user@lucene.apache.org
> Subject: R
Hi,
I published my work on Github:
https://github.com/uschindler/german-decompounder
Have fun. I am not yet 100% sure about the License of the data file. The
original
Author (Björn Jacke) did not publish any license; but LibreOffice publishes his
files
Under LGPL. So to be safe, I applied the
Hi Michael,
I had this issue just yesterday. I did that several times and I built a good
dictionary in the meantime.
I have an example for Solr or Elasticsearch with the same data. It uses the
HyphenationCompoundTokenFilter, but with ZIP file *and* dictionary (it's
important to have both). The
+1, some time ago I also used the decompounder mentioned by Dawid and was
satisfied back then.
Regards,
Tommaso
Il giorno sab 16 set 2017 alle ore 09:29 Dawid Weiss
ha scritto:
> Hi Mike. Search lucene dev archives. I did write a decompounder with Daniel
> Naber. The quality was not ideal but
Hi Mike. Search lucene dev archives. I did write a decompounder with Daniel
Naber. The quality was not ideal but perhaps better than nothing. Also,
Daniel works on languagetool.org? They should have something in there.
Dawid
On Sep 16, 2017 1:58 AM, "Michael McCandless"
wrote:
> Hello,
>
> I ne
12 matches
Mail list logo