Rafael Turk skrev:
*Great ideia! Berkeley DB is definitely a try, simple and effective, but
I'll have to work the data previously.
JDBM has a more appealing license if you ask ASF.
karl
-
To unsubscribe, e-mail:
Rafael Turk a écrit :
Hi Mathieu,
*What do you wont to do?*
An spell checker and related keyword suggestion
Here is a spell checker wich I try to finalize :
https://admin.garambrogne.net/projets/revuedepresse/browser/trunk/src/java
If you wont an ngram => popularity map, just use a berkl
Hi Mathieu,
*What do you wont to do?*
An spell checker and related keyword suggestion
If you wont an ngram => popularity map, just use a berkley DB, and use this
information in your Lucene application. Lucene is a reversed index, Berkeley
DB an index.
*Great ideia! Berkeley DB is definitely a t
Thanks Julien,
I´ll definitely give it a try!!!
[]s
Rafael
On Wed, Apr 23, 2008 at 8:38 AM, Julien Nioche <
[EMAIL PROTECTED]> wrote:
> Hi Raphael,
>
> We initially tried to do the same but ended up developing our own API for
> querying the Web 1T. You can find more details on
> http://digita
Rafael Turk a écrit :
Hi Folks,
I´m trying to load Google Web 1T 5 Gram to Lucene. (This corpus contains
English word n-grams and their observed frequency counts. The length of the
n-grams ranges from unigrams(single words) to five-grams)
I´m loading each ngram (each row is a ngram) as an
Hi Raphael,
We initially tried to do the same but ended up developing our own API for
querying the Web 1T. You can find more details on
http://digitalpebble.com/resources.html
There could be a way to reuse elements from Lucene e.g. the Term index only
but I could not find an obvious way to achieve
Hi Folks,
I´m trying to load Google Web 1T 5 Gram to Lucene. (This corpus contains
English word n-grams and their observed frequency counts. The length of the
n-grams ranges from unigrams(single words) to five-grams)
I´m loading each ngram (each row is a ngram) as an individual Document.
Th