On 4/18/07, William Mee <[EMAIL PROTECTED]> wrote:
I'd like to add metadata which I get *after* indexing a document's contents to
the index. To be more specific: I'm implementing shingling (detection of
near-duplicate documents) and want to add the document fingerprint (which is
based on the s
William Mee wrote:
I'd like to add metadata which I get *after* indexing a document's
contents to the index. To be more specific: I'm implementing
shingling (detection of near-duplicate documents) and want to add the
document fingerprint (which is based on the sequence of tokens) to
the index.
T
18 apr 2007 kl. 18.25 skrev William Mee:
The only way I could get this information *before* adding a
document to an index is to create a token stream manually (and then
have this happen all over again when the document is indexed). This
isn't a satisfying solution.
Why is it not a satisf
I'd like to add metadata which I get *after* indexing a document's contents to
the index. To be more specific: I'm implementing shingling (detection of
near-duplicate documents) and want to add the document fingerprint (which is
based on the sequence of tokens) to the index.
There doesn't seem