[
https://issues.apache.org/jira/browse/NIFI-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507466#comment-15507466
]
Michael Moser commented on NIFI-2787:
-------------------------------------
In Nifi 1.1.0-SNAPSHOT, I wrote a unit test to make this manifest as
[Index Provenance Events] ERROR
org.apache.nifi.provenance.PersistentProvenanceRepository - Failed to index
Provenance Event for
target/storage/722797da-d510-4cee-b1ed-cd497df6052a/0.prov.gz to
target/storage/722797da-d510-4cee-b1ed-cd497df6052a/index-1474396176476
java.lang.IllegalArgumentException: Document contains at least one immense term
in field="immense" (whose UTF8 encoding is longer than the max length 32766),
all of which were skipped. Please correct the analyzer to not produce such
terms. The prefix of the first immense term is: <junk>, original message:
bytes can be at most 32766 in length; got 36000
at <snip>
at
org.apache.nifi.provenance.lucene.IndexingAction.index(IndexingAction.java:126)
at
org.apache.nifi.provenance.PersistentProvenanceRepository$12.call(PersistentProvenanceRepository.java:1742)
> PersistentProvenanceRepository rollover can fail on immense indexed attributes
> ------------------------------------------------------------------------------
>
> Key: NIFI-2787
> URL: https://issues.apache.org/jira/browse/NIFI-2787
> Project: Apache NiFi
> Issue Type: Bug
> Components: Core Framework
> Affects Versions: 1.0.0, 0.7.0
> Reporter: Michael Moser
>
> Accidentally created an immense attribute (36,000 bytes), which I indexed
> with nifi.provenance.repository.indexed.attributes. Received this error.
> ERROR [Provenance Repository Rollover Thread-1]
> o.a.n.p.PersistentProvenanceRepository Failed to rollover Provenance
> repository due to java.lang.IllegalArgumentException: Document contains at
> least one immense term in field="FOO" (whose UTF8 encoding is longer than the
> max length 32766), all of which were skipped. Please correct the analyzer to
> not produce such terms.
> Perhaps this is as simple as changing
> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-provenance-repository-bundle/nifi-persistent-provenance-repository/src/main/java/org/apache/nifi/provenance/RepositoryConfiguration.java#L37
> to 32766 to match Lucene. Investigation & testing needed.
> For background, this Lucene ticket made exceeding the term size limit an
> IllegalArgumentException https://issues.apache.org/jira/browse/LUCENE-5472
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)