Okay, I've had a go at this, have found the solution, or at least why it's
breaking.
Running testGetAttachmentContent() on TikaSearchProviderTest, it successfully
creates the two attachments:
./target/test-classes/testrepository1557439078377/test-tika-att/aaa-diagram.pdf-dir/attachment.properties
./target/test-classes/testrepository1557439078377/test-tika-att/aaa-diagram.pdf-dir/1.pdf
./target/test-classes/testrepository1557439078377/test-tika-att/favicon.png-dir/attachment.properties
./target/test-classes/testrepository1557439078377/test-tika-att/favicon.png-dir/1.png
and correctly updates the Lucene index for both attachments:
2019-05-10 10:06:26,979 [JSPWiki Lucene Indexer] DEBUG
org.apache.wiki.search.LuceneSearchProvider \
- Done updating Lucene index for page 'test-tika/aaa-diagram.pdf'.
2019-05-10 10:07:23,203 [JSPWiki Lucene Indexer] DEBUG
org.apache.wiki.search.LuceneSearchProvider \
- Done updating Lucene index for page 'test-tika/favicon.png'.
and successfully creates the TikaSearchProvider ('tsp').
but runs into this code
// If the page cannot be determined, we cannot possibly find the
// attachments.
//
if( currentPage == null || currentPage.getName().length() == 0 )
{
return null;
}
and hence returns a null page and then a null attachment on:
Attachment attPdf = engine.getAttachmentManager().getAttachmentInfo(
"test-tika/aaa-diagram.pdf" );
In looking at why, it's because the parent page name is "Test-tika", not
"test-tika", i.e., this is a case difference. Changing the string to "Test-tika"
allows the test to pass.
Cheers,
Murray
On 10/05/19 8:35 AM, Murray Altheim wrote:
HI Juan Pablo,
Part of what fails with Lucene is not only what version is used by the code,
but also what's available in the classpath. So you may also need to eliminate
the conflicting versions, as if Lucene can find a newer version that can also
cause a conflict. Or set both to the most recent one, Version.LATEST.
From my own Lucene version unit test:
Version.LATEST,
Version.LUCENE_7_5_0,
Version.LUCENE_7_4_0,
Version.LUCENE_7_0_0,
Version.LUCENE_6_6_5,
Version.LUCENE_6_5_0,
Version.LUCENE_6_0_0,
Cheers,
Murray
On 10/05/19 12:25 AM, Juan Pablo Santos Rodríguez wrote:
Hi Murray,
Thanks for taking a look! I was able to make an additional test, setting
jspwiki's lucene version to 7.5 (the same brought by default by tika) and
had the same outcome.
I did look at other tests retrieving attachments and PageRenamerTest grabs
them the same way. I still have to take a better look at it, though..
best regards,
juan pablo
...........................................................................
Murray Altheim <murray18 at altheim dot com> = = ===
http://www.altheim.com/murray/ === ===
= = ===
In the evening
The rice leaves in the garden
Rustle in the autumn wind
That blows through my reed hut.
-- Minamoto no Tsunenobu
--
...........................................................................
Murray Altheim <murray18 at altheim dot com> = = ===
http://www.altheim.com/murray/ === ===
= = ===
In the evening
The rice leaves in the garden
Rustle in the autumn wind
That blows through my reed hut.
-- Minamoto no Tsunenobu