fetching content from archives and images

Maciej Liżewski Fri, 04 Jan 2013 04:09:50 -0800

Hi,

I have got two questions:


1. does tika recursively fetch content from archives (zip, rar, etc)? I have
found some configuration options that suggest it is done, but I want to be
sure. Also - is there some description on how it is done (does it process
whole virtual file system in archive or just some files, etc...)?

2. are there any plugins/interfaces that could allow tika to fetch content
from images via some OCR library (if so - is the library obligatory or can
be replaced with different one that better suits my needs)?

TIA
Maciek

fetching content from archives and images

Reply via email to