Hi,
I tried configuring the tika configuration using the config file and
importing it to the program where I am parsing the text, but that didn't
work and I am still getting the same error/result.
Basically, I want my program (using tika for parsing) to consider any kind
of data that is provided as a simple "text" and nothing else.

Could you please suggest a path forward how I can solve this?

-Kashif

On Sun, Mar 17, 2024 at 10:23 PM Tilman Hausherr <[email protected]>
wrote:

> Hi,
>
> The best would of course be that you don't make it look as if your text
> files are something else.
>
> The second best: fine tune the tika configuration
> https://tika.apache.org/2.9.1/configuring.html
>
> Tilman
>
> On 17.03.2024 17:46, Kashif Khan wrote:
>
> Do you think it is an issue to be fixed? And also, is there a workaround
> for this to work?
>
> On Sun, Mar 17, 2024, 5:03 PM Tilman Hausherr <[email protected]>
> wrote:
>
>> The first one is recognized as image/x-portable-graymap because "P2" is a
>> magic number for that type.
>>
>> "P1" is a magic number for image/x-portable-bitmap.
>>
>> Tilman
>>
>> On 16.03.2024 12:37, Kashif Khan wrote:
>>
>> Hello Tim/Forum,
>>
>> While I am trying to parse the below content the result is null/empty:
>> *"P2P He has Asthma"*
>> OR
>> *"P18-8610 He has Asthma"*
>> OR
>> *"P2P Scheduled as He had breathing issues *for the last* 1 year."*
>>
>> Whereas, the below gets parsed without any issues:
>> *"He has Asthma"*
>> *"Appointment Scheduled as He had breathing issues for last 1 year."*
>>
>> Could you please help in understand the exact issue and help with the
>> resolution?
>>
>> -Kashif Khan
>> [email protected]
>>
>>
>>
>

Reply via email to