Re: [Podofo-users] Better handling of xref entries shorter than 20 chars

Michal Sudolsky Tue, 05 Mar 2019 16:51:39 -0800

Maybe better would be to check whether are both empty1 and empty2
whitespaces.



On Wed, Mar 6, 2019 at 9:48 AM Michal Sudolsky <[email protected]> wrote:

> Hi,
>
> No, This is really not good idea. Probably many pdf which are working now
> with podofo will stop working. First pdf which I checked now has SPACE+LF
> and it opens fine in any possible pdf viewer. Also OpenOffice generates
> SPACE+LF (version 4.1.2 on windows probably).
>
> Also in pdf reference is stated that if end of line marker is single
> character then it is preceded by space, check page 94.
>
> On the contrary I think podofo is too strict. Maybe would be better that
> it actually can open these files with xref entries shorter or longer than
> 20 chars as readers from that page do. I think some readers are implemented
> to read xref table by lines (by using for example c++ function getline).
>
>
> On Tue, Mar 5, 2019 at 1:47 AM F. E. <[email protected]> wrote:
>
>> ... if I understand that correctly, then the proposed change allows
>>> "\r\n" and "\n\r" as a line separator, which I do not think is a good
>>> idea, thus I'd be rather more strict, than less
>>
>> You understand correctly, and i would prefer the more strict approach as
>> well.
>> But, the current parsing without check does not distinguish between CR+LF
>> and LF+CR, both variants will work transparently.
>> So by using the less strict check, we limit the fail-return to the
>> instances we know are actually wrong without restricting this 'semi-legit'
>> LF+CR case.
>> That was my reasoning for the patch. Nevetheless, I added the strict
>> patch with this mail.
>>
>> Greetings,
>> F.E.
>>
>>
>> Am Mo., 4. März 2019 um 17:12 Uhr schrieb zyx <[email protected]>:
>>
>>> On Mon, 2019-03-04 at 16:21 +0100, F. E. wrote:
>>> > empty1 and empty2 are two characters for holding CR('\r') and
>>> > LF('\n'). So when an xref entry
>>> > misses the CR (or the LF), empty2 will hold the first character of
>>> > the next xref entry. So why not check the two empty variables if they
>>> > hold the required characters and failing when they don't:
>>>
>>>         Hi,
>>> I'm fine with the change, but...
>>>
>>> > > if ( read != 5 || ( empty1 != '\r' ) ||  ( empty2 != '\n' ) )
>>> > >
>>> >
>>> > Or a little bit less strict (allowing LF+CR):
>>> >
>>> > > if ( read != 5 || ( empty1 != '\r' && empty2 != 'r' ) ||  ( empty1
>>> > > != '\n' && empty2 != '\n' )   )
>>>
>>> ... if I understand that correctly, then the proposed change allows
>>> "\r\n" and "\n\r" as a line separator, which I do not think is a good
>>> idea, thus I'd be rather more strict, than less. There might be some
>>> tricks how to recover from broken XRef tables, which PoDoFo doesn't
>>> have many, if any at all.
>>>
>>> I do not know what others opinion is.
>>>         Bye,
>>>         zyx
>>>
>>>
>>>
>>> _______________________________________________
>>> Podofo-users mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/podofo-users
>>>
>> _______________________________________________
>> Podofo-users mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/podofo-users
>>
>

_______________________________________________
Podofo-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/podofo-users

Re: [Podofo-users] Better handling of xref entries shorter than 20 chars

Reply via email to