Thanks, Tilman.

"This happens twice during build tests, of over 100 text extractions."

Thanks for explaining... From the code comment, I thought this was a
general behavior of JDK7+ sort, but it sounds like it is only a problem in
a rare edge case of specific compares. How exciting.

Cheers,

K

On Tue, Dec 17, 2024, 2:22 AM Tilman Hausherr <thaush...@t-online.de> wrote:

> Hi,
>
> Thank you, the change has been committed.
> re 1: we'll see what happens... re "but it is code that needs to be
> maintained" - that is a general problem. Sometimes it's even difficult
> to maintain ones own code.
> re 2: No because most of the time, the faster built-in sort works fine.
> The slower mergesort is only used when the exception is thrown. This
> happens twice during build tests, of over 100 text extractions.
>
> Tilman
>
> On 16.12.2024 15:55, Kevin Day wrote:
> > I am attaching the patch file.
> >
> > And yes, this patch is simply PDFBOX-3774 as an option, a small
> > cosmetic change to use idiomatic Java for PDFBOX-5487, and a unit test
> > that demonstrates the overlapping.
> >
> >
> > A couple of additional thoughts:
> >
> > 1.  I feel that PDFBOX-5487 isn't doing very much.  The PDFBOX-3774
> > feature will address the problem fixed by PDFBOX-5487, and the
> > "problem" of having a space glyph entirely within the previous
> > character is a very restricted edge-case.  In the end, the performance
> > hit is not a big deal, but it is code that needs to be maintained.  I
> > thought I'd mention it in case the PDFBOX-5487 requester would be
> > happy with PDFBOX-3774 as a solution.
> >
> > 2.  I noticed that there is a note about JDK7+ sorting
> > requiring transitive comparators.  Given that the build requires
> > JDK8+, I wonder if it is time to remove the Collections.sort path (and
> > get rid of an exception throw, etc...)?
> >
> > - K
> >
> >
> >
> > On Mon, Dec 16, 2024 at 6:21 AM Tilman Hausherr
> > <thaush...@t-online.de> wrote:
> >
> >     On 16.12.2024 14:02, Kevin Day wrote:
> >     > I just realized that there is an incorrect note in the
> getter/setter
> >     > Javadocs about the setting only taking effect if sorting is
> enabled.
> >     >
> >     > That note can be removed. The new setting is valid regardless of
> >     whether
> >     > sorting is enabled.
> >
> >     Hi,
> >
> >     Could you please resend the patch as text attachment? Somehow the
> >     mail
> >     program messed this up.
> >
> >      From what I understand, the patch is the suggestion from
> >     PDFBOX-3774but
> >     as an option, plus a test. The other change (re PDFBOX-5487) is a
> >     (useful) cosmetic change. I wonder why I missed that when I
> >     committed it.
> >
> >     Tilman
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail:users-unsubscr...@pdfbox.apache.org
> > For additional commands, e-mail:users-h...@pdfbox.apache.org
>
>

Reply via email to