Branch: refs/heads/blead
  Home:   https://github.com/Perl/perl5
  Commit: 5a82da71ad75ec4f39c02d3b4818f97b79503595
      
https://github.com/Perl/perl5/commit/5a82da71ad75ec4f39c02d3b4818f97b79503595
  Author: Karl Williamson <[email protected]>
  Date:   2024-12-04 (Wed, 04 Dec 2024)

  Changed paths:
    M utf8.c

  Log Message:
  -----------
  ruler


  Commit: efff4c066d5cc28427c7406b6e0791b341177b2b
      
https://github.com/Perl/perl5/commit/efff4c066d5cc28427c7406b6e0791b341177b2b
  Author: Karl Williamson <[email protected]>
  Date:   2024-12-04 (Wed, 04 Dec 2024)

  Changed paths:
    M utf8.c

  Log Message:
  -----------
  utf8_to_uv_msgs: Hoist common paradigm out of switch

Previously this variable was initialized to 0 and then set to another
value in each case in the switch.  But most of the cases set it to the
same value.  Might as well initialize it to that value, remove the
statements that merely repeat that, and leave in the cases that set it
to something else.


  Commit: 79c191f131ec8235608676365349c86a80fc2852
      
https://github.com/Perl/perl5/commit/79c191f131ec8235608676365349c86a80fc2852
  Author: Karl Williamson <[email protected]>
  Date:   2024-12-04 (Wed, 04 Dec 2024)

  Changed paths:
    M utf8.c

  Log Message:
  -----------
  utf8_to_uv_msgs: Reorder case statements in switch()

This just moves these around so that they are roughly in increasing
order of complexity, which makes it easier to grok.

This doesn't change the order the cases are called in when there are
multiple ones.  The order is dictated by what the switch() does first,
which isn't affected by this commit.


  Commit: f174f6e1bbe1da189497c1ea868e5b9dfd0fa3d4
      
https://github.com/Perl/perl5/commit/f174f6e1bbe1da189497c1ea868e5b9dfd0fa3d4
  Author: Karl Williamson <[email protected]>
  Date:   2024-12-04 (Wed, 04 Dec 2024)

  Changed paths:
    M utf8.c

  Log Message:
  -----------
  utf8_to_uv_msgs: Hoist common paradigm out of switch()

Instead of initializing this to 0 and then setting it in each case of
the switch, initialize it to the most common value, remove those
statements that set it to that, leaving the rest alone.


  Commit: e6954d821e8d09d72c02ccc5da5a72710d9e5409
      
https://github.com/Perl/perl5/commit/e6954d821e8d09d72c02ccc5da5a72710d9e5409
  Author: Karl Williamson <[email protected]>
  Date:   2024-12-04 (Wed, 04 Dec 2024)

  Changed paths:
    M utf8.c
    M utf8.h

  Log Message:
  -----------
  utf8_to_uv_msgs: Use a single bit flag

This condition used a combination of two bits, which makes things a
little awkward, and isn't really needed


  Commit: a3b31bf912380d4d7f5a87237d1e1d7a3e8d538e
      
https://github.com/Perl/perl5/commit/a3b31bf912380d4d7f5a87237d1e1d7a3e8d538e
  Author: Karl Williamson <[email protected]>
  Date:   2024-12-04 (Wed, 04 Dec 2024)

  Changed paths:
    M utf8.c

  Log Message:
  -----------
  utf8n_to_uv_msgs: Avoid unnecessary work

The outermost block here is executed when any of three types of
problematic Unicode code points is encountered, and the caller has
indicated special handling of at least one of those types.  Before this
commit, we set a flag to later look to see if what was encountered
matched the type the caller specified.  This commit changes to do that
looking at the point where the flag had been set, and only sets the flag
if necessary.  This may completely avoid the later work, which has
set-up overhead, and this will make future commits simpler.


  Commit: 8195a3462db52605271265cdc4bd2e8dcb59df68
      
https://github.com/Perl/perl5/commit/8195a3462db52605271265cdc4bd2e8dcb59df68
  Author: Karl Williamson <[email protected]>
  Date:   2024-12-04 (Wed, 04 Dec 2024)

  Changed paths:
    M utf8.c

  Log Message:
  -----------
  utf8_tu_uv_msgs: Comments, white-space only

This adds extensive comments and adjusts white space a bit.


  Commit: c2009f14c2bbabd101f0bb87a40562f95aadae1e
      
https://github.com/Perl/perl5/commit/c2009f14c2bbabd101f0bb87a40562f95aadae1e
  Author: Karl Williamson <[email protected]>
  Date:   2024-12-04 (Wed, 04 Dec 2024)

  Changed paths:
    M utf8.c

  Log Message:
  -----------
  utf8_to_uv_msgs: Avoid extra deref pointer writes

This function is passed an address of where to write some values.
Instead of updating the derefernced pointer each time, do it once after
all the information is accumulated.

This allows for the values to be updated without updating the pointed to
variable, so makes some case in a switch() able to be more uniform.


  Commit: 25edbe888b0c1e4b668bb874584cb944d98c12d5
      
https://github.com/Perl/perl5/commit/25edbe888b0c1e4b668bb874584cb944d98c12d5
  Author: Karl Williamson <[email protected]>
  Date:   2024-12-04 (Wed, 04 Dec 2024)

  Changed paths:
    M utf8.c

  Log Message:
  -----------
  utf8_to_uv_msgs: Move premature setting to later

This commit moves the final two cases of setting up the return to be the
REPLACEMENT_CHARACTER to later in the code, where all such malformations
are handled.

This makes the handling uniform for a bunch of cases, which will enable
a future commit to combine them.


  Commit: 37abfd580010f08d754302854b4c5d375607aefa
      
https://github.com/Perl/perl5/commit/37abfd580010f08d754302854b4c5d375607aefa
  Author: Karl Williamson <[email protected]>
  Date:   2024-12-04 (Wed, 04 Dec 2024)

  Changed paths:
    M utf8.c

  Log Message:
  -----------
  utf8.c: Create a #define

This creates a #define of the string in a string array constant.  The
constant was to avoid having this common string appear in multiple
places, but there is a benefit to having it in one place separately,
which this commit also does.  And that is, it makes a call to
Perl_form() consistent with two other calls that the next commit will
combine.


  Commit: cc59aa43431eac24596c2a38d59b568b54fcf132
      
https://github.com/Perl/perl5/commit/cc59aa43431eac24596c2a38d59b568b54fcf132
  Author: Karl Williamson <[email protected]>
  Date:   2024-12-04 (Wed, 04 Dec 2024)

  Changed paths:
    M utf8.c

  Log Message:
  -----------
  utf8_to_uv_msgs: Revamp above Unicode code points handling

This is the most complicated of the problematic UTF-8 conditions.  There
are three types of these, each with a different warning message, and any
combination of them can be set to be warned or not warned about or an
error bit returned.  More rigorous testing, yet to be committed,  has
indicated that there are bugs in the existing implementation, which this
commit fixes.

The three cases are partially combined, so that if one case finds it is
not authorized to handle things, it drops to the next lower severity
one.


Compare: https://github.com/Perl/perl5/compare/30a8f9523f06...cc59aa43431e

To unsubscribe from these emails, change your notification settings at 
https://github.com/Perl/perl5/settings/notifications

Reply via email to