‪2026年6月19日(金) 19:54 ‫سپهر محمودی‬‎ <[email protected]>:‬
>
>  Hello everyone
> Over the past few weeks I have been exploring a common pattern that
> frequently appears in PHP applications: masking sensitive parts of
> strings such as credit card numbers, email addresses, phone numbers,
> and personal identifiers.
>
> In many real-world codebases, developers typically implement masking
> using combinations of functions like substr(), strlen(), str_repeat(),
> substr_replace(), or their multibyte equivalents. While these
> approaches work, they often lead to repetitive, error‑prone, and
> sometimes inefficient user‑land implementations. Handling edge
> cases—especially when offsets are negative, lengths are omitted, or
> when working with Unicode text—can make these snippets unnecessarily
> complex.
>
> While thinking about this problem, I designed a function concept
> called grapheme_mask(). The goal of this function is to provide a
> clear, native, and Unicode‑safe way to mask sections of a string.
>
> The key idea is that the function operates on grapheme clusters,
> rather than raw bytes or individual code points. This allows it to
> correctly handle modern Unicode text, including composed characters
> and emoji sequences, without breaking them apart.
>
> Conceptually, the function replaces a range of grapheme clusters with
> a masking string.
>
> Example:
>
> grapheme_mask("[email protected]", "*", 2, -12);
> // result: se****@example.com
> --------------------------------------------
> Example with emoji sequences:
> grapheme_mask("👨🏽‍👩‍👧‍👦 family", "*", 0, 1);
> // result: * family
> -----------------------------------------
>
> The intention is not to replace existing string functions, but to
> provide a dedicated and expressive helper for a task that developers
> routinely implement themselves.
>
> If there is interest from the community, I would be happy to draft a
> formal RFC describing the proposed behavior, edge cases, and potential
> implementation details.
>
> I would greatly appreciate any feedback, thoughts, or suggestions.
>
> Best regards,
>
> Sepehr

Hi, Sepehr and Internals

Thank you for bringing up discussion.
Looks good to me.

One more point for add that function.
The diacritical mark sometimes includes one code point and separated
code points.
For example, Umlaut(ä, a + ¨), Dakuten(が, か + ゛) and etc in the world.
These characters needs support for grapheme_mask function.
Therefore, I would like need that function.

Regards
Yuya


-- 
---------------------------
Yuya Hamada (tekimen)
- https://tekitoh-memdhoi.info
- https://github.com/youkidearitai
-----------------------------

Reply via email to