> On 04/19/2025 1:14 AM PDT Hal Murray via devel <devel@ntpsec.org> wrote: > > We allow/require UTF-8 rather than simple ASCII. I know we need that to > get the character for micro, as in microseconds. Do we need it for > anything else?
We should be able to get away with closer to ASCII, if we encode micro and such as (unicode) escape sequences or points, such as "\ub5" or "\xb5"; we might want unicode for contributer names later. > I saw a note recently about AI being susceptable to hiding evil code in > invisible unicode. > > New Vulnerability in GitHub Copilot and Cursor: How Hackers Can Weaponize > Code Agents > https://www.pillar.security/blog/new-vulnerability-in-github-copilot-and- > cursor-how-hackers-can-weaponize-code-agents > > ----- > > Is there a package we should be using that checks code for invisible unicode? I feel compelled to mention (dang NIH*) filescan[1] which is something I wrote for gspsd to detect higher codepoints, tabs, and trailing whitespace. I have nto looked at that blog post yet, but a more focussed tool written by someone else would generally be more appropriate. * Not Invented Here [1] https://gitlab.com/gpsd/gpsd/-/blob/master/devtools/filescan _______________________________________________ devel mailing list devel@ntpsec.org https://lists.ntpsec.org/mailman/listinfo/devel