> On Mar 25, 2025, at 4:06 PM, Dennis Snell <dennis.sn...@automattic.com> wrote:
> 
> 
>> On Mar 25, 2025, at 3:23 PM, Máté Kocsis <kocsismat...@gmail.com> wrote:
>> 
>> 
>> Hi Dennis,
>> 
>> 
>>> I am myself also a bit lost on the countless names that I tried out in the 
>>> implementation, but I think I had toHumanFriendlyString() and 
>>> toDisplayFriendlyString() methods at some point. These then ended up being 
>>> toString() and toDisplayString() after some iterations. I would be ok with 
>>> renaming getHost() and toString() so that their names suggest they don't 
>>> use IDNA, but I'd clearly need a good enough suggestion, since neither 
>>> "MachineFriendly", nor "NonDisplayable" sound like the best alternative for 
>>> me. I was also considering using getIdnaHost() and toIdnaString(), but I 
>>> realized these are the worst looking names I have come up with so far.
>>> 
>>> 
>>> 
>> 
>> What about getPunycodeHost(), getUnicodeHost(), toPunycodeString(), 
>> toUnicodeString()? Or getAsciiHost() and toAsciiString() may also work. 
>> These are the best names I managed to come up with so far.
>> 
>> 
>> In the meantime, I renamed RFC 3986's toString() methods too according to 
>> another suggestion:
>> - toString() became toRawString()
>> - toNormalizedString() became toString()
>> 
>> 
>> The new names mirror exactly what their getter counterparts do.
>> 
>> 
>> Máté 
>> 
>> 
> 
> Hi Máté,
> 
> 
> I’ve been pondering these names for the past week and a half and I couldn’t 
> think of anything, but at first glance I like getUnicodeHost() and 
> getAsciiHost(). These communicate a little bit the nuance, though they aren’t 
> totally in-your-face (which in this case I wish there were a more obvious 
> pair that is).
> 
> 
> Other pairs I was toying with but don’t like are:
>  - getPrintHost() / getDataHost()
>  - getDisplayHost() / getAPIHost()
>  - getDisplayHost() / getEncodedHost()
>  - getDisplayHost() / getEscapedHost()
> 
> 
> (the same pairs would apply to the other methods, like toDisplayString() / 
> toEncodedString())
> 
> 
> This seems to be taking a lot of effort and time, but thank you still for 
> engaging with it — naming is hard! But it’s worth it.
> 
> 

Just for fun I have tossed this into DeepSeek-R1 671B


> WHATWG URLs have two representations: one for humans and one for machines. 
>The reason for having two is that URLs may have IDNA domains which are 
>punycode encoded and there are security issues around showing that to huamns. 
>For example, if a person reads "https://xn--google.com"; they may assume that 
>the domain belongs to Google, when in fact it points to "https://䕮䕵䕶䕱.com";. 
>You are a modern programming language designer working on a standard library 
>to expose a URL parser and you want the interface of this library to educate 
>developers on where to use the appropriate representation. Given a URL object 
>$u of class URL, propose two methods for converting that URL to a string. The 
>name of the methods should communicate their use, and when a developer 
>searches for the right method to get the string form, they should not be 
>presented with a non-prefixed and prefixed pair like toString() and 
>toHumanString(). Instead, the methods names should form a kind of symmetric 
>pair like toEncodedString() and toDisplayString(). Use your knowledge of 
>WHATWG URL nuances, browser security issues, human developers making typical 
>mistakes, and propose at least ten pairs of words that could be used for 
>returning these two different representations.


A few of the ideas that it returned which stuck out were:


 - toDataString() / toViewString() and getDataHost() / getViewHost()
 - toSerializedString() / toReadableString() and getSerializedHost() / 
getReadableHost()
 - toProcessingString() / toSafeDisplayString() and getProcessingHost() / 
getSafeDisplayHost()


After checking in the Gecko source code, I sadly only found helper methods 
which take a URL/URI and transform them:


 - prepareUrlForDisplay()
 - unEscapeURIForUI()


Node seems to punt on this by providing `URL.format()` with a `{ unicode: 
boolean }` option. These all seem to miss the mark, in my opinion, because of 
how easy it is to assume that `toString()` or `.host` is what you’re after.


Thanks for entertaining the extra follow-up here.


Warmly,
Dennis Snell

Reply via email to