> On Mar 25, 2025, at 4:06 PM, Dennis Snell <dennis.sn...@automattic.com> wrote:
>
>
>> On Mar 25, 2025, at 3:23 PM, Máté Kocsis <kocsismat...@gmail.com> wrote:
>>
>>
>> Hi Dennis,
>>
>>
>>> I am myself also a bit lost on the countless names that I tried out in the
>>> implementation, but I think I had toHumanFriendlyString() and
>>> toDisplayFriendlyString() methods at some point. These then ended up being
>>> toString() and toDisplayString() after some iterations. I would be ok with
>>> renaming getHost() and toString() so that their names suggest they don't
>>> use IDNA, but I'd clearly need a good enough suggestion, since neither
>>> "MachineFriendly", nor "NonDisplayable" sound like the best alternative for
>>> me. I was also considering using getIdnaHost() and toIdnaString(), but I
>>> realized these are the worst looking names I have come up with so far.
>>>
>>>
>>>
>>
>> What about getPunycodeHost(), getUnicodeHost(), toPunycodeString(),
>> toUnicodeString()? Or getAsciiHost() and toAsciiString() may also work.
>> These are the best names I managed to come up with so far.
>>
>>
>> In the meantime, I renamed RFC 3986's toString() methods too according to
>> another suggestion:
>> - toString() became toRawString()
>> - toNormalizedString() became toString()
>>
>>
>> The new names mirror exactly what their getter counterparts do.
>>
>>
>> Máté
>>
>>
>
> Hi Máté,
>
>
> I’ve been pondering these names for the past week and a half and I couldn’t
> think of anything, but at first glance I like getUnicodeHost() and
> getAsciiHost(). These communicate a little bit the nuance, though they aren’t
> totally in-your-face (which in this case I wish there were a more obvious
> pair that is).
>
>
> Other pairs I was toying with but don’t like are:
> - getPrintHost() / getDataHost()
> - getDisplayHost() / getAPIHost()
> - getDisplayHost() / getEncodedHost()
> - getDisplayHost() / getEscapedHost()
>
>
> (the same pairs would apply to the other methods, like toDisplayString() /
> toEncodedString())
>
>
> This seems to be taking a lot of effort and time, but thank you still for
> engaging with it — naming is hard! But it’s worth it.
>
>
Just for fun I have tossed this into DeepSeek-R1 671B
> WHATWG URLs have two representations: one for humans and one for machines.
>The reason for having two is that URLs may have IDNA domains which are
>punycode encoded and there are security issues around showing that to huamns.
>For example, if a person reads "https://xn--google.com" they may assume that
>the domain belongs to Google, when in fact it points to "https://䕮䕵䕶䕱.com".
>You are a modern programming language designer working on a standard library
>to expose a URL parser and you want the interface of this library to educate
>developers on where to use the appropriate representation. Given a URL object
>$u of class URL, propose two methods for converting that URL to a string. The
>name of the methods should communicate their use, and when a developer
>searches for the right method to get the string form, they should not be
>presented with a non-prefixed and prefixed pair like toString() and
>toHumanString(). Instead, the methods names should form a kind of symmetric
>pair like toEncodedString() and toDisplayString(). Use your knowledge of
>WHATWG URL nuances, browser security issues, human developers making typical
>mistakes, and propose at least ten pairs of words that could be used for
>returning these two different representations.
A few of the ideas that it returned which stuck out were:
- toDataString() / toViewString() and getDataHost() / getViewHost()
- toSerializedString() / toReadableString() and getSerializedHost() /
getReadableHost()
- toProcessingString() / toSafeDisplayString() and getProcessingHost() /
getSafeDisplayHost()
After checking in the Gecko source code, I sadly only found helper methods
which take a URL/URI and transform them:
- prepareUrlForDisplay()
- unEscapeURIForUI()
Node seems to punt on this by providing `URL.format()` with a `{ unicode:
boolean }` option. These all seem to miss the mark, in my opinion, because of
how easy it is to assume that `toString()` or `.host` is what you’re after.
Thanks for entertaining the extra follow-up here.
Warmly,
Dennis Snell