Just my two cents on R side. On 1/28/21 10:00 PM, Nicholas Chammas wrote: > On Thu, Jan 28, 2021 at 3:40 PM Sean Owen <sro...@gmail.com > <mailto:sro...@gmail.com>> wrote: > > It isn't that regexp_extract_all (for example) is useless outside > SQL, just, where do you draw the line? Supporting 10s of random > SQL functions across 3 other languages has a cost, which has to be > weighed against benefit, which we can never measure well except > anecdotally: one or two people say "I want this" in a sea of > hundreds of thousands of users. > > > +1 to this, but I will add that Jira and Stack Overflow activity can > sometimes give good signals about API gaps that are frustrating users. > If there is an SO question with 30K views about how to do something > that should have been easier, then that's an important signal about > the API. > > For this specific case, I think there is a fine argument > that regexp_extract_all should be added simply for consistency > with regexp_extract. I can also see the argument > that regexp_extract was a step too far, but, what's public is now > a public API. > > > I think in this case a few references to where/how people are having > to work around missing a direct function for regexp_extract_all could > help guide the decision. But that itself means we are making these > decisions on a case-by-case basis. > > From a user perspective, it's definitely conceptually simpler to have > SQL functions be consistent and available across all APIs. > > Perhaps if we had a way to lower the maintenance burden of keeping > functions in sync across SQL/Scala/Python/R, it would be easier for > everyone to agree to just have all the functions be included across > the board all the time.
Python aligns quite well with Scala so that might be fine, but R is a bit tricky thing. Especially lack of proper namespaces makes it rather risky to have packages that export hundreds of functions. sparkly handles this neatly with NSE, but I don't think we're going to go this way. > > Would, for example, some sort of automatic testing mechanism for SQL > functions help here? Something that uses a common function testing > specification to automatically test SQL, Scala, Python, and R > functions, without requiring maintainers to write tests for each > language's version of the functions. Would that address the > maintenance burden? With R we don't really test most of the functions beyond the simple "callability". One the complex ones, that require some non-trivial transformations of arguments, are fully tested. -- Best regards, Maciej Szymkiewicz Web: https://zero323.net Keybase: https://keybase.io/zero323 Gigs: https://www.codementor.io/@zero323 PGP: A30CEF0C31A501EC
OpenPGP_signature
Description: OpenPGP digital signature