On Fri, Aug 30, 2019 at 9:03 AM Fabien COELHO <coe...@cri.ensmp.fr> wrote: > I have found this thread: > > https://www.postgresql.org/message-id/flat/E1cq93r-0004ey-Mp%40gemulon.postgresql.org > > It seems that comments from committers discouraged me to go on… :-) For > instance Robert wanted a "checker", which is basically harder than a > generator because you have to parse both sides and then compare.
Well, I don't think I intended to ask for something that was more difficult than a full generator. I think it's more that I had the idea that a checker would be simpler. It's true that you'd have to parse both sides and compare. On the other hand, a checker can be incomplete -- only checking certain things -- whereas a generator has to work completely -- including all of the strange cases. So it seemed to me that a checker would allow for tolerating more in the way of exceptions than a generator. A generator also has to integrate properly into the build system, which can be tricky. It seems like the approach Andres is proposing here could work pretty well. I think the biggest possible problem is that any semi-serious developer will basically have to have LLVM installed. To build the software, you wouldn't need LLVM unless you want to build with JIT support. But to modify the software, you'll need LLVM for any modification that touches node definitions. I don't know how much of a nuisance that's likely to be for people, especially people developing on less-mainstream platforms. One concern I have is about whether the code that uses LLVM is likely to be dependent on specific LLVM versions. If I can just type something like 'yum/port/brew/apt-get install llvm' on any semi-modern platform and have that be good enough, it won't bother me much at all. On the other hand, if I have to hand-compile it because RHEL version $X only ships older LLVM $Y (or ships unexpectedly-newer version $YY) then that's going to be annoying. We already have the same annoyance with autoconf; at some times, I've needed to have multiple versions installed locally to cater to all the branches. However, that's less of a problem than this would be, because (1) updating configure is a substantially less-common need than updating node definitions and (2) autoconf is a much smaller piece of software than libclang. It builds in about 1 second, which I bet LLVM does not. To point to an analogous case, note that we pretty much have to adjust a bunch of things every few years to be able to support new versions of Visual Studio, and until we do, it Just Doesn't Work. That stinks. In contrast, new versions of gcc often cause new warnings, but those are easier to work around until such time as somebody gets around to cleaning them up. But most developers get to ignore Windows most of the time, whereas if this breaks for somebody, they can't really work at all until they either work around it on their side or it gets fixed upstream. So it's a significant potential inconvenience. As a benchmark, I'd propose this: if the LLVM interfaces that this new code would use work in all versions of LLVM released in the last 3 years and there's no indication that they will change in the next release, then I'd feel pretty comfortable. If they've changed once, that'd probably still be OK. If they've changed more than once, perhaps we should think about a Perl script under our own control as an alternative, so as to avoid having to keep adjusting the C code every time LLVM whacks the interfaces around. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company