Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

tutiluren Mon, 14 Sep 2020 13:23:21 -0700

Even though I highly appreciate that PostgreSQL, a database software which 
doesn't cost money, exists *at all*, that fact is oftentimes overshadowed by a 
small but important number of very frustrating issues which I consider to 
largely ruin the overall "experience" of using PostgreSQL. I'd call them 
"almost show-stopping". I realize very well that not everyone "has the same 
priorities" (clearly not) or the expertise, will and free time to work on a 
certain "area" of the overall project/ecosystem, but these 
issues/bugs/limitations are so problematic to me that I have to express them 
directly to the PostgreSQL developers:


1. All non-ANSI characters are turned into "?"s for application_name. My 
administration panel is thus full of gibberish such as: "?a??? ???? ? ?m??? 
??i???u?" and it always looks as if something is awfully broken. I would not 
dare to show it to a CEO or other important person, as they'd just go: "We're 
switching to IBM, effective immediately. Throw this open source rubbish out at 
once!" The explanation I've heard for this is that it's basically a security 
issue, as it's possible to set the application_name before the 
something-something (safe Unicode handler code?) has kicked in, but I have no 
problems with setting the application_name to Unicode characters *after* the 
database connection has already been fully established, in a separate query, as 
I already do, and I doubt that anyone else would have, either. So that 
explanation, while probably technically true, doesn't seem to make any sense.

2. pg_dump misinterprets non-ANSI values for the "--exclude-*" options (at 
least the --exclude-table-data one, which is the one I've tested) on Windows, 
resulting in it being impossible to make more "sophisticated" backups of 
PostgreSQL databases; it's either all or nothing. Other programs, including my 
own test scripts and commands, are perfectly able to use any Unicode character 
sent from/through both cmd.exe and PHP CLI, but not pg_dump, so the idea that 
"Windows it at fault" here just doesn't seem true. (Although I don't doubt for 
a second that it often *is* the case... Microsoft is not a nice entity in any 
way.) I spent a lot of time and efforts experimenting with and asking about 
this, but eventually gave up and concluded that it was yet another bug in an 
open source project "only" on Windows with no real/pressing interest in fixing 
it. For me, this means that I lose a ton of fresh data every day, or have to 
make *gigantic* backups. (I have several huge "temporary debug log" tables 
whose data have zero long-term value but tons of short-term value.) It makes me 
feel crippled and excluded in an uncomfortable manner.

3. The ability to embed PG to run in an automatic, quiet manner as part of 
something else. I know about SQLite, but it's extremely limited to the point of 
being virtually useless IMO, which is why I cannot use that for anything 
nontrivial. I want my familiar PostgreSQL, only not require it to be manually 
and separately installed on the machine where it is to run as part of some 
"application". If I could just "embed" it, this would allow me to create a 
single EXE which I can simply put on a different machine to run my entire 
"system" which otherwise takes *tons* of tedious, error-prone manual labor to 
install, set up and maintain. Of course, this is probably much easier said than 
done, but I don't understand why PG's architecture necessarily dictates that PG 
must be a stand-alone, separate thing. Or rather, why some "glue" cannot enable 
it to be used just like SQLite from a *practical* perspective, even if it still 
is a "server-client model" underneath the hood. (Which doesn't matter at all to 
me, nor should it matter to anyone else.)

4. There is no built-in means to have PG manage (or even suggest) indexes on 
its own. Trying to figure out what indexes to create/delete/fine-tune, and 
determine all the extremely complex rules for this art (yes, I just called 
index management an *art*, because it is!), is just utterly hopeless to me. It 
never gets any easier. Not even after many years. It's the by far worst part of 
databases to me (combined with point five). Having to use third-party solutions 
ensures that it isn't done in practice, at least for me. I don't trust, nor do 
I want to deal with, external software and extensions in my databases. I still 
have nightmares from PostGIS, which I only keep around, angrily, out of 
absolute necessity. I fundamentally don't like third-party add-ons to things, 
but want the core product to properly support things. Besides, this 
(adding/managing indexes) is not even some niche/obscure use-case, but 
something which is crucial for basically any nontrivial database of any kind!

5. Ever since my early days with PG in the mid-2000s, I've tried numerous times 
to read the manual, wikis and comments for the configuration files, 
specifically the performance directives, and asked many, many times for help 
about that, yet never been able to figure out what they want me to enter for 
all the numerous options. At this point, it isn't me being lazy/stupid; it's 
objectively very difficult to understand all of that. The practical end result 
of this is that I've always gone back to using the untouched default 
configuration file (except for the logging-related options), which, especially 
in the past on FreeBSD, *severely* crippled my PG database to not even come 
close to taking advantage of the full power of the hardware. Instead, it felt 
like I was using maybe 1% of the machine's power, even with a proper database 
design and indexes and all of that stuff, simply because the default config was 
so "conservative" and it couldn't be just set to "use whatever resources are 
available". I wish so much for PG to have a mode where it self-tunes itself as 
needed, over time, based on the actual workload, or at least allowed some kind 
of abstract "performance mode" such as: "you are allowed to use significant 
system resources, PG", or: "You are one of my most important applications. Just 
use as much power as you currently need, but at least save about 10% for the 
rest of the system, will you?" Maybe this is also harder than it sounds to 
accomplish, but for somebody like me who has zero funding, I cannot hire some 
professional to sit down with me and fine-tune my system for $899/hour. Also, 
besides the purely monetary issue, there are serious privacy implications with 
that scene. I wouldn't want an outsider to have intimate knowledge of my 
database system, which more than probably is a requirement for them to be able 
to do their job properly.

I'm sorry if any of the above sounds insulting/"entitled". These are the main 
things which truly bother me about PG and the "PG ecosystem", and I'd love to 
hear some first-hand comments on them. At least point 1 and 2 seem like they 
would be almost trivial to fix, at least compared to the rest.

Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

Reply via email to