[HACKERS] The case for preserving case.
Hello, postgresql hackers. I am working with a client with a 20k record MySQL database (that will shortly expand to 100k/1m) and a few thousand lines of PHP code that does a lot of DB interaction. Their application, with a lot of relationships between data and a bunch of data integrity requirements is perfectly suited to postgresql. The PHP code follows a coding standard wherein variables are assigned CamelCase identifiers. All of the objects persist themselves to the DB, with a variable per column; on object initialization db columns are read from the db and added as attributes of the object. All of this breaks when I start to use postgresql, because all of the attributes become lowercased. Fixing this problem involves one of three things: 1.) rewriting all the code to have lowercased identifiers. This is effectively renaming everything, as long camel case attributes become much harder to read when they're lowercased. This also changes the clients' preferred coding standard. 2.) using double quotes around all identifiers in sql statements. As you're probably aware, there's no string format in PHP that lets you write double quote marks unescaped (and do variable substitution), so this involves rewriting hundreds of lines and imposing ongoing overhead for every SQL query. 3.) escaping 4 lines in src/backend/parser/scansup.c , where identifiers are lowercased. I understand that the reason for lowercasing is because odbc connections, etc expect case insensitivity, but the current behaviour isn't an SQL standard nor is it really case insensitivity. I would love case insensitivity with case preservation, but since that evidently is a more complicated option, I would like to know how I can formulate the 'case preserving' option in a way to make it palatable for inclusion. -- nothing can happen inside a sphere that you could not inscribe upon it. ~mindlacehttp://mindlace.net ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] The case for preserving case.
On Apr 24, 2004, at 00:48, Tom Lane wrote: You do realize that any such patch would be at least a thousand times larger than that? I am coming from a state of ignorance past the fact that commenting out four lines of code appeared to create the behaviour I desired. I knew that just changing it to match the behaviour *I* wanted isn't the same thing as making a change that could work for everyone; that's why I asked what sort of implementation of this behaviour would be acceptable. And have vast repercussions on existing client code? I don't want to impose this on anyone else, I just want a postgresql that doesn't mangle my case, as case carries meaning in my application. From what I've seen online, other people migrating away from MySQL would like this behaviour to be an option as well. I'm willing to debate this, but not with people who claim it's a four-line change. Do some research. You are welcome to not pay attention to what I have to say; I will probably never be deeply involved in the PostgreSQL codebase. I am willing to do more work to make this option that is very useful to me more widely acceptable. -- Living on earth and in space are the same class of problem. In one, the environment is harshly inimical to humans: in the other, the inverse is true. ~mindlacehttp://mindlace.net ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] The case for preserving case.
On Apr 24, 2004, at 11:17, Jan Wieck wrote: I don't think that we will break backward compatibility for existing PostgreSQL specific code in order to gain CamelCase+MySQL porting ease by adopting an even less standard compliant behaviour than we currently have. I understand and agree that breaking backward compatibility is not an option. As things are today, we are case insensitive for unquoted identifiers and breaking that is not an option. What you do to unquoted identifiers is not case insensitivity, but lowercase folding. I see a chance for getting your desired behaviour, case preservation, only as a side effect if a larger move towards the standard. That sounds great! I'd like to help if I can. This would not be a simple per postmaster config option or even a compiletime setting, but rather a per database option in the pg_database system catalog, chosen at CREATE DATABASE time. This also sounds good, but with my vast ignorance of postgresql, I have no idea the proper way to tell scansup.c to knock it off (or fold up, or fold down) based on something in the pg_database system catalog. The real problem with this is that it has far greater side effects than you seem to imagine yet. [snip described problem] I am probably still not understanding, since if the internals always quote in their queries, it would seem that the internals could continue to use lowercase identifiers regardless of the DB setting. I am certain that most of us are open for a more complete proposal that includes moving towards the ANSI standard, but the change you outlined below is not acceptable. I understand that. I am willing to do work to make a more complete proposal, but I would appreciate some guidance as to how to code something that would be more acceptable. I read in another thread that the stuff going on in scansup.c isn't allowed/shouldn't talk to the database, so I freely admit I don't know how to approach a palatable solution. Worst case scenario, I'm content with keeping my hacked version of postgresql so that I can get this application ported faster. Making that happen fast and taking advantage of postgresql's superior features will help convince my client that the thousands of dollars he's spending in this port were worthwhile ... once I've done that, gradually transitioning to case-insensitive identifiers is possible, but right now all he sees is big transition pain for gain he hasn't seen yet. -- nothing can happen inside a sphere that you could not inscribe upon it. ~mindlacehttp://mindlace.net ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[HACKERS] case folding and postgres
Hello, I have a project I'm moving from mysql to postgresql. It has both a fair amount of code and a moderate amount of data. In MySQL the identifiers are all MixedCase, but the query strings are never quoted. I would like to change the default behaviour of postgresql to not fold the case to lower. If I change scansup.c 's downcase_truncate_identifier() to not lowercase identifiers, will I break anything (other than case insensitivity?) Furthermore, is there any way I could package this patch such that it would be accepted? A suggestion I received from #postgresql was to implement upper casing, lower casing, and leave-it-alone casing and to have a per-db setting for that. Another approach I wouldn't mind adding is a start-time option. Thank you for CCing me, as I am not subbed to postgresql-hackers list. -- nothing can happen inside a sphere that you could not inscribe upon it. ~mindlacehttp://mindlace.net ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster