[HACKERS] The case for preserving case.

2004-04-23 Thread emf
Hello, postgresql hackers.

I am working with a client with a 20k record MySQL database (that will 
shortly expand to 100k/1m) and a few thousand lines of PHP code that 
does a lot of DB interaction.

Their application, with a lot of relationships between data and a bunch 
of data integrity requirements is perfectly suited to postgresql.

The PHP code follows a coding standard wherein variables are assigned 
CamelCase identifiers. All of the objects persist themselves to the DB, 
with a variable per column; on object initialization db columns are 
read from the db and added as attributes of the object.

All of this breaks when I start to use postgresql, because all of the 
attributes become lowercased.

Fixing this problem involves one of three things:

1.) rewriting all the code to have lowercased identifiers. This is 
effectively renaming everything, as long camel case attributes become 
much harder to read when they're lowercased. This also changes the 
clients' preferred coding standard.

2.) using double quotes around all identifiers in sql statements. As 
you're probably aware, there's no string format in PHP that lets you 
write double quote marks unescaped (and do variable substitution), so 
this involves rewriting hundreds of lines and imposing ongoing overhead 
for every SQL query.

3.) escaping 4 lines in src/backend/parser/scansup.c , where 
identifiers are lowercased.

I understand that the reason for lowercasing is because odbc 
connections, etc expect case insensitivity, but the current behaviour 
isn't an SQL standard nor is it really case insensitivity. I would love 
case insensitivity with case preservation, but since that evidently is 
a more complicated option, I would like to know how I can formulate the 
'case preserving' option in a way to make it palatable for inclusion.
--
nothing can happen inside a sphere
that you could not inscribe upon it.
~mindlacehttp://mindlace.net

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] The case for preserving case.

2004-04-24 Thread emf
On Apr 24, 2004, at 00:48, Tom Lane wrote:
You do realize that any such patch would be at least a thousand times
larger than that?
I am coming from a state of ignorance past the fact that commenting out 
four lines of code appeared to create the behaviour I desired. I knew 
that just changing it to match the behaviour *I* wanted isn't the same 
thing as making a change that could work for everyone; that's why I 
asked what sort of implementation of this behaviour would be 
acceptable.

 And have vast repercussions on existing client code?
I don't want to impose this on anyone else, I just want a postgresql 
that doesn't mangle my case, as case carries meaning in my application. 
From what I've seen online, other people migrating away from MySQL 
would like this behaviour to be an option as well.

I'm willing to debate this, but not with people who claim it's a
four-line change.  Do some research.
You are welcome to not pay attention to what I have to say; I will 
probably never be deeply involved in the PostgreSQL codebase.

I am willing to do more work to make this option that is very useful to 
me more widely acceptable.
--
Living on earth and in space are the same class of
problem. In one, the environment is harshly inimical to
humans: in the other, the inverse is true.
~mindlacehttp://mindlace.net

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] The case for preserving case.

2004-04-24 Thread emf
On Apr 24, 2004, at 11:17, Jan Wieck wrote:
I don't think that we will break backward compatibility for existing 
PostgreSQL specific code in order to gain CamelCase+MySQL porting ease 
by adopting an even less standard compliant behaviour than we 
currently have.
I understand and agree that breaking backward compatibility is not an 
option.

 As things are today, we are case insensitive for unquoted identifiers 
and breaking that is not an option.
What you do to unquoted identifiers is not case insensitivity, but 
lowercase folding.

 I see a chance for getting your desired behaviour, case preservation, 
only as a side effect if a larger move towards the standard.
That sounds great! I'd like to help if I can.

This would not be a simple per postmaster config option or even a 
compiletime setting, but rather a per database option in the 
pg_database system catalog, chosen at CREATE DATABASE time.
This also sounds good, but with my vast ignorance of postgresql, I have 
no idea the proper way to tell scansup.c to knock it off (or fold up, 
or fold down) based on something in the pg_database system catalog.

The real problem with this is that it has far greater side effects 
than you seem to imagine yet.
[snip described problem] I am probably still not understanding, since 
if the internals always quote in their queries, it would seem that the 
internals could continue to use lowercase identifiers regardless of the 
DB setting.

I am certain that most of us are open for a more complete proposal 
that includes moving towards the ANSI standard, but the change you 
outlined below is not acceptable.
I understand that. I am willing to do work to make a more complete 
proposal, but I would appreciate some guidance as to how to code 
something that would be more acceptable. I read in another thread that 
the stuff going on in scansup.c isn't allowed/shouldn't talk to the 
database, so I freely admit I don't know how to approach a palatable 
solution.

Worst case scenario, I'm content with keeping my hacked version of 
postgresql so that I can get this application ported faster. Making 
that happen fast and taking advantage of postgresql's superior features 
will help convince my client that the thousands of dollars he's 
spending in this port were worthwhile ... once I've done that, 
gradually transitioning to case-insensitive identifiers is possible, 
but right now all he sees is big transition pain for gain he hasn't 
seen yet.
--
nothing can happen inside a sphere
that you could not inscribe upon it.
~mindlacehttp://mindlace.net

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[HACKERS] case folding and postgres

2004-04-26 Thread emf
Hello,
I have a project I'm moving from mysql to postgresql. It has both a 
fair amount of code and a moderate amount of data. In MySQL the 
identifiers are all MixedCase, but the query strings are never quoted.

I would like to change the default behaviour of postgresql to not fold 
the case to lower. If I change scansup.c 's 
downcase_truncate_identifier() to not lowercase identifiers, will I 
break anything (other than case insensitivity?)

Furthermore, is there any way I could package this patch such that it 
would be accepted? A suggestion I received from #postgresql was to 
implement upper casing, lower casing, and leave-it-alone casing and to 
have a per-db setting for that. Another approach I wouldn't mind adding 
is a start-time option.

Thank you for CCing me, as I am not subbed to postgresql-hackers list.
--
nothing can happen inside a sphere
that you could not inscribe upon it.
~mindlacehttp://mindlace.net
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster