On 3/18/19 5:18 AM, Morris de Oryx wrote:
Sounds like I may have touched a nerve with some. If so, no offense
intended!
There are cases where case-sensitivity is required or desirable, it
would be silly to argue otherwise. Where you have such cases, then
case-sensitive queries are great. Some RDBMS systems default to
case-sensitive searches, others default to case-blind searches. I'll
take that to mean that either choice has merit, and it's not foolish
to prefer either. I need case-blind searches vanishingly close to 100%
of the time, but other people have different conditions and may find
that they almost never need case-blind searches. I find this extreme
hard to imagine, but find it easy to imagine people who need
case-sensitive searches quite often.
What I've been thinking about most are user-driven searches on
user-oriented data. Users, at least any user I've ever had, don't want
case-sensitive searches. They also don't care about diacritical
characters, at least in English. I worked for a French company for
many years. Diacritical searches were not always preferred, but
sometimes were. It depends on your user community and their
requirements and norms. Sometimes it comes down to an individual user.
Options are good! I was really just trying to warn someone coming from
a base-blind default about Postgres behavior because, well, it hurts
if you aren't expecting it. That doesn't make Postgres wrong (it's not
that kind of a choice), but it is important to know about.
I'm new to Postgres (only about a year in), and it's great. But I'm
used to a case-blind search as a default. And, honestly, I can *never*
remember a case when a user asked for a case-sensitive search. Ever.
In 30+ years. Maybe it's just me. Just kidding, it's not just me. If
you're presenting users with a search interface, you can find out by
asking them. Or you can AB test a search UI where there is the option
of case-sensitive/blind searching, but you randomly flip which is the
default. For users, case-sensitive searches are assumed. That's what
Google does. Seriously, Google === Search. It's not a hard test to
run. If you find that with a case-blind search, 30% of user tick the
box to make it case-sensitive, then you've got users that often do
care about case-sensitive search.
And since it seems to be unclear at a few places in the discussion
above: *It absolutely makes sense to store data in its original form*
and to allow for case-sensitive searches, when required. It would be
very weird to store
/ Call me back Ishmael, I've gotta go.../
and get back anything else, be that
/ call me back ishmael, i've gotta go...
/
or
/ CALL ME BACK ISHMAEL, I'VE GOTTA GO.../
As far as I understand it in Postgres, you can:
* Use something like UPPER or LOWER in Every. Single. Search.
* Fold text to one case consistently in an index to make searches
case-blind.
* Use citext.
* Teach and convince every user to enter and search for data
case-sensitively correctly Every. Single. Time.
On that last point, good luck. Here's an example, I'm keen on birds.
Do you write it:
Black-shouldered Kite
Black-Shouldered Kite
Since you say you're new(ish) to postgres I have to ask have you bumped
into the tilde(~) operator for text? And the critical tildeAstersik ( ~* )?