[SQL] Simple method to format a string
Good morning, Is there a simply method in psql to format a string? For example, adding a space to every three consecutive letters: abcdefgh -> *** *** *** Thanks a lot! Emi -- Sent via pgsql-sql mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
[SQL] Is there a similarity-function that minds national charsets?
Hi, Is there a similarity-function that minds national charsets? Over here we've got some special cases that screw up the results on similarity(). Our characters: ä, ö, ü, ß could as well be written as: ae, oe, ue, ss e.g. select similarity ( 'Müller', 'Mueller' ) results to: 0.363636 In normal cases everything below 0.5 would be to far apart to be considered a match. As it is, I had to transfer the contents of the table into a temporary table where I translate every ambigous char to it's 2 char representation. Is there a solution so that detour is not necessary? -- Sent via pgsql-sql mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Simple method to format a string
On Wed, Jun 20, 2012 at 8:42 AM, Emi Lu wrote:
> Good morning,
>
> Is there a simply method in psql to format a string?
>
> For example, adding a space to every three consecutive letters:
>
> abcdefgh -> *** *** ***
>
> Thanks a lot!
> Emi
>
>
I looked at "format" here:
http://www.postgresql.org/docs/9.1/static/functions-string.html
but didn't see a way.
This function might do what you need:
CREATE FUNCTION spaced3 (text) RETURNS text AS $$
DECLARE
-- Declare aliases for function arguments.
arg_string ALIAS FOR $1;
-- Declare variables
row record;
res text;
BEGIN
res := '';
FOR row IN SELECT regexp_matches(arg_string, '.{1,3}', 'g') as chunk LOOP
res := res || ' ' || btrim(row.chunk::text, '{}');
END LOOP;
RETURN res;
END;
$$ LANGUAGE 'plpgsql';
# SELECT spaced3('abcdefgh');
spaced3
-
abc def gh
(1 row)
# SELECT spaced3('0123456789');
spaced3
012 345 678 9
(1 row)
to remove the function run this:
# drop function spaced3(text);
-wes
Re: [SQL] Simple method to format a string
On Wed, Jun 20, 2012 at 12:08:24PM -0600, Wes James wrote:
> On Wed, Jun 20, 2012 at 8:42 AM, Emi Lu wrote:
>
> > Good morning,
> >
> > Is there a simply method in psql to format a string?
> >
> > For example, adding a space to every three consecutive letters:
> >
> > abcdefgh -> *** *** ***
> >
> > Thanks a lot!
> > Emi
> >
> >
> I looked at "format" here:
>
> http://www.postgresql.org/docs/9.1/static/functions-string.html
>
> but didn't see a way.
>
> This function might do what you need:
>
>
> CREATE FUNCTION spaced3 (text) RETURNS text AS $$
> DECLARE
>-- Declare aliases for function arguments.
> arg_string ALIAS FOR $1;
>
> -- Declare variables
> row record;
> res text;
>
> BEGIN
> res := '';
> FOR row IN SELECT regexp_matches(arg_string, '.{1,3}', 'g') as chunk LOOP
> res := res || ' ' || btrim(row.chunk::text, '{}');
> END LOOP;
> RETURN res;
> END;
> $$ LANGUAGE 'plpgsql';
>
>
> # SELECT spaced3('abcdefgh');
>
>spaced3
> -
> abc def gh
> (1 row)
>
> # SELECT spaced3('0123456789');
> spaced3
>
> 012 345 678 9
> (1 row)
>
> to remove the function run this:
>
> # drop function spaced3(text);
>
> -wes
Just a small optimization would be to use a backreference with regexp_replace
instead of regexp_matches:
select regexp_replace('foobarbaz', '(...)', E'\\1 ', 'g');
regexp_replace
foo bar baz
(1 row)
regards,
Ken
--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Simple method to format a string
Just a small optimization would be to use a backreference with regexp_replace
instead of regexp_matches:
select regexp_replace('foobarbaz', '(...)', E'\\1 ', 'g');
regexp_replace
foo bar baz
Great.
After combined with several more replace(s), regexp_replace will provide
me the expecting result.
Thanks!
Emi
--
select
regexp_replace(
replace(
replace(col-val, ' ', ''), '-', ''),
replace...
'(...)', E'\\1 ', 'g')
from tn;
--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Is there a similarity-function that minds national charsets?
On 06/21/2012 12:30 AM, Andreas wrote:
Hi,
Is there a similarity-function that minds national charsets?
Over here we've got some special cases that screw up the results on
similarity().
Our characters: ä, ö, ü, ß
could as well be written as: ae, oe, ue, ss
e.g.
select similarity ( 'Müller', 'Mueller' )
results to: 0.363636
In normal cases everything below 0.5 would be to far apart to be
considered a match.
That's not just charset aware, that's looking for awareness of
language-and-dialect specific transliteration rules for representing
accented chars in 7-bit ASCII. My understanding was that these rules and
conventions vary and are specific to each language - or even region.
tsearch2 has big language dictionaries to try to handle some issues like
this (though I don't know about this issue specifically). It's possible
you could extend the tsearch2 dictionaries with synonyms, possibly
algorithmically generated.
If you have what you consider to be an acceptable 1:1 translation rule
you could build a functional index on it and test against that, eg:
CREATE INDEX blah ON thetable ( (flatten_accent(target_column) );
SELECT similarity( flatten_accent('Müller'), target_column );
Note that the flatten_accent function must be IMMUTABLE and can't access
or refer to data in other tables, columns, etc nor SET (GUC) variables
that might change at runtime.
--
Craig Ringer
--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql
