[SQL] Simple method to format a string

2012-06-20 Thread Emi Lu

Good morning,

Is there a simply method in psql to format a string?

For example, adding a space to every three consecutive letters:

abcdefgh -> *** *** ***

Thanks a lot!
Emi


--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


[SQL] Is there a similarity-function that minds national charsets?

2012-06-20 Thread Andreas

Hi,

Is there a similarity-function that minds national charsets?

Over here we've got some special cases that screw up the results on 
similarity().


Our characters: ä, ö, ü, ß
could as well be written as:  ae, oe, ue, ss

e.g.

select similarity ( 'Müller', 'Mueller' )
results to:  0.363636

In normal cases everything below 0.5 would be to far apart to be 
considered a match.


As it is, I had to transfer the contents of the table into a temporary 
table where I translate every ambigous char to it's 2 char representation.


Is there a solution so that detour is not necessary?

--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Simple method to format a string

2012-06-20 Thread Wes James
On Wed, Jun 20, 2012 at 8:42 AM, Emi Lu  wrote:

> Good morning,
>
> Is there a simply method in psql to format a string?
>
> For example, adding a space to every three consecutive letters:
>
> abcdefgh -> *** *** ***
>
> Thanks a lot!
> Emi
>
>
I looked at "format" here:

http://www.postgresql.org/docs/9.1/static/functions-string.html

but didn't see a way.

This function might do what you need:


CREATE FUNCTION spaced3 (text) RETURNS text AS $$
DECLARE
   -- Declare aliases for function arguments.
  arg_string ALIAS FOR $1;

  -- Declare variables
  row record;
  res text;

BEGIN
  res := '';
  FOR row IN SELECT regexp_matches(arg_string, '.{1,3}', 'g') as chunk LOOP
res := res || ' ' || btrim(row.chunk::text, '{}');
  END LOOP;
  RETURN res;
END;
$$ LANGUAGE 'plpgsql';


# SELECT spaced3('abcdefgh');

   spaced3
-
  abc def gh
(1 row)

# SELECT spaced3('0123456789');
spaced3

  012 345 678 9
(1 row)

to remove the function run this:

# drop function spaced3(text);

-wes


Re: [SQL] Simple method to format a string

2012-06-20 Thread [email protected]
On Wed, Jun 20, 2012 at 12:08:24PM -0600, Wes James wrote:
> On Wed, Jun 20, 2012 at 8:42 AM, Emi Lu  wrote:
> 
> > Good morning,
> >
> > Is there a simply method in psql to format a string?
> >
> > For example, adding a space to every three consecutive letters:
> >
> > abcdefgh -> *** *** ***
> >
> > Thanks a lot!
> > Emi
> >
> >
> I looked at "format" here:
> 
> http://www.postgresql.org/docs/9.1/static/functions-string.html
> 
> but didn't see a way.
> 
> This function might do what you need:
> 
> 
> CREATE FUNCTION spaced3 (text) RETURNS text AS $$
> DECLARE
>-- Declare aliases for function arguments.
>   arg_string ALIAS FOR $1;
> 
>   -- Declare variables
>   row record;
>   res text;
> 
> BEGIN
>   res := '';
>   FOR row IN SELECT regexp_matches(arg_string, '.{1,3}', 'g') as chunk LOOP
> res := res || ' ' || btrim(row.chunk::text, '{}');
>   END LOOP;
>   RETURN res;
> END;
> $$ LANGUAGE 'plpgsql';
> 
> 
> # SELECT spaced3('abcdefgh');
> 
>spaced3
> -
>   abc def gh
> (1 row)
> 
> # SELECT spaced3('0123456789');
> spaced3
> 
>   012 345 678 9
> (1 row)
> 
> to remove the function run this:
> 
> # drop function spaced3(text);
> 
> -wes

Just a small optimization would be to use a backreference with regexp_replace
instead of regexp_matches:

select regexp_replace('foobarbaz', '(...)', E'\\1 ', 'g');
 regexp_replace 

 foo bar baz 
(1 row)

regards,
Ken

-- 
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Simple method to format a string

2012-06-20 Thread Emi Lu



Just a small optimization would be to use a backreference with regexp_replace
instead of regexp_matches:

select regexp_replace('foobarbaz', '(...)', E'\\1 ', 'g');
  regexp_replace

  foo bar baz


Great.

After combined with several more replace(s), regexp_replace will provide 
me the expecting result.


Thanks!
Emi

--
select
regexp_replace(
   replace(
   replace(col-val, ' ', ''), '-', ''),
 replace...
'(...)', E'\\1 ', 'g')
from tn;



--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Is there a similarity-function that minds national charsets?

2012-06-20 Thread Craig Ringer

On 06/21/2012 12:30 AM, Andreas wrote:

Hi,

Is there a similarity-function that minds national charsets?

Over here we've got some special cases that screw up the results on 
similarity().


Our characters: ä, ö, ü, ß
could as well be written as:  ae, oe, ue, ss

e.g.

select similarity ( 'Müller', 'Mueller' )
results to:  0.363636

In normal cases everything below 0.5 would be to far apart to be 
considered a match.


That's not just charset aware, that's looking for awareness of 
language-and-dialect specific transliteration rules for representing 
accented chars in 7-bit ASCII. My understanding was that these rules and 
conventions vary and are specific to each language - or even region.


tsearch2 has big language dictionaries to try to handle some issues like 
this (though I don't know about this issue specifically). It's possible 
you could extend the tsearch2 dictionaries with synonyms, possibly 
algorithmically generated.


If you have what you consider to be an acceptable 1:1 translation rule 
you could build a functional index on it and test against that, eg:


CREATE INDEX blah ON thetable ( (flatten_accent(target_column) );
SELECT similarity( flatten_accent('Müller'), target_column );

Note that the flatten_accent function must be IMMUTABLE and can't access 
or refer to data in other tables, columns, etc nor SET (GUC) variables 
that might change at runtime.

--
Craig Ringer

--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql