from:"Rémi Cura"

Re: [GENERAL] GiST indeices on range types

2015-04-01 Thread Rémi Cura

As far as I understand it (not much), gist index over spatial data is in
fact gist index over range(x), range(y).

This is why Gist works in n-dimension. It always works on range
(conceptually).

In fact rectangle are the intersection of a range on x and a range on y
(literally)
same, a 3D box is the intersection of range on x,y,z
You could go further by adding time, etc.

Cheers,
Rémi-C

2015-04-01 9:00 GMT+02:00 Magnus Hagander :

> On Sat, Mar 28, 2015 at 7:52 AM, Rebecca Zahra 
> wrote:
>
>> Good morning,
>>
>> I am Rebecca Zahra and I am currently in my final year of Masters studies
>> at the University of Malta. My thesis is about the usage of indexes for
>> multi-dimensional data.
>>
>> I was going through the posts regarding GIST indexes and I came across
>> the following
>> http://dba.stackexchange.com/questions/39589/optimizing-queries-on-a-range-of-timestamps-two-columns
>>
>> I was wondering if maybe you can help me with a question.  I know that an
>> R-Tree index implementation is used on top of GIST to index spatial data.
>> Can you please tell me what type of index is used on top of GIST to index 
>> *range
>> types*?
>>
>>
> PostgreSQL has had indexable range types for quite some time now:
> http://www.postgresql.org/docs/current/static/rangetypes.html#RANGETYPES-GIST
>
> Indexable with gist or spgist. I don't think the docs cover the actual
> implementation internals though - you'll probably have to go to the source
> if you need that.
>
> --
>  Magnus Hagander
>  Me: http://www.hagander.net/
>  Work: http://www.redpill-linpro.com/
>

[GENERAL] partitoning expert : Partitonning with specialization of one column type

2015-04-01 Thread Rémi Cura

Hey, postgres 9.3 here.
for partitionning expert.

I have several table child with columns (gid serial, patch pcpatch(n))
where n may change depending on the tables, and pcpatch is a type from
pgpointcloud extension
(here is the definition)
-
CREATE TYPE pcpatch
   (INPUT=pcpatch_in,
   OUTPUT=pcpatch_out,
   RECEIVE=-,
   SEND=-,
   TYPMOD_IN=pc_typmod_in,
   TYPMOD_OUT=pc_typmod_out,
   ANALYZE=-,
   CATEGORY='U', DEFAULT='',
   INTERNALLENGTH=-1, ALIGNMENT=int4, STORAGE=MAIN);
-

The question is, how do I create an inheritance scheme?

If the father table is

CREATE TABLE father(
gid serial,
patch pcpatch
) ;

trying to put child_1, child_2 ... to inherit father raise an error


child table "test_child_1" has different type for column "patch"


So my question is, how would it be possible (if at all) to inherit of
father table, while specializing the type of father table in child table?

Thanks,
Cheers,
Rémi-C

Re: [GENERAL] partitoning expert : Partitonning with specialization of one column type

2015-04-01 Thread Rémi Cura

Hm sorry I wasn't very clear

child table already have type pcpatch(3) , pcpatch(4), etc.
This is the normal behaviour and can't be changed.

pcpatch(n) are specialization of pcpatch type.

pcpatch(3) is widely different from pcpatch(4)

For those who know PostGIS pcpatch is like "geometry" and pcpatch(3) like
"geometry(point)" and pcpatch(5) like "geometry(polygon)",
for instance.

Cheers,
Rémi-C


2015-04-01 16:21 GMT+02:00 Steven Erickson :

>  Your child could be:
>
>
>
> CREATE TABLE child-1(
>
> patchn pcpatchn
>
> ) INHERITS (father);
>
>
>
>
>
>
>
> *From:* pgsql-general-ow...@postgresql.org [mailto:
> pgsql-general-ow...@postgresql.org] *On Behalf Of *Rémi Cura
> *Sent:* Wednesday, April 01, 2015 8:41 AM
> *To:* PostgreSQL General
> *Subject:* [GENERAL] partitoning expert : Partitonning with
> specialization of one column type
>
>
>
> Hey, postgres 9.3 here.
>
> for partitionning expert.
>
> I have several table child with columns (gid serial, patch pcpatch(n))
>
> where n may change depending on the tables, and pcpatch is a type from
> pgpointcloud extension
>
> (here is the definition)
>
> -
> CREATE TYPE pcpatch
>(INPUT=pcpatch_in,
>OUTPUT=pcpatch_out,
>RECEIVE=-,
>SEND=-,
>TYPMOD_IN=pc_typmod_in,
>TYPMOD_OUT=pc_typmod_out,
>ANALYZE=-,
>CATEGORY='U', DEFAULT='',
>INTERNALLENGTH=-1, ALIGNMENT=int4, STORAGE=MAIN);
> -
>
> The question is, how do I create an inheritance scheme?
>
> If the father table is
>
> CREATE TABLE father(
>
> gid serial,
>
> patch pcpatch
>
> ) ;
>
> trying to put child_1, child_2 ... to inherit father raise an error
>
> 
> child table "test_child_1" has different type for column "patch"
> 
>
> So my question is, how would it be possible (if at all) to inherit of
> father table, while specializing the type of father table in child table?
>
> Thanks,
>
> Cheers,
>
> Rémi-C
>
> --
> NOTICE: This email message is for the sole use of the intended
> recipient(s) and may contain confidential and privileged information. Any
> unauthorized use, disclosure or distribution is prohibited. If you are not
> the intended recipient, please contact the sender by reply email and
> destroy all copies of the original message.
>
>

Re: [GENERAL] partitoning expert : Partitonning with specialization of one column type

2015-04-01 Thread Rémi Cura

Okay,
thanks for the answer.
I take it that it is "do it differently then".

I might have an idea but it is not working yet, so I'll post another
message.

Thanks dear co-list user ^^
Cheers,
Rémi-C


2015-04-01 16:56 GMT+02:00 David G. Johnston :

> On Wed, Apr 1, 2015 at 7:26 AM, Rémi Cura  wrote:
>
>> pcpatch(n) are specialization of pcpatch type.
>>
>
> While this may be true PostgreSQL doesn't understand "specialization" of
> data types.  If you cannot do as Steven suggested you do not get to
> leverage inheritance directly and will need to devise your own work-arounds
> for the capabilities that you require.
>
> I'm somewhat surprised that what you describe works at all - mainly given
> that functions do not allow typemod specifications to pass through the
> function invocation...but alas I'm not all that familiar with PostGIS and
> whatever is being done does indeed seem to be working...
>
> David J.
> 
>

[GENERAL] Partitionning using geometry

2015-04-01 Thread Rémi Cura

Hey dear list,

I'd like to partition geographical (geometry) data with postgres mechanism.
(my usage is in fact related to pointcloud, but I use geometry as a work
around)
>From example I read on constraint, nothing should prevent it from working
Here is a self contained example, the planner doesn"t seems to use the
constraint_exclusion mechanism, whatever the constraint

Thanks,
Cheers,
Rémi-C

--

CREATE SCHEMA IF NOT EXISTS test_partitionning;
SET search_path TO test_partitionning, public ;

DROP TABLE IF  EXISTS test_father CASCADE;
CREATE TABLE test_father  (
gid SERIAL PRIMARY KEY
, geom geometry
);

create table test_child_1 (
check (geometry_overlaps(geom,ST_Expand(ST_MakePoint(10,10),10  ) ) )
,check ( geom&&ST_Expand(ST_MakePoint(10,10),10  ) )
, CHECK (ST_X(geom) BETWEEN 0 AND 20)
, CHECK (ST_Y(geom) BETWEEN 0 AND 20)
, CHECK (  ST_Intersects(geom, ST_Expand(ST_MakePoint(10,10),10  ))  )
) inherits (test_father);
--CREATE INDEX ON test_child_1 USING GIST(geom);

create table test_child_2 (
check (geometry_overlaps(geom,ST_Expand(ST_MakePoint(30,10),10  ) ) )
,check ( geom&&ST_Expand(ST_MakePoint(30,10),10  ) )
, CHECK (ST_X(geom) BETWEEN 20 AND 40)
, CHECK (ST_Y(geom) BETWEEN 0 AND 20)
, CHECK (  ST_Intersects(geom, ST_Expand(ST_MakePoint(30,10),10  ))  )
) inherits (test_father);
--CREATE INDEX ON test_child_2 USING GIST(geom);


INSERT INTO test_child_1 (geom)
SELECT ST_MakePoint(s1/10.0+random(),s2/10.0+random())
FROM generate_series(1,90) AS s1, generate_series(1,90) AS s2;

INSERT INTO test_child_2 (geom)
SELECT ST_MakePoint(s1/10.0+random(),s2/10.0+random())
FROM generate_series(200,300) AS s1, generate_series(1,90) AS s2;


SHOW constraint_exclusion;
SET constraint_exclusion TO partition;


WITH area_of_interest AS (
SELECT ST_Buffer(ST_MakePoint(5,5),1) as buf
)
SELECT *
FROM area_of_interest, test_father
WHERE  -- geom && buf
ST_X(geom) BETWEEN ST_XMin(buf) AND ST_Xmax(buf)
AND ST_Y(geom) BETWEEN ST_YMin(buf) AND ST_Ymax(buf) ;


SELECT *
FROM  test_father , ST_Buffer(ST_MakePoint(5,5),1) as buf
WHERE
ST_X(geom) BETWEEN ST_XMin(buf) AND ST_Xmax(buf)
AND ST_Y(geom) BETWEEN ST_YMin(buf) AND ST_Ymax(buf);
--

[GENERAL] Python 3.2 XP64 and Numpy...

2015-05-27 Thread Rémi Cura

Hey dear List,

On a windows XP 64.

I installed python (64b),
it works.
CReating plpython3u works, and python works within database.

I installed numpy (manually compiled,64b),
it works outside of Postgres,
but inside a plpython3u function, simply doing
'import numpy' raises an error saying that python 32 is not a valid win32
application.

I'm really stuck and have checked everything I could (path, rights,
depends...)

Somebody has an idea of how to make it works?

Thanks
Cheers,
Rémi-C

Re: [GENERAL] Python 3.2 XP64 and Numpy...

2015-05-28 Thread Rémi Cura

Hey thanks for the help !

> > Hey dear List,

> >

> > On a windows XP 64.

> >

> > I installed python (64b),

> > it works.

>

> What version of Python 2 or 3 or both?

>

> What does python -V show at the command line?

>



Python 3.2 and python 2.6, both 64bits are installed on the PC.

When I return "sys.version" from inside a plpython3u function, i get

3.2 (r32:88445, Feb 20 2011, 21:30:00) [MSC v.1500 64 bit (AMD64)]



> > CReating plpython3u works, and python works within database.

> >

> > I installed numpy (manually compiled,64b),

> > it works outside of Postgres,

> > but inside a plpython3u function, simply doing

> > 'import numpy' raises an error saying that python 32 is not a valid

> > win32 application.

>

> Is there a 32 bit version of numpy on your machine?



Nope! I freshly installed all of this.

Numpy is installed only in my Python 3.2.

My python 2.6 does not have numpy.



I tried:

C:\Python32>python.exe

Python 3.2 (r32:88445, Feb 20 2011, 21:30:00) [MSC v.1500 64 bit (AMD64)]
on win32

Type "help", "copyright", "credits" or "license" for more information.

>>> import numpy

>>> print(numpy.__version__)

1.9.2

>>>



but when i try to import numpy from a plpython3u function it raised
error...





>

> >

> > I'm really stuck and have checked everything I could (path, rights,

> > depends...)

>

> Does that include PYTHONPATH?



Yes! i tried to print python path from a plpython3u function (return
sys.path), i get :

"['C:\\WINDOWS\\system32\\python32.zip', 'C:\\Python32\\Lib',
'C:\\Python32\\DLLs', 'F:\\postgresData', 'C:\\Program
Files\\PostgreSQL\\9.3\\bin', 'C:\\Python32',
'C:\\Python32\\lib\\site-packages']"



I look in all those folders and the only numpy i found is the one i have
recompiled in "C:\\Python32\\lib\\site-packages"

Re: [GENERAL] Python 3.2 XP64 and Numpy...

2015-06-01 Thread Rémi Cura

Hey, thanks to help me with that.
I started fresh to have a truly reproducible process,
so you can have all information and rule out some error possibilities.

 - Uninstall all python.
 - Check that PythonPath doesn't exist anymore
 - check that python doesn't exist anymore

 - install python 3.2.5 64 bit from official python website into C/Python32

 - Reload configuration for server.

 - create plpython3u , create a python function, test it (show path)

   * It works, python path is
'C:\\Windows\\system32\\python32.zip', 'C:\\Python32\\Lib',
'C:\\Python32\\DLLs', 'E:\\9.3\\data', 'C:\\Program
Files\\PostgreSQL\\9.3\\bin', 'C:\\Python32',
'C:\\Python32\\lib\\site-packages'

 - Donwload latest numpy from website.
 - ON antoher PC
   * Compile numpy with visual 2008 , 64 bit
   * Create an binary installer for windows (using python.exe setup.py
)with proper argument

 - On the server :
 - install numpy with the compiled installer.


 - check that numpy is correctly installer in C:\Python32\Lib\site-packages
 - using an external terminal, check that numpy works (import numpy -> OK)

 - Now, define a plpython3u function containing "import numpy"

 - Run the function --> error is
  "ERREUR:  ImportError: DLL load failed: Le module spécifié est
introuvable.",
   which roughly translate to
  "ERROR: ImportError : DLL load failed : the specified module couldn't be
found".

 - Create a plpython3u function returning sys.path
the path is "C:\\Windows\\system32\\python32.zip', 'C:\\Python32\\Lib',
'C:\\Python32\\DLLs', 'E:\\9.3\\data', 'C:\\Program
Files\\PostgreSQL\\9.3\\bin', 'C:\\Python32',
'C:\\Python32\\lib\\site-packages"
numpy is in this path, in C:\\Python32\\lib\\site-packages
All user of the computer have all rights on the
C:\\Python32\\lib\\site-packages folder


 - execute `import imp; imp.find_package('numpy')` within the plpython3u
function
   -> returns "None, 'C:\\Python32\\lib\\site-packages\\numpy', ('', '', 5)"

 - create a helloworld module , put it next to numpy, try to call it
   -> it gets called

I really don't see what I can do more.

Cheers,
Rémi-C

Re: [GENERAL] Python 3.2 XP64 and Numpy...

2015-06-01 Thread Rémi Cura

Here is the test code
---
--creating plpython3u
DROP LANGUAGE plpython3u CASCADE;
CREATE LANGUAGE plpython3u ;


--create a test function
DROP FUNCTION IF EXISTS rc_test_python(   );
CREATE FUNCTION rc_test_python(  )
RETURNS void
AS $$"""
@brief This function test python
"""
import sys
import imp
#sys.path.insert(0, 'C:\\Python32\\Lib\\site-packages\\numpy')
plpy.notice(sys.path)
plpy.notice('importing numpy')
#import numpy #first way to do it

file, pathname, description = imp.find_module('numpy')
plpy.notice(file, pathname, description) # numpy was found
imp.load_module('numpy',file, pathname, description ) #second way to do it
plpy.notice('end of importing numpy, this message won t show, an exception
is raised before')
#plpy.notice(numpy.__version__)
import helloworld as h
imp.reload(h)
s = h.helloworld()
plpy.notice(s)
return
$$ LANGUAGE plpython3u IMMUTABLE STRICT;

--call function
SELECT *
FROM rc_test_python(   ) ;
-
---

2015-06-01 18:09 GMT+02:00 Rémi Cura :

> Hey, thanks to help me with that.
> I started fresh to have a truly reproducible process,
> so you can have all information and rule out some error possibilities.
>
>  - Uninstall all python.
>  - Check that PythonPath doesn't exist anymore
>  - check that python doesn't exist anymore
>
>  - install python 3.2.5 64 bit from official python website into C/Python32
>
>  - Reload configuration for server.
>
>  - create plpython3u , create a python function, test it (show path)
>
>* It works, python path is
> 'C:\\Windows\\system32\\python32.zip', 'C:\\Python32\\Lib',
> 'C:\\Python32\\DLLs', 'E:\\9.3\\data', 'C:\\Program
> Files\\PostgreSQL\\9.3\\bin', 'C:\\Python32',
> 'C:\\Python32\\lib\\site-packages'
>
>  - Donwload latest numpy from website.
>  - ON antoher PC
>* Compile numpy with visual 2008 , 64 bit
>* Create an binary installer for windows (using python.exe setup.py
> )with proper argument
>
>  - On the server :
>  - install numpy with the compiled installer.
>
>
>  - check that numpy is correctly installer in C:\Python32\Lib\site-packages
>  - using an external terminal, check that numpy works (import numpy -> OK)
>
>  - Now, define a plpython3u function containing "import numpy"
>
>  - Run the function --> error is
>   "ERREUR:  ImportError: DLL load failed: Le module spécifié est
> introuvable.",
>which roughly translate to
>   "ERROR: ImportError : DLL load failed : the specified module couldn't be
> found".
>
>  - Create a plpython3u function returning sys.path
> the path is "C:\\Windows\\system32\\python32.zip', 'C:\\Python32\\Lib',
> 'C:\\Python32\\DLLs', 'E:\\9.3\\data', 'C:\\Program
> Files\\PostgreSQL\\9.3\\bin', 'C:\\Python32',
> 'C:\\Python32\\lib\\site-packages"
> numpy is in this path, in C:\\Python32\\lib\\site-packages
> All user of the computer have all rights on the
> C:\\Python32\\lib\\site-packages folder
>
>
>  - execute `import imp; imp.find_package('numpy')` within the plpython3u
> function
>-> returns "None, 'C:\\Python32\\lib\\site-packages\\numpy', ('', '',
> 5)"
>
>  - create a helloworld module , put it next to numpy, try to call it
>-> it gets called
>
> I really don't see what I can do more.
>
> Cheers,
> Rémi-C
>
>

Re: [GENERAL] Python 3.2 XP64 and Numpy...

2015-06-02 Thread Rémi Cura

Hey,
python is installed from official binary, 64 b for windows,
in C/Python32

I can't remember the argument, but it might be irrelevant.
The problem doesn't seem to be to install numpy, it works perfectly in the
regular terminal.

The problem seems to be that postgres can't use correctly numpy.


I found no version of scipy installer for win 64 with python 3.2,
I tried several but not having the exact python version always end in
failure.
The binaries you linked to are dependent on intel math kernel library,
which I don't have.


Cheers,
Rémi-C


2015-06-01 19:41 GMT+02:00 Adrian Klaver :

> On 06/01/2015 09:09 AM, Rémi Cura wrote:
>
>> Hey, thanks to help me with that.
>> I started fresh to have a truly reproducible process,
>> so you can have all information and rule out some error possibilities.
>>
>>   - Uninstall all python.
>>   - Check that PythonPath doesn't exist anymore
>>   - check that python doesn't exist anymore
>>
>>   - install python 3.2.5 64 bit from official python website into
>> C/Python32
>>
>>   - Reload configuration for server.
>>
>>   - create plpython3u , create a python function, test it (show path)
>>
>> * It works, python path is
>>  'C:\\Windows\\system32\\python32.zip', 'C:\\Python32\\Lib',
>> 'C:\\Python32\\DLLs', 'E:\\9.3\\data', 'C:\\Program
>> Files\\PostgreSQL\\9.3\\bin', 'C:\\Python32',
>> 'C:\\Python32\\lib\\site-packages'
>>
>>   - Donwload latest numpy from website.
>>   - ON antoher PC
>>
>
> So what is the Python setup on this machine?
>
>  * Compile numpy with visual 2008 , 64 bit
>> * Create an binary installer for windows (using python.exe setup.py
>> )with proper argument
>>
>
> The argument would be?
>
>
>>   - On the server :
>>   - install numpy with the compiled installer.
>>
>
> Best guess is that the numpy compilation you are doing on Machine A is not
> compatible with what you have installed on Machine B(the server).
>
> Have you looked at this:
> http://www.lfd.uci.edu/~gohlke/pythonlibs/
>
> or
>
> http://www.scipy.org/install.html
>
>
>>
>>   - check that numpy is correctly installer in
>> C:\Python32\Lib\site-packages
>>   - using an external terminal, check that numpy works (import numpy ->
>> OK)
>>
>>   - Now, define a plpython3u function containing "import numpy"
>>
>>   - Run the function --> error is
>>"ERREUR:  ImportError: DLL load failed: Le module spécifié est
>> introuvable.",
>> which roughly translate to
>>"ERROR: ImportError : DLL load failed : the specified module couldn't
>> be found".
>>
>>   - Create a plpython3u function returning sys.path
>> the path is "C:\\Windows\\system32\\python32.zip', 'C:\\Python32\\Lib',
>> 'C:\\Python32\\DLLs', 'E:\\9.3\\data', 'C:\\Program
>> Files\\PostgreSQL\\9.3\\bin', 'C:\\Python32',
>> 'C:\\Python32\\lib\\site-packages"
>> numpy is in this path, in C:\\Python32\\lib\\site-packages
>> All user of the computer have all rights on the
>> C:\\Python32\\lib\\site-packages folder
>>
>>
>>   - execute `import imp; imp.find_package('numpy')` within the
>> plpython3u function
>> -> returns "None, 'C:\\Python32\\lib\\site-packages\\numpy', ('',
>> '', 5)"
>>
>>   - create a helloworld module , put it next to numpy, try to call it
>> -> it gets called
>>
>> I really don't see what I can do more.
>>
>> Cheers,
>> Rémi-C
>>
>>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>

Re: [GENERAL] pl/python composite type array as input parameter

2015-06-02 Thread Rémi Cura

Hey,
the only straight workaround I know (which is pretty bad)
is to cast down your record to text.
Then you have an array of text, which is manageable.

For this you can either 'flatten' your record into a unique text,
or cast each part of your record to text, then emulate an array of array
(you need to know the length of the inner array in your function though).

I used this to emulate a 2D numpy vector (N*3)(for numpy).

You'll need a custom aggregate, like this one

.

The other more sane solution is to pass the information about the row you
want to retrieve, and retrieve the row directly within the python.
For instance, here you would pass an array of id of the employee you want
to work with.
This is saner, but as a design I don't really like to have specific SQL
code into a generic python function.

I agree it is cumbersome, and I also badly miss more powerful input for
python function (after all, plpython can already return composite types,
which is awesome)


Cheers,
Rémi-C

2015-06-02 2:44 GMT+02:00 Adrian Klaver :

> On 06/01/2015 07:42 AM, Filipe Pina wrote:
>
>> Thanks for the reply anyway, it's a pity though, it'd be useful..
>>
>> Another bump I've found along the pl/python road: insert ROWTYPE in
>> table..
>> Maybe you have some hint on that? :)
>>
>> So, in PLPGSQL I can:
>>
>> DECLARE
>>my_var my_table;
>> BEGIN
>>my_var.col1 := 'asd';
>>INSERT INTO my_table VALUES(my_table.*);
>> END;
>>
>> How would I do something like that in pl/python?
>>
>> First, how to declare a ROW-TYPE variable, as they're all python mappings?
>>
>> my_var = { 'col1': 'asd' } enough? it'd would miss all the other
>> columns...
>>
>> Second, how to insert it?
>>
>> plpy.prepare and .execute say they don't support composite types, so I
>> cannot simply pass
>>
>> pl = plpy.prepare('INSERT INTO core_customer VALUES ($1)', ['my_table'])
>>
>> Any workarounds for this? (meaning I wouldn't have to specify any
>> columns in the insert statement)
>>
>
> http://www.postgresql.org/docs/9.4/interactive/sql-insert.html
>
> pl = plpy.prepare('INSERT INTO core_table SELECT * FROM my_table')
>
>
>> Thanks
>>
>> On Sex, Mai 29, 2015 at 2:00 , Peter Eisentraut  wrote:
>>
>>> On 5/18/15 10:52 AM, Filipe Pina wrote:
>>>
>>> But one of the functions I need to create needs to accept an array
>>> of records.
>>>
>>> PL/Python doesn't support that. Some more code needs to be written to
>>> support that. You did everything correctly. I don't know of a good
>>> workaround.
>>>
>>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: [GENERAL] pl/python composite type array as input parameter

2015-06-02 Thread Rémi Cura

OUps,

I forget another strategy I used :
instead of having
testp2(es employee[])

you can use
testp2( names text[], salaries integer[], ages integer[])

This might be the solution with the less work, but it is absolutely
terrible practice,
because it will be hard to change you record type (evolution difficult)
, and having many columns will make you create function with many arguments,
which is often a bad idea.

Cheers,
Rémi-C

2015-06-02 10:36 GMT+02:00 Rémi Cura :

> Hey,
> the only straight workaround I know (which is pretty bad)
> is to cast down your record to text.
> Then you have an array of text, which is manageable.
>
> For this you can either 'flatten' your record into a unique text,
> or cast each part of your record to text, then emulate an array of array
> (you need to know the length of the inner array in your function though).
>
> I used this to emulate a 2D numpy vector (N*3)(for numpy).
>
> You'll need a custom aggregate, like this one
> <https://github.com/Remi-C/_utilities/blob/master/postgres/array_of_array.sql>
> .
>
> The other more sane solution is to pass the information about the row you
> want to retrieve, and retrieve the row directly within the python.
> For instance, here you would pass an array of id of the employee you want
> to work with.
> This is saner, but as a design I don't really like to have specific SQL
> code into a generic python function.
>
> I agree it is cumbersome, and I also badly miss more powerful input for
> python function (after all, plpython can already return composite types,
> which is awesome)
>
>
> Cheers,
> Rémi-C
>
> 2015-06-02 2:44 GMT+02:00 Adrian Klaver :
>
>> On 06/01/2015 07:42 AM, Filipe Pina wrote:
>>
>>> Thanks for the reply anyway, it's a pity though, it'd be useful..
>>>
>>> Another bump I've found along the pl/python road: insert ROWTYPE in
>>> table..
>>> Maybe you have some hint on that? :)
>>>
>>> So, in PLPGSQL I can:
>>>
>>> DECLARE
>>>my_var my_table;
>>> BEGIN
>>>my_var.col1 := 'asd';
>>>INSERT INTO my_table VALUES(my_table.*);
>>> END;
>>>
>>> How would I do something like that in pl/python?
>>>
>>> First, how to declare a ROW-TYPE variable, as they're all python
>>> mappings?
>>>
>>> my_var = { 'col1': 'asd' } enough? it'd would miss all the other
>>> columns...
>>>
>>> Second, how to insert it?
>>>
>>> plpy.prepare and .execute say they don't support composite types, so I
>>> cannot simply pass
>>>
>>> pl = plpy.prepare('INSERT INTO core_customer VALUES ($1)', ['my_table'])
>>>
>>> Any workarounds for this? (meaning I wouldn't have to specify any
>>> columns in the insert statement)
>>>
>>
>> http://www.postgresql.org/docs/9.4/interactive/sql-insert.html
>>
>> pl = plpy.prepare('INSERT INTO core_table SELECT * FROM my_table')
>>
>>
>>> Thanks
>>>
>>> On Sex, Mai 29, 2015 at 2:00 , Peter Eisentraut  wrote:
>>>
>>>> On 5/18/15 10:52 AM, Filipe Pina wrote:
>>>>
>>>> But one of the functions I need to create needs to accept an array
>>>> of records.
>>>>
>>>> PL/Python doesn't support that. Some more code needs to be written to
>>>> support that. You did everything correctly. I don't know of a good
>>>> workaround.
>>>>
>>>
>>
>> --
>> Adrian Klaver
>> adrian.kla...@aklaver.com
>>
>>
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>>
>
>

[GENERAL] (expert) "insert into VIEW returning" inside an instead of trigger returns nothing

2015-09-02 Thread Rémi Cura

Hey list,
I'm stuck on a problem that I can't figure out (postgres 9.3).
In short, using an
INSERT INTO __view_with_trigger__ ...  RETURNING gid INTO _gid;
returns nothing.
I need this feature because I use views on tables as user interface.

This must have to do with the postgres order of execution,
because inserting into a table instead of the view of the table returns the
expected result.

Here is a synthetic example (of course the real use really requires this
kind of architecture),
any help is much appreciated,
because I don't see any work-around (except not using view at all, which
would be terrible data duplication in my case)

Cheers,
Rémi-C




-- test inserting and instead of trigger --
---

CREATE SCHEMA IF NOT EXISTS test ;
SET search_path to test, public ;

DROP TABLE IF EXISTS generic_object CASCADE;
CREATE TABLE generic_object (
gid SERIAL PRIMARY KEY
, orientation float
) ;

DROP VIEW IF EXISTS editing_generic_object ;
CREATE VIEW editing_generic_object AS(
SELECT gid,
degrees(orientation) AS orientation
FROM generic_object
) ;


DROP TABLE IF EXISTS specific_object CASCADE ;
CREATE TABLE specific_object (
gid int references generic_object (gid) ON DELETE CASCADE
, width float
) ;

DROP VIEW IF EXISTS editing_specific_object ;
CREATE VIEW editing_specific_object AS(
SELECT g.gid
, g.orientation
, so.width
FROM specific_object AS so LEFT OUTER JOIN
generic_object AS g USING (gid)
) ;




DROP FUNCTION IF EXISTS test.rc_editing_generic_object() CASCADE ;
CREATE OR REPLACE FUNCTION test.rc_editing_generic_object(  )
RETURNS  trigger  AS $BODY$
/** @brief : this trigger deals with editing generic object*/
DECLARE
BEGIN
IF TG_OP = 'DELETE' THEN DELETE FROM test.generic_object WHERE gid
= OLD.gid ; RETURN OLD ;
ELSIF TG_OP = 'INSERT' THEN INSERT INTO test.generic_object
(orientation) VALUES (radians(NEW.orientation) ) ;
ELSE UPDATE test.generic_object SET orientation =
radians(NEW.orientation) ;
END IF ;

RETURN NEW ;
END ;
$BODY$  LANGUAGE plpgsql VOLATILE;


DROP TRIGGER IF EXISTS rc_editing_generic_object ON
test.editing_generic_object ;
CREATE TRIGGER rc_edit_street_object_pedestrian INSTEAD OF UPDATE OR INSERT
OR DELETE
ON test.editing_generic_object
FOR ROW  EXECUTE PROCEDURE rc_editing_generic_object(  ) ;




DROP FUNCTION IF EXISTS test.rc_editing_specific_object() CASCADE ;
CREATE OR REPLACE FUNCTION test.rc_editing_specific_object(  )
RETURNS  trigger  AS $BODY$
/** @brief : this trigger deals with editing specific object*/
DECLARE
_gid int;
BEGIN
IF TG_OP = 'DELETE' THEN DELETE FROM test.generic_object WHERE gid
= OLD.gid ; RETURN OLD ;
ELSIF TG_OP = 'INSERT' THEN
--does not works
INSERT INTO test.editing_generic_object (orientation) VALUES (
NEW.orientation)  RETURNING gid INTO _gid;
--does works
--INSERT INTO test.generic_object (orientation) VALUES (
radians(NEW.orientation) )  RETURNING gid INTO _gid;

RAISE WARNING 'here is the gid deduced after insertion : %',
_gid ;
INSERT INTO test.specific_object (gid, width) VALUES (_gid,
NEW.width) ;
ELSE
UPDATE test.editing_generic_object  AS e SET orientation =
NEW.orientation WHERE e.gid = NEW.gid;
UPDATE test.specific_object AS s SET width = NEW.width WHERE
s.gid = NEW.gid;
END IF ;
RETURN NEW ;
END ;
$BODY$  LANGUAGE plpgsql VOLATILE;


DROP TRIGGER IF EXISTS rc_editing_specific_object ON
test.editing_specific_object ;
CREATE TRIGGER rc_editing_specific_object INSTEAD OF UPDATE OR INSERT OR
DELETE
ON test.editing_specific_object
FOR ROW  EXECUTE PROCEDURE rc_editing_specific_object(  ) ;

--testing

--inserting into generic : works
INSERT INTO editing_generic_object ( orientation) VALUES (180) ;
SELECT *
FROM generic_object ;

-- insert into specific : don't work
INSERT INTO editing_specific_object ( orientation,width) VALUES (180,
123) ;
SELECT *
FROM specific_object ;

[GENERAL] Re: (expert) "insert into VIEW returning" inside an instead of trigger returns nothing

2015-09-02 Thread Rémi Cura

I think I got it,
I have to always return something (like NEW) in the instead of trigger,
but fill NEW
with returnings of INSERT into regular table.
CHeers,
Rémi-C

2015-09-02 13:44 GMT+02:00 Rémi Cura :

> Hey list,
> I'm stuck on a problem that I can't figure out (postgres 9.3).
> In short, using an
> INSERT INTO __view_with_trigger__ ...  RETURNING gid INTO _gid;
> returns nothing.
> I need this feature because I use views on tables as user interface.
>
> This must have to do with the postgres order of execution,
> because inserting into a table instead of the view of the table returns
> the expected result.
>
> Here is a synthetic example (of course the real use really requires this
> kind of architecture),
> any help is much appreciated,
> because I don't see any work-around (except not using view at all, which
> would be terrible data duplication in my case)
>
> Cheers,
> Rémi-C
>
> 
>
> 
> -- test inserting and instead of trigger --
> ---
>
> CREATE SCHEMA IF NOT EXISTS test ;
> SET search_path to test, public ;
>
> DROP TABLE IF EXISTS generic_object CASCADE;
> CREATE TABLE generic_object (
> gid SERIAL PRIMARY KEY
> , orientation float
> ) ;
>
> DROP VIEW IF EXISTS editing_generic_object ;
> CREATE VIEW editing_generic_object AS(
> SELECT gid,
> degrees(orientation) AS orientation
> FROM generic_object
> ) ;
>
>
> DROP TABLE IF EXISTS specific_object CASCADE ;
> CREATE TABLE specific_object (
> gid int references generic_object (gid) ON DELETE CASCADE
> , width float
> ) ;
>
> DROP VIEW IF EXISTS editing_specific_object ;
> CREATE VIEW editing_specific_object AS(
> SELECT g.gid
> , g.orientation
> , so.width
> FROM specific_object AS so LEFT OUTER JOIN
> generic_object AS g USING (gid)
> ) ;
>
>
>
>
> DROP FUNCTION IF EXISTS test.rc_editing_generic_object() CASCADE ;
> CREATE OR REPLACE FUNCTION test.rc_editing_generic_object(  )
> RETURNS  trigger  AS $BODY$
> /** @brief : this trigger deals with editing generic object*/
> DECLARE
> BEGIN
> IF TG_OP = 'DELETE' THEN DELETE FROM test.generic_object WHERE gid
> = OLD.gid ; RETURN OLD ;
> ELSIF TG_OP = 'INSERT' THEN INSERT INTO test.generic_object
> (orientation) VALUES (radians(NEW.orientation) ) ;
> ELSE UPDATE test.generic_object SET orientation =
> radians(NEW.orientation) ;
> END IF ;
>
> RETURN NEW ;
> END ;
> $BODY$  LANGUAGE plpgsql VOLATILE;
>
>
> DROP TRIGGER IF EXISTS rc_editing_generic_object ON
> test.editing_generic_object ;
> CREATE TRIGGER rc_edit_street_object_pedestrian INSTEAD OF UPDATE OR
> INSERT OR DELETE
> ON test.editing_generic_object
> FOR ROW  EXECUTE PROCEDURE rc_editing_generic_object(  ) ;
>
>
>
>
> DROP FUNCTION IF EXISTS test.rc_editing_specific_object() CASCADE ;
> CREATE OR REPLACE FUNCTION test.rc_editing_specific_object(  )
> RETURNS  trigger  AS $BODY$
> /** @brief : this trigger deals with editing specific object*/
> DECLARE
> _gid int;
> BEGIN
> IF TG_OP = 'DELETE' THEN DELETE FROM test.generic_object WHERE gid
> = OLD.gid ; RETURN OLD ;
> ELSIF TG_OP = 'INSERT' THEN
> --does not works
> INSERT INTO test.editing_generic_object (orientation) VALUES (
> NEW.orientation)  RETURNING gid INTO _gid;
> --does works
> --INSERT INTO test.generic_object (orientation) VALUES (
> radians(NEW.orientation) )  RETURNING gid INTO _gid;
>
> RAISE WARNING 'here is the gid deduced after insertion : %',
> _gid ;
> INSERT INTO test.specific_object (gid, width) VALUES (_gid,
> NEW.width) ;
> ELSE
> UPDATE test.editing_generic_object  AS e SET orientation =
> NEW.orientation WHERE e.gid = NEW.gid;
> UPDATE test.specific_object AS s SET width = NEW.width WHERE
> s.gid = NEW.gid;
> END IF ;
> RETURN NEW ;
> END ;
> $BODY$  LANGUAGE plpgsql VOLATILE;
>
>
> DROP TRIGGER IF EXISTS rc_editing_specific_object ON
> test.editing_specific_object ;
> CREATE TRIGGER rc_editing_specific_object INSTEAD OF UPDATE OR INSERT OR
> DELETE
> ON test.editing_specific_object
> FOR ROW  EXECUTE PROCEDURE rc_editing_specific_object(  ) ;
>
> --testing
>
> --inserting into generic : works
> INSERT INTO editing_generic_object ( orientation) VALUES (180) ;
> SELECT *
> FROM generic_object ;
>
> -- insert into specific : don't work
> INSERT INTO editing_specific_object ( orientation,width) VALUES (180,
> 123) ;
> SELECT *
> FROM specific_object ;
>

Re: [GENERAL] Update statement results in Out of memory

2016-07-06 Thread Rémi Cura

You could check the max number of points in your geometries :

SELECT max(ST_NumPoints(geom))
FROM ...

Of course you could still have invalid / abberant geometry,
which you could also check (ST_IsValid, St_IsSimple).

You could solve both those hypotheses if you could perform your buffer by
batch.

Cheers,
Rémi-C

2016-07-06 15:36 GMT+02:00 Paul Ramsey :

> Running a multi-million row update will take a long time.
> It's possible you've exposed a memory leak in ST_Buffer (the older
> your version of GEOS, the more likely that is) but it's also possible
> you're just running a really long update.
> I find for batch processing purposes that creating fresh tables is far
> preferable:
>
> CREATE TABLE newtable AS SELECT ST_Buffer(geom) ... FROM oldtable;
>
> If you still see memory issues with the above then you probably do
> have a leak, *or* you're just running buffer on a sufficiently large
> input geometry or with a large enough radius to blow up the memory
> naturally.
>
> P
>
>
> On Wed, Jul 6, 2016 at 2:13 AM, Ivan Bianchi  wrote:
> > Hello,
> >
> > I am trying to update a column using a PostGIS ST_Buffer function into a
> > table of 4.257.769 rows, but after 6 hours, an Out of memory error
> appears
> > and the kernel starts killing processes until a Kernel Panic shows up.
> >
> > I have simplified the buffer target geometry and also added a gist index
> to
> > that column.
> >
> > The statement is the following:
> >>
> >> psql -h host -U user -W -d database -c "UPDATE table SET buffer =
> >> ST_Buffer(simplified_geometry, 0.005);"
> >
> >
> > After reading and tunning the configuration, I still have the same
> result.
> >
> > Here's the initial memory stats:
> >
> >>   totalusedfree shared  buff/cache
>  available
> >> Mem:15G1.5G 12G503M1.4G
> >> 13G
> >> Swap:  7.8G  0B7.8G
> >
> >
> >
> > I'm running out of ideas, as I think the postgresql.conf memory
> parameters
> > are quite low for the machine specs. I understand I can split the process
> > and paginate the rows, but I can't see why I can't deal with this full
> > statement right now.
> >
> > Do you think this issue is related with the postgres memory parameters
> > configuration? Why is not respecting the shared_buffers or
> > effective_cache_size parameters and keeps growing?
> >
> >
> > Here's some info:
> >
> > Machine specs
> >
> > Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz (8 cores)
> > 16 GB of memory
> > Fedora release 23 (Twenty Three)
> > Kernel - 4.5.7-202.fc23.x86_64
> >
> > postgresql.conf
> >
> > effective_cache_size = 5GB
> > shared_buffers = 3GB
> > work_mem = 10MB
> >
> > maintenance_work_mem = 800MB
> > wal_buffers = 16MB
> >
> > Kernel parameters
> >
> > vm.overcommit_memory=2
> >
> > kernel.shmmax = 8340893696
> > kernel.shmall = 2036351
> >
> > Versions:
> >
> > PostgreSQL 9.5.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 5.3.1
> > 20160406 (Red Hat 5.3.1-6), 64-bit
> > POSTGIS="2.2.2 r14797" GEOS="3.5.0-CAPI-1.9.0 r4084" PROJ="Rel. 4.9.1, 04
> > March 2015" GDAL="GDAL 2.0.2, released 2016/01/26" LIBXML="2.9.3"
> > LIBJSON="0.12" RASTER
> >
> >
> > Many thanks,
> >
> > --
> > Ivan
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

[GENERAL] forcing a table (parent in inheritance) tor emain empty

2016-10-11 Thread Rémi Cura

Hey dear list,
I can't find a nice solution to enforce a necessary behaviour in my case :
I want a parent table to remain empty.

Of course I could define a trigger and return NULL in any case, but I'd
like a more elegant approach using check or constraints.

Any thought appreciated,
Cheers,
Rémi C

Re: [GENERAL] forcing a table (parent in inheritance) tor emain empty

2016-10-11 Thread Rémi Cura

This solution is very nice.
Sadly the check is inherited by the children
(I only want the parent to be empty, not the children).

It seems the element that are not inherited are

   - Indexes
   - Unique constraints
   - Primary Keys
   - Foreign keys
   - Rules and Triggers


thanks anyway for the fast answer,
Cheers,
Rémi C

2016-10-11 18:33 GMT+02:00 Manuel Gómez :

> On Tue, Oct 11, 2016 at 12:27 PM, Rémi Cura  wrote:
> > Hey dear list,
> > I can't find a nice solution to enforce a necessary behaviour in my case
> :
> > I want a parent table to remain empty.
> >
> > Of course I could define a trigger and return NULL in any case, but I'd
> like
> > a more elegant approach using check or constraints.
>
> You could probably do it with a simple constraint:
>
> postgres=# create table dum(check (false));
> CREATE TABLE
> postgres=# insert into dum default values;
> ERROR:  new row for relation "dum" violates check constraint "dum_check"
> DETAIL:  Failing row contains ().
>

Re: [GENERAL] forcing a table (parent in inheritance) tor emain empty

2016-10-11 Thread Rémi Cura

Perfect !

Cheers,
Rémi C

2016-10-11 19:12 GMT+02:00 Julien Rouhaud :

> On 11/10/2016 19:04, Rémi Cura wrote:
> > This solution is very nice.
> > Sadly the check is inherited by the children
> > (I only want the parent to be empty, not the children).
> >
> > It seems the element that are not inherited are
> >
> >   * Indexes
> >   * Unique constraints
> >   * Primary Keys
> >   * Foreign keys
> >   * Rules and Triggers
> >
>
> you can specify a "NO INHERIT" on the check constraint, that should
> solve your issue.
>
> --
> Julien Rouhaud
> http://dalibo.com - http://dalibo.org
>

Re: [GENERAL] Improve PostGIS performance with 62 million rows?

2017-01-05 Thread Rémi Cura

Hey,
1 sec seems really good in this case,
and I'm assuming you tuned postgres so the main index fits into ram
(work_mem and all other stuff).

You could avoid a CTE by mixing both cte.

WITH pts AS (
SELECT (pt).geom, (pt).path[1] as vert
FROM
ST_DumpPoints(
ST_Segmentize(
ST_GeographyFromText('SRID=4326;LINESTRING(-150.008056
61.179167,-156.77 71.285833)'),
600
)::geometry
) as pt
)
SELECT elevation
FROM data
INNER JOIN (SELECT
ST_MakeLine(ARRAY[a.geom, b.geom]) as short_line
FROM pts a
INNER JOIN pts b
ON a.vert=b.vert-1 AND b.vert>1) segments
ON  ST_DWithin(location, segments.short_line, 600)
ORDER BY elevation DESC limit 1;


Then you could remove the useless and (potentially explosive if you have
large number of dump points) inner join on points :
"FROM pts a
INNER JOIN pts b "

You could simply use a window function to generate the segments, like in
here

.
The idea is to dump points, order them by path, and then link each point
with the previous one (function lag).
Assuming you don't want to use the available function,
this would be something like :



WITH segments AS (
SELECT ST_MakeLine( lag((pt).geom , 1, NULL) OVER (ORDER BY (pt).path)
  ,(pt).geom) AS short_line
FROM ST_DumpPoints(
  ST_Segmentize(
ST_GeographyFromText('SRID=4326;LINESTRING(-150.008056
61.179167,-156.77 71.285833)'),
600
)::geometry
) as pt
)
SELECT elevation
FROM data ,segments
WHERE segments.short_line IS NOT NULL --the first segment is null by design
(lag function)
  AND  ST_DWithin(location, segments.short_line, 600) = TRUE
ORDER BY elevation DESC
limit 1;


I don't know if you can further improve this query after that,
but I'll guess it would reduce your time and be more secure regarding
scaling.


if you want to further improve your result,
you'll have to reduce the number of row in your index,
that is partition your table into several tables !

This is not easy to do with current postgres partitionning methods as far
as I know
(partitionning is easy, automatic efficient query is hard).

Another way would be to reduce you requirement, and consider that in some
case you may want less details in the altimetry, which would allow you to
use a Level Of Detail approach.

Congrats for the well explained query/problem anyway !
Cheers,
Rémi-C

2017-01-05 23:09 GMT+01:00 Paul Ramsey :

> Varying the segment length upwards might have a salutary effect for a
> while, as the efficiency improvement of fewer inner loops battles with the
> inefficiency of having more points selected by the index filter. Worth an
> experiment.
>
> P
>
> On Thu, Jan 5, 2017 at 1:00 PM, Israel Brewster 
> wrote:
>
>>
>> On Jan 5, 2017, at 10:38 AM, Paul Ramsey 
>> wrote:
>>
>> Yes, you did. You want a query that spits out a tupleset of goemetries
>> (one each for each wee segment), and then you can join that set to your
>> main table using st_dwithin() as the join clause.
>> So start by ditching the main table and just work on a query that
>> generates a pile of wee segments.
>>
>>
>> Ahhh, I see you've done this sort of thing before (
>> http://blog.cleverelephant.ca/2015/02/breaking-linestring-
>> into-segments.html) :-)
>>
>> So following that advice I came up with the following query:
>>
>> WITH dump AS (SELECT
>> ST_DumpPoints(
>> ST_Segmentize(
>> ST_GeographyFromText('SRID=4326;LINESTRING(-150.008056
>> 61.179167,-156.77 71.285833)'),
>> 600
>> )::geometry
>> ) as pt
>> ),
>> pts AS (
>> SELECT (pt).geom, (pt).path[1] as vert FROM dump
>> )
>> SELECT elevation
>> FROM data
>> INNER JOIN (SELECT
>> ST_MakeLine(ARRAY[a.geom, b.geom]) as short_line
>> FROM pts a
>> INNER JOIN pts b
>> ON a.vert=b.vert-1 AND b.vert>1) segments
>> ON  ST_DWithin(location, segments.short_line, 600)
>> ORDER BY elevation DESC limit 1;
>>
>> Which yields the following EXPLAIN ANALYZE (https://explain.depesz.com/s/
>> RsTD ):
>>
>>
>>QUERY PLAN
>>
>>
>> 
>> 
>> 
>> 
>>  Limit  (cost=11611706.90..11611706.91 rows=1 width=4) (actual
>> time=1171.814..1171.814 rows=1 loops=1)
>>CTE dump
>>  ->  Result  (cost=0.00..5.25 rows=1000 width=32) (actual
>> time=0.024..1.989 rows=1939 loops=1)
>>CTE pts
>>  ->  CTE Scan on dump  (cost=0.00..20.00 rows=1000 width=36) (actual
>> time=0.032..4.071 rows=1939 loops=1)
>>->  Sort  (cost=11611681.65..11611768.65 rows=34800 width=4) (actual
>> time=1171.813..1171.813 rows=1 loops=1)
>>  Sort Key: data.elevatio

Re: [GENERAL] Improve PostGIS performance with 62 million rows?

2017-01-09 Thread Rémi Cura

Hey,
I like your curiosity !

At the billion range, you __have__ to use pgpointcloud,
pyramid raster solution (actually the more common way to perform this task)
or another database (hello monetdb).
Cheers,
Rémi-C

2017-01-09 20:11 GMT+01:00 Jonathan Vanasco :

>
> On Jan 9, 2017, at 12:49 PM, Israel Brewster wrote:
>
> >  Planning time: 4.554 ms
> >  Execution time: 225998.839 ms
> > (20 rows)
> >
> > So a little less than four minutes. Not bad (given the size of the
> database), or so I thought.
> >
> > This morning (so a couple of days later) I ran the query again without
> the explain analyze to check the results, and noticed that it didn't take
> anywhere near four minutes to execute. So I ran the explain analyze again,
> and got this:
>
> ...
>
> >  Planning time: 0.941 ms
> >  Execution time: 9636.285 ms
> > (20 rows)
> >
> > So from four minutes on the first run to around 9 1/2 seconds on the
> second. Presumably this difference is due to caching? I would have expected
> any caches to have expired by the time I made the second run, but the data
> *is* static, so I guess not. Otherwise, I don't know how to explain the
> improvement on the second run - the query plans appear identical (at least
> to me). *IS* there something else (for example, auto vacuum running over
> the weekend) that could explain the performance difference?
>
>
> This may sound crazy, but I suggest running each of these scenarios 3+
> times:
>
> # cold explain
> stop postgres
> start postgres
> explain analyze SELECT
>
> # cold select
> stop postgres
> start postgres
> enable \t for query timing
> SELECT
>
> # cold explain to select
> stop postgres
> start postgres
> explain analyze SELECT
> enable \t for query timing
> SELECT
>
> # cold select to explain
> stop postgres
> start postgres
> enable \t for query timing
> SELECT
> explain analyze SELECT
>
> # cold select to select
> stop postgres
> start postgres
> enable \t for query timing
> SELECT
> SELECT
>
> I've found the timing for "Explain Analyze" to be incredibly different
> from an actual SELECT on complex/large dataset queries... and the
> differences don't seem to correlate to possible speedups from index/table
> caching.
>
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

[GENERAL] bloated postgres data folder, clean up

2016-02-29 Thread Rémi Cura

Hey dear list,
after a fex years of experiments and crash,
I ended up with a grossly bloated postgres folder.
I had about 8 Go of useless files.
All is in a virtualbox, so I'm sure to be able to reproduce exactly, and
fried my postgres folder a couple of time before getting it right.

Julien (Rouhaud) helped me to find those useless files via SQL.
The idea is to list files in postgres directory with `pg_ls_dir`, then to
check that the dir name correspond to something useful (using
pg_relation_filenode).
--
https://gist.github.com/Remi-C/926eaee04d61a7245eb8
--

To be sure I export the found files list,
then use oid2name to check that no file is recognized.

files can then be deleted (using plpythonu in my case).

So far a vacuum full analyze raise no errors.

Warning : for this to work, the SQL query must be sent while connected to
the database to clean.

Hope this may be useful
Cheers,
Rémi-C

Re: [GENERAL] bloated postgres data folder, clean up

2016-03-02 Thread Rémi Cura

Would gladly do it,
but still this "wiki cooloff" stuff,
can't create a page
Cheers,
Rémi-C

2016-02-29 20:44 GMT+01:00 Alvaro Herrera :

> Rémi Cura wrote:
> > Hey dear list,
> > after a fex years of experiments and crash,
> > I ended up with a grossly bloated postgres folder.
> > I had about 8 Go of useless files.
>
> Would you add a new page to the wiki with this?
>
> https://wiki.postgresql.org/wiki/Category:Administrative_Snippets
>
> --
> Álvaro Herrerahttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Re: [GENERAL] bloated postgres data folder, clean up

2016-03-02 Thread Rémi Cura

Hey,
this is quite the *opposite*.
The function find files in the postgres database folder that are not used
by the database.

To use it :
 * connect to the database you want to analyse ( **mandatory** ).
 * create the function (execute function definition)
 * Execute `SELECT * FROM find_useless_postgres_file('your_database_name')`

This will output a list of files that are on the disk but not used by
postgres,
and so can be removed.

To be extra sure, you should use oid2name programme to check that the
useless files are really useless.


For this :
 * output the list of potential useless files with copy for instance
  ex :
  COPY ( SELECT file_name
 FROM find_useless_postgres_file('your_database_name')
   ) TO 'path_to_you_database_folder/potential_useless.txt'

   now you've got a file with a list of potential erroneous files.

 * Then use oid2name
  `$su postgres
   $cd path_to_you_database_folder
   $while read i; do oid2name -f "$i" -i -S -q -d your_database_name; done
< potential_useless.txt
  `

  Nothing should show, meaning that every potential erroneous file
   has not been recognized by oid2name !
  If you feel unconvinced, you can manually try oid2name on some
   of the potential erroneous files, to be extra sure.
  It should not find anything.

 * Now delete all the files in `potential_useless.txt`.
  It could be wiser to not delete the files but rename those
  (for instance, adding `.potentially_useless` as a postfix)
  so if it breaks something, you have an easy way to revert everything.

Anyway, use *-*extra extra*-* caution if you delete.
Except a backup, there would be no easy way to correct a mistake.

Cheers,
Rémi-C

2016-03-02 15:38 GMT+01:00 Johnny Morano :

> Hi Remi!
>
>
>
> This SQL function you have provided, seems to return all valid files, is
> that correct? In my case, it returned all my ‘base/’ files. Is that normal?
>
> If yes, maybe you rename the function to ‘find_useful_postgres_files’ ;-)
>
>
>
> Could you explain in steps how to use this function to make a cleanup of
> bloated data? (like in an example with commands and example output, if
> possible of course)
>
>
>
> Thanks!
>
>
>
>
>
> Mit besten Grüßen / With best regards,
>
> Johnny Morano
>
> 
>
>
>
> *Johnny Morano  | Principal Systems Engineer*
>
>
>
> PAY.ON GmbH  |  AN ACI WORLDWIDE COMPANY  |  WWW.PAYON.COM
> <http://www.payon.com/>
>
> Jakob-Haringer-Str. 1  |  5020 Salzburg  |  Austria
>
> Registered at: LG Salzburg  |  Company number: FN 315081 f  |  VAT-ID:
> ATU64439405
>
> Managing Director: Christian Bamberger
>
>
>
>
>
> Follow us on:
>
>
>
> [image: cid:image001.jpg@01D126D0.E1AB0670] <http://blog.payon.com/>  [image:
> cid:image002.jpg@01D126D0.E1AB0670]
> <http://www.linkedin.com/company/146260?trk=tyah>  [image:
> cid:image003.jpg@01D126D0.E1AB0670] <https://twitter.com/PAYON_com>
>
>
>
> This email message and any attachments may contain confidential,
> proprietary or non-public information. This information is intended solely
> for the designated recipient(s). If an addressing or transmission error has
> misdirected this email, please notify the sender immediately and destroy
> this email. Any review, dissemination, use or reliance upon this
> information by unintended recipients is prohibited. Any opinions expressed
> in this email are those of the author personally.
>
>
>
> This message and any attachments have been scanned for viruses prior
> leaving PAY.ON; however, PAY.ON does not guarantee the security of this
> message and will not be responsible for any damages arising as a result of
> any virus being passed on or arising from any alteration of this message by
> a third party. PAY.ON may monitor e-mails sent to and from PAY.ON.
>
>
>
>
>
>
>
>
>
> *From:* pgsql-general-ow...@postgresql.org [mailto:
> pgsql-general-ow...@postgresql.org] *On Behalf Of *Rémi Cura
> *Sent:* Mittwoch, 2. März 2016 14:58
> *To:* Alvaro Herrera
> *Cc:* PostgreSQL General
> *Subject:* Re: [GENERAL] bloated postgres data folder, clean up
>
>
>
> Would gladly do it,
>
> but still this "wiki cooloff" stuff,
>
> can't create a page
>
> Cheers,
>
> Rémi-C
>
>
>
> 2016-02-29 20:44 GMT+01:00 Alvaro Herrera :
>
> Rémi Cura wrote:
> > Hey dear list,
> > after a fex years of experiments and crash,
> > I ended up with a grossly bloated postgres folder.
> > I had about 8 Go of useless files.
>
> Would you add a new page to the wiki with this?
>
> https://wiki.postgresql.org/wiki/Category:Administrative_Snippets
>
> --
> Álvaro Herrerahttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>
>

Re: [GENERAL] bloated postgres data folder, clean up

2016-03-03 Thread Rémi Cura

ile_name
>
> # The CSV header only
>
>
>
> So, no bloated files.
>
> But, using the function from
> https://wiki.postgresql.org/wiki/Show_database_bloat I get:
>
>
>
> live=# select tbloat,wasted_space from table_bloat order by wasted_space
> desc limit 25;
>
> tbloat │ wasted_space
>
> ┼──
>
> 1.0 │ 9976 kB
>
> 1.2 │ 98 GB
>
> 1.0 │ 97 MB
>
> 1.4 │ 96 kB
>
> 1.2 │ 920 kB
>
> 1.2 │ 88 kB
>
> 1.1 │ 88 kB
>
> 2.0 │ 8192 bytes
>
> 0.0 │ 8192 bytes
>
> 1.3 │ 8192 bytes
>
> 2.0 │ 8192 bytes
>
> 1.3 │ 8192 bytes
>
> 1.5 │ 8192 bytes
>
> 1.5 │ 8192 bytes
>
> 2.0 │ 8192 bytes
>
> 1.1 │ 8192 bytes
>
> 1.0 │ 8192 bytes
>
> 1.1 │ 8192 bytes
>
> 1.3 │ 8192 bytes
>
> 1.5 │ 8192 bytes
>
> 1.1 │ 80 kB
>
> 1.0 │ 7584 kB
>
> 1.6 │ 71 MB
>
> 1.0 │ 704 kB
>
> 1.1 │ 6968 kB
>
> (25 rows)
>
>
>
> So actually, quite a lot of bloated data J
>
> What am I doing wrong?
>
>
>
> Mit freundlichen Grüßen / With kind regards,
>
> Johnny Morano
>
> 
>
>
>
> *Johnny Morano  |  Principal Systems Engineer*
>
>
>
> PAY.ON GmbH  |  AN ACI WORLDWIDE COMPANY  |  WWW.PAYON.COM
> <http://www.payon.com/>
>
> Jakob-Haringer-Str. 1  |  5020 Salzburg  |  Austria
>
>
>
> This email message and any attachments may contain confidential,
> proprietary or non-public information. This information is intended solely
> for the designated recipient(s). If an addressing or transmission error has
> misdirected this email, please notify the sender immediately and destroy
> this email. Any review, dissemination, use or reliance upon this
> information by unintended recipients is prohibited. Any opinions expressed
> in this email are those of the author personally.
>
>
>
> *From:* Rémi Cura [mailto:remi.c...@gmail.com]
> *Sent:* Mittwoch, 2. März 2016 17:49
> *To:* Johnny Morano
> *Cc:* Alvaro Herrera; PostgreSQL General
>
> *Subject:* Re: [GENERAL] bloated postgres data folder, clean up
>
>
>
> Hey,
>
> this is quite the *opposite*.
>
> The function find files in the postgres database folder that are not used
> by the database.
>
> To use it :
>
>  * connect to the database you want to analyse ( **mandatory** ).
>
>  * create the function (execute function definition)
>
>  * Execute `SELECT * FROM find_useless_postgres_file('your_database_name')`
>
>
> This will output a list of files that are on the disk but not used by
> postgres,
>
> and so can be removed.
>
> To be extra sure, you should use oid2name programme to check that the
> useless files are really useless.
>
>
> For this :
>  * output the list of potential useless files with copy for instance
>   ex :
>   COPY ( SELECT file_name
>
>  FROM find_useless_postgres_file('your_database_name')
>
>) TO 'path_to_you_database_folder/potential_useless.txt'
>
>now you've got a file with a list of potential erroneous files.
>
>  * Then use oid2name
>
>   `$su postgres
>$cd path_to_you_database_folder
>
>$while read i; do oid2name -f "$i" -i -S -q -d your_database_name; done
> < potential_useless.txt
>   `
>
>   Nothing should show, meaning that every potential erroneous file
>has not been recognized by oid2name !
>
>   If you feel unconvinced, you can manually try oid2name on some
>of the potential erroneous files, to be extra sure.
>   It should not find anything.
>
>
>
>  * Now delete all the files in `potential_useless.txt`.
>
>   It could be wiser to not delete the files but rename those
>
>   (for instance, adding `.potentially_useless` as a postfix)
>
>   so if it breaks something, you have an easy way to revert everything.
>
>
>
> Anyway, use *-*extra extra*-* caution if you delete.
> Except a backup, there would be no easy way to correct a mistake.
>
> Cheers,
>
> Rémi-C
>
>
>
> 2016-03-02 15:38 GMT+01:00 Johnny Morano :
>
> Hi Remi!
>
>
>
> This SQL function you have provided, seems to return all valid files, is
> that correct? In my case, it returned all my ‘base/’ files. Is that normal?
>
> If yes, maybe you rename the function to ‘find_useful_postgres_files’ ;-)
>
>
>
> Could you explain in steps how to use this function to make a cleanup of
> bloated data? (like in an example with commands and example output, if
> possible of course)
>
&

[GENERAL] PLPythonu for production server

2016-03-03 Thread Rémi Cura

Hey List,

would it be considered safe to use plpythonu for a production database?
What would be the limitations/ dangers?

Thanks,
Cheers,
Rémi-C

Re: [GENERAL] PLPythonu for production server

2016-03-04 Thread Rémi Cura

Thanks for the answer guys.

I should have mentionned that I had read the doc,
and was looking for non explicit knowledge,
like :
- what is the reputation of plpython for a dba?
- are there actual production system that use it
- what would be the recommended usage perimeter ?
  (only administration script like function, advanced processing, etc ...)

Cheers,
Rémi-C

2016-03-03 20:55 GMT+01:00 David G. Johnston :

> On Thu, Mar 3, 2016 at 12:35 PM, Adrian Klaver 
> wrote:
>
>> On 03/03/2016 10:09 AM, Rémi Cura wrote:
>>
>>> Hey List,
>>>
>>> would it be considered safe to use plpythonu for a production database?
>>> What would be the limitations/ dangers?
>>>
>>
>> They are explained here:
>>
>> http://www.postgresql.org/docs/9.5/interactive/plpython.html
>>
>> "PL/Python is only available as an "untrusted" language, meaning it does
>> not offer any way of restricting what users can do in it and is therefore
>> named plpythonu. A trusted variant plpython might become available in the
>> future if a secure execution mechanism is developed in Python. The writer
>> of a function in untrusted PL/Python must take care that the function
>> cannot be used to do anything unwanted, since it will be able to do
>> anything that could be done by a user logged in as the database
>> administrator. Only superusers can create functions in untrusted languages
>> such as plpythonu."
>>
>
> See also:
>
> http://www.postgresql.org/docs/9.5/interactive/sql-grant.html
>
> GRANT { USAGE | ALL [ PRIVILEGES ] }
> ON LANGUAGE lang_name [, ...]
> TO role_specification [, ...] [ WITH GRANT OPTION ]
>
> and
>
> GRANT { EXECUTE | ALL [ PRIVILEGES ] }
> ON { FUNCTION function_name ( [ [ argmode ] [ arg_name ] arg_type [,
> ...] ] ) [, ...]
>  | ALL FUNCTIONS IN SCHEMA schema_name [, ...] }
> TO role_specification [, ...] [ WITH GRANT OPTION ]
>
> David J.
>
>

Re: [GENERAL] bloated postgres data folder, clean up

2016-03-04 Thread Rémi Cura

Hey Julien,
thanks for the original query !

There was a slight mistake in the query,
it was comparing the file name with
pg_class.relfilenode.
It is not safe in some case (see doc : "caution" in here
<http://www.postgresql.org/docs/current/static/storage-file-layout.html>)
, so better use the pg_relation_filenode() function.
AS a result this database could not be started anymore (no worry I had
copy).
However using pg_relation_filenode() seems to be safe (passes vacuum full
analyse).

I'll modify the query as soon as I have access to gist.

I agree the warning about tablespace is also important.
I'll put all of this on the wiki as soon has I have permission to create a
new page
(man, how long is this "cool-off", it's been already several days !).

Cheers,
Rémi-C

2016-03-03 20:10 GMT+01:00 Julien Rouhaud :

> On 03/03/2016 18:15, Rémi Cura wrote:
> > Hey,
>
> Hello Rémi,
>
> > first I forgot something in the querry to remove the annoying .XXX :
> > ---
> > SELECT distinct substring(file_name from '\d+' )
> > FROM find_useless_postgres_file('your_database_name') ;
> > ---
> >
>
> Thanks for working on this :)
>
> I added a comment on the gist URL you provided. It's a simplified
> version of the main query that should work fine and detect more orphan
> files. Double checking it would be a good idea though.
>
> Also, as you can't check other databases than the one you're connected
> to, I used current_database() instead of user defined database name.
>
> It's also important to warn that all of this only work for finding
> orphan files on the default
> 
> tablespace (and to never blindly remove
> files in the PGDATA of course).
> 
>
>

Re: [GENERAL] PLPythonu for production server

2016-03-05 Thread Rémi Cura

Thanks !
Cheers,
Rémi-C

2016-03-05 0:38 GMT+01:00 Adrian Klaver :

> On 03/04/2016 01:46 AM, Rémi Cura wrote:
>
>> Thanks for the answer guys.
>>
>> I should have mentionned that I had read the doc,
>> and was looking for non explicit knowledge,
>> like :
>> - what is the reputation of plpython for a dba?
>> - are there actual production system that use it
>> - what would be the recommended usage perimeter ?
>>(only administration script like function, advanced processing, etc
>> ...)
>>
>
> An example:
>
> http://bonesmoses.org/2016/03/04/pg-phriday-being-a-tattletale/
>
>
>
>> Cheers,
>> Rémi-C
>>
>>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>

[GENERAL] python modul pre-import to avoid importing each time

2014-06-19 Thread Rémi Cura

Hey List,

I use plpython with postgis and 2 python modules (numpy and shapely).
Sadly importing such module in the plpython function is very slow (several
hundreds of milliseconds).

I also don't know if this overhead is applied each time the function is
called in the same session.

Is there a way to pre-import those modules once and for all,
such that the python function are accelerated?

Thanks,

Cheers,
Rémi-C

Re: [GENERAL] python modul pre-import to avoid importing each time

2014-06-26 Thread Rémi Cura

Hey,
thanks for your answer !

Yep you are right, the function I would like to test are going to be called
a lot (100k times), so even 15 ms per call matters.

I'm still a bit confused by a topic I found here :
http://stackoverflow.com/questions/15023080/how-are-import-statements-in-plpython-handled
The answer gives a trick to avoid importing each time, so somehow it must
be usefull.

On another internet page (can't find it anymore) somebody mentioned this
module loading at server startup, one way or another, but gave no
precision. It seems that the "plpy" python module get loaded by default,
would'nt it be possible to hack this module to add other import inside it?

I also use PL/R (untrusted I guess) and you can create a special table to
indicate which module to load at startup.

Cheers,
Rémi-C

2014-06-25 21:46 GMT+02:00 Jeff Janes :

> On Thu, Jun 19, 2014 at 7:50 AM, Rémi Cura  wrote:
> > Hey List,
> >
> > I use plpython with postgis and 2 python modules (numpy and shapely).
> > Sadly importing such module in the plpython function is very slow
> (several
> > hundreds of milliseconds).
>
> Is that mostly shapely (which I don't have)?  numpy seems to be pretty
> fast, like 16ms.  But that is still slow for what you want, perhaps.
>
> >
> > I also don't know if this overhead is applied each time the function is
> > called in the same session.
>
> It is not.  The overhead is once per connection, not once per call.
> So using a connection pooler could be really be a help here.
>
> > Is there a way to pre-import those modules once and for all,
> > such that the python function are accelerated?
>
> I don't think there is.  With plperl you can do this by loading the
> module in plperl.on_init and by putting plperl into
> shared_preload_libraries so that this happens just at server start up.
> But I don't see a way to do something analogous for plpython due to
> lack of plpython.on_init.  I think that is because the infrastructure
> to do that is part of making a "trusted" version of the language,
> which python doesn't have.  (But it could just be that no one has ever
> gotten around to adding it.)
>
> Cheers,
>
> Jeff
>

Re: [GENERAL] python modul pre-import to avoid importing each time

2014-06-27 Thread Rémi Cura

Hey,
thanks, now we have good information:

the python package are really loaded once per connection, so no
optimization is needed.
Unlike plperl or plR there is no easy way to preload packages.
There may be some solutions to make this import at connection start but it
would involve C modification (found no trace of python file or hackable sql
script in postgres source and install directory)

After that,
further optimization is possible by avoiding the useless 'import' (because
it is already loaded) (see the trick here
<http://stackoverflow.com/questions/15023080/how-are-import-statements-in-plpython-handled>
)
,however benefits are not proven.

My use case is simple geometry manipulation functions. It is easier to use
plpython rather than plpgsql thanks to numpy for vector manipulation.
Usually the functions are called inside complex query with many CTE, and
execute over 100k of rows. Total execution time is in the order of minutes.
(exemple of querry at the end)

Thanks everybody,
Rémi

Example of querry
CREATE TABLE holding_result AS
WITH the_geom AS (
SELECT gid, geom
FROM my_big_table --200k rows
)
SELECT gid, my_python_function(geom) AS result
FROM the_geom;

2014-06-27 4:27 GMT+02:00 Adrian Klaver :

> On 06/26/2014 02:14 AM, Rémi Cura wrote:
>
>> Hey,
>> thanks for your answer !
>>
>> Yep you are right, the function I would like to test are going to be
>> called a lot (100k times), so even 15 ms per call matters.
>>
>>
> I got to thinking about this.
>
> 100K over what time frame?
>
> How is it being called?
>
>
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>

Re: [GENERAL] how to create multiple databases running in different dirs

2014-07-01 Thread Rémi Cura

Hey,
postgres already takes care of multiple client writting/reading,
so you don't really need to be afraid of concurrency (for most of the stuff)

If it is so your desire, you could also have multiple server on the same
machine (althought on different port).
This way each server would have its own repository.

Cheers,
Rémi-C


2014-07-01 4:59 GMT+02:00 John R Pierce :

> On 6/30/2014 4:58 PM, frank ernest wrote:
>
>> Hi, I'm new to postgresql and sql in general. I desired to write a
>> program in C that used an sql data base for IPC and because multiple copies
>> of my program might run on the same machine I wanted a way to ensure that
>> only one copy of each multithreaded program got one database but I'm
>> uncertain how to go about this. As the program might be run several
>> different times and each time it should only get the data base from it's
>> predisesor I can't use the pid as a unique data base name. I thought that I
>> might start multiple pgsql instances but somehow that seems wrong. Any
>> ideas?
>>
>
> how would an instance of your program know what to connect to, or which
> previous instance its 'predecessor' was ?
>
> normally, you have ONE database for a given set of applications, and all
> the applications share the same database tables and such.
>
>
> --
> john r pierce  37N 122W
> somewhere on the middle of the left coast
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: [GENERAL] Design ? table vs. view?

2014-07-16 Thread Rémi Cura

Hey,
I guess you know all about PL/R,
the R language extension for postgres .
It is very convenient, though be carefull as sometime it crashed my server.

Cheers,
Rémi-C


2014-07-16 3:42 GMT+02:00 John McKown :

> On Tue, Jul 15, 2014 at 8:46 AM, David G Johnston <
> david.g.johns...@gmail.com> wrote:
>
>> John McKown wrote
>> > I have a table which has some "raw" data in it. By "raw", I mean it is
>> > minimally processed from a log file. Every week, I update this table by
>> > processing the weekly log using awk to create a "psql script" file which
>> > looks similar to:
>> >
>> 
>> > So the overhead may be quite high, because to SELECT from RUNINFO,
>> > PostgreSQL must realize all four views.
>> >
>> > I appreciate your thoughts on if this is OK, given that performance is
>> > currently acceptable. Mainly because this work is basically only done
>> one
>> > a
>> > week, on Sundays. And I don't do it myself, it is done via a scheduler
>> > (not
>> > cron, but similar) which runs some scripts.
>>
>> I would likely make "jobrun.runinfo" into a table while leaving
>> "jobrun.rawdata" as-is.  I would have a function that populates "runinfo"
>> from "rawdata" that I would call after performing the copy to "rawdata".
>> There would be no views - unless you desire a view interface over
>> "runinfo"
>> for API or permission reasons.
>>
>> In 9.4 you can (probably) make "runinfo" an explicit MATERIALIZED VIEW and
>> perform REFRESH command to accomplish the same thing - though I am not
>> particularly familiar with the mechanics of that feature.
>>
>> David J.
>>
>>
> Being the indecisive nut that I am, I am going to do both . I will
> keep the current view. But when I update the rawdata, what I will then do
> is:
>
> drop table runinfo_table;
> create table runinfo_table as select distinct * from runinfo;
>
> I am fairly confident that there cannot be any duplicates in runinfo. But,
> being paranoid as well, I will do the DISTINCT just to be sure. I may
> change the VIEW to do that in the future, and remove it from the
> preceeding. Since the process which updates the rawdata table is automated
> and runs on a Sunday, the time needed to recreate runinfo_table is not
> relevant to me. So I get what I want, unless I update rawdata off schedule.
> I cannot imagine why I would do that since the logs from which I create it
> are generally only available after 17:00 local time on Sunday. Getting the
> iogs-to-date information for the time since the last dump is basically a
> PITA and my current use is not critical. Actually, it is more a
> "skunkworks" project of my own to produce a set of nice graphs, using R,
> which _might_ turn out to be interesting to management, but the production
> of which _will_ help me learn PostgreSQL and R better (hopefully).
>
> Many thanks.
>
> --
> There is nothing more pleasant than traveling and meeting new people!
> Genghis Khan
>
> Maranatha! <><
> John McKown
>

[GENERAL] pg_advisory_lock problem

2014-08-10 Thread Rémi Cura

Hey dear list,

following the advise of Depesz I'm trying to use advisory lock.

I'm trying to perform parallel upsert.
I have 2 different sessions in which I run a very complicated querry (lot's
of CTE) concurrently. In fact this complicated querry is simply put inside
a plpgsql function for ease of calling.

the querry performs upsert in 3 different tables, each time using a
dedicated plpgsql function that looks like this :
-
PERFORM pg_advisory_lock(#1 , 2 or 3 depending on the table#,id of the
stuff I want to upsert) ;
WITH stuff_to_upsert ()
,updating AS (update returning id)
,inserting AS (insert if not updated)
PERFORM pg_advisory_unlock(same as above).


The querry call such plpgsql function like this
CTE_N(
SELECT r.*
FROM result_to_be_upserted, function_upserting(...) as r
)

Yet I still have errors of duplicated primary key being broken because
trying to insert 2 times the same stuff.
ERROR:  duplicate key value violates unique constraint
"result_intersection_pkey"
DETAIL:  Key (node_id)=(*12621*) already exists.

Of course when performing the querry sequencially I don't have any errors,
even performing it several times.

I have read the 2 pages relevant to pg_advisory lock, and I clean all the
advisory lock before executing the test that gives those errors.

After the errors happens (which means that 1 process completed and the
other failed), I can see that there is a lock in pg_locks with the id of
the row that caused the error when being upserted.
advisory;2953366;;3;*12621*;2;8/0;11380;ExclusiveLock;t;f

Any help is greatly appreciated, I have tried everything I could think of.

Thanks,
Cheers,
Rémi-C

Re: [GENERAL] pg_advisory_lock problem

2014-08-11 Thread Rémi Cura

2014-08-11 5:33 GMT+02:00 John R Pierce :

>
> -
>
>> PERFORM pg_advisory_lock(#1 , 2 or 3 depending on the table#,id of the
>> stuff I want to upsert) ;
>> WITH stuff_to_upsert ()
>> ,updating AS (update returning id)
>> ,inserting AS (insert if not updated)
>> PERFORM pg_advisory_unlock(same as above).
>> 
>>
>
>
> ah, you're releasing the lock before the insert is committed, since this
> is all within a function call, its entirely within a single transaction.
>
>
>
> --
> john r pierce  37N 122W
> somewhere on the middle of the left coast
>
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


Hey,
thanks your two !

Oh no I hoped it was something fixable
, because I don't think I can fix this transaction problem.
I guess because the big query is called in one plpgsql function it will be
in one transaction
,so even if I resorted to plpython in the upserting functions to issue
commit it shouldn't work.
The only way would be to transform the big query plpgsqp function into
another pl function,
but I can't really afford it :-(

Damn, it is really hard to use a table with 2 different session at the same
time!

Thanks for your answers, and for the much needed typo correction ;-)

Cheers,
Rémi-C

Re: [GENERAL] pg_advisory_lock problem

2014-08-11 Thread Rémi Cura

Hey,
as you (both?) suggested it works using advisory lock used at the beginning
and end of the transaction. This way there is no upsert at all if the
element is locked? (I used general advisory lockbut in the same way as
transactionnal lock)
The issue in this case is simple : I have to use about 100k advisory locks,
which is a big memory requirement for my hardware :-(

Merlin I'm afraid I don't understand what is "vanilla" LOCK TABLE.

I can't really use a lock table because each query upsert sequentially into
3 tables, doing lots of computing between.
I use parallel query to compute faster (load dividing). I guess it would be
very slow with about 8 parallel queries with locks.

I should test this lock approach to be sure.

Thanks both of you !
Cheers,
Rémi-C

2014-08-11 17:51 GMT+02:00 Merlin Moncure :

> On Mon, Aug 11, 2014 at 9:49 AM, Kevin Grittner  wrote:
> > Rémi Cura  wrote:
> >
> >> 2014-08-11 5:33 GMT+02:00 John R Pierce :
> >
> >>> ah, you're releasing the lock before the insert is committed,
> >>> since this is all within a function call, its entirely within a
> >>> single transaction.
> >
> >> Oh no I hoped it was something fixable
> >
> > Well, it might be.  Try using a transactional advisory lock and
> > letting it expire at the end of the transaction, rather than
> > explicitly releasing it before the transaction commits.  Depending
> > on some other details, that might get it to do what you want.
>
> Better to use vanilla LOCK TABLE statement in my opinion for this purpose.
>
> merlin
>

Re: [GENERAL] pg_advisory_lock problem

2014-08-12 Thread Rémi Cura

2014-08-11 22:48 GMT+02:00 Kevin Grittner :

> Rémi Cura  wrote:
>
> > as you (both?) suggested it works using advisory lock used at the
> > beginning and end of the transaction. This way there is no upsert
> > at all if the element is locked? (I used general advisory lockbut
> > in the same way as transactionnal lock)
>
> This is too vague to comment on.
>
> > The issue in this case is simple : I have to use about 100k
> > advisory locks, which is a big memory requirement for my hardware
> > :-(
>
> ... and that doesn't seem to make any sense.  Either you are not
> understanding advisory locks or you are doing something very, very
> unusual.
>
> > Merlin I'm afraid I don't understand what is "vanilla" LOCK
> > TABLE.
>
> See the LOCK TABLE command.
>
> http://www.postgresql.org/docs/current/interactive/sql-lock.html
>
> http://www.postgresql.org/docs/current/interactive/explicit-locking.html
>
> > I can't really use a lock table because each query upsert
> > sequentially into 3 tables, doing lots of computing between.
>
> Now *that* I understand.  :-)  It's not an unusual requirement,
> but can be a challenge when using snapshot isolation (where writes
> don't block reads and reads don't block anything).  There are two
> main approaches -- introduce blocking to serialize some of the
> operations, or use the SERIALIZABLE transaction isolation level to
> ensure that the behavior of all concurrent transactions is
> consistent with the behavior you would see if they were run one at
> a time.  The latter approach doesn't introduce any new blocking,
> but it can cause transactions to get an ERROR with a SQLSTATE of
> 40001 at just about any point, so you need to be prepared to
> recognize that and retry those transactions from the beginning (not
> just the last statement of the transaction), ignoring any data read
> during the failed attempt.
>
> You may want to read the entire chapter on concurrency control:
>
> http://www.postgresql.org/docs/current/interactive/mvcc.html
>
> If you are considering using SERIALIZABLE transactions, you should
> probably review the examples in the Wiki, to get an idea of how it
> behaves in various cases:
>
> http://wiki.postgresql.org/wiki/SSI
>
> > I use parallel query to compute faster (load dividing). I guess
> > it would be very slow with about 8 parallel queries with locks.
>
> Well, if you introduce blocking you reduce your parallelism, but if
> you use serializable transactions and there are actually a lot of
> conflicts you can see poor performance because of the errors
> rolling back transactions and the need to retry them from the
> start.  The techniques used to implement serializable transactions
> in PostgreSQL are basically a refinement of the Optimistic
> Concurrency Control (OCC) techniques, but generally with far fewer
> retries needed -- the point being that it optimistically assumes
> that there will not be a conflict so that concurrency is better,
> but has to cancel things if that optimism proves to be unfounded.
>
> To make related to changes to multiple tables and maintain coherent
> data, you probably will need to do one or the other.
>
> --
> Kevin Grittner
> EDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


Hey, thanks for your detailed answer.

The particularity here is that I use a big query with CTE instead of a more
procedural way.
I do sophisticated geometric computing using postGIS. I guess it is a hack
of both postgres and postgis.

I explain better the pg_advisory locks uses I have tried.

*First classic use of pg_advisory, not working :*
CTE_1 (find what rows will be upserted in table_1)
CTE_2 (find what rows will be upserted in table_2)
CTE_3 (find what rows will be upserted in table_3)
CTE_4 (compute the result to be upserted into table_1)
CTE_5 (upsert into table_1 using custom upsert_function)
CTE_6 (compute the result to be upserted into table_2)
CTE_7 (upsert into table_2 using custom upsert_function)
CTE_8 (compute the result to be upserted into table_2)
CTE_9 (upsert into table_2 using custom upsert_function)
CTE_10 (end of query)
each of the upserting function is plpgsql and do something like
pg_advisory_lock(table_number, id of row to be upserted)
with updating AS (update table)
insert into table if not updated
pg_advisory_unlock(table_number,id of row to be upserted)

According to what the list said, it doesn't work because of visibility
issues : the locking work, so we know each processes will upsert the same
thing sequentially. However it will fail because each process has no
visibility on the insert done by the others. So when the second process
upsert the same thing, it will try to ins

Re: [GENERAL] pg_advisory_lock problem

2014-08-20 Thread Rémi Cura

Hey,
just a quick follow-up for archive:

problem solved with pg_try_advisory...
using lock (ROW EXCLUSIVE) does'nt work (same duplicate key value)
using lock (ACCESS EXCLUSIVE) work (no error) but stop the parallelism (one
thread as to wait for another to go on).

The optimal solution would be to slice the big CTE into several successive
temp table creation, using a procedural language allowing control of
transaction.
This would allow to control visibility and use the powerful MVCC features.
(I'm unwilling to do that).

Cheers everybody and many thanks for the help,
Rémi-C


2014-08-12 14:45 GMT+02:00 Merlin Moncure :

> On Tue, Aug 12, 2014 at 5:07 AM, Rémi Cura  wrote:
> >
> >
> >
> > 2014-08-11 22:48 GMT+02:00 Kevin Grittner :
> >
> >> Rémi Cura  wrote:
> >>
> >> > as you (both?) suggested it works using advisory lock used at the
> >> > beginning and end of the transaction. This way there is no upsert
> >> > at all if the element is locked? (I used general advisory lockbut
> >> > in the same way as transactionnal lock)
> >>
> >> This is too vague to comment on.
> >>
> >> > The issue in this case is simple : I have to use about 100k
> >> > advisory locks, which is a big memory requirement for my hardware
> >> > :-(
> >>
> >> ... and that doesn't seem to make any sense.  Either you are not
> >> understanding advisory locks or you are doing something very, very
> >> unusual.
> >>
> >> > Merlin I'm afraid I don't understand what is "vanilla" LOCK
> >> > TABLE.
> >>
> >> See the LOCK TABLE command.
> >>
> >> http://www.postgresql.org/docs/current/interactive/sql-lock.html
> >>
> >>
> http://www.postgresql.org/docs/current/interactive/explicit-locking.html
> >>
> >> > I can't really use a lock table because each query upsert
> >> > sequentially into 3 tables, doing lots of computing between.
> >>
> >> Now *that* I understand.  :-)  It's not an unusual requirement,
> >> but can be a challenge when using snapshot isolation (where writes
> >> don't block reads and reads don't block anything).  There are two
> >> main approaches -- introduce blocking to serialize some of the
> >> operations, or use the SERIALIZABLE transaction isolation level to
> >> ensure that the behavior of all concurrent transactions is
> >> consistent with the behavior you would see if they were run one at
> >> a time.  The latter approach doesn't introduce any new blocking,
> >> but it can cause transactions to get an ERROR with a SQLSTATE of
> >> 40001 at just about any point, so you need to be prepared to
> >> recognize that and retry those transactions from the beginning (not
> >> just the last statement of the transaction), ignoring any data read
> >> during the failed attempt.
> >>
> >> You may want to read the entire chapter on concurrency control:
> >>
> >> http://www.postgresql.org/docs/current/interactive/mvcc.html
> >>
> >> If you are considering using SERIALIZABLE transactions, you should
> >> probably review the examples in the Wiki, to get an idea of how it
> >> behaves in various cases:
> >>
> >> http://wiki.postgresql.org/wiki/SSI
> >>
> >> > I use parallel query to compute faster (load dividing). I guess
> >> > it would be very slow with about 8 parallel queries with locks.
> >>
> >> Well, if you introduce blocking you reduce your parallelism, but if
> >> you use serializable transactions and there are actually a lot of
> >> conflicts you can see poor performance because of the errors
> >> rolling back transactions and the need to retry them from the
> >> start.  The techniques used to implement serializable transactions
> >> in PostgreSQL are basically a refinement of the Optimistic
> >> Concurrency Control (OCC) techniques, but generally with far fewer
> >> retries needed -- the point being that it optimistically assumes
> >> that there will not be a conflict so that concurrency is better,
> >> but has to cancel things if that optimism proves to be unfounded.
> >>
> >> To make related to changes to multiple tables and maintain coherent
> >> data, you probably will need to do one or the other.
> >>
> >> --
> >> Kevin Grittner
> >> EDB: http://www.enterprisedb.com
> >> The Enterprise PostgreSQL Company
> >
> >
> >
>

Re: [GENERAL] PostgreSQL Portable

2014-09-10 Thread Rémi Cura

Hey,
I'm working in GIS field and I had the same problems.
Solution I found, which has been working for the past year :
virtual box on external drive !
This way you can have an independent OS (Linux for easy
postgres/postgis/whatever gis you want).

I find it very comfortable because my server is separated from guest os. So
I can take the disk and work on any pc with virtual box installed (require
admin right), and I have all GIS tools on the server, so the virtual
machine is very self contained.
It is also easy to backup (but very slow due to huge iso file).

I use a USB2 okay-ish disk. Guest win XP 64 / win seven 32 ; Host Ubuntu
12.04 32b.
About perfo : I do complex queries. Perf are OK for my use case (about same
as a dedicated XP 32bit).

Using the external disk to hold a table space is a __very__ bad idea.
As soon you do some upgrade/the disk get disconnected/anything happen, you
are really screwed.
(I had the issue. Without backup you can't do much without very strong
postgres skills)

Cheers,
Rémi-C

2014-09-10 23:50 GMT+02:00 Steve Crawford :

>  On 09/10/2014 02:00 PM, Daniel Begin wrote:
>
>  First, I am a Newbie regarding PostgreSQL …
>
>
>
> I just started to look at PostgreSQL to implement a large GIS DB (1Tb).
> The data must reside in an external disk with eSATA connection and may be
> moved to different locations (and Windows desktops/laptops). I was looking
> to install PostgreSQL and PostGIS extensions on each PC (setting-up the
> proper PGDATA directory to the external disk) until I read about PostgreSQL
> and PgAdmin Portable …
>
>
>
> http://sourceforge.net/projects/pgadminportable/
>
> http://sourceforge.net/projects/postgresqlportable/
>
>
>
> Is that a viable alternative considering the expected size of the DB? Any
> comments or proposal would be appreciated J
>
> Daniel
>
>
> It appears you are looking to take the PostgreSQL data directory from
> machine to machine on an external drive. I fear you will run into some
> potential problems:
>
> 1. Performance (mentioned by others).
>
> 2. OS mismatch. Have you ensured that all client machines are running
> identical setups? The underlying files are not guaranteed portable between
> OS versions and 64/32-bit. In fact they probably won't be.
>
> 3. Backups. What happens when one user screws up the database?
>
> Perhaps you could explain further the genesis of this requirement. The
> message list is littered with questions like this asking how to implement a
> certain solution when, given an understanding of the reason the question is
> being asked, a far better solution exists. This happens even more often
> when the person asking is a "newbie."
>
> Cheers,
> Steve
>
>

Re: [GENERAL] permission denied for schema topology

2014-09-11 Thread Rémi Cura

Hey,
if you are using postgis and postgis_topology,
there are specific backup/restore process.
Cheers,
Rémi-C

2014-09-11 12:49 GMT+02:00 Iain Mott :

> Hello,
>
> The server for my websites was recently changed and upgraded. I have ssh
> access to the server and since the upgrade I am no longer able to use
> "pg_dump" to perform scripted backups. I've written to the site's
> support services, but until now, they've not been able to help (they've
> responded saying they will likely need to refer the case on to a
> developer - I don't  have any guarantees that it will be resolved).
>
> I've googled for this - but didn't come up with any relevant solutions.
> Can someone on this list please make suggestions that I can pass on to
> the technical support?
>
> Here's what happens (the important error messages are in English):
>
>
> [~]# pg_dump mydatabase > dump.sql
> Senha:
> pg_dump: comando SQL falhou
> pg_dump: Mensagem de erro do servidor: ERROR:  permission denied for
> schema topology
> pg_dump: O comando foi: LOCK TABLE topology.topology IN ACCESS SHARE MODE
>
>
> ---
>
> I am able to perform dumps of the databases via phpPdAdmin in the
> "cpanel" of the server, but this is going very inconvenient - hoping to
> use pg_dump
>
> Thanks,
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: [GENERAL] permission denied for schema topology

2014-09-11 Thread Rémi Cura

Yep,
this means you are using postgis.
I don't know if you use postgis topology.
If you have a schema topology containing a topology table, then you also
have postgis_topology installed.

You can check this with this query : "SELECT postgis_full_version();"

Maybe the extensions have been created with another user id than your's,
hence the trouble.
Another trouble could come from the fact that postgis and postgis_topology
may have been installed without using postgres CREAtE EXTENSION (old
version of postgis).

Cheers,
Rémi-C

2014-09-11 13:12 GMT+02:00 Iain Mott :

> Thanks Rémi-C for the quick reply. By coincidence my site does involve
> mapping, however I wasn't aware that the server might be using postgis
> and postgis_topology. In psql when I type "\d" i get the following:
>
> public  | comentarios   | tabela | myusername
>  public  | featuredata   | tabela | myusername
>  public  | geography_columns | visão  | myusername
>  public  | geometry_columns  | visão  | myusername
>  public  | raster_columns| visão  | myusername
>  public  | raster_overviews  | visão  | myusername
>  public  | rss   | tabela | myusername
>  public  | spatial_ref_sys   | tabela | myusername
>
> geography_columns, geometry_columns, raster_columns, raster_overviews
> and spatial_ref_sys are all recent changes to my database (I didn't
> create them and I wasn't aware of their existence until the switch to
> the new server). Do these and the errors I'm receiving indicate the use
> of postgis and postgis_topology by the server?
>
> Just googled for doing a dump of a postgis database and encountered this
> command:
>
> pg_dump --no-acl --no-owner $DATABASE > dump.sql
>
> I tried it, but it results in the same error messages
>
> Cheers,
>
>
>
>
>
>
>
>
>
>  Em Qui, 2014-09-11 às 12:53 +0200, Rémi Cura escreveu:
> > Hey,
> >
> > if you are using postgis and postgis_topology,
> > there are specific backup/restore process.
> > Cheers,
> > Rémi-C
> >
> >
> > 2014-09-11 12:49 GMT+02:00 Iain Mott :
> > Hello,
> >
> > The server for my websites was recently changed and upgraded.
> > I have ssh
> > access to the server and since the upgrade I am no longer able
> > to use
> > "pg_dump" to perform scripted backups. I've written to the
> > site's
> > support services, but until now, they've not been able to help
> > (they've
> > responded saying they will likely need to refer the case on to
> > a
> > developer - I don't  have any guarantees that it will be
> > resolved).
> >
> > I've googled for this - but didn't come up with any relevant
> > solutions.
> > Can someone on this list please make suggestions that I can
> > pass on to
> > the technical support?
> >
> > Here's what happens (the important error messages are in
> > English):
> >
> >
> > [~]# pg_dump mydatabase > dump.sql
> > Senha:
> > pg_dump: comando SQL falhou
> > pg_dump: Mensagem de erro do servidor: ERROR:  permission
> > denied for schema topology
> > pg_dump: O comando foi: LOCK TABLE topology.topology IN ACCESS
> > SHARE MODE
> >
> >
> > ---
> >
> > I am able to perform dumps of the databases via phpPdAdmin in
> > the
> > "cpanel" of the server, but this is going very inconvenient -
> > hoping to
> > use pg_dump
> >
> > Thanks,
> >
> >
> >
> > --
> > Sent via pgsql-general mailing list
> > (pgsql-general@postgresql.org)
> > To make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-general
> >
> >
>
>
>

Re: [GENERAL] PostgreSQL Portable

2014-09-12 Thread Rémi Cura

Hey,
I had many external hard drive crash (savage unplug, power off, pc forced
restart).
The server on the virtual machine was never hurt, nor the data.

Cheers,
Rémi-C

2014-09-12 15:34 GMT+02:00 George Neuner :

> Hi Craig,
>
> On Fri, 12 Sep 2014 11:33:55 +0800, Craig Ringer
>  wrote:
>
> >On 09/11/2014 03:16 PM, George Neuner wrote:
> >>
> >> If the driver permits it and you [or your users] can be trusted to
> >> perform a safe unmount via the OS *before* disconnecting the device,
> >> then you can enable write caching for the device using the device
> >> manager.  [Note that the device must be connected for it to be visible
> >> in the device manager.]
> >
> >It shouldn't be living dangerously, actually.
> >
> >While I haven't tested it myself, writeback caching on the external
> >drive should be safe so long as it continues to honour explicit disk
> >flush requests.
> >
> >That's why we have the WAL and do periodic checkpoints. If you yank the
> >drive mid-write you'll lose uncommitted transactions and might have
> >slower startup next time around, but it should otherwise not be overly
> >problematic.
>
> For the most part you're correct, but recall that WAL itself can be
> made asynchronous [see fsync() and synchronous_commit() settings] and
> the periodic OS sync also may be disabled - which doesn't affect WAL
> handling but may(?) affect the background writer.
>
> Even having synchronous WAL the most recent transactions can be lost
> if the log device fails *during* a write.  That's why, if we use
> external devices at all, we tend to use closely coupled devices - disk
> array, wired SAN, etc. - that aren't very likely to be physically
> disconnected.  And uninterruptible power all around 8-)
>
> A portable device can be reasonably safe if treated properly, but it
> never will be quite as safe as an internal device.
>
> George
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: [GENERAL] csv import error

2014-09-23 Thread Rémi Cura

Why don't you use "COPY" ?
Cheers,
Rémi-C

2014-09-23 12:50 GMT+02:00 FarjadFarid(ChkNet) <
farjad.fa...@checknetworks.com>:

> Hi,
>
>
>
> Quotation marks should be around both the name of each and every column
> and their values.
>
>
>
> Also replace column comas as separation character for sql insert
> statement.
>
> What has happened here is that the values from CSV are directly  into sql.
>
>  Hope this helps.
>
>
>
> Best Regards
>
>
>
>
>
> Farjad Farid
>
>
>
> *From:* pgsql-general-ow...@postgresql.org [mailto:
> pgsql-general-ow...@postgresql.org] *On Behalf Of *Eugenio Trumpy
> *Sent:* 23 September 2014 11:26
> *To:* pgsql-general@postgresql.org
> *Subject:* [GENERAL] csv import error
>
>
>
> Hello,
>
> I'm trying to import data using a csv file,
> but I got an error:
>
>
> ERROR:  column "key;daprof;aprof;tipo;valore;note;oid;unit_mis" of relation 
> "assorb" does not exist
>
> LINE 1: INSERT INTO "info_pozzi_hydrocarbon"."assorb" ("key;daprof;a...
>
>^
>
> *In statement:*
>
> INSERT INTO "info_pozzi_hydrocarbon"."assorb"
> ("key;daprof;aprof;tipo;valore;note;oid;unit_mis") VALUES
> ('1001334;19.1;21;A;6;;;11')
>
>
> My sql statement for the table that I would like to populate is:
>
> -- Table: info_pozzi_hydrocarbon.assorb
>
> -- DROP TABLE info_pozzi_hydrocarbon.assorb;
>
> CREATE TABLE info_pozzi_hydrocarbon.assorb
> (
>   key integer,
>   daprof double precision,
>   aprof double precision,
>   tipo character(1),
>   valore double precision,
>   note character(254),
>   oid serial NOT NULL,
>   unit_mis smallint,
>   CONSTRAINT assorb_pk PRIMARY KEY (oid),
>   CONSTRAINT assorb_fk FOREIGN KEY (key)
>   REFERENCES pozzi (key) MATCH SIMPLE
>   ON UPDATE CASCADE ON DELETE CASCADE,
>   CONSTRAINT unit_m_fk FOREIGN KEY (unit_mis)
>   REFERENCES info_cod.unita (unita) MATCH SIMPLE
>   ON UPDATE NO ACTION ON DELETE NO ACTION
> )
> WITH (
>   OIDS=FALSE
> );
>
>
>
> the csv file is like the following (it is only a part):
>
>
> key;daprof;aprof;tipo;valore;note;oid;unit_mis
>
> 1001334;19.1;21;A;6;;;11
>
> 1001334;93.5;94;A;30;;;11
>
> 1001334;94;115;A;20;;;11
>
> 1001334;154.5;255;A;644;;;11
>
> 1001334;273;282;A;4;;;11
>
> 1001334;298;309;A;7;;;11
>
> 1001334;432;1224;P;1850;;;11
>
> 4277001;121;901;A;397;ALLARGAMENTO FORO;;11
>
> 4277001;121;901;A;96;PERFORAZIONE;;11
>
>
> The filed order it is the same and also the decimal separator is set as dot. 
> I set also the delimiter for fields.
> I don't understand what was the problem.
>
> Can somebody help me?
>
> Eugenio
>
>

Re: [GENERAL] DBlink, postgres to DB2

2014-10-22 Thread Rémi Cura

Hey
http://www.postgresql.org/docs/9.3/static/contrib-dblink-connect.html
Seems dblink is for postgres to postgres connections.
Cheers,
Rémi-C

2014-10-22 12:45 GMT+02:00 Postgres India :

> Hi,
> I am trying to connect DB2 from postgres using dblink, is there any
> configuration required at DB2 and postgres server.
>
> If any command string please provide it.
>
>
>
> Thanks
> Manmohan. K
>
>

Re: [GENERAL] DBlink, postgres to DB2

2014-10-22 Thread Rémi Cura

You may want to use foreign data wrapper (fdw)

https://wiki.postgresql.org/wiki/Foreign_data_wrappers

Cheers,
Rémi-C

2014-10-22 12:50 GMT+02:00 Rémi Cura :

> Hey
> http://www.postgresql.org/docs/9.3/static/contrib-dblink-connect.html
> Seems dblink is for postgres to postgres connections.
> Cheers,
> Rémi-C
>
> 2014-10-22 12:45 GMT+02:00 Postgres India :
>
>> Hi,
>> I am trying to connect DB2 from postgres using dblink, is there any
>> configuration required at DB2 and postgres server.
>>
>> If any command string please provide it.
>>
>>
>>
>> Thanks
>> Manmohan. K
>>
>>
>

[GENERAL] update several columns from function returning several values

2014-12-02 Thread Rémi Cura

Hey,
a trivial question I guess,
can't make it work.

IF I define a function returning several values , I can't use it to update
mutliple columns of a table at once.
i __don't__ want to use CTE or subquerry,
and of course I don't wan tto compute the function several time.

CREATE TABLE test_update_m_values (
gid int,
gid_plus_1 int,
gid_minus_1 int
);

INSERT INTO test_update_m_values VALUES (1,0,0) ;

CREATE OR REPLACE FUNCTION rc_test_update_m_values( gid int,OUT gid_plus_1
int, OUT gid_minus_1 int)
AS $$  -- @brief : test function, can be deleted
BEGIN
SELECT gid+1,gid-1 INTO gid_plus_1, gid_minus_1;
RETURN ;
END;
$$ LANGUAGE 'plpgsql' IMMUTABLE STRICT   ;


UPDATE test_update_m_values SET (gid_plus_1,gid_minus_1) =
(rc_test_update_m_values(gid)); --doesn't work

Somebody now if this is possible?

CHeers,
Rémi-C

Re: [GENERAL] update several columns from function returning several values

2014-12-03 Thread Rémi Cura

Thanks,
waiting for the 9.5 then =)
Cheers,
Rémi-C

2014-12-02 18:23 GMT+01:00 Tom Lane :

> =?UTF-8?Q?R=C3=A9mi_Cura?=  writes:
> > IF I define a function returning several values , I can't use it to
> update
> > mutliple columns of a table at once.
> > ...
> > UPDATE test_update_m_values SET (gid_plus_1,gid_minus_1) =
> > (rc_test_update_m_values(gid)); --doesn't work
>
> > Somebody now if this is possible?
>
> Not currently.  In 9.5 it'll be possible to do
>
> UPDATE test_update_m_values SET (gid_plus_1,gid_minus_1) =
> (select * from rc_test_update_m_values(gid));
>
> but the syntax you were trying will never work, because it would be
> ambiguous with the case of assigning a composite value to a single
> composite column.
>
> regards, tom lane
>

Re: [GENERAL] Indexing large table of coordinates with GiST

2015-01-15 Thread Rémi Cura

Hey,
You may want to post this on postGIS list.

I take that so many rows mean either raster or point cloud.
If it is point cloud simply consider using pg_pointcloud.
A 6 billion point cloud is about 600 k lines for one of my data set.

If it is raster, you may consider using postgis raster type.
If you really want to keep that much geometry,
you may want to partition your data on a regular grid.
Cheers,
Rémi-C

2015-01-15 15:45 GMT+01:00 Andy Colson :

> On 1/15/2015 6:44 AM, Daniel Begin wrote:
>
>> Hi, I'm trying to create an index on coordinates (geography type) over a
>> large table (4.5 billion records) using GiST...
>>
>> CREATE INDEX nodes_geom_idx ON nodes USING gist (geom);
>>
>> The command ran for 5 days until my computer stops because a power outage!
>> Before restarting the index creation, I am asking the community if there
>> are
>> ways to shorten the time it took the first time :-)
>>
>> Any idea?
>>
>> Daniel
>>
>>
>>
>>
> Set maintenance_work_mem as large as you can.
>
> -Andy
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: [GENERAL] Indexing large table of coordinates with GiST

2015-01-16 Thread Rémi Cura

Please let me one more guess ^^
Third guess :  you are using topology (nodes are indexed by node_id).

  -  If this is the case, you could use postgis topology.
  - The gain is that with this topology model, you store *shared linestring*,
and not shared points.


More seriously from what you say it seems possible to use pg_pointcloud
with your data,
if the following assumption is correct :
*When querying by other attributes, the points you get are roughly in the
same area (at least the area is a significant subset of the total area).*
So to be perfectly clear : if for a  node with node_id *N*, you can expect
that the node with node_id *N+1* is spatially close to the node N, you can
use pg_pointcloud and it will be effective.

Then the solution could be : partition your points spatially (aka, from
your billions points, you create few millions of groups of points, with a
grid, clustering, whatever).
Then create an index on each group of points bounding box.
Then create an index (gist) on range(node_id) for each group of point.
.. create indexes for other attributes : on range(attribute)

The you can query your data effectively, and the index size will fit into
RAM (about 1Go for 8 Million patch for me).
The query would be :
  - first get group of points of potential interest
(WHERE st_intersects(group_of_points.bbox, your_polygon) AND
group_of_points.range(node_id)&& numrange(123,678) AND other attribute
filtering )
  - second, from the group of points selected, extract the actual points,
and do the fine filtering you need
   (WHERE ST_Intersects(ST_MakePoint(point.X,point.Y,point.Z),your_polygon
AND node_id BETWEEN 123 AND 678 ...))


If the assumption is correct, it works well (for instance, all the billions
points I use also have a time stamp (equivalent to your node_id I would
say), I frequently query on time range and it is as fast as spatial query
(that is milliseconds order of magnitude) ).

To give you an order of magnitude of work involved it would take me a
couple of hours to put your data into pg_pointcloud (computing time would
be about 12 hours multi-processed , absolutely all inclusive).

Cheers,
Rémi-C



2015-01-16 1:18 GMT+01:00 Nathan Clayton :

>
> On 1/15/2015 12:36 PM, Daniel Begin wrote:
>
>>
>> Thank, there is a lot of potential ways to resolve this problem!
>>
>> For Rob, here is a bit of context concerning my IT environment…
>>
>> Windows 7 64b Desktop, running with an Intel i7 core and 16GB ram. The
>> PostgreSQL 9.3 database is stored on a 3TB external drive (USB-3 connection
>> with write cache enabled and backup battery) and a temp_tablespaces is
>> pointing to a 1TB internal drive.
>>
>> Now, let me answered/questioned given proposals in the order I received
>> them…
>>
>> 1-Andy, I will set maintenance_work_mem as large as I can unless someone
>> points to an important caveat.
>>
>> 2-Vick, partitioning the table could have been very interesting. However,
>> I will have to query the table using both the node ID (which could have
>> provided a nice partition criterion) and/or the node location (find nodes
>> within a polygon). I am not familiar with table partition but I suspect I
>> can’t create a spatial index on a table that have been partitioned (split
>> into multiple tables that inherit from the “master" table). Am I right?
>>
>> 3-Rémi, so many rows does not necessarily mean either raster or points
>> cloud (but it’s worth asking!-).  As I mentioned previously, I must be able
>> to query the table not only using nodes location (coordinates) but also
>> using the few other fields the table contains (but mainly node IDs). So, I
>> don’t think it could work, unless you tell me otherwise?
>>
>> 4-Paul, the nodes distribution is all over the world but mainly over
>> inhabited areas. However, if I had to define a limit of some sort, I would
>> use the dateline.  Concerning spatial queries, I will want to find nodes
>> that are within the boundary of irregular polygons (stored in another
>> table). Is querying on irregular polygons is compatible with geohashing?
>>
>> Regards,
>>
>> Daniel
>>
>>
>>  Provided you have an integer primary key on both your node tables and
> polygon tables, would it make sense to preprocess the overlaps and have a
> many-to-many table with the node-id and polygon-id? Depending on the speed
> in which data is ingested, you could easily build triggers to run after
> inserts/updates to keep the table updated, or you could create a globally
> unique autoincrement field that tracks revisions and update everything
> after a given high-water mark.
>
> Lookups and joins would be using integers and should give you much better
> performance than searching through the polygons.
>
> For the many-to-many table, something like (you can obviously parse it out
> into smaller batches on the insert if you need to so you don't blow up your
> memory usage. If needed you can have two tables partitioned on either the
> node-id or the polygon-id to speed up lookups, as

Re: [GENERAL] Fwd: Ask for a question

2015-01-21 Thread Rémi Cura

More bluntly maybe :

if you can do it in Excel,
you can do it in Postgres.

Cheers,
Rémi-C

2015-01-21 16:37 GMT+01:00 Raymond O'Donnell :

> On 21/01/2015 14:38, Pierre Hsieh wrote:
> >
> >
> > Hi,
> >
> >
> >
> > Would you please tell me whether PostgreSQL can execute the following
> > tasks? If not, please also tell me which one can help me for that. Thanks
>
>
> Not clear what you're asking, but if you just want to find the standard
> deviation of a sample then that's no problem:
>
>
> http://www.postgresql.org/docs/9.4/static/functions-aggregate.html#FUNCTIONS-AGGREGATE-STATISTICS-TABLE
>
> Hope this helps,
>
> Ray.
>
>
> --
> Raymond O'Donnell :: Galway :: Ireland
> r...@iol.ie
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: [GENERAL] how to calculate standard deviation from a table

2015-01-22 Thread Rémi Cura

Are you sur you don't want a moving windows
(stddev on 0 to 50 , then stdev on 1 to 51)
..

If you don't want moving windows your query would look like

DROP TABLE IF EXISTS your_data;
CREATE TABLE your_data AS
SELECT s as gid , random() as your_data_value
FROM generate_series(1,1) as  s ;

SELECT min(gid) as min_gid, max(gid) as max_gid, stddev(your_data_value) as
your_stddev
FROM your_data
GROUP BY (gid-1)/50
ORDER BY min_gid ASC


Please note that "min(gid) as min_gid, max(gid) as max_gid" and "ORDER BY
min_gid ASC" are just there to help you understand the result
Cheers,
Rémi-C

2015-01-22 16:49 GMT+01:00 David G Johnston :

> Pierre Hsieh wrote
> > Hi
> >
> > This table just has a column which type is integer. There are one million
> > data in this table. I wanna calculate standard deviation on each 50 data
> > by
> > order. It means SD1 is from data 1 to data 50, SD2 is from data 51 to
> > 100 Is there anyone who can give me some suggestions? Thanks
> >
> > Pierre
>
> Integer division
>
> David J.
>
>
>
> --
> View this message in context:
> http://postgresql.nabble.com/how-to-calculate-standard-deviation-from-a-table-tp5835031p5835042.html
> Sent from the PostgreSQL - general mailing list archive at Nabble.com.
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: [GENERAL] How do I bump a row to the front of sort efficiently

2015-02-05 Thread Rémi Cura

Hey,
I'm not a guru, here is what I understood.
You are mixing several problems in the same question :
 - 1. why the planner isn't more efficient
 - 2. why the workaround is difficult to use with an ORM.

for 1. you can't do much (as said by others, you don't really need a case
here anyway). I think using a CASE is equivalent for the planner to using
your own custom blackbox function. So no way to improve that.
for 2. : if you can't pass limit and offset in your ORM,
a small workaround is to number your row following the order you defined
with the function row_number() over(your order here),
then you can use your ORM to design where conditions equivalent to limit
and offset :

WHERE row_number BETWEEN your_offset AND your_limit

Cheers,
Rémi-C

2015-02-04 21:40 GMT+01:00 Paul Jungwirth :

> >> I imagine your original would be at risk of LIMITing out the very row
> you
> >> seek to get at the "top", since you don't have an ORDER BY to tell it
> which
> >> ones to keep during the outer LIMIT.
>
> Here is an old thread about combining ORDER BY with UNION:
>
> http://www.postgresql.org/message-id/16814.1280268...@sss.pgh.pa.us
>
> So I think this query would work:
>
> select * from topic
> where id = 1000
> union all
> (select * from topic
> where id <> 1000
> order by bumped_at desc
> limit 29)
> order by case when id = 1000 then 0 else 1 end, bumped_at desc
> ;
>
> > I need to be able to offset and limit the union hack in a view, which
> > is proving very tricky.
>
> Since this is sort of a "parameterized view" (which Postgres does not
> have) you are probably better off figuring out how to make the UNION
> query work with your ORM. What ORM is it? Maybe someone here can help
> you with that. Or maybe instead of a view you could write a
> set-returning function, e.g. as described here:
>
>
> http://stackoverflow.com/questions/11401749/pass-in-where-parameters-to-postgresql-view
>
> Paul
>
> --
> _
> Pulchritudo splendor veritatis.
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

[GENERAL] Missing table from in INSERT RETURNING

2015-02-17 Thread Rémi Cura

Hello dear list,
I would appreciate some help on a small matter that has been bothering me
for a long time :


CREATE TABLE test_insert_returning(
gid SERIAL
,some_value int
);
WITH serie AS (
select s, s*10 as computing
from generate_series(1,10) as s
)
INSERT INTO test_insert_returning (some_value)
SELECT computing
FROM serie
RETURNING gid, serie.s

doesn't work.

The only workaround I found was to create a plpgsql function that doesan
idnividual insert
so that :
WITH serie AS (
select s, s*10 as computing
from generate_series(1,10) as s
)
SELECT serie.s, my_inserting_function(computing)
FROM serie ;

But it is very annoying and potentially bad for performance because many
insert may be fired.

Any solution?
(postgres 9.3)
Cheers,
Rémi-C

Re: [GENERAL] Failure loading materialized view with pg_restore

2015-02-18 Thread Rémi Cura

Hey,
pg_hba is to manage who has *access* to database.
Your problem seems to be who has* SELECT permission* to x table.
Cheers,
Rémi-C

2015-02-18 12:03 GMT+01:00 BladeOfLight16 :

> On Wed, Feb 18, 2015 at 5:48 AM, Brian Sutherland  > wrote:
>
>> # dump and reload
>> pg_dump --username super --format c -f dump.dump orig
>> createdb copied
>>
>
> It might be helpful to dump in the plain SQL format and look at what it's
> doing.
>

Re: [GENERAL] Missing table from in INSERT RETURNING

2015-02-18 Thread Rémi Cura

Hey folks,
thanks for the answers.
As you guessed it is just a synthetic example,
(so of course RETURNING some_value / 10  is possible in this case, but not
in general when the function is much more complicated than "/10").
Same wise, adding a column is just not a serious option.

It correspond to a real need that is that you have rows associated to an
*id*.
Now you want to insert part of this row into a table with a serial field (
*gid*).
Upon insertion, the serial field is automatically filled, and you get it
with a returning statement.
The problem is that you have no way to know which value of *gid* is
associated to which *id*.

The other workaround I found is to get nextvalue() before insert to to know
beforehand what will be the *(gid, id)*  association.

It is suboptimal and ugly, so I would prefer another solution.

Cheers,
Rémi-C

2015-02-17 21:33 GMT+01:00 John McKown :

> On Tue, Feb 17, 2015 at 2:15 PM, David G Johnston <
> david.g.johns...@gmail.com> wrote:
>
>> On Tue, Feb 17, 2015 at 1:08 PM, John McKown [via PostgreSQL] <[hidden
>> email] > wrote:
>>
>>> I haven't seen any one else reply. I don't know if you've gotten a
>>> solution. But the following seemed to work for me:
>>>
>>>
>> mine apparently got bounced...
>>
>>
>>
>>> WITH serie AS (
>>> select s, s*10 as computing
>>> from generate_series(1,10) as s
>>> )
>>> INSERT INTO test_insert_returning (some_value)
>>> SELECT computing
>>> FROM serie
>>> RETURNING gid, some_value;
>>>
>>
>> or, "RETURNING some_value / 10"
>>
>>
>>> From my reading on the RETURNING phrase, you can only return values from
>>> the table into which you are doing the INSERT. Not any other table or view
>>> which might be referenced.
>>>
>>>
>> This is correct; and I am curious on the use case that requires
>> otherwise.
>>
>
> A weird one might be where in data available ("s") in the CTE is in
> English measure (feet, miles, etc) and the OP wants to insert the
> equivalent Metric value ("computing") into the table, but needs to return
> the English value to the caller (why?). He does not want to put the English
> measure into the table itself, just to be able to return it. And not need
> to do a reverse conversion. As I said, just a weird thought. From a
> effervescent fount of weird thoughts - me. Or perhaps what he is storing in
> the table is a one-way hash of a password, and wants to return the
> clear-text password to the caller? Hum, that is almost reasonable. I'll
> need to be more on guard.
>
>
>
>>
>> David J.
>>
>>
>
>
> --
> He's about as useful as a wax frying pan.
>
> 10 to the 12th power microphones = 1 Megaphone
>
> Maranatha! <><
> John McKown
>

[GENERAL] Computing (disjoint) union of range

2013-09-13 Thread Rémi Cura

Hello Dear list,

this is more a plpgsql dev. issue.

I wrote a plpgsql function to compute union of time range taht works :

[1,4]U[3,8]U[12,14]U[16,18] ---> [1,8]U[12,14]U[16,18]

It works on multiple rows.

My issue is a design issue :
I want to work on set of row and return set of row.
I am aware I could take as input/output array of range but I don't want
(memory/casting cost).

Currently the function takes a cursor on a table and output a setof record.

I would like that the function can blend in multiple subqueries smoothly, as

WITH (
first query to get range),
(query computing union
),
(query using computed union
) etc.

Currently I have to execute 2 sql statment :

create cursor on ranges;

WITH (function to compute union)
,
(query...)

The only kind of function taking set of record as input I know of is
aggregate function, but it returns only one row and the output of union can
take multiple row.

Any insight would be greatly appreciated.

Cheers,

Rémi-C

[GENERAL] best practice for || set of rows --> function --> set of rows

2013-09-17 Thread Rémi Cura

Hello Dear List,
this is a re-post with a more clear title and rewrite.

I have a design issue :
My function works on ordered set of rows and returns ordered set of rows
(less thant input). I want to use it in standard sql query (in WITH for
example).

What is best practice for input?

I see 3 solutions :
_give table name as input || so no order unless I use view, doesn't work
with CTE and all.
_give array and use unnest/arrayagg || bad performance with big input
_give refcursor || non-convenient because of transaction and require 2
separate queries.

*Is there another way?*
I would like to use it in a single sql statement, which leaves only array,
which will perform bad (I may have hundreds of k of rows as input).
Also the function need all the row in input to work, and can't operate row
by row.

It would be like an aggregate, but returning several rows.

Of course I read all the doc I could find, but doc doesn't give best
practice !

Thank you very much for helping, I am in a stalemate now, and can't
progress further.

Below is the original message, giving details over what the function do.

Cheers,

Rémi-C

I wrote a plpgsql function to compute union of time range that works :

[1,4]U[3,8]U[12,14]U[16,18] ---> [1,8]U[12,14]U[16,18]

It works on multiple rows.

My issue is a design issue :
I want to work on set of row and return set of row.
I am aware I could take as input/output array of range but I don't want
(memory/casting cost).

Currently the function takes a cursor on a table and output a setof record.

I would like that the function can blend in multiple subqueries smoothly, as

WITH (
first query to get range),
(query computing union
),
(query using computed union
) etc.

Currently I have to execute 2 sql statment :

create cursor on ranges;

WITH (function to compute union)
,
(query...)

The only kind of function taking set of record as input I know of is
aggregate function, but it returns only one row and the output of union can
take multiple row.

Any insight would be greatly appreciated.

Cheers,

Rémi-C

Re: [GENERAL] Tree structure

2013-09-23 Thread Rémi Cura

BE carefull you have a number of limitation with recursive cte (I'm
thinking of update and so.)
You can work around with plpgsql but it might be painfull.

You forgot a solution :
if you need powerfull graph features,
use postgres as a database and a SPARQL speaking frontend.
It may be a bit of overkill ;-)

Cheers,
Rémi-C


2013/9/23 Kaare Rasmussen 

> Hi Alban
>
>
>  4. Using a recursive common table expression (CTE).
>> http://www.postgresql.org/**docs/9.2/static/queries-with.**html
>>
>
> Yes, you're right. In fact that's what I'm testing a way to replace, as
> I'm not confident in the performance in all situations. My fault entirely;
> I should have told so from the start.
>
>
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/**mailpref/pgsql-general
>

Re: [GENERAL] Howto import regularly some CSV files with variing names?

2013-09-24 Thread Rémi Cura

To be very straightforward :
your bash script will dynamically create the sql query in a string, then
send it to database using psql.
You can also use pipes.

For example :

$4 -c "COPY $1 FROM '/tmp/pipe_ply_binaire_vers_ply_ascii_"$1"' WITH
CSV DELIMITER AS ' ';";
where $4 is the psql command to connect to db, $1 the number in the
name of the file we are working in, etc


Cheers,
Rémi-C


2013/9/24 Steve Crawford 

> On 09/23/2013 05:47 PM, Andreas wrote:
>
>> Am 24.09.2013 02:25, schrieb Adrian Klaver:
>>
>>> On 09/23/2013 05:19 PM, Andreas wrote:
>>>

 I need to import some log-files of an application [...]
 The import would be easy if the files had a constant name but the app
 creates csv files with names like "ExportMMDD".


 So   how would I get the filenames into the sql-script?

>>>
>>> Do man on find and look for -exec.
>>>
>>>
>> I could find the files and exec a shell script but how can I have a SQL
>> script take the found filenames as parameter?
>>
>> The SQL script needs to create a temp table
>> then COPY the file with the filename it got as parameter into the temp
>> table
>> then insert from there into the log-table
>>
>> How would I get the filenames into the SQL script?
>>
>>
>>
> Assuming your main script - the one that mounts the directory and finds
> the file name - is in bash you can easily put a small script into a heredoc
> block with variable substitution:
>
> some script stuff that mounts remote directory and sets variable
> logfilename
> ...
> psql -your -connection -parameters < some preliminary setup statements
> \copy  from $logfilename ...
> some processing statements
> EOS
>
> The disadvantage of this approach is that it is difficult-to-impossible to
> detect and handle statement-level errors. But for short scripts like simple
> imports this may not be an issue or may be easily solved by wrapping things
> in a begin;...commit; block.
>
> Cheers,
> Steve
>
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/**mailpref/pgsql-general
>

[GENERAL] Re: [GENERAL] Re: [GENERAL] Help on ṕerformance

2013-10-04 Thread Rémi Cura

Hey short trick :
to avoid to use the schema name multiple time (more readable and more easy
to re use).

You can use the
SET search_path gpstracking_device_tracks, public;

(see manual here :
http://www.postgresql.org/docs/current/static/sql-set.html)
Cheers,

Rémi-C


2013/10/2 Carlos Eduardo Sotelo Pinto 

> Thanks to all
>
> I have fix that refactoring the function
>
> BEGIN
> arr := regexp_split_to_array(_imeis, E'\\s+');
>  RETURN QUERY
> SELECT
> gpstracking_device_tracks.imei,
>  gpstracking_device_tracks.device_id,
> gpstracking_device_tracks.date_time_process,
> gpstracking_device_tracks.latitude,
>  gpstracking_device_tracks.longitude,
> gpstracking_device_tracks.course,
> gpstracking_device_tracks.speed,
>  gpstracking_device_tracks.mileage,
> gpstracking_device_tracks.gps_signal,
> gpstracking_device_tracks.gsm_signal,
>  gpstracking_device_tracks.alarm_status,
> gpstracking_device_tracks.gps_status,
> gpstracking_device_tracks.vehicle_status,
>  gpstracking_device_tracks.alarm_over_speed,
> gpstracking_device_tracks.other,
> gpstracking_device_tracks.address
>  FROM (
> SELECT
> gpstracking_device_tracks.imei,
>  gpstracking_device_tracks.device_id,
> gpstracking_device_tracks.date_time_process,
> gpstracking_device_tracks.latitude,
>  gpstracking_device_tracks.longitude,
> gpstracking_device_tracks.course,
> gpstracking_device_tracks.speed,
>  gpstracking_device_tracks.mileage,
> gpstracking_device_tracks.gps_signal,
> gpstracking_device_tracks.gsm_signal,
>  gpstracking_device_tracks.alarm_status,
> gpstracking_device_tracks.gps_status,
> gpstracking_device_tracks.vehicle_status,
>  gpstracking_device_tracks.alarm_over_speed,
> gpstracking_device_tracks.other,
> gpstracking_device_tracks.address,
>  ROW_NUMBER() OVER(PARTITION BY gpstracking_device_tracks.imei ORDER BY
> gpstracking_device_tracks.date_time_process DESC) as rnumber
> FROM gpstracking_device_tracks
>  WHERE gpstracking_device_tracks.imei = ANY(arr)
> AND gpstracking_device_tracks.date_time_process >= date_trunc('hour',
> now())
>  AND gpstracking_device_tracks.date_time_process <= NOW()
> ) AS gpstracking_device_tracks
> WHERE gpstracking_device_tracks.rnumber = 1;
> END;
>
>
> 2013/10/2 Merlin Moncure 
>
>> On Mon, Sep 30, 2013 at 10:03 AM, Carlos Eduardo Sotelo Pinto
>>  wrote:
>> >
>> > I need a help on postgresql performance
>> >
>> > I have configurate my postgresql files for tunning my server, however
>> it is
>> > slow and cpu resources are highter than 120%
>> >
>> > I have no idea on how to solve this issue, I was trying to search more
>> infor
>> > on google but is not enough, I also have try autovacum sentences and
>> reindex
>> > db, but it continues beeing slow
>> >
>> > My app is a gps listener that insert more than 6000 records per minutes
>> > using a tcp server developed on python twisted, where there is no
>> problems,
>> > the problem is when I try to follow the gps devices on a map on a
>> relatime,
>> > I am doing queries each 6 seconds to my database from my django app, for
>> > request last position using a stored procedure, but the query get slow
>> on
>> > more than 50 devices and cpu start to using more than 120% of its
>> resources
>> >
>> > Django App connect the postgres database directly, and tcp listener
>> server
>> > for teh devices connect database on threaded way using pgbouncer, I
>> have not
>> > using my django web app on pgbouncer caause I dont want to crash gps
>> devices
>> > connection on the pgbouncer
>> >
>> > I hoe you could help on get a better performance
>> >
>> > I am attaching my store procedure, my conf files and my cpu, memory
>> > information
>> >
>> > **Stored procedure**
>> >
>> > CREATE OR REPLACE FUNCTION gps_get_live_location (
>> > _imeis varchar(8)
>> > )
>> > RETURNS TABLE (
>> > imei varchar,
>> > device_id integer,
>> > date_time_process timestamp with time zone,
>> > latitude double precision,
>> > longitude double precision,
>> > course smallint,
>> > speed smallint,
>> > mileage integer,
>> > gps_signal smallint,
>> > gsm_signal smallint,
>> > alarm_status boolean,
>> > gsm_status boolean,
>> > vehicle_status boolean,
>> > alarm_over_speed boolean,
>> > other text,
>> > address varchar
>> > ) AS $func$
>> > DECLARE
>> > arr varchar[];
>> > BEGIN
>> > arr := regexp_split_to_array(_imeis, E'\\s+');
>> > FOR i IN 1..array_length(arr, 1) LOOP
>> > RETURN QUERY
>> > SELECT
>> > gpstracking_device_tracks.imei,
>> > gpstracking_device_tracks.device_id,
>> > gpstracking_device_tracks.date_time_process,
>> > gpstracking_device_tracks.latitude,
>> > gpstracking_device_tracks.longitude,
>> > gpstracking_device_tracks.course,
>> > gpstracking_device_tracks.speed,
>> > gpstracking_device_tracks.mileage,
>> > gpstracking_device_tracks.gps_signal,
>> > gpstracking_device_tracks.gsm_signal,
>> > gpstracking_device_t

Re: [GENERAL] Tree structure

2013-10-10 Thread Rémi Cura

Hey sorry if my answer is stupid,
but there is an extension for array, even if it is limited to int (but int
could be indexes of row)
It's named http://www.postgresql.org/docs/9.3/static/intarray.html
It provides essential function, although lacking some (I re-implemented
union of array with disjoint result).
I think this extension uses indexes

Cheers,
Rémi-C


2013/10/10 Kaare Rasmussen 

> Hi Merlin
>
>  On Thu, Oct 10, 2013 at 1:00 AM, Kaare Rasmussen 
>> wrote:
>>
>>> I'm quite surprised there seem to be no way in core to treat an array as
>>> an
>>> array. Using @> treats it as a set, AFAICT.
>>>
>> can you elaborate on that?
>>
>> merlin
>>
>
> To me, an array is a vector (or a vector of vectors). So I'm looking for
> an operator where
>
> ARRAY[1,4,3] doesn't contain ARRAY[3,1] and
> ARRAY[2,7] isn't contained by ARRAY[1,7,4,2,6] (but ARRAY[1,7,4] is)
>
> IOW order matters to me, but not to the array operators mentioned in
> http://www.postgresql.org/**docs/9.3/static/functions-**array.html.
> Note that index support is important.
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/**mailpref/pgsql-general
>

Re: [GENERAL] Cannot import logs from csv

2013-10-18 Thread Rémi Cura

Hey,
Can you check the line 424855 in the file maincluster-20131011.csv .

Yo may have a comma in unprotected field, or empty field, or wrong end
line, etc.

Cheers,

Rémi -C


2013/10/18 Adrian Klaver 

> On 10/18/2013 08:07 AM, Helen Griffiths wrote:
>
>> Hello.
>>
>> I've got a table set up on server B to store the logs from server A, as
>> outlined in
>> http://www.postgresql.org/**docs/8.4/static/runtime-**
>> config-logging.html#RUNTIME-**CONFIG-LOGGING-CSVLOG
>>
>>
>
>> Every day, I set \encoding SQL_ASCII on server B (server A is SQL_ASCII)
>> and use \copy with the csv flag to upload yesterday's log file to the
>> table.  For months, this has worked flawlessly until 12th October.  The
>> import failed with a message:
>> postgres=# \copy maincluster_log FROM maincluster-20131011.csv CSV
>> ERROR:  extra data after last expected column
>> CONTEXT:  COPY maincluster_log, line 424855: "2013-10-11 15:58:59.463
>> BST,"apachemitre","course_**records",30875,"[local]",**
>> 52581233.789b,3,"idle",20..."
>>
>>
>> The failing line, I will give at the end because it is long.
>>
>> PostgreSQL version on both servers is 8.4, running on CentOS 6.3.
>>
>> Please, does anyone have some insight into why this fails?
>>
>
> This came up before recently in this thread:
>
> http://www.postgresql.org/**message-id/**CADK3HHJNEWKD9gNyXmjv9ABbn+**
> 37rY3Mvp9=1j7msg9YpoBBBw@mail.**gmail.com
>
> To cut to the chase, in that case the OP found:
>
> "Ok, I found the offending line. It was not the pgadmin line. There was a
> line with a large binary insert."
>
> Not sure if that helps.
>
>
>> Helen Griffiths
>>
>>
> --
> Adrian Klaver
> adrian.kla...@gmail.com
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/**mailpref/pgsql-general
>

Re: [GENERAL] Cannot import logs from csv

2013-10-18 Thread Rémi Cura

Your welcome;

Best wishes for fixes =)

Cheers,

Rémi-C


2013/10/18 Helen Griffiths 

> On Fri, 18 Oct 2013, Adrian Klaver wrote:
>
>  This came up before recently in this thread:
>>
>> http://www.postgresql.org/**message-id/**CADK3HHJNEWKD9gNyXmjv9ABbn+**
>> 37rY3Mvp9=1j7msg9YpoBBBw@mail.**gmail.com
>>
>> To cut to the chase, in that case the OP found:
>>
>> "Ok, I found the offending line. It was not the pgadmin line. There was a
>> line with a large binary insert."
>>
>> Not sure if that helps.
>>
>
> It might: there are then some lines like:
> 2013-10-11 02:44:26.164 BST,"postgres","course_**records",19216,"[local]",
> **525740b3.4b10,124887,"idle in transaction",2013-10-11 01:05:07
> BST,7/16388,0,LOG,0,"**fastpath function call: ""loread"" (OID
> 954)"
>
> I suspect the ""loread"" is the offender.
>
> Thanks to all for the help.  I would never have thought to suspect the
> lines after the one given in the error message.
>
> --
> Helen Griffiths
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/**mailpref/pgsql-general
>

Re: [GENERAL] Count of records in a row

2013-10-22 Thread Rémi Cura

Hey,
I tried something very similar to compute generalized union of numeric
range (which was innapropriate, anyway).

My conclusion were that it's not possible using windows function as you
need either a memory (windows function are not allowed in update) or
iterations to propagate information (windows functions cannot be nested).
There may be a theoretical possibility of success using windows function
and recursive CTE.
(see end of this mail for a taste to this kind of solution)

But it is immensely easier and sometimes mandatory to use instead
a plpgsql function using cursor (or cursors).

It would be something like that in plpgsql :

cursor on table of letter ordered
accum = 0;
loop on rows of table ordered

if letter = previous letter, new_id = accum
else accum ++ ; new_id = accum

old letter = new_letter
new letter = next letter;

end of loop,

Cheers,
Rémi-C

Piste for solving it with windows function and recursive CTE :

--defining test env :

drop table if exists test_grouping;
create table test_grouping
(id serial
,letter text
--,previous_letter text
,for_computation int
--,previous_for_computation INT
);
INSERT INTO test_grouping (letter) VALUEs
('A'),
('A'),('A'),('A'),('B'),('C'),('A'),('D'),('A'),('A'),('D'),('D'),('B'),('C'),('C'
);
UPDATE test_grouping set for_computation=id;

SELECT *
FROM test_grouping;

--this query gives the result, but it needs to be iterated using a
recursive CTE (not done here):
--you can do it manually by executing it several times

WITH computation AS (
SELECT id
, letter
, for_computation,
 lag( letter, 1,NULL) over w,
  CASE
WHEN  lag( letter, 1,NULL) over w = letter
THEN
lag( for_computation, 1,NULL) over w
--NULL
ELSE
id
END AS new_id,
(SELECT count(*) over ())
FROM test_grouping AS tg
WINDOW w AS (ORDER BY id ASC ROWS 1 preceding)
ORDER BY tg.id ASC
)
UPDATE test_grouping AS tg SET for_computation = new_id FROM computation AS
c WHERE tg.id=c.id
RETURNING tg.*




2013/10/22 David Johnston 

> Robert James wrote
> > I have a table of event_id, event_time.  Many times, several events
> > happen in a row.  I'd like a query which replaces all of those events
> > with a single record, showing the count.
> >
> > Eg: Take A,A,A,B,C,A,D,A,A,D,D,B,C,C and return: A,3; B,1; C,1; A,1;
> > D,1; A,2; D,2; B,1; C,2
> >
> > How can I do that?
>
> 
>
> Window functions are going to be your friend.
>
> To solve the grouping problem I would assign the first row's value a group
> value of zero (0).  Using the "lag(...)" window function and an
> appropriately defined frame you conditionally add one (1) to the prior
> row's
> group value if the value of lag(1) does not equal the current row's value.
> The result should be a new column where all sequential duplicates share the
> same group number.
>
> Distinct will give you a lookup relation for which letter belongs to which
> group
> Group By + Count on the group will give you counts
>
> Use string_agg(...) to condense the above into single row/column
>
> HTH
>
> David J.
>
>
>
>
> --
> View this message in context:
> http://postgresql.1045698.n5.nabble.com/Count-of-records-in-a-row-tp5775363p5775365.html
> Sent from the PostgreSQL - general mailing list archive at Nabble.com.
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: [GENERAL] Count of records in a row

2013-10-22 Thread Rémi Cura

Hey,
when using a for you implicitly use a cursor (I think),
so this is the same, use FOR if you like it more.
It should be *very* fast to write !

As I wrote, relational algebra can handle it, but it is not practically
feasible :

If you just execute 3 times the query I wrote, you will have your answer.
It is 3 times because the biggest sequence is A A A A.
That's the problem, your number of execution depends on the max size of
sequence.

The problems boils down to this : the answer for one row depends on the
answer of the previous row, the row before , etc.

You could succeed with ordering by id in a windows function, and in this
window function order by new_id and putting null to the end, but such
nested windows functions calls are not allowed.

Nevertheless if you find something purely relational please keep me posted !

Cheers,

Rémi-C

2013/10/22 Robert James 

> On 10/22/13, Rémi Cura  wrote:
> > But it is immensely easier and sometimes mandatory to use instead
> > a plpgsql function using cursor (or cursors).
> >
> > It would be something like that in plpgsql :
> >
> > cursor on table of letter ordered
> > accum = 0;
> > loop on rows of table ordered
> >
> > if letter = previous letter, new_id = accum
> > else accum ++ ; new_id = accum
> >
> > old letter = new_letter
> > new letter = next letter;
> >
> > end of loop,
>
> Shouldn't it be possible to do that with a FOR loop without a cursor?
>
> It might be that procedural is the way to go.  But I still believe
> that relational algebra can handle this, even without a window
> function.  Something like:
>
> SELECT event e, COUNT(
> SELECT event oe ... WHERE oe.event_time > e.event_time AND NOT EXISTS (
>  SELECT event te WHERE te.event_time > e.event_time AND
> te.event_time < oe.event_time))
>
> .
>

Re: [GENERAL] Count of records in a row

2013-10-22 Thread Rémi Cura

héhé,
nice snipping Merlin !

I guess you are almost there, output is still wrong  (should be) (
> Eg: Take A,A,A,B,C,A,D,A,A,D,D,B,C,C and return: A,3; B,1; C,1; A,1;
> D,1; A,2; D,2; B,1; C,2
)

I don't understand enough to make the modifications =)

Cheers,
Rémi-C


2013/10/22 hubert depesz lubaczewski 

> On pon, paź 21, 2013 at 08:38:52 -0400, Robert James wrote:
> > I have a table of event_id, event_time.  Many times, several events
> > happen in a row.  I'd like a query which replaces all of those events
> > with a single record, showing the count.
> >
> > Eg: Take A,A,A,B,C,A,D,A,A,D,D,B,C,C and return: A,3; B,1; C,1; A,1;
> > D,1; A,2; D,2; B,1; C,2
> > How can I do that?
>
> "A" or other letters don't really match your schema description.
> Please provide sample schema (as in: create table ...), sample data, and
> expected output.
>
> Best regards,
>
> depesz
>
> --
> The best thing about modern society is how easy it is to avoid contact
> with it.
>
> http://depesz.com/
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: [GENERAL] Count of records in a row

2013-10-22 Thread Rémi Cura

Thanks for this good example Merlin !

I didn't know you could use variable inside custom aggregates, and this
allow to solve the problem!

In my own problem I couldn't use aggregates because
_as it output at most one row, it would have mean a lots of useless
computation (as in this example I guess, (please correct me if it's not the
case) : we do N computations of aggregate , each "using" at most N rows)
_I couldn't cheat with arrays because of cost of
serialization/deserialization

I'll keep in mind this custom aggregate use and try to learn more about it.
Cheers,
Rémi-C

Re: [GENERAL] Count of records in a row

2013-10-22 Thread Rémi Cura

Thanks again for the precision !

I still don't understand perfectly. We call the aggregate n times, and each
time we compute the aggregate, using (potentially) n rows, thus becoming
(at most) O(n*n).

With a standard loop, I loop n times, and each times I only need the
current row plus the previous row which I put in memory, thus O(n).

I posted a lot about my issue, but only about the fraction of the problem I
was blocked by, and I get no conclusive answer.

My problem was to find a good way to have a plpgsql function taking set of
rows as input and returning a set of rows.
I worked on range, and I wanted a polymorphic function (working with any
range).

Aggregates only returns one row at most, array are dangerous with big data,
temp table have to be created/deleted and have to be used in the same
session, cursors arn't well supported by accessing library, view can't be
written, mat view weren't available.

Anyway I solved it using cursors, not optimal but works !

(
http://stackoverflow.com/questions/18873797/plpgsql-expert-set-of-record-input-and-output-for-a-function/18874283#18874283
 )

Cheers,

Rémi-C

2013/10/22 Merlin Moncure 

> On Tue, Oct 22, 2013 at 9:09 AM, Rémi Cura  wrote:
> >
> > Thanks for this good example Merlin !
> >
> > I didn't know you could use variable inside custom aggregates, and this
> > allow to solve the problem!
> >
> > In my own problem I couldn't use aggregates because
> > _as it output at most one row, it would have mean a lots of useless
> > computation (as in this example I guess, (please correct me if it's not
> the
> > case) :
>
> That is not the case.  With the approach above what you 'pay' vs
> standard loop is basically one pl/pgsql function call per output row.
> (you can do it in straight sql, but when with pl/pgsql to leverage
> cached function parsing).  What you 'get' is a more general function
> because the loop structure is in the query itself as well as the
> output structure.  This cleanly separates action from the data.
> Usually, the mechanics of executing the aggregate are not a huge part
> of query execution time.  Actually, the worst case is when the
> aggregate is trivial but no matter what it's O(n).
>
> I'm not clear what on the issue is with your particular case, since
> you didn't post it :-).   Maybe post some extra detail?
>
> merlin
>

Re: [GENERAL] Count of records in a row

2013-10-22 Thread Rémi Cura

Hmm exactly what I was thinking !

Thank you a lot, I spend many hours thinking about this and this solution
is very nice.

Cheers,
Rémi-C


2013/10/22 Merlin Moncure 

> On Tue, Oct 22, 2013 at 10:01 AM, Elliot 
> wrote:
> > On 2013-10-21 20:38, Robert James wrote:
> >>
> >> I have a table of event_id, event_time.  Many times, several events
> >> happen in a row.  I'd like a query which replaces all of those events
> >> with a single record, showing the count.
> >>
> >> Eg: Take A,A,A,B,C,A,D,A,A,D,D,B,C,C and return: A,3; B,1; C,1; A,1;
> >> D,1; A,2; D,2; B,1; C,2
> >>
> >> How can I do that?
> >>
> >>
> >
> > It looks like you already found a solution, but here's one with a CTE. I
> > cobbled this together from an older query I had for doing something
> similar,
> > for which I unfortunately lost the original source of this approach.
> Also,
> > this implies that there is something that gives an ordering to these rows
> > (in this case, the field "i").
> >
> > create temp table data (i int, val char);
> >
> > insert into data (val, i)
> > values
> > ('A',1),
> > ('A',2),
> > ('A',3),
> > ('B',4),
> > ('C',5),
> > ('A',6),
> > ('D',7),
> > ('A',8),
> > ('A',9),
> > ('D',10),
> > ('D',11),
> > ('B',12),
> > ('C',13),
> > ('C',14)
> > ;
> >
> > with x
> > as
> > (
> >   select i,
> >  row_number() over () as xxx,
> >  val,
> >  row_number() over (partition by val order by i asc)
> >- row_number() over () as d
> >   from data
> >   order by i
> > )
> > select val,
> >count(*)
> > from x
> > group by d,
> >  val
> > order by min(i)
>
> wow, that's really clever.
>
> merlin
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: [GENERAL] Count of records in a row

2013-10-22 Thread Rémi Cura

OK,
just out of pure curiosity,
is it always the case or is it due to this particular aggregate?

Cheers,
Rémi-C


2013/10/22 Merlin Moncure 

> On Tue, Oct 22, 2013 at 9:43 AM, Rémi Cura  wrote:
> > Thanks again for the precision !
> >
> > I still don't understand perfectly. We call the aggregate n times, and
> each
> > time we compute the aggregate, using (potentially) n rows, thus becoming
> (at
> > most) O(n*n).
> >
> > With a standard loop, I loop n times, and each times I only need the
> current
> > row plus the previous row which I put in memory, thus O(n).
>
> For posterity, the above is incorrect.  Since the aggregate is ordered
> through the window function, it gets executed exactly once per output
> row.  It behaves exactly like a loop.  You know this because there is
> no array in the aggregate state.
>
> merlin
>

[GENERAL] Error with "return query" ( "return next" working ) with custom type

2013-10-22 Thread Rémi Cura

Hey dear lists,

Here is a self contained example showing strange behavior from a real life
example concerning the use of postgis_topology topogeometry type.


The problem is :
when trying to return setof topogeometry,
the "return query"  gives an error of type where there is none, and the
return next is working fine.

The precise error message is ERROR 42804

"ERROR:  structure of query does not match function result type
DETAIL:  Returned type ttt.fake_topogeometry does not match expected type
integer in column 1.
CONTEXT:  PL/pgSQL function ttt.testtopogeom(ttt.fake_topogeometry) line 9
at RETURN QUERY
"


Is it ok, postres bug, postgis bug?
What are the possible corrections?


Here is the self contained code stored in the "ttt" schema.


DROP SCHEMA IF EXISTS ttt CASCADE;
CREATE SCHEMA ttt;

DROP TYPE IF EXISTS ttt.fake_topogeometry CASCADE;
CREATE TYPE ttt.fake_topogeometry AS
   (topology_id integer,
layer_id integer,
id integer,
a_type integer);

DROP FUNCTION IF EXISTS ttt.testTopogeom(test_topom ttt.fake_topogeometry);
CREATE FUNCTION ttt.testTopogeom(test_topom ttt.fake_topogeometry)
RETURNS SETOF ttt.fake_topogeometry AS
$BODY$
-- this function is an empty function to test return of multiple topogeom
DECLARE
the_topo ttt.fake_topogeometry;
BEGIN
RETURN NEXT (1,1,1,1)::ttt.fake_topogeometry;
--RETURN NEXT (3,3,3,3)::ttt.fake_topogeometry;

RETURN QUERY  SELECT (1,1,1,1)::ttt.fake_topogeometry as foo;
-- UNION
--SELECT  (3,3,3,3)::ttt.fake_topogeometry as foo
RETURN  ;
END ;
$BODY$
LANGUAGE plpgsql IMMUTABLE;

SELECT *
FROM ttt.testTopogeom((2,2,2,2)::ttt.fake_topogeometry);

[GENERAL] Re: [postgis-users] Error with "return query" ( "return next" working ) with custom type

2013-10-23 Thread Rémi Cura

Hey,
thanks for the answers,

sorry for the cross post, i didn't know if it was postgis or postgres
issue, hence the double post (removed postgis now).

I'm afraid I don't understand perfectly the answer. Are you (both) saying
that it is a normal behavior that a function that should return a custom
type doesn't in fact return this custom type, but a number of columns
composing this custom type?

This seems like at best a strange behavior !

The whole point of using custom type is to provide interface, right?

To be precise, when specifying "return setof fake_topogeometry" I would
expect that the function returns a  fake_topogeometry object (like the
querry `SELECT (5,5,5,5)::fake_topogeometry`), not some columns !

I'm obviously missing something, as
SELECT * FROM testTopogeom(); --returns columns
SELECT testTopogeom(); --returns object

Could you suggest me some more documentation (other than
http://www.postgresql.org/docs/9.3/static/plpgsql-control-structures.html)?

Is this difference between Return Next and return query documented?

Thanks for your help,
Cheers,
Rémi-C


2013/10/23 Steve Grey 

> try:
>
> RETURN QUERY  SELECT 1,1,1,1;
>
> The error message means the cast failed between ttt.fake_topogeometry and
> the topology_id (i.e. first) field of the return type of the function,
> which isn't what you wanted to do.
>
> Pls. don't cross-post between lists.
>
>
> On 23 October 2013 01:21, Rémi Cura  wrote:
>
>>
>>
>> Hey dear lists,
>>
>> Here is a self contained example showing strange behavior from a real
>> life example concerning the use of postgis_topology topogeometry type.
>>
>>
>> The problem is :
>> when trying to return setof topogeometry,
>> the "return query"  gives an error of type where there is none, and the
>> return next is working fine.
>>
>> The precise error message is ERROR 42804
>>
>> "ERROR:  structure of query does not match function result type
>> DETAIL:  Returned type ttt.fake_topogeometry does not match expected type
>> integer in column 1.
>> CONTEXT:  PL/pgSQL function ttt.testtopogeom(ttt.fake_topogeometry) line
>> 9 at RETURN QUERY
>> "
>>
>>
>> Is it ok, postres bug, postgis bug?
>>  What are the possible corrections?
>>
>>
>> Here is the self contained code stored in the "ttt" schema.
>>
>>
>> DROP SCHEMA IF EXISTS ttt CASCADE;
>> CREATE SCHEMA ttt;
>>
>> DROP TYPE IF EXISTS ttt.fake_topogeometry CASCADE;
>> CREATE TYPE ttt.fake_topogeometry AS
>>(topology_id integer,
>> layer_id integer,
>> id integer,
>> a_type integer);
>>
>> DROP FUNCTION IF EXISTS ttt.testTopogeom(test_topom
>> ttt.fake_topogeometry);
>> CREATE FUNCTION ttt.testTopogeom(test_topom ttt.fake_topogeometry)
>>  RETURNS SETOF ttt.fake_topogeometry AS
>> $BODY$
>> -- this function is an empty function to test return of multiple topogeom
>>  DECLARE
>> the_topo ttt.fake_topogeometry;
>> BEGIN
>>  RETURN NEXT (1,1,1,1)::ttt.fake_topogeometry;
>> --RETURN NEXT (3,3,3,3)::ttt.fake_topogeometry;
>>
>> RETURN QUERY  SELECT (1,1,1,1)::ttt.fake_topogeometry as foo;
>> -- UNION
>>  --SELECT  (3,3,3,3)::ttt.fake_topogeometry as foo
>> RETURN  ;
>> END ;
>>  $BODY$
>> LANGUAGE plpgsql IMMUTABLE;
>>
>> SELECT *
>>  FROM ttt.testTopogeom((2,2,2,2)::ttt.fake_topogeometry);
>>
>>
>>
>> ___
>> postgis-users mailing list
>> postgis-us...@lists.osgeo.org
>> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
>>
>
>
> ___
> postgis-users mailing list
> postgis-us...@lists.osgeo.org
> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
>

Re: [GENERAL] Count of records in a row

2013-10-23 Thread Rémi Cura

Ok thanks for this precision Merlin.
Seems like aggregates are way more powerful than I thought.

Obviously I need a lot more reading about custom aggregates before fully
understanding it.

Elliot's query is pure SQL so obviously very cool !

It could be improved at the margin, and aggregates/function are certainly
faster on big data.
But if you have no specific needs I would say Elliot is easier and more
universal.


Cheers & thanks all for this good discussion.

Rémi-C


2013/10/23 Merlin Moncure 

> > 2013/10/22 Merlin Moncure 
> >> > With a standard loop, I loop n times, and each times I only need the
> >> > current
> >> > row plus the previous row which I put in memory, thus O(n).
> >>
> >> For posterity, the above is incorrect.  Since the aggregate is ordered
> >> through the window function, it gets executed exactly once per output
> >> row.  It behaves exactly like a loop.  You know this because there is
> >> no array in the aggregate state.
> >>
> > just out of pure curiosity,
> > is it always the case or is it due to this particular aggregate?
>
> It is always the case.  Generally speaking, aggregates, especially
> user defined aggregates, are run once per input row.   In this case
> the main utility of window functions is to order the aggregate
> execution calls and (especially) allow intermediate output per input
> row, instead of per aggregate grouping.
>
> On Tue, Oct 22, 2013 at 6:01 PM, Robert James 
> wrote:
> > Wow, this is an excellent discussion - and I must admit, a bit beyond
> > my abilities.  Is there a consensus as to the best approach to adopt?
> > Is Elliot's the best?
>
> For this *specific* problem, I would give Elliot's (extremely clever)
> query the nod on the basis that it does not require any supporting
> infrastructure, which is always nice.  That being said, once you start
> getting the mojo of user defined aggregates + window functions it
> starts to become clear that it's a cleaner way of doing many types of
> things that are normally handled by loops.
>
> merlin
>

[GENERAL] Re: [postgis-users] Error with "return query" ( "return next" working ) with custom type

2013-10-23 Thread Rémi Cura

For archive :

FOR x IN q_query behaves like RETURN QUERY regarding previously described
behavior.

Cheers,
Rémi-C


2013/10/23 Rémi Cura 

> Hey,
> thanks for the answers,
>
> sorry for the cross post, i didn't know if it was postgis or postgres
> issue, hence the double post (removed postgis now).
>
> I'm afraid I don't understand perfectly the answer. Are you (both) saying
> that it is a normal behavior that a function that should return a custom
> type doesn't in fact return this custom type, but a number of columns
> composing this custom type?
>
> This seems like at best a strange behavior !
>
> The whole point of using custom type is to provide interface, right?
>
> To be precise, when specifying "return setof fake_topogeometry" I would
> expect that the function returns a  fake_topogeometry object (like the
> querry `SELECT (5,5,5,5)::fake_topogeometry`), not some columns !
>
> I'm obviously missing something, as
> SELECT * FROM testTopogeom(); --returns columns
> SELECT testTopogeom(); --returns object
>
> Could you suggest me some more documentation (other than
> http://www.postgresql.org/docs/9.3/static/plpgsql-control-structures.html
> )?
>
> Is this difference between Return Next and return query documented?
>
> Thanks for your help,
> Cheers,
> Rémi-C
>
>
> 2013/10/23 Steve Grey 
>
>> try:
>>
>> RETURN QUERY  SELECT 1,1,1,1;
>>
>> The error message means the cast failed between ttt.fake_topogeometry
>> and the topology_id (i.e. first) field of the return type of the
>> function, which isn't what you wanted to do.
>>
>> Pls. don't cross-post between lists.
>>
>>
>> On 23 October 2013 01:21, Rémi Cura  wrote:
>>
>>>
>>>
>>> Hey dear lists,
>>>
>>> Here is a self contained example showing strange behavior from a real
>>> life example concerning the use of postgis_topology topogeometry type.
>>>
>>>
>>> The problem is :
>>> when trying to return setof topogeometry,
>>> the "return query"  gives an error of type where there is none, and the
>>> return next is working fine.
>>>
>>> The precise error message is ERROR 42804
>>>
>>> "ERROR:  structure of query does not match function result type
>>> DETAIL:  Returned type ttt.fake_topogeometry does not match expected
>>> type integer in column 1.
>>> CONTEXT:  PL/pgSQL function ttt.testtopogeom(ttt.fake_topogeometry) line
>>> 9 at RETURN QUERY
>>> "
>>>
>>>
>>> Is it ok, postres bug, postgis bug?
>>>  What are the possible corrections?
>>>
>>>
>>> Here is the self contained code stored in the "ttt" schema.
>>>
>>>
>>> DROP SCHEMA IF EXISTS ttt CASCADE;
>>> CREATE SCHEMA ttt;
>>>
>>> DROP TYPE IF EXISTS ttt.fake_topogeometry CASCADE;
>>> CREATE TYPE ttt.fake_topogeometry AS
>>>(topology_id integer,
>>> layer_id integer,
>>> id integer,
>>> a_type integer);
>>>
>>> DROP FUNCTION IF EXISTS ttt.testTopogeom(test_topom
>>> ttt.fake_topogeometry);
>>> CREATE FUNCTION ttt.testTopogeom(test_topom ttt.fake_topogeometry)
>>>  RETURNS SETOF ttt.fake_topogeometry AS
>>> $BODY$
>>> -- this function is an empty function to test return of multiple topogeom
>>>  DECLARE
>>> the_topo ttt.fake_topogeometry;
>>> BEGIN
>>>  RETURN NEXT (1,1,1,1)::ttt.fake_topogeometry;
>>> --RETURN NEXT (3,3,3,3)::ttt.fake_topogeometry;
>>>
>>> RETURN QUERY  SELECT (1,1,1,1)::ttt.fake_topogeometry as foo;
>>> -- UNION
>>>  --SELECT  (3,3,3,3)::ttt.fake_topogeometry as foo
>>> RETURN  ;
>>> END ;
>>>  $BODY$
>>> LANGUAGE plpgsql IMMUTABLE;
>>>
>>> SELECT *
>>>  FROM ttt.testTopogeom((2,2,2,2)::ttt.fake_topogeometry);
>>>
>>>
>>>
>>> ___
>>> postgis-users mailing list
>>> postgis-us...@lists.osgeo.org
>>> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
>>>
>>
>>
>> ___
>> postgis-users mailing list
>> postgis-us...@lists.osgeo.org
>> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
>>
>
>

[GENERAL] GIST index : order Hack : getting the order used by CLUSTER .. USING my_index

2013-10-24 Thread Rémi Cura

Hey List,

I would like to be able to get the rows following the order of an index
(*NOT* getting an order by accelerated, but only an order defined by an
index).

Something like this :

SELECT my_row
FROM my_table
ORDER BY the_index ASC

where the_index is a GIST index over points.

I know there is a possibility as it is exactly what the command
CLUSTER my_table USING the_index
does.


I read the following page :
http://www.postgresql.org/docs/9.3/static/indexes-ordering.html, but it is
not of great help.
How much of a hack is it?

Cheers,

Rémi-C

Re: [GENERAL] GIST index : order Hack : getting the order used by CLUSTER .. USING my_index

2013-10-24 Thread Rémi Cura

Hello,

I'm interested in the tree structure inherent to the gist indexing.
I was thinking to retrieve it from order of index.
Do you know how I could access it directly?

My use case would be to take advantage of this gist ordering to order 2D
points
s1 : N1 N2 .. Nn
so that for any given t

> =?UTF-8?Q?R=C3=A9mi_Cura?=  writes:
> > I would like to be able to get the rows following the order of an index
> > (*NOT* getting an order by accelerated, but only an order defined by an
> > index).
>
> Since a GiST index hasn't got any specific internal order, I fail to see
> the point of this.
>
> regards, tom lane
>

Re: [GENERAL] GIST index : order Hack : getting the order used by CLUSTER .. USING my_index

2013-10-24 Thread Rémi Cura

Ok,
thank you Tom for this precise answer !

I don't understand how the CLUSTER .. USING index command work then.
It is supposed to rewrite on disk following index order. Does it do nothing
for GIST index?

Cheers,
Rémi-C


2013/10/24 Tom Lane 

> =?UTF-8?Q?R=C3=A9mi_Cura?=  writes:
> > I'm interested in the tree structure inherent to the gist indexing.
> > I was thinking to retrieve it from order of index.
>
> How?  A SQL query would have no idea where the index page boundaries were
> in the sequence of retrieved tuples.
>
> > Do you know how I could access it directly?
>
> I don't think there's any way to do that without modifying the GiST code.
> What you really care about here is the contents of the upper index levels,
> which is something that's not exposed at all outside the index AM.
>
> regards, tom lane
>

Re: [GENERAL] GIST index : order Hack : getting the order used by CLUSTER .. USING my_index

2013-10-24 Thread Rémi Cura

Great,
thanks.

Now you say that I never saw any improvement when clustering table with
gist.
You just saved me a lot of unnecessary queries :-)

Cheers,

Rémi-C


2013/10/24 Tom Lane 

> =?UTF-8?Q?R=C3=A9mi_Cura?=  writes:
> > I don't understand how the CLUSTER .. USING index command work then.
> > It is supposed to rewrite on disk following index order. Does it do
> nothing
> > for GIST index?
>
> Nobody has ever demonstrated that CLUSTER has any value for anything
> except btree indexes.  It seems likely to me that it'd actually be
> counterproductive for indexes like GiST, which depend on data arriving in
> random order for the highest index levels to end up well-distributed.
>
> regards, tom lane
>

Re: [GENERAL] Connection pooling

2013-10-30 Thread Rémi Cura

Are the geoserver and postgres on same computer?

Cheers,

Rémi-C


2013/10/30 si24 

> I'm not sure if its suppose to be under general so please let me know if I
> need to move it to another topic area please.
>
>
>
> --
> View this message in context:
> http://postgresql.1045698.n5.nabble.com/Connection-pooling-tp5776378p5776382.html
> Sent from the PostgreSQL - general mailing list archive at Nabble.com.
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: [GENERAL] Connection pooling

2013-10-31 Thread Rémi Cura

Isn'it a client problem?

It should be client application closing connection when done with data
retrieval, and not the other way around?

Cheers,
Rémi-C


2013/10/31 Jayadevan 

> I have never used pgbouncer myself. But my guess is you have to look at the
> Timeout parameters in the configuration file.
>
>
>
> --
> View this message in context:
> http://postgresql.1045698.n5.nabble.com/Connection-pooling-tp5776378p5776481.html
> Sent from the PostgreSQL - general mailing list archive at Nabble.com.
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: [GENERAL] Connection pooling

2013-10-31 Thread Rémi Cura

Hey,
I might be completly wrong, but when you say

"get the connections to close if they are not being used",

I'd say that it is a bad client design to not close a connection when it
doesn't need it anymore.
The client should retrieve the data or close when not using after a certain
amount of time.

What you are trying to do is garbage collector.

Cheers,
Rémi-C


2013/10/31 si24 

> I'm not 100% sure I follow in that part of if its the client cause
> currently
> when I run it on my own computer it does the same thing. Only when I stop
> tomcat and start it again then i get the 3 default connection that postgres
> has set up. our server does the same thing.
>
>
>
>
>
> --
> View this message in context:
> http://postgresql.1045698.n5.nabble.com/Connection-pooling-tp5776378p5776490.html
> Sent from the PostgreSQL - general mailing list archive at Nabble.com.
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: [GENERAL] psql variable interpolation with subsequent underscore

2013-11-08 Thread Rémi Cura

maybe a stupid answer,
but why not use another language (plpgsql? python?).
Is it because of transaction issue?
Sorry for not being more helpfull.
Cheers,
Rémi-C


2013/11/8 Tim Kane 

> Hi all,
>
> I’m having difficulty using variable interpolation within psql, where that
> variable is within a table name…
>
>
> Like so..
>
> =# set MM 201310
> =# select :MM;
>  ?column?
> --
>201309
> (1 row)
>
>
> =# alter table my_table rename to my_table_:MM_raw;
> ERROR:  syntax error at or near ":"
> LINE 1: …my_table rename to my_table_:MM_ra...
>  ^
>
> The problem is that psql tries to interpret ‘MM_raw’ as the variable
> name, but my intention is to only interpret ‘MM’ followed by a literal
> underscore.
>
> I can’t find any other way to encapsulate the variable name in this way…
>  Is there a trick to it?
>
> I suspect I’ll need to work around this by altering the naming convention
> such that the MM is at the ‘end’ of the table name.  Maybe..   Thoughts?
>
>
> Tim
>
>
>
>

[GENERAL] Autodocumenting plpgsql function

2013-11-26 Thread Rémi Cura

Hey all,
somebody knows of a way to autodocument plpgsql function, in a docxygen
style
(adding tags in comments for instance, or creating doc templates to fill).

It would really help to write the doc and maintain it.

Thanks,

Rémi-C

Re: [GENERAL] Autodocumenting plpgsql function

2013-11-26 Thread Rémi Cura

Hey thanks for the answer.

I'm thinking way more than that.
Typically in you comments you include special tags, like @input, then
doxygen will parse it and generate an html documentation.

Cheers,

Rémi-C


2013/11/26 Albe Laurenz 

> Rémi Cura wrote:
> > somebody knows of a way to autodocument plpgsql function, in a docxygen
> style
> > (adding tags in comments for instance, or creating doc templates to
> fill).
> >
> > It would really help to write the doc and maintain it.
>
> I am not sure what you need, but I see two ways to
> document a function:
>
> 1) With /** .. */ comments in the beginning.
>Maybe doxygen can be used to parse a database dump.
>
> 2) With COMMENT ON FUNCTION ... IS '...';
>That also keeps the documentation close to where
>the code is, and it shows up in database dumps.
>
> Yours,
> Laurenz Albe
>

Re: [GENERAL] tracking scripts...

2013-11-26 Thread Rémi Cura

Now it's too late,
but maybe you could allow to not use a single transaction ( but instead
127k transactions).4

Then at the end of every transaction you could print something in gui
(print for pgscript, raise for plpgsql) or execute a command to write in a
file (copy for instance).
It would also be in the log, but not so clear.

Cheers,

Rémi-C


2013/11/26 Joey Quinn 

> I have a fairly large table (4.3 billion rows) that I am running an update
> script on (a bit over 127 thousand individual update queries). I am using
> the gui. It has been running for about 24 hours now. Is there any good way
> to gauge progress (as in, how many of the individual update queries have
> finished)?
>
>
>

Re: [GENERAL] tracking scripts...

2013-11-27 Thread Rémi Cura

First serious answer :
you don't have to use command line,
you can use the pgadmin gui, loading your file with all the command, and
then hit F6 (or select run as pgscript).
This will wrapp each command in a transaction , and will print messages all
along.
Please test this on a few line before trying on everything.

Second :
lol for the secret message.
There is a very famous one like this in french, in private letters by
George Sand, a famous writter.

The text is very high level and nice french, but if you read one line then
skip the next ..,
the message is very dirty !

It is of course also very hard to translate ...

Cheers,
Rémi-C

Cher ami,
Je suis toute émue de vous dire que j'ai
bien compris l'autre jour que vous aviez
toujours une envie folle de me faire
danser. Je garde le souvenir de votre
baiser et je voudrais bien que ce soit
une preuve que je puisse être aimée
par vous. Je suis prête à montrer mon
affection toute désintéressée et sans cal-
cul, et si vous voulez me voir ainsi
vous dévoiler, sans artifice, mon âme
toute nue, daignez me faire visite,
nous causerons et en amis franchement
je vous prouverai que je suis la femme
sincère, capable de vous offrir l'affection
la plus profonde, comme la plus étroite
amitié, en un mot : la meilleure épouse
dont vous puissiez rêver. Puisque votre>
âme est libre, pensez que l'abandon ou je
vis est bien long, bien dur et souvent bien>
insupportable. Mon chagrin est trop
gros. Accourrez bien vite et venez me le
faire oublier. À vous je veux me sou-
mettre entièrement.
Votre poupée

2013/11/26 Merlin Moncure 

> On Tue, Nov 26, 2013 at 3:20 PM, Raymond O'Donnell  wrote:
> > On 26/11/2013 20:30, Merlin Moncure wrote:
> >> There are not many ways to Hand off information outside of the
> >> database while a transaction Is running. one way Is to write a Simple
> >> trigger in plpgsql that 'raise'es A notice every 'n' times trigger
> >> condition fires.  that'S Essentially the only Clean way to do it in
> >> such a way that the information is Returned to the Executing console.
> >> Thanks!
> >
> > Totally unrelated to the thread I noticed that the capitalised
> > letters in the email above spell out this:
> >
> >  THIISASECRET
> >
> > .. which (almost) spells "This is a secret". Was this intentional, or am
> > I just working too hard? :-)
>
> Well, bad spelling on my part.  To get the joke, you have to be A.
> observant, B. be using a gmail account, and C. be a comic book geek
> that grew up in the 80's.
>
> merlin
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: [GENERAL] tracking scripts...

2013-11-27 Thread Rémi Cura

I'm not an expert,
 I would think if you can spare using only one transaction , it would be
way way faster to do it !

the system simply could skip keeping log to be ready to roll back for a 1
billion row update !

Of course it would be preferable to use psql to execute statement by
statement as separate transactions , and do it with X several parallel psql
(splitting the big text file into X parts), yet Joey seemed reluctant to
use console =)


Cheers,
Rémi-C


2013/11/27 Albe Laurenz 

> John R Pierce wrote:
> > On 11/26/2013 9:24 AM, Joey Quinn wrote:
> >> When I ran that command (select * from pg_stat_activity"), it returned
> >> the first six lines of the scripts. I'm fairly sure it has gotten a
> >> bit beyond that (been running over 24 hours now, and the size has
> >> increased about 300 GB). Am I missing something for it to tell me what
> >> the last line processed was?
> >
> > that means your GUI lobbed the entire file at postgres in a single
> > PQexec call, so its all being executed as a single statement.
> >
> > psql -f "filename.sql" dbname   would have processed the queries one at
> > a time.
>
> Yes, but that would slow down processing considerably, which would
> not help in this case.
>
> I'd opt for
> psql -1 -f "filename.sql" dbname
> so it all runs in a single transaction.
>
> Yours,
> Laurenz Albe
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: [GENERAL] tracking scripts...

2013-11-27 Thread Rémi Cura

Sorry,
if you cancel everything will be rolled back
(it is actually what makes DB so powerfull).

Unless it finishes I don't know of a way to keep changes.

At least on my computer (I don't know if you can generalize this), it is
way faster to split into many transaction, so you would gain time.
Using pgscript will make you loose some time, but it won't be much if each
querry is long enough (some seconds at least).
If you intend to do it often, you may want to consider mutliple parallel
psql.

Cheers,

Rémi-C



2013/11/27 Joey Quinn 

> Wow, thank-you (sometimes the answer is right there in front of you...
> very new to Postgres, had wondered what the difference was between the run
> query and run as PGS script, but hadn't looked into it yet).
>
> So, here's the critical question(s) right now (for me)...
>
> With the way I launched it, using the "Execute query" button, if I now hit
> the "Cancel query" button, what happens? Have the last two days of updates
> already been committed? Or will they get rolled back? I would love to
> switch to the other method, so that I can gauge progress, but would hate to
> lose two days worth of run time...
>
>  If I do run the same script (a bit over 100k lines) in PGS mode, will
> that affect the speed? If so, how much? 1%? 5%? More?
>
>
>
>
> On Wed, Nov 27, 2013 at 3:20 AM, Rémi Cura  wrote:
>
>> First serious answer :
>> you don't have to use command line,
>> you can use the pgadmin gui, loading your file with all the command, and
>> then hit F6 (or select run as pgscript).
>> This will wrapp each command in a transaction , and will print messages
>> all along.
>> Please test this on a few line before trying on everything.
>>
>> Second :
>> lol for the secret message.
>> There is a very famous one like this in french, in private letters by
>> George Sand, a famous writter.
>>
>> The text is very high level and nice french, but if you read one line
>> then skip the next ..,
>> the message is very dirty !
>>
>> It is of course also very hard to translate ...
>>
>> Cheers,
>> Rémi-C
>>
>> Cher ami,
>> Je suis toute émue de vous dire que j'ai
>> bien compris l'autre jour que vous aviez
>> toujours une envie folle de me faire
>> danser. Je garde le souvenir de votre
>> baiser et je voudrais bien que ce soit
>> une preuve que je puisse être aimée
>> par vous. Je suis prête à montrer mon
>> affection toute désintéressée et sans cal-
>> cul, et si vous voulez me voir ainsi
>> vous dévoiler, sans artifice, mon âme
>> toute nue, daignez me faire visite,
>> nous causerons et en amis franchement
>> je vous prouverai que je suis la femme
>> sincère, capable de vous offrir l'affection
>> la plus profonde, comme la plus étroite
>> amitié, en un mot : la meilleure épouse
>> dont vous puissiez rêver. Puisque votre>
>> âme est libre, pensez que l'abandon ou je
>> vis est bien long, bien dur et souvent bien>
>> insupportable. Mon chagrin est trop
>> gros. Accourrez bien vite et venez me le
>> faire oublier. À vous je veux me sou-
>> mettre entièrement.
>> Votre poupée
>>
>>
>> 2013/11/26 Merlin Moncure 
>>
>>>  On Tue, Nov 26, 2013 at 3:20 PM, Raymond O'Donnell  wrote:
>>> > On 26/11/2013 20:30, Merlin Moncure wrote:
>>> >> There are not many ways to Hand off information outside of the
>>> >> database while a transaction Is running. one way Is to write a Simple
>>> >> trigger in plpgsql that 'raise'es A notice every 'n' times trigger
>>> >> condition fires.  that'S Essentially the only Clean way to do it in
>>> >> such a way that the information is Returned to the Executing console.
>>> >> Thanks!
>>> >
>>> > Totally unrelated to the thread I noticed that the capitalised
>>> > letters in the email above spell out this:
>>> >
>>> >  THIISASECRET
>>> >
>>> > .. which (almost) spells "This is a secret". Was this intentional, or
>>> am
>>> > I just working too hard? :-)
>>>
>>> Well, bad spelling on my part.  To get the joke, you have to be A.
>>> observant, B. be using a gmail account, and C. be a comic book geek
>>> that grew up in the 80's.
>>>
>>> merlin
>>>
>>>
>>> --
>>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>>> To make changes to your subscription:
>>> http://www.postgresql.org/mailpref/pgsql-general
>>>
>>
>>
>

Re: [GENERAL] Similarity search for sentences

2013-12-05 Thread Rémi Cura

May be totally a bad idea :
explode your sentence into(sentence_number, one_word), n times , (makes a
big table, you may want to partition)
then, classic index on sentence number, and on the one world (btree if you
make = comparison , more subtel if you do "like 'word' ")

depending on perf, it could be wort it to regroup by words :
sentence_number[], on_word
Then you could try array or hstore on sentence_number[] ?

Cheers,

Rémi-C


2013/12/5 Janek Sendrowski 

> Hi,
>
> I have tables with millions of sentences. Each row contains a sentence. It
> is natural language and every language is possible, but the sentences of
> one table have the same language.
> I have to do a similarity search on them. It has to be very fast,
> because I have to search for a few hundert sentences many times.
> The search shouldn't be context-based. It should just get sentences with
> similar words(maybe stemmed).
>
> I already had a try with gist/gin-index-based trigramm search (pg_trgm
> extension), fulltextsearch (tsearch2 extension) and a pivot-based indexing
> (Fixed Query Array), but it's all to slow or not suitable.
> Soundex and Metaphone aren't suitable, as well.
>
> I'm already working on this project since a long time, but without any
> success.
> Do any of you have an idea?
>
> I would be very thankful for help.
>
> Janek Sendrowski
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: [GENERAL] Set returning aggregate?

2013-12-09 Thread Rémi Cura

Hello,
could it be possible then to define a custom CTE as the result of a
function?
somthing like :

with first_cte AS (
select blabla)
, second_cte AS (
a_function_returning_a_set(parameters)
)
SELECT blabla

A CTE is much like what you would like bborie :
you can reference previously defined CTE, and you output a set of row.

Cheers,
Rémi-C


2013/12/8 Pavel Stehule 

> Hello
>
>
> 2013/12/8 Bborie Park 
>
>> I'm wondering if an aggregate function can return a set of records?
>>
>
> No, final function cannot returns set. It is disallowed.
>
> Theoretically, it should be possible - it is explicitly prohibited. But if
> it will be allowed, you can get same problems like using SRF function in
> target list.
>
> postgres=# select generate_series(1,2),generate_series(1,2);
>  generate_series │ generate_series
> ─┼─
>1 │   1
>2 │   2
> (2 rows)
>
> Time: 49.332 ms
> postgres=# select generate_series(1,2),generate_series(1,3);
>  generate_series │ generate_series
> ─┼─
>1 │   1
>2 │   2
>1 │   3
>2 │   1
>1 │   2
>2 │   3
> (6 rows)
>
> Time: 0.445 ms
>
> It will be hard defined a expected behaviour when somebody use more these
> aggregates in same query and returns different number of rows.
>
>
> Regards
>
> Pavel
>
>
>
>>
>> Say I have a table with a column of type raster (PostGIS). I want to get
>> the number of times the pixel values 1, 3 and 4 occur in that raster
>> column. I am hoping to build an aggregrate function that returns the
>> following...
>>
>> value | count
>> +
>> 1   | 12
>> +
>> 2   | 12
>> +
>> 3   | 12
>>
>> Is it possible for an aggregate function to return a set? I've written
>> some test cases and it looks like the answer is No but I'd like
>> confirmation.
>>
>> Thanks,
>> Bborie Park
>>
>> PostGIS Steering Committee
>>
>
>

Re: [GENERAL] Grouping, Aggregate, Min, Max

2013-12-13 Thread Rémi Cura

There is a trick to simplify the thing and avoid using aggregates :
I think it will give you your answer.

http://postgresql.1045698.n5.nabble.com/Count-of-records-in-a-row-td5775363i20.html

Cheers,
Rémi-C


2013/12/13 Misa Simic 

> Hi All,
>
> I am not sure how to define with words what I want to accomplish (so can't
> ask google the right question :) )
>
> So will try to explain with sample data and expected result:
>
> Scenario 1)
>
>   id thing_id category period_id  1 1 A 1  2 1 A 2  3 1 A 3  4 1 A 4  5 1
> A 5  6 1 A 6  7 1 A 7  8 1 A 8  9 1 A 9  10 2 A 1  11 2 A 2  12 2 A 3  13
> 2 A 4
>
>
> Expected result:
>
>   thing_id category periods  1 A 1-9  2 A 1-4
> (Sounds easy, group by, thing_id, category use Min and Max for period id -
> but further scenarios makes it a bit complicated...)
>
> Scenario 2)
>
>   id thing_id category period_id  1 1 A 1  2 1 A 2  3 1 A 3  4 1 A 4  5 1
> B 5  6 1 B 6  7 1 B 7  8 1 A 8  9 1 A 9  10 2 A 1  11 2 A 2  12 2 A 3  13
> 2 A 4
> Expected result:
>  thing_id category periods  1 A 1-4, 8-9  1 B 5-7  2 A 1-4
> Scenario 3)
>
>  id thing_id category period_id  1 1 A 1  2 1 A 2  3 1 A 3  4 1 A 7  5 1 A
> 8  6 1 A 9  7 2 A 1  8 2 A 2  9 2 A 3  10 2 A 4
> Expected result:
>
>  thing_id category periods  1 A 1-3, 7-9  2 A 1-4
>
>
> So goal is, to group by thing_id, category id - but if period_id is
> interupted (not in incremented by 1) to have aggregated spans...
>
> To desired results we have came up using several CTE's (what makes a query
> a bit big, and more "procedural way": make cte what calculated diff between
> current and previous row, next cte uses previous one to define groupings,
> next cte to make aggregates etc...)
>
> So I wonder - is there some kind of aggregate window function what does
> desired results?
>
>
> Many Thanks,
>
> Misa
>
>

[GENERAL] excution time for plpgsql function and subfunction

2014-01-10 Thread Rémi Cura

Hey List,
kind of a simple question :

I'm using the postgis_topology extension,
and I'm trying to figure where the slowness comes from when importing data.

It involves plpgsql function calling other plpgsql functions, insert,
update, etc etc.

I know I can use explain analyze for one querry, but I don't know how to
get details of how much time takes each sub-function called by a main
function.

Thus it is very difficult to guess where is the bottleneck.

Thanks ,cheers,

Rémi-C

Re: [GENERAL] excution time for plpgsql function and subfunction

2014-01-13 Thread Rémi Cura

Thanks all =)

Cheers,

Rémi-C


2014/1/13 Guillaume Lelarge 

> On Fri, 2014-01-10 at 10:10 +0100, Rémi Cura wrote:
> > Hey List,
> > kind of a simple question :
> >
> > I'm using the postgis_topology extension,
> > and I'm trying to figure where the slowness comes from when importing
> data.
> >
> > It involves plpgsql function calling other plpgsql functions, insert,
> > update, etc etc.
> >
> > I know I can use explain analyze for one querry, but I don't know how to
> > get details of how much time takes each sub-function called by a main
> > function.
> >
> > Thus it is very difficult to guess where is the bottleneck.
> >
>
> Shameless plug, but here is how I do it:
>
> http://blog.guillaume.lelarge.info/index.php/post/2012/03/31/Profiling-PL/pgsql-functions
>
>
> --
> Guillaume
> http://blog.guillaume.lelarge.info
> http://www.dalibo.com
>
>

[GENERAL] reading array[text] in C extension function

2014-01-15 Thread Rémi Cura

Hey,
I'm trying to use an array of text as input in a C function in a custom
extension.

the prototype of the sql function is  :
CREATE OR REPLACE FUNCTION pc_subset(  dimensions TEXT[])


it is called like :
pc_subset(  ARRAY['X'::text,'Y'::text])

and the C function trying to read the text array (converting it to cstring
) :

text ** vals;

char ** cstring_array;
 nelems = ARR_DIMS(arrptr)[0];
vals = (text**) ARR_DATA_PTR(arrptr);
cstring_array = (char **) pcalloc(nelems * sizeof(char * ) );
for (i3=0;i3

Re: [GENERAL] reading array[text] in C extension function

2014-01-16 Thread Rémi Cura

I'll auto-answer ;-)

Based on the function "btoptions" from postgres source, which takes aas a
first arg a text[] :

ArrayType  *array;
Datum  *dimdatums;
int ndim;
array = DatumGetArrayTypeP(dimensions);
Assert(ARR_ELEMTYPE(array) == TEXTOID);
pcinfo("after assert \n");
deconstruct_array(array, TEXTOID, -1, false, 'i',
  &dimdatums, NULL, &ndim);

//construct the array to hold the result :
char ** final_dimension_array = (char **) pcalloc(ndim * sizeof(char * ) );
 for (i = 0; i < ndim; i++)
{
  text   *dimensiontext = DatumGetTextP(dimdatums[i]);
char   *text_str = VARDATA(dimensiontext);
int text_len = VARSIZE(dimensiontext) - VARHDRSZ;
char   *s;
char   *p;
s = TextDatumGetCString(dimdatums[i]);
final_dimension_array[i] = s;
}
//pcinfo("end of the text retrieval\n");

Cheers,

Rémi-C



2014/1/15 Rémi Cura 

> Hey,
> I'm trying to use an array of text as input in a C function in a custom
> extension.
>
> the prototype of the sql function is  :
> CREATE OR REPLACE FUNCTION pc_subset(  dimensions TEXT[])
>
>
> it is called like :
> pc_subset(  ARRAY['X'::text,'Y'::text])
>
> and the C function trying to read the text array (converting it to cstring
> ) :
>
> text ** vals;
>
> char ** cstring_array;
>  nelems = ARR_DIMS(arrptr)[0];
>  vals = (text**) ARR_DATA_PTR(arrptr);
> cstring_array = (char **) pcalloc(nelems * sizeof(char * ) );
>  for (i3=0;i3 {
> cstring_array[i3] = text_to_cstring( vals[i3]);
>  elog(INFO, "elem %d of dim_array : %s\n",i3,cstring_array[i3]);
> }
>
> I crashes postgres because of a segfault.
>
>
> Any help appreciated =)
>
> Cheers,
> Rémi-C
>

[GENERAL] expert : SRF returning double[]

2014-01-16 Thread Rémi Cura

Hey list,

another tricky C function interface problem :

How to write a set returning function, that returns for each row an array?

it seems like the main function SRF_RETURN_NEXT takes Datum and so I can't
use PG_RETURN_ARRAYTYPE_P().

Shall I encapsulate the array into a composite field (only 1 field : the
double array)

I looked a lot for an example wihtout success.

Help greatly appreciated,

thanks,

Cheers,
Rémi-C

Re: [GENERAL] expert : SRF returning double[]

2014-01-16 Thread Rémi Cura

another auto-answer :

Suprisingly ,

result = construct_array(...)
SRF_RETURN_NEXT(funcctx, PointerGetDatum(result));

But Datum memory must be allocated

Cheers,

Rémi-C


2014/1/16 Rémi Cura 

> Hey list,
>
> another tricky C function interface problem :
>
> How to write a set returning function, that returns for each row an array?
>
> it seems like the main function SRF_RETURN_NEXT takes Datum and so I can't
> use PG_RETURN_ARRAYTYPE_P().
>
> Shall I encapsulate the array into a composite field (only 1 field : the
> double array)
>
> I looked a lot for an example wihtout success.
>
> Help greatly appreciated,
>
> thanks,
>
> Cheers,
> Rémi-C
>

Re: [GENERAL] Ordering Results by a Supplied Order

2014-02-06 Thread Rémi Cura

Hey,
 I don"t understand the difference between this ORDINALITY option and
adding a "row_number() over()" in the SELECT.

Thanks,

Cheers,
Remi-C


2014-02-06 Vik Fearing :

>  On 02/06/2014 04:16 AM, Michael Sacket wrote:
>
> Often times I find it necessary to work with table rows in a specific,
> generally user-supplied order.  It could be anything really that requires
> an ordering that can't come from a natural column.  Most of the time this
> involved manipulating a position column from the client application.  In
> any case, I've often found that to be cumbersome, but I think I've come up
> with a solution that some of you may find useful.
>
>
> Up until 9.4, that's a good way to do it.
>
> Starting from 9.4, you can use the WITH ORDINALITY feature.
> http://www.postgresql.org/docs/devel/static/sql-select.html
>
> --
> Vik
>
>

Re: [GENERAL] Hard upgrade (everything)

2014-02-06 Thread Rémi Cura

On my private computer I upgraded first the postgres to 9.3, then upgraded
postgis.
Sadly according to
http://trac.osgeo.org/postgis/wiki/UsersWikiPostgreSQLPostGIS ,
postgis 1.5 is not compatible with postgres 9.3.
However POstgis 2.1 is compatible with you current postgres option.
So as suggested you can upgrade postgis (see hard/soft upgrade), the
upgrade postgres.
By the way postgis is very easy to compil with ubuntu (use package system
to get dependecies, then simply sudo ./autogen.sh, sudo ./configure, make ,
sudo make install)
Cheers,
Remi-C



2014-02-06 alexandros_e :

> I would use normal pg_dump and pg_restore for the DBs and not
> utils/postgis_restore.pl. Also, AFTER I backup all databases and
> everything
> else, you could try to upgrade Postgis without upgrading PostgreSQL by
> buliding from source e.g.
> http://trac.osgeo.org/postgis/wiki/UsersWikiPostGIS21Ubuntu1304src. There
> it
> says that Postgis 2.1 could be used with PostgreSQL 9.1. This will give you
> access to Postgis 2.1 features without reinstalling everything.
>
> Of course normally I would not upgrade if this is an 1-2 years project,
> unless I 100% need Postgis 2.1 features not present in 1.5.
>
>
>
>
>
> --
> View this message in context:
> http://postgresql.1045698.n5.nabble.com/Hard-upgrade-everything-tp5790801p5790811.html
> Sent from the PostgreSQL - general mailing list archive at Nabble.com.
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: [GENERAL] Ordering Results by a Supplied Order

2014-02-06 Thread Rémi Cura

Ok, thanks ^^

Cheers,
Rémi-C


2014-02-06 Vik Fearing :

> On 02/06/2014 10:00 AM, Rémi Cura wrote:
> > Hey,
> >  I don"t understand the difference between this ORDINALITY option and
> > adding a "row_number() over()" in the SELECT.
>
> WITH ORDINALITY will give you something to order by.  You should never
> do "row_number() over ()" because that will give you potentially random
> results.
>
> --
> Vik
>
>

1 2 >

1 - 100 of 117 matches

Mail list logo