Re: [GENERAL] PostgreSQL-related legal question

2015-03-11 Thread Gavin Flower

On 12/03/15 01:55, Bill Moran wrote:

On Wed, 11 Mar 2015 13:50:02 +0100
Dorian Hoxha  wrote:


I don't see how it could have negative impact on the postgresql project?
It's not like your job will be to find vulnerabilities and not disclose
them ?

I don't think I should discuss the particulars of the situation on
the list. That's why I'm just looking for a lawyer who understands
the situation and can advise me.


On Wed, Mar 11, 2015 at 1:28 PM, Bill Moran 
wrote:


I've been asked to sign a legal document related to a PostgreSQL-
related job opening. I have concerns about the document and that
signing it could have a negative impact on the PostgreSQL project
(in addition to personal concerns).

I'm guessing I'm not the first person to go through this. I'm
hoping someone on this list can refer me to a lawyer who is
familiar with the challenges of NDAs and open source projects.

I'm not asking for pro-bono, I'm willing to pay for services,
but I just figured that I might get better results getting a
referral than by contacting $random_legal_service.

--
Bill Moran


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



Bill cannot comment, but it might be along the lines of assigning all 
intellectual property rights, or something of that ilk. In that case, it 
might give the company ownership of stuff he may have contributed (or 
intends to contribute) to PostgreSQL in some way – which could lead to 
legal complications affecting PostgreSQL adversely, which would be 
expensive and an unnecessary distraction.




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] how would you speed up this long query?

2015-03-27 Thread Gavin Flower

On 28/03/15 10:10, zach cruise wrote:

select
   sub_query_1.pid,
   sub_query_1.tit,
   sub_query_1.num,
   sub_query_3.cid,
   sub_query_3.id,
   sub_query_3.c,
   sub_query_3.s,
   sub_query_3.z,
   sub_query_3.cy,
   sub_query_3.cd,
   sub_query_3.cr,
   org.id as org__id,
   org.pid as org__pid,
   org.open,
   org.cid as org__cid,
   z0.zcg
from
   (select
   proj.pid,
   proj.tit,
   proj.num
   from
   proj,
   (select
   org.pid
   from
   org
   where
   org.open = 'Y') as sub_1
   where
   proj.pid = sub_1.pid) as sub_query_1,
   (select
   detail.cid,
   detail.id,
   detail.c,
   detail.s,
   detail.z,
   detail.cy,
   detail.cd,
   detail.cr
   from
   detail,
   (select
   org.id
   from
   org
   where
   org.open = 'Y') as sub_3
   where
   detail.id = sub_3.id) as sub_query_3,
   org,
   z0
where
   sub_query_1.pid = org.pid and
   sub_query_3.id = org.id and
   sub_query_3.z = z0.zcg
group by
   z0.zcg,
   sub_query_1.pid,
   sub_query_1.tit,
   sub_query_1.num,
   sub_query_3.cid,
   sub_query_3.id,
   sub_query_3.c,
   sub_query_3.s,
   sub_query_3.z,
   sub_query_3.cy,
   sub_query_3.cd,
   sub_query_3.cr,
   org.id,
   org.pid,
   org.open,
   org.cid



Version of PostgreSQL?

Operating system?

Hardware configuration?

Indexes?

EXPLAIN ANALYZE output?

Anything else that might be relevant?

What have you already done to investigate?


Cheers,
Gavin




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgresql Development Options

2015-04-05 Thread Gavin Flower

On 06/04/15 08:46, Ray Madigan wrote:
The application will be very small, its main function is as a data 
acquision tool that reads data from an c api and does minimal 
processing and feeds it off to a postgresql database for offline 
processing.  I haven't written the PC side application yet, just a 
prototype in Qt.  It just hast to be as fast as possible, or I would 
java.  I am comfortable with c++, but would use other language if it 
would get the job done.


The ui is used basically to control the device that is generating the 
data, and parsing it into sql to send to the database.  Very simple.


Thank you.

ps, if I shouldn't include your email in my response tell me, I am not 
sure what the conventions are.


On Sun, Apr 5, 2015 at 1:35 PM, Adrian Klaver 
mailto:adrian.kla...@aklaver.com>> wrote:


On 04/05/2015 01:21 PM, Ray Madigan wrote:

I have been using postgresql in java off and on for many
years.   I now
have an assignemtn where I have to build a very straight forward
networked application that needs to be able to insert Array
data types
from a windows environment to a remote Linux Postgresql database.

My first attempt was to use Qt.  Their is a postgresql driver
and found
out the hard way that it doesn't support the Array data type. 
I need a

small ui so the user can set options for the application.   My
question
is, are there other UI options that I can use to development this
application.


What languages are you comfortable with?

Seems from the above the UI is separate from the application, is
that the case?

Is so what is the application written in?


Thanks in advance.



-- 
Adrian Klaver

adrian.kla...@aklaver.com 


The convention is to bottom post (like I am here), and include as much 
of the previous emails as is relevant.


You don't have to be too strict in cutting things out of previous 
emails, it is more to prevent emails getting far too long - in the above 
case no real need to omit things.


In general mention what version of PostgreSQL you are using, though not 
really relevant here!



Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] documenting tables version control

2015-05-06 Thread Gavin Flower

See comments at bottom,

On 06/05/15 20:47, Tim Clarke wrote:
We keep the DDL statements for any tables in our subversion code 
repository just like any other programming code. Takes only a moment 
then all history or changes is available and comparable. svnserve on 
our linux server, svn workbench on my ubuntu laptop, all open source :)


Tim Clarke
On 05/05/15 21:13, Suresh Raja wrote:


Hi All:


I have tables with different versions in the same schema.  Like
T1a, T1b, T1c
T2a, T2b, T2c, T2d

...
etc.


I'm interested in documenting various version of tables, may be in 
excel sheet or may be in another schema in the database.  I would 
like to store information on table name, column names, column order, 
pry key column etc.  Also is there a way I can reverse engineer from 
the database itself.


Let me know if anybody any suggestion or format that they used.  I 
appreciate all your help and shall be suitably acknowledged.


Thanks,
-Suresh Raja




Please don't top post in these lists.

Not upgraded to git yet???  :-)


Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to clear buffer

2015-05-13 Thread Gavin Flower

On 13/05/15 20:13, Ravi Krishna wrote:

I am writing bench mark scripts and as part of it would like to clear the cache
programmatically. This is to ensure that when we run select queries the data is
not read read from the cache. Does PG provide any easy way to do it other than
the obvious way to restart the database.

Thanks.



Don't forget the O/S also caches things!

-Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Testing Views

2015-05-20 Thread Gavin Flower

On 20/05/15 19:47, Andy Chambers wrote:

Hello All,

I have a problem for which a view seems like a nice solution. 
Basically we want to see all records in some table that are older than 
5 days and haven't yet gone through further processing.


This particular view is probably simple enough that it doesn't require 
unit tests but I wonder how people test complex views that depend on 
the current date or time. I found a thread on the hackers list [1] 
that talked about stubbing pg_catalog.now() but was wondering if any 
TDD minded developers had invented anything better since then.


One option I thought of was to just not use views that depend on the 
current date or time and instead create a set returning function that 
takes the time as a parameter. Would such a function have similar 
performance characteristics to an equivalent view?


Cheers,
Andy

[1]: 
http://postgresql.nabble.com/overriding-current-timestamp-td5507701.html

How about using a 'Time Machine'?

Run the database in a VM and change the date/time.  You can take a clone 
of the VM, and recreate it multiple times from a known state.


A technique we used to test for Y2K problems, except in those days we 
did have Virtual Machines like we have now!



Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] TRIGGER TRUNCATE -- CASCADE or RESTRICT

2015-06-02 Thread Gavin Flower

On 03/06/15 08:40, Andreas Ulbrich wrote:

On 02.06.2015 22:12, Melvin Davidson wrote:

Your problem is in your design.

If you do it like this:

CREATE TABLE A
(
p_col serial PRIMARY KEY,
acol  integer
);

CREATE TABLE B() INHERITS (A);

INSERT INTO A(acol) VALUES (1);
INSERT INTO B(acol) VALUES (2);

SELECT * FROM A;
SELECT * FROM B;

Then the sequence (p_col) will be UNIQUE across all tables and can be 
referenced.

No need for a key table.

No, someone can do:
INSERT INTO A VALUES (2,3);
TABLE A;
shows:
 p_col | acol
---+--
 1 |1
 2 |2
 2 |3
p_col is not unique!
Curious, I tried to investigate, to get a better understanding and ran 
into a problem...


   $ psql
   psql (9.4.1)
   Type "help" for help.

   gavin=> CREATE TABLE A
   gavin-> (
   gavin(> p_col serial PRIMARY KEY,
   gavin(> acol  integer
   gavin(> );
   CREATE TABLE
   gavin=> CREATE TABLE B() INHERITS (A);
   CREATE TABLE
   gavin=> INSERT INTO A(acol) VALUES (1);
   ERROR:  column "acol" of relation "a" does not exist
   LINE 1: INSERT INTO A(acol) VALUES (1);
   ^
   gavin=> \d+ a
   Table "public.a"
 Column | Type   |
   Modifiers | Storage | Stats target | Description

   
+-+---+-+--+-
 p_col  | integer | not null default
   nextval('a_p_col_seq'::regclass) | plain   |  |
 acol   | integer
   |   | plain  
   |  |

   Indexes:
"a_pkey" PRIMARY KEY, btree (p_col)
   Child tables: b

   gavin=> \d b
   Table "public.b"
 Column | Type   | Modifiers
   +-+---
 p_col  | integer | not null default nextval('a_p_col_seq'::regclass)
 acol   | integer |
   Inherits: a

   gavin=>




[...]


Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Re: INSERT a real number in a column based on other columns OLD INSERTs

2015-06-25 Thread Gavin Flower

On 25/06/15 16:06, litu16 wrote:

SELECT t.time_index FROM table_ebscb_spa_log04 t WHERE t.fn_name =
NEW.fn_name AND (t.time_type = 'Start' OR time_type = 'Lap') ORDER BY
t.stmtserial DESC LIMIT 1 INTO t_ix;

Please format your SQL, as it makes it easier to read!
But, you don't have to follow my conventions, though.  :-)

By formatting it, I noticed an apparent mistake (in this case, 
functionally the same).

You had time_type = 'Lap' rather than t.time_type = 'Lap'.

SELECT
t.time_index
FROM
table_ebscb_spa_log04 t
WHERE
t.fn_name = NEW.fn_name
AND
(
t.time_type = 'Start'
OR
t.time_type = 'Lap'
)
ORDER BY
t.stmtserial DESC
LIMIT 1
INTO t_ix;


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-24 Thread Gavin Flower

On 25/08/15 01:15, Ray Cote wrote:
On Sat, Aug 22, 2015 at 11:46 AM, Karsten Hilbert 
mailto:karsten.hilb...@gmx.net>> wrote:



[...]


9. Do NOT arbitrarily assign an "id" column to a table as a
primary key when other columns
are perfectly suited as a unique primary key.

... 


  Good example:
CREATE TABLE accounts
( accout_id bigint NOT NULL ,


I would not consider the general use of natural primary keys to be 
best practice.

[...]

Neither would I.

The database has primary keys that are often foreign keys for other 
tables.  So if the primary key is a natural key, then if the external 
world redefines the nature of the natural key, for example changing its 
type or format, then this would have unnecessary invasive changes to 
multiple tables within the database.  Also you are at the mercy of 
external control of what constitutes uniqueness, for example the 
American Social Security Number is not unique!


Also the best practice is to make the primary key name 'id' as you do 
know the table it is in, so prepending the table name is redundant - so 
you can clearly identify foreign keys because the suffix '_id 'is 
prepended by the table name of the referenced table.  Hence 'id' is a 
primary key, and account_id is a foreign key pointing into the account 
table.


I have had to deal with databases were a child table's primary key is 
the parent table's primary key with extra characters appended, so you 
can have a child table's primary key exceeding 45 characters. The child 
table only need to know the primary key of it direct parent, so using 
int, or bigint, would be a far better solution!


Having said the above, there may well be valid reasons to use a natural 
key for the primary key - so it should NOT be an absolute rule to 
disallow it.



Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-24 Thread Gavin Flower

On 25/08/15 02:58, John Turner wrote:
[...]
Conversely, if synthetic keys are chosen as Primary, they must be 
accompanied by a legitimate Unique natural key.

Agreed, but only where appropriate.

Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-24 Thread Gavin Flower

On 25/08/15 04:26, Joshua D. Drake wrote:

On 08/24/2015 08:56 AM, Melvin Davidson wrote:

 >The "serial" key is the default primary key amongst every single web
development environment in existence.

Methinks thou doest take too much for granted.

Yes, serial has it's purpose, but I sincerely doubt it is "the default
primary key amongst every single web development environment in 
existence"

I am not sure where you get your stats from. Probably you are referring
to "Ruby on Rails". IMHO,


Rails
Anything that uses Hibernate (Java)
Django
Every PHP framework
Pyramid
Anything that uses sql-alchemy

I can go on for miles with this. It is true that a lot of these 
support non-serial keys. It is also true that is not the default.


JD


I came to the idea of using surrogate primary keys long before I knew 
anything about the software on the above list or anything similar!



Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-24 Thread Gavin Flower

On 25/08/15 14:45, David G. Johnston wrote:
On Mon, Aug 24, 2015 at 10:02 PM, Gavin Flower 
<mailto:gavinflo...@archidevsys.co.nz>>wrote:


Also the best practice is to make the primary key name 'id' as you
do know the table it is in, so prepending the table name is
redundant - so you can clearly identify foreign keys because the
suffix '_id 'is prepended by the table name of the referenced
table.  Hence 'id' is a primary key, and account_id is a foreign
key pointing into the account table.


​ I would much rather be able to write:

SELECT parent_id, child_id, [...]
FROM parent
JOIN child USING (parent_id)

instead of

SELECT parent.id <http://parent.id> AS parent_id, child.id 
<http://child.id> AS child_id, [...]

FROM parent
JOIN child ON (parent.id <http://parent.id> = child.parent_id)

​ Yes, looking at the parent table it is obvious that the id you are 
looking at is the "parent" id.  But as soon as you join two or more 
tables you are guaranteed to have multiple columns with the name "id" 
that you now need to disambiguate.



The column name "table_id" refers to the primary identifier for that 
entity no matter where it appears.  I'd rather have one redundant 
situation than one exception to the rule.


David J.


Hmm...

I consider it good practice to always give an alias for each table used, 
especially for non trivial SQL statements.


So I think the above would look better (using slightly more realistic 
table names) as:


SELECT
c.id,
s.id,
[...]
FROM
company c
JOIN shop s USING (s.company_id = c.id);

Which is I think a lot clearer (it is obvious that you are joining a 
foreign key with a primary key), and you can add more stuff without it 
suddenly becoming ambiguous.


I once wrote a Sybase stored proc with over 3000 lines of SQL (not 
practical to split it up, unfortunately), individual selects were often 
over half a page.  It interrogated 17 tables from two different 
databases and needed 5 temporary tables.





Cheers,
Gavin

P.S.  the use of '[...]' was started by me way back in the heyday of 
usenet, in the beginning of the 1990's!  Previously people used '[ 
omitted ]'.




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Gavin Flower

On 25/08/15 19:04, Karsten Hilbert wrote:

On Tue, Aug 25, 2015 at 02:02:17PM +1200, Gavin Flower wrote:


On 25/08/15 01:15, Ray Cote wrote:

On Sat, Aug 22, 2015 at 11:46 AM, Karsten Hilbert mailto:karsten.hilb...@gmx.net>> wrote:


[...]

9. Do NOT arbitrarily assign an "id" column to a table as a
primary key when other columns
are perfectly suited as a unique primary key.

...

  Good example:
CREATE TABLE accounts
( accout_id bigint NOT NULL ,


I would not consider the general use of natural primary keys to be best
practice.

Gavin, Ray,

I certainly didn't write any of the above.

Karsten

Hi Karsten,

It took me a couple of minutes, but I traced "9.  ..." to 
melvin6...@gmail.com who opened the thread


Looks like Ray misquoted back in the entry that can be identified by
(using the 'source' option on my mail client)

   From: Ray Cote 
   Date: Mon, 24 Aug 2015 09:15:27 -0400
   Message-ID:
   

which was

   On Sat, Aug 22, 2015 at 11:46 AM, Karsten Hilbert 
   wrote:

   > > 1. Prefix ALL literals with an Escape
   > >EG:  SELECT E'This is a \'quoted literal \'';
   > > SELECT E'This is an unquoted literal';
   > >
   > >Doing so will prevent the annoying "WARNING:  nonstandard use of
   > escape in a string literal"
   >

   I'd be concerned that what is missing here is the bigger issue of  Best
   Practice #0: Use Bound Variables.
   The only way I've seen invalid literals show up in SQL queries is through
   the dynamic generation of SQL Statements vs. using bound variables.
   Not using bound variables is your doorway to SQL injection exploits.


   9. Do NOT arbitrarily assign an "id" column to a table as a primary key
   > when other columns
   > are perfectly suited as a unique primary key.

   ...

Good example:
   > CREATE TABLE accounts
   > ( accout_id bigint NOT NULL ,


   I would not consider the general use of natural primary keys to be best
   practice.
   Let's assume your account_id field is used as a foreign key in a dozen
   other tables.
   1) What happens if someone mis-types the account-id?
 To correct that, you also need to correct the FK field in the other
   dozen tables.
   2) What happens when your company starts a new project (or buys a
   competitor) and all the new account numbers are alpha-numeric?
   3) Your example shows the id as a bigint, but your rule is not limited to
   integers.
   What if your table is country populations and the primary key is country
   name?
   Now, you have quite large foreign keys (and a country changing its name is
   not unheard of).
   (and let's not even get started on case-sensitivity or character encodings).


Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why does splitting $PGDATA and xlog yield a performance benefit?

2015-08-25 Thread Gavin Flower

On 26/08/15 05:54, David Kerr wrote:

On Tue, Aug 25, 2015 at 10:16:37AM PDT, Andomar wrote:

However, I know from experience that's not entirely true, (although it's not 
always easy to measure all aspects of your I/O bandwith).

Am I missing something?


Two things I can think of:

Transaction writes are entirely sequential.  If you have disks
assigned for just this purpose, then the heads will always be in the
right spot, and the writes go through more quickly.

A database server process waits until the transaction logs are
written and then returns control to the client. The data writes can
be done in the background while the client goes on to do other
things.  Splitting up data and logs mean that there is less chance
the disk controller will cause data writes to interfere with log
files.

Kind regards,
Andomar


hmm, yeah those are both what I'd lump into "I/O bandwith".
If your disk subsystem is fast enough, or you're on a RAIDd SAN
or EBS you'd either overcome that, or not neccssarily be able to.



Back when I actually understood the various timings of disc accessing on 
a MainFrame system, back in the 1980's (disc layout & accessing, is way 
more complicated now!), I found that there was a considerable difference 
between mainly sequential & mostly random access - easily greater than a 
factor of 5 (from memory) in terms of throughput.


Considering the time to move heads between tracks and rotational latency 
(caused by not reading sequential blocks on the same track).  There are 
other complications, which I have glossed over!



Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Gavin Flower

On 26/08/15 03:40, Melvin Davidson wrote:
[...]

IOW: If we have an account table, then the account_id or account_no
 would be the primary key. There is no need to have a separate 
serial id as the primary key.

[...]

Account numbers are externally generated, and may potentially change.  
Management might suddenly decide that they want to start using the year 
the account started as the first 4 digits, or that the branch code 
should be reflected in it, or something else.  The database should be 
protected from these arbitrary changes.  Hence the account_no is not a 
good candidate for a primary key.



Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Gavin Flower

On 26/08/15 08:56, Adrian Klaver wrote:

On 08/25/2015 08:40 AM, Melvin Davidson wrote:

Adrian,

Stop being so technical. When we/I speak of natural keys, we are talking
about the column
that would NATURALly lend itself as the primary key.


Pretty sure this is a technical list:)


Don't let inconvenient facts get in the way of a good argument!  :-)

[...]
Pretty sure parts are not unique to an exact vehicle, unless you are 
talking a totally handmade one. They are not even unique to make and 
model. As an example, I used to work on Class B Isuzu trucks. These 
models(FTR) where also built for Chevrolet as the Forward models. So 
right of the bat there where two part numbers for each part, one that 
started with 9 if you got it from Chevrolet and one with 11 from 
Isuzu, if memory serves. Then Isuzu decided to reorganize their part 
numbers, so that introduced another number, all pointing to the exact 
same part. Then there where those parts available from the parts 
houses(NAPA, etc).


Then there was the greenhouse I worked for where we supplied UPC coded 
tags for our customers. In the beginning, it was simple, the item 
portion of the UPC was unique and with the company prefix served as a 
'natural' key for the tags. Then the chain stores we worked with must 
have all gone to the same seminar on how to be Walmart and decided 
they did not want unique numbers, but UPCs tied to price groups that 
covered a variety of plants. Luckily, I was too stupid to 

Natural Stupidity??? :-)

(Sorry, couldn't resist!)

know surrogate keys where bad and had a sequence attached to the tag 
table. This then became the tag id and made life a lot easier during 
the transition. It still remains there, because people are people and 
'natural' tends to be artificial and transient.
Extremely good examples, I'll bear them in mind - makes me even more 
keen on surrogate primary keys.  I'm always very wary when people tell 
me some numbering scheme will NEVER change!!!


[...]


Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Gavin Flower

On 26/08/15 02:17, Adrian Klaver wrote:
[...]


2) One of the older unique natural keys (genus, species) is not so 
unique. I am a fisheries biologist by training and in my time the 
'unique' identifier for various fishes has changed. Now that 
ichthyologists have discovered DNA testing, it can be expected there 
will be even more changes. This is even more apparent when you go back 
in in history. As an example:


https://en.wikipedia.org/wiki/Rainbow_trout

Rainbow trout

Current

Oncorhynchus mykiss

Past

Salmo mykiss Walbaum, 1792
Parasalmo mykiss (Walbaum, 1792)

[...]


Salmo gilberti Jordan, 1894
Salmo nelsoni Evermann, 1908

So you probably need a date stamp so you could record things relating to 
the correct name for a given period in a mapping table, and still relate 
to the same surrogate key for referencing other tables.


Maybe even worse, is when a species is suddenly found to be 2 or more 
distinct species!


Something similar could happen with account numbers: 2 companies with 
similar names might be assigned to the same account number, and lots of 
transactions recorded before the mistake is discovered. Though obviously 
a surrogate key would not give you complete protection from a lot of 
work sorting the mess out, but it would probably help!


I read on post a year or 2 back, a guy in Europe had at least 4 
different variations on his name depending on the country he was in and 
the local language and cultural norms.


When I worked at a freezing works in the 1970's in Auckland, I heard 
that the pay roll allowed for over 52 different names per employee (per 
year?).  Though, I was never told the maximum name changes ever used.  
Essentially management might fire someone, but the union would complain, 
and they would be rehired under a different name - so I was told!  So 
the correct holiday pay & PAYE tax deductions would still relate to the 
same individual no matter how many name changes they had.



Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Gavin Flower
NB the attribution colours seems to be mixed up a bit here, but this all 
dialogue between me & Adrian.



On 26/08/15 09:48, Adrian Klaver wrote:

On 08/25/2015 02:23 PM, Gavin Flower wrote:

On 26/08/15 08:56, Adrian Klaver wrote:

[...]

have all gone to the same seminar on how to be Walmart and decided
they did not want unique numbers, but UPCs tied to price groups that
covered a variety of plants. Luckily, I was too stupid to

Natural Stupidity??? :-)


Oh yeah and a long history too, but that needs at least a pitcher of 
beer to recount.

Well if you're ever in Auckland, I'll shout you a beer!
(We might even put you up for a night or two.)




(Sorry, couldn't resist!)


know surrogate keys where bad and had a sequence attached to the tag
table. This then became the tag id and made life a lot easier during
the transition. It still remains there, because people are people and
'natural' tends to be artificial and transient.

Extremely good examples, I'll bear them in mind - makes me even more
keen on surrogate primary keys.  I'm always very wary when people tell
me some numbering scheme will NEVER change!!!


To add a recent one. My partner Nancy signed up for Medicare last year 
to avoid the premium penalty. This year in July she signed up for 
Social Security. Turns out, for reasons I do not understand, 
CMS(https://www.cms.gov/) changes the Medicare account number at that 
point. The reason we even cared is that the billing system thinks she 
has two accounts and is double billing. Time on the phone with someone 
at CMS was not enlightening. We where told to trust the system and 
eventually it will work itself out. Still waiting:(


STOP IT!!!  You're making me even more cynical and paranoid!  :-)



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Gavin Flower

On 26/08/15 04:33, Marc Munro wrote:

On Sat, 2015-08-22 at 15:15 +, Melvin Davidson wrote:


I've been searching for a "PostgreSQL Developer Best Practices" with not
much luck,
so I've started my own. At the risk of stirring up a storm of controversy,
I would appreciate additional suggestions and feedback.


You might add: Create all relation names as plurals.

Or, if your site uses predominantly singular names, make that the
standard.  Consistency within the site is more important than any
dogmatic belief about whether singular or plural forms is better.  If
you don't put it in the standard, someone will eventually create tables
with names that don't gel with everything else.

__
Marc





Actually I would suggest standardising on singular names, not JUST 
because that this the standard I prefer!  :-)


But (also) because:

1. Singular words tend to be shorter

2. plurals are more ambiguous wrt spelling

3. there other good reasons, that I've forgotten for now :-(
   (but I remember having them!!!)


Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Gavin Flower

On 26/08/15 11:34, Adrian Klaver wrote:
[...]


Agreed, but it happens. When Lowes took over a local hardware
chain(Eagles) here in Washington state they moved very quickly on
changing the account numbers. The company I worked for who supplied
Eagles and then Lowes sat on a check for $22,000 that was sent to us
in error because the account numbers got switched. We called them when
we got the check, but it still took them six months to own up to it.


DOH!

Next time a screwball outfit sends you a check for $22k erroneously just
go deposit it :-)


Well that is what I wanted to do, the owner overruled me:(. Something 
about Lowes having more lawyers then we did. The strange part was we 
called them and told them what had happened and supplied the relevant 
information that explained the mix up. You would have thought us 
calling to return a check that was supposed to be to us would have 
raised a flag!

[...]

Many years ago a department store credited our account with a refund for 
about $150, which obviously was not meant for us - we had never bought 
that item!


They replied, essentially saying we we were due the refund.

We sent a letter yet again, explaining the problem, and saying we were 
not entitled.


They then sent used a check for the amount, which we cashed, feeling we 
had done our best to help them & that we could not be bothered wasting 
more time trying to sort things out for them!


I think any judge would laugh them out of court!




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-26 Thread Gavin Flower

On 27/08/15 00:03, Vincent de Phily wrote:

On Wednesday 26 August 2015 10:27:04 Gavin Flower wrote:

Actually I would suggest standardising on singular names, not JUST
because that this the standard I prefer!  :-)

But (also) because:

  1. Singular words tend to be shorter

  2. plurals are more ambiguous wrt spelling

  3. there other good reasons, that I've forgotten for now :-(
 (but I remember having them!!!)

Oh, so you name your tables 'fish' and 'crow' instead of 'school' and
'murder' ? Such wasted opportunities :p


???


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-26 Thread Gavin Flower

On 26/08/15 12:17, Melvin Davidson wrote:
[...]
So for the sake of argument, a natural key is something that in itself 
is unique and the possibility of a duplicate does not exist.
Before ANYONE continues to insist that a serial id column is good, 
consider the case where the number of tuples will exceed a bigint.

Don't say it cannot happen, because it can.

If you create tuples in your table at the rate of a million records per 
second every hour of every day of every year, it will take over 250,000 
years to exceed the value of a bigint!


[...]



--
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.
I will refrain from marking snarky comments about your sig - tempting 
though it might be!  :-)



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-26 Thread Gavin Flower

On 27/08/15 06:59, Raymond O'Donnell wrote:

On 26/08/2015 19:54, Gavin Flower wrote:

On 27/08/15 00:03, Vincent de Phily wrote:

On Wednesday 26 August 2015 10:27:04 Gavin Flower wrote:

Actually I would suggest standardising on singular names, not JUST
because that this the standard I prefer!  :-)

But (also) because:

   1. Singular words tend to be shorter

   2. plurals are more ambiguous wrt spelling

   3. there other good reasons, that I've forgotten for now :-(
  (but I remember having them!!!)

Oh, so you name your tables 'fish' and 'crow' instead of 'school' and
'murder' ? Such wasted opportunities :p


???

A school of fish and a murder of crows... wonderfully evocative
collective nouns.

Ray.



Hmm...

Reminds me about a story I was told where a programmer had used names of 
his favourite footballers as labels in an assembler program!  :-)



-Gavin



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Public facing PostgreSQL hosting ?

2015-08-31 Thread Gavin Flower

On 31/08/15 19:03, essam Ganadily wrote:

hi
i do develop on PostgreSQL from home and from work. i need public 
facing PostgreSQL , something i can use Pgadmin from anywhere.
performance and scalability is not important because i will be running 
like few operations per day.


any idea where can i find that ?
thankx
I don't think it is a good idea to let people to have direct access (in 
the sense of allowing people to submit SQL, or even just values) to a 
database, apart from carefully controlled groups.  Usually it is best to 
have some sort of middleware in between the public & a database.



Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] table dependencies

2015-09-07 Thread Gavin Flower

On 07/09/15 19:44, Raymond O'Donnell wrote:

On 06/09/2015 22:59, FarjadFarid(ChkNet) wrote:

No worries.

I found a way.


Would you share it, for the archives?

Ray.



I think I can do it relatively simply, in a reasonable general fashion.

if it is of real interest let me know, and I'll see if I can find time 
to try and implement  it.



Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] ID column naming convention

2015-10-13 Thread Gavin Flower

On 14/10/15 06:36, droberts wrote:

Hi, is there a problem calling ID's different when used as a FK vs table ID?
For example


mydimtable ()
  ID
  name
  description


myfacttable ()
   my_dim_id   # FK to ID above
   total_sales


I 'think' if I don't enforce foreign key constraints, then this practice
prevents tools from being able to generate ERD diagrams right?



--
View this message in context: 
http://postgresql.nabble.com/ID-column-naming-convention-tp5869844.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


My practice is to name the PRIMARY KEY as id, and foreign keys with the 
original table name plus the sufiix_id.


By leaving the table name off the primary key name, and just using id, 
makes it more obvious that it is a primary key (plus it seems redundant 
to prefix the primary key name with its own table name!).


CREATE TABLE house
(
id  int PRIMARY KEY,
address text
);

CREATE TABLE room
(
id   int PRIMARY KEY,
house_id int REFERENCES house(id),
name text
);


There are exceptions like:

CREATE TABLE human
(
idint PRIMARY KEY,
mother_id int REFERENCES human (id),
father_id int REFERENCES human (id),
name  text
);

Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] *** QUESTION *** After successful 'BEGIN;' command -- why PGSQL_TRANSACTION_ACTIVE and not PGSQL_TRANSACTION_INTRANS?

2015-10-14 Thread Gavin Flower

On 12/10/15 22:52, Steve Petrie, P.Eng. wrote:
[...]


BTW -- this PostgreSQL newbie (bye bye mysql) is getting a nice warm 
fuzzy feeling, about PostgreSQL and its amazingly helpful community :)

[...]

I can attempt to remedy your 'nice warm fuzzy feeling'!  :-)

More seriously:
(1) why did you consider PostgreSQL?
(2) what made you change?
(3) for you, in your situation:  how is the ease of use & functionality 
of PostgreSQL compared to MySQL?



Probably best to start a new thread in pgsql-advocacy for your answers - 
I'm sure many people would be interested in what you have to say!




Cheers,
Gavin




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] ID column naming convention

2015-10-15 Thread Gavin Flower

On 16/10/15 13:09, Jim Nasby wrote:

On 10/13/15 2:34 PM, Gavin Flower wrote:



My practice is to name the PRIMARY KEY as id, and foreign keys with the
original table name plus the sufiix_id.

By leaving the table name off the primary key name, and just using id,
makes it more obvious that it is a primary key (plus it seems redundant
to prefix the primary key name with its own table name!).


There's two things that are ugly about that though:

Joins become MUCH easier to screw up. When you have 5 different fields 
that are all called 'id' it's trivial to mix them up. It's much harder 
to accidentally do something like 'blah.person_id = foo.invoice_id'.


The other issue is common to all "bare word" names (id, name, 
description, etc): it becomes completely impossible to find all 
occurrences of something in code. If you grep your entire codebase for 
'person_id', you know you'll find exactly what you want. Grepping for 
'id' OTOH would be useless.
It would seem to be very dodgy to us a join based on apparently very 
different semantic implied by 'blah.person_id = foo.invoice_id'!!! :-)


Because 2 fields in different tables have the same name, it does not 
necessarily mean they have the same semantics. For example 2 tables 
could have a field named 'start_date', but the one in a table called 
'employment' would have different semantics to the one in 'project'.


Since 'id' is only used to indicate a PRIMARY KEY, there is less 
confusion in joins, and it is clear when something is a foreign key 
rather than a PRIMARY KEY.  For example, if two tables both refer to the 
same human, you can join using a.human_id = b.human_id - and it is 
clearer when you are joining a child to a parent table, for example 
line_item.stock_id = stock.id.


Adopting you convention, it would result in not only picking up foreign 
key references, but also the primary keys - which may, or may not, too 
helpful!


It would be very rare to have a join such as project.id = task.id, it is 
usually a mistake to join tables on their primary key - so using just 
'id' as the PRIMARY KEY name is a bonus.


I once devised a stored procedure in SyBase with over 3,000 lines of SQL 
(I would have broken it up in smaller units, but it was not practicable 
in that development environment).  It had 7 temporary tables, 5 used 
'id' as the PRIMARY KEY - and 2 used the name of the PRIMARY KEY of an 
existing table ('tcs_id' & 'perorg_seq'), because that made more sense, 
as they had the the same semantic meaning. I did not design the 2 
databases I queried, but I suspect sometimes I might decide it best to 
use something other than just 'id' - but it would be very rare (I won't 
say never!) that I'd use the table name as a prefix for the primary key.


Searching on a bare word names can be useful when the fields have 
similar, related semantics.  In a real database, I'd be very unlikely to 
use 'name' for a field, though using 'description' might be valid.  
Though in general, I would agree that using several words in a name is 
normally preferable. Also it would also be better to define appropriate 
DOMAINs rather than just using bare types like 'text' & 'int' - to 
better document the semantics and make it easier to change things in a 
more controlled way.


If one was grepping for the occurrences of the use of the PRIMARY KEY of 
the table human, you would look for 'human_id' you would only grep for 
'id' if one wanted to find the use of PRIMARY KEYs.


No naming convention is perfect in all situations, and I'll adapt mine 
as appropriate.  In my experience, my convention (well to be honest, I 
adopted it from others - so I can't claim to have originated it!) seems 
to be better in general.


Essentially it is a guideline, I won't insist that you have have your 
computers confiscated  if you use a different convention!







--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] ID column naming convention

2015-10-17 Thread Gavin Flower

On 18/10/15 00:13, Karsten Hilbert wrote:

On Fri, Oct 16, 2015 at 04:59:12PM -0500, Jim Nasby wrote:


BTW, I found Karsten's idea of using 'pk' for the surrogate key, and
fk_table_name interesting. It helps avoid ambiguity from externally
generated ID values.

That's the point :-)

Here's a real live schema using (mostly) the above approach:

http://www.gnumed.de/~ncq/gnumed/schema/gnumed_v20/

in case anyone is interested in taking a look.

Karsten

Hmm...

Apparently (according to your naming convention) several tables (such as 
'clin.substance_intake') have 2 PRIMARY KEYs!!!


I guess you must have a large wall to display the schema on!

Looks like a lot of fun.

Can you tell us more about the database (ignoring nit-picking!)?


Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Recursive Arrays 101

2015-10-26 Thread Gavin Flower
I would suggest that you use int or bigint for primary keys, and have 
mapping tables to convert the scientific term to the surrogate primary key.


If the mapping table has additional attributes, like date of change & 
reason, then you can also print a history of changes.


Then the relationships between tables will be more isolated from changes 
in scientific nomenclature!  Plus if the same animals known by different 
scientific names at different times, you can have several mappings to 
the same animal.  Also if an organism is moved from one phylum to 
another, you can find the organism via either new or old references.
I've heard of cases were one species, is suddenly found to be 2 or 
distinct species!



Cheers,
Gavin


On 26/10/15 18:19, David Blomstrom wrote:
LOL - I don't think there are any natural keys here. Traditional 
scientific names are amazingly flaky. I guess I shouldn't call them 
flaky; it's just that no one has ever figured out a way do deal with 
all the complexities of classification. The new LSID's might be more 
stable - but which LSID does one choose? But it's amazing how many 
"aliases" are attached to many taxonomic names; utterly bewildering.


On Sun, Oct 25, 2015 at 10:09 PM, Adrian Klaver 
mailto:adrian.kla...@aklaver.com>> wrote:


On 10/25/2015 09:10 PM, David Blomstrom wrote:

It's also interesting that some entities (e.g. EOL) are now using
something called Life Science ID's (or something like that) in
lieu of
traditional scientific names. It sounds like a cool idea, but
some of
the LSID's seem awfully big and complex to me. I haven't
figured out
exactly what the codes mean.


Aah, the natural key vs surrogate key conversation rears its head.



Then again, when I navigate to the Encyclopedia of Life's
aardvark page
@ http://www.eol.org/pages/327830/overview the code is actually
amazingly short.



-- 
Adrian Klaver

adrian.kla...@aklaver.com 




--
David Blomstrom
Writer & Web Designer (Mac, M$ & Linux)
www.geobop.org 




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Recursive Arrays 101

2015-10-26 Thread Gavin Flower

Hi David,

Please don't top post!

On 27/10/15 09:42, David Blomstrom wrote:
I've created my first table in postgreSQL. I'd like to ask 1) if you 
see any errors, 2) do you have any suggestions for improving it, and 
3) can you give me the code I need to paste into the shell (or 
whatever you call the command-line tool) to recreate it?


This is what the table's schema looks like in MySQL...

N - int(6) [Primary Key]
Taxon - varchar(50) [Unique Key]
Parent - varchar(50) [Index Key]
ParentID - tinyint(1) [Index Key]
Slug - varchar(50) [Index Key]
NameCommon - varchar(50)
Plural - varchar(50)
Extinct - tinyint(1)
Rank - tinyint(2)
Key - tinyint(1)

The table type is MyIsam, collation is latin1_general_ci

Slug, NameCommon and Plural are NULL.

All of my tables have a default first column named N or ID, which is 
simply a numerical key that begins with 1. It's always designated the 
primary key.


All the other columns in this table can be divided into two 
categories, text (varchar) and numerical (tinyint).


The values in the columns Taxon and Slug serve as URL's, so they can 
have no spaces, apostrophes, accents, etc. (Taxon handles scientific 
names, Slug common names, if any.) So a row focusing on the Steller's 
jay would have values like these:


NameCommmon - Steller’s jay
Plural - Steller’s jays
Taxon - Cyanocitta-stelleri
Slug - stellers-jay
Parent - Cyanocitta

The column ParentID - which I want to use for hierarchical 
relationships - has values ranging from 1 for Mammalia (the first row) 
to 5 for the species level. The column Extinct has the value 1 (not 
extinct) or 2, 3 or 4 for various categories of extinct taxons.


The column Rank has the value 25 for the first row (class Mammalia), 
35 for each order (e.g. Carnivora), 45 for each family, 55 for each 
genus and 65 for each species. The value for Key is 1 (for every row), 
designating it a tetrapod. The bird, reptile and amphibian tables have 
the same key value, while fish, invertebrates and plants have their 
own unique keys.


I have Unique keys on N and Taxon, Index keys (not unique) on Parent, 
ParentID and Slug.


My PostgreSQL table is in a database named GeoZoo. When I go into 
pgAdmin3 > SQLPane, it looks like this:


CREATE TABLE public.gz_life_mammals
(
  "N" integer NOT NULL,
  "Taxon" character varying(50) NOT NULL,
  "Parent" character varying(50) NOT NULL,
  "ParentID" smallint NOT NULL,
  "Slug" character varying(50),
  "NameCommon" character varying(50),
  "Plural" character varying(50),
  "Extinct" smallint NOT NULL,
  "Rank" smallint NOT NULL,
  "Key" smallint NOT NULL,
  CONSTRAINT "Primary Key" PRIMARY KEY ("N"),
  CONSTRAINT "Unique Key" UNIQUE ("Taxon")
[I haven't added any non-unique keys yet.]
)
WITH (
  OIDS=FALSE
);
ALTER TABLE public.gz_life_mammals
  OWNER TO postgres;

I should also mention that Taxon is the column I use to UNION or JOIN 
this table with other tables.


P.S. If I decide to organize things by taxonomic levels (e.g. kingdom, 
class, etc.), then it should be easy to rename the table, delete a few 
columns, and refill it with data associated with a particular class.

[...]

Would suggest using lower case column names without embedded spaces, if 
possible!


NEVER assign tables to the postgres user, application tables should be 
owned by a user!


Note that PRIMARY KEY gives you both NON NULL & uniqueness.  So you 
don't need a separate PRIMARY KEY constraint!


'id' would be better than 'N' for the primary key name. ==> 'id int  
PRIMARY KEY'


Using 'text' rather than 'character varying(50)' would probably be better.

Since you are making a single column unique, suggest 'taxon   text 
UNIQUE NOT NULL'


You don't need to specify 'OIDS=FALSE', as that is now the defualt.



Cheers,
Gavin




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Recursive Arrays 101

2015-10-26 Thread Gavin Flower

On 27/10/15 10:17, David Blomstrom wrote:
What does "top post" mean? And what do you mean by "embedded spaces"? 
Are you referring to the underscores in the TABLE name?


On Mon, Oct 26, 2015 at 2:12 PM, Gavin Flower 
mailto:gavinflo...@archidevsys.co.nz>> 
wrote:


Hi David,

Please don't top post!


[...]

Top posting is when you put your reply at the top of the post, as you 
did just now in response to my post.


Here I am bottom posting, which is the norm for postgres mailing lists.

Ignore my comment about embedded spaces, I misread what you had written, 
underlines are fine.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Recursive Arrays 101

2015-10-26 Thread Gavin Flower

On 27/10/15 10:26, David Blomstrom wrote:

Here's what it looks like now:

CREATE TABLE public.gz_life_mammals
(
  id integer NOT NULL,
  taxon text NOT NULL,
  parent text NOT NULL,
  slug text,
  namecommon text,
  plural text,
  extinct smallint NOT NULL,
  rank smallint NOT NULL,
  key smallint NOT NULL,
  CONSTRAINT "Primary Key" PRIMARY KEY (id),
  CONSTRAINT "Unique Key" UNIQUE (taxon)
)
WITH (
  OIDS=FALSE
);
ALTER TABLE public.gz_life_mammals
  OWNER TO postgres;

* * * * *

I don't even have a clue what OIDS=FALSE means; I haven't read up on 
it yet. It's just there by default. I haven't figured out how to 
change the NULL value for any columns, other than toggle back and 
forth between NULL and NOT NULL.


To assign a user, would I just ask it to associate a table with my 
username? Can I do that with pgAdmin3?


Thanks.

Hi David,

Constructing SQL in an editor and executing the SQL script using psql is 
often a lot easier than using pgadmin3, and gives you far more control!  
I use both, but more often use psql.


From the postgres user and using psql, you can create a user & database 
like:


CREATE ROLE gavin
LOGIN
CREATEDB;

CREATE DATABASE gavin
OWNER gavin;


Obviously, you can create a database with a different name for the same 
user.  Just that the above means that if you call up psql from a 
terminal of that user, you don't need to explicitly tell it what 
database to use.


I created an SQL script create_table.sql (usually better to have a more 
descriptive name!) in an editor:


CREATE TABLE public.gz_life_mammals
(
  idint PRIMARY KEY,
  taxon text UNIQUE NOT NULL,
  parenttext NOT NULL,
  slug  text,
  name_common   text,
  pluraltext,
  extinct   smallint NOT NULL,
  rank  smallint NOT NULL,
  key   smallint NOT NULL
);

Here is a session where I create the table (I created the terminal in 
the same directory as the SQL script, you can also simply cd to the 
relevant directory before executing psql):

$ psql
psql (9.4.4)
Type "help" for help.

gavin=> \i create_table.sql
CREATE TABLE
gavin=> \q
$

You might be able to do all the above using pgadmin3...


Cheers,
Gavin



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Importing CSV File

2015-10-26 Thread Gavin Flower

On 27/10/15 10:45, David Blomstrom wrote:
I tried to import a CSV file into a PostgreSQL table using pgAdmin 
III. I got an error message: "extra data after last column."



All my spreadsheets have an "end of data" column that has /r/n in each 
cell. When I import a CSV file into a MySQL table, everything beyond 
/r/n is ignored. Is there some way to tell PostgreSQL to stop at /r/n?


Thanks.
You can import the CSV into a spreadsheet (such as LibreOffice Calc), 
and manually delete the extraneous lines.




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Recursive Arrays 101

2015-10-26 Thread Gavin Flower

On 27/10/15 09:29, David Blomstrom wrote:
[...]
Regarding my "usage patterns," I use spreadsheets (Apple's Numbers 
program) to organize data. I then save it as a CSV file and import it 
into a database table. It would be very hard to break with that 
tradition, because I don't know of any other way to organize my data.


On the other hand, I have a column (Rank) that identifies different 
taxonomic levels (kingdom, class, etc.). So I can easily sort a table 
into specific taxonomic levels and save one level at a time for a 
database table.


There is one problem, though. I can easily put all the vertebrate 
orders and even families into a table. But genera might be harder, and 
species probably won't work; there are simply too many. My spreadsheet 
program is almost overwhelmed by fish species alone.

[...]

The maximum number of rows in the LibreOffice spreadsheet (Calc) is 
1,073,741,824


LibreOffice is free and available for Apple Macs (as well as for Linux & 
Microsoft O/S's), see

http://www.libreoffice.org/download
I suggest you download the 5.0 version


Cheers,
Gavin



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Recursive Arrays 101

2015-10-26 Thread Gavin Flower

On 27/10/15 11:18, David Blomstrom wrote:

I pasted this into the shell...

CREATE ROLE david
LOGIN
CREATEDB;

CREATE DATABASE GeoZoo2
OWNER david;

and I did it again, replacing LOGIN with my password, but when I 
refresh pgAdmin III, there are no new databases.


And when I paste this in, it doesn't create a table...

CREATE TABLE public.gz_life_mammals2
(
  idint PRIMARY KEY,
  taxon text UNIQUE NOT NULL,
  parenttext NOT NULL,
  slug  text,
  name_common   text,
  pluraltext,
  extinct   smallint NOT NULL,
  rank  smallint NOT NULL,
  key   smallint NOT NULL
);
You can list all the databases with '\l' and the tables with '\dt' in 
psql, for example:


$ psql
psql (9.4.4)
Type "help" for help.

gavin=> \l
   List of databases
Name|   Owner| Encoding |   Collate |Ctype
|   Access privileges

++--+-+-+---
 gavin  | gavin  | UTF8 | en_GB.UTF-8 | en_GB.UTF-8 |
 greeter_quickstart | jboss_dev  | UTF8 | en_GB.UTF-8 | en_GB.UTF-8 |
 jboss_dev_db   | jboss_dev  | UTF8 | en_GB.UTF-8 | en_GB.UTF-8 |
 jboss_sys_db   | jboss_sys  | UTF8 | en_GB.UTF-8 | en_GB.UTF-8 |
 jboss_test_db  | jboss_test | UTF8 | en_GB.UTF-8 | en_GB.UTF-8 |
 mydb   | mydb_admin | UTF8 | en_GB.UTF-8 | en_GB.UTF-8 |
 pgsp2ed| gavin  | UTF8 | en_GB.UTF-8 | en_GB.UTF-8 |
 postgres   | postgres   | UTF8 | en_GB.UTF-8 | en_GB.UTF-8 |
 postgresql_book| gavin  | UTF8 | en_GB.UTF-8 | en_GB.UTF-8 |
 template0  | postgres   | UTF8 | en_GB.UTF-8 | en_GB.UTF-8 
| =c/postgres  +
||  | | | 
postgres=CTc/postgres
 template1  | postgres   | UTF8 | en_GB.UTF-8 | en_GB.UTF-8 
| =c/postgres  +
||  | | | 
postgres=CTc/postgres

(11 rows)

gavin=> \c pgsp2ed
You are now connected to database "pgsp2ed" as user "gavin".
pgsp2ed=> \dt
  List of relations
 Schema |   Name   | Type  |  Owner
+--+---+--
 public | accounts | table | gavin
 public | application_settings_new | table | gavin
 public | application_settings_old | table | gavin
 public | fiverow  | table | gavin
 public | fiverows | table | gavin
 public | fruit| table | gavin
 public | fruit_in_stock   | table | gavin
 public | fruit_offer  | table | gavin
 public | modified_table   | table | gavin
 public | modify_test  | table | gavin
 public | my_data  | table | gavin
 public | name | table | gavin
 public | names| table | gavin
 public | namex| table | gavin
 public | notify_test  | table | gavin
 public | original_database| table | gavin
 public | original_user| table | gavin
 public | salary   | table | fred
 public | salary_change_log| table | fred
 public | test | table | postgres
 public | test1| table | gavin
 public | test2| table | gavin
 public | test3| table | gavin
 public | tmp_setting  | table | gavin
 public | word | table | gavin
(25 rows)

pgsp2ed=> \q
$



Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Recursive Arrays 101

2015-10-26 Thread Gavin Flower

On 27/10/15 11:18, David Blomstrom wrote:
[...]

CREATE DATABASE GeoZoo2

[...]

Would strongly advise NOT using capital letters in names of databases in pg!



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Recursive Arrays 101

2015-10-26 Thread Gavin Flower

On 27/10/15 11:57, David Blomstrom wrote:
When I type in /l, it just says "database Postgres," even though I can 
see TWO databases in pgAdmin III. When I type in /dt, it says Username 
[postgres].


On Mon, Oct 26, 2015 at 3:40 PM, Gavin Flower 
mailto:gavinflo...@archidevsys.co.nz>> 
wrote:


On 27/10/15 11:18, David Blomstrom wrote:
[...]

CREATE DATABASE GeoZoo2

[...]

Would strongly advise NOT using capital letters in names of
databases in pg!




--
David Blomstrom
Writer & Web Designer (Mac, M$ & Linux)
www.geobop.org <http://www.geobop.org>


What does '\l' and '\du' actually give you (gives us the actual output)?


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Recursive Arrays 101

2015-10-26 Thread Gavin Flower

On 27/10/15 11:18, David Blomstrom wrote:

I pasted this into the shell...

CREATE ROLE david
LOGIN
CREATEDB;

CREATE DATABASE GeoZoo2
OWNER david;

and I did it again, replacing LOGIN with my password, but when I 
refresh pgAdmin III, there are no new databases.


And when I paste this in, it doesn't create a table...

CREATE TABLE public.gz_life_mammals2
(
  idint PRIMARY KEY,
  taxon text UNIQUE NOT NULL,
  parenttext NOT NULL,
  slug  text,
  name_common   text,
  pluraltext,
  extinct   smallint NOT NULL,
  rank  smallint NOT NULL,
  key   smallint NOT NULL
);

trying typing into psql, assuming this is a postgres user session:
\c geozoo2 david
(format is '\c DATABASE USERID').

If there is an error, please copy & paste the exact error message returned!



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Recursive Arrays 101

2015-10-26 Thread Gavin Flower

On 27/10/15 12:15, David Blomstrom wrote:


Server [localhost]: \c geozoo2 david

Database [postgres]:


*Shell2*

Username [postgres]: \c geozoo2 david

psql: warning: extra command-line argument "david" ignored

psql: warning: extra command-line argument "l" ignored

psql: invalid port number: "dt"


Press  to continue...



In my psql seesion I get...

gavin=> \c mydb mydb_admin
You are now connected to database "mydb" as user "mydb_admin".
mydb=>


If you have multiple psql sessions, suggest you delete all except 1 or 2.

By type of shell, is meant are using a bash shell in your terminal, or 
csh, or something else? Bash stands for BOurne Again Shell, it process 
commands like 'psql' that you type into the shell.


Please copy i the email addresses of the other helping you & the mailing 
list!



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Recursive Arrays 101

2015-10-26 Thread Gavin Flower

On 27/10/15 12:27, David Blomstrom wrote:

LOL - This is precisely why I prefer GUI's. ;)

I have two PostgreSQL icons on my taskbar - a blue elephant (pgAdmin 
III) and a little monitor (PSQL). When I click on PSQL, it always 
opens at least two windows or instances. When I type things in and hit 
enter, it spawns more windows. I've counted at least as many as a dozen.



[...]

Please DO NOT top post!!!

I use both GUI's & terminals.

The 'don't worry your tiny little mind' philosophy of Apple is an 
anathema to me - it tries to hide too much of the useful stuff from 
users.  I use the 'Mate' Desktop (http://mate-desktop.org) Environment 
from Fedora Linux - I have 35 virtual desktops each with 2 highly 
configured panels that auto hide, and both my terminals and directory 
windows allow multiple tabs (just like web browsers). Apple does not 
permit such extensive customisation.


To get a proper terminal in an Apple Mac, have a look at:

https://en.wikipedia.org/wiki/Terminal_%28OS_X%29
http://www.macworld.co.uk/feature/mac-software/get-more-out-of-os-x-terminal-3608274

Note that it uses the bash shell, see:
https://www.gnu.org/software/bash


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Recursive Arrays 101

2015-10-26 Thread Gavin Flower

On 27/10/15 13:29, John R Pierce wrote:

On 10/26/2015 5:20 PM, David G. Johnston wrote:

What exactly are you showing us here?


he's demonstrating a lack of reading comprehension.   I'm done. thread 
on ignore.





I think its proof that Apple products rot your brain!



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Recursive Arrays 101

2015-11-05 Thread Gavin Flower

On 06/11/15 04:33, Rob Sargent wrote:

On 11/05/2015 04:56 AM, Achilleas Mantzios wrote:

On 04/11/2015 17:53, Rob Sargent wrote:

On 11/04/2015 03:03 AM, Achilleas Mantzios wrote:
Sorry for being kind of late to the party (I was in 2015.PgConf.EU 
!!), and not having read
most of the replies, what we have been successfully doing for this 
problem for our app

is do it this way :
parents int[] -- where parents stores the path from the node to the 
root of the tree

and then have those indexes :
btree (first(parents))
btree (level(parents)) -- length
btree (last(parents))
gin (parents gin__int_ops) -- the most important

This has been described as "genealogical tree" approach, and works 
very good, IMHO much better

than nested sets.

Is there a more complete description of this approach available?  By 
the title one might assume could be applied to populations as 
opposed to phylogeny (the OP's use case).  Does it deal with 
consanguinity?  Does it perform well going "up" the tree (which is 
of course branched at every level)?


From here https://en.wikipedia.org/wiki/Phylogenetic_tree I assume 
that phylogenetic trees are normal
trees, and I see no reason why not be modeled with the genealogical 
approach described. The earliest paper

I based my work on was :
https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&ved=0CCUQFjABahUKEwiR6auUlvnIAhXGvhQKHVyDA-s&url=https%3A%2F%2Fdownload.samba.org%2Fpub%2Funpacked%2Fldb%2Fldb_sqlite3%2Ftrees.ps&usg=AFQjCNEktJsibP435MBki5cdGmO_CzKmwg&sig2=I9yC_tpyeWrEueDJTXbyAA&bvm=bv.106674449,d.d24&cad=rja

Finding the root is O(1). Going "up" the tree or finding common 
ancestry is reduced to the problem
of finding overlap/intersections/contains/contained between 
postgresql arrays.


The indexes, functions and operators provided by contrib/intarray 
were a basic element for the success of this

approach.

Going "up" a genealogy to me means getting two parents, four 
grandparents, 8 great grandparents etc.  On a good day, at least when 
there are no loops.  This isn't, to my understanding, how phylogeny 
works (but my genetics degree was thirty year ago) so perhaps I'm 
still confused by the titles used.  And certainly not to say that your 
approach isn't what the OP really needs!



You're actually going 'DOWN' the tree, in terms of how trees are used in 
computer science & graph theory!


See http://www.mathcove.net/petersen/lessons/get-lesson?les=32


Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] maximum "target list" (maximum columns)

2016-06-16 Thread Gavin Flower

On 16/06/16 16:39, Shaun Cutts wrote:

The opinion of database developers is that using more than 1664 columns is bad 
design, so that the current maximum number of columns is not onerous.

When I simply built applications on top of databases, I was of the same opinion.

However, now my job is to perform some sort of analysis on data from elsewhere. 
Over the course of a couple weeks I need to do something with data that often is
structured into tables of many thousands of columns. Figuring out what the 
appropriate form is part of manipulating the data — something for which I’d
like to use postgres. My applications, such as they are, handle metadata not 
data.

Operations such as crosstab or json_populate_record can easily create rows with 
many more than 1664 fields. I also use data such as census data — with many 
different survey fields all applying to geographic areas. For a given 
application only some are relevant, but ideally I’d create a materialized view 
for a given project with relevant data. In the best case, the “library” version 
of the dataset would have just a few very wide tables. Storing in json is 
possible but inconvenient, slower and means that often type info must be stored
separately.

In short, IMHO the database shouldn’t force me to structure my data before I 
understand it, and being able to query it is how I come to understand it. At 
some scale, practicality dictates that one needs to take “special measures” to 
handle large volumes of data — large in breadth as well as width. But this 
boundary should be on the order of millions of columns, not thousands.

Is there a reason besides “its bad design” to disallow tables with many columns?




Not had to deal with extremely fat database tables.

However, back when I was a COBOL programmer: one system had a very big 
record type, more than 10 times bigger than anything else I'd ever dealt 
with.  It was an extreme pain to debug problems with such a large record 
(try poring over hex dumps on lineflow!) - so I expect it would be 
painful for a large database table.


Could you spread the huge number of columns into a number or database 
tables with some sort of logical grouping?



Cheers,
Gavin



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] OT hardware recommend

2016-06-17 Thread Gavin Flower

On 18/06/16 08:36, Andy Colson wrote:

Hi all.

I have access to quite a few laptop HD's (10 to 15 of them at least), 
and thought that might make a neat test box that might have some good 
IO speed.


Needs to be cheap though, so linux with software raid, rack mount 
preferred but not required.


Anyone have any experience with anything like that?  $2K might be 
possible, painful, but possible.


Suggestions?

Thanks for your time,

-Andy


It would be a good idea to say what country you are in, and what city 
(or locality).


What I know about Auckland in New Zealand may not be relevant to you...  :-)


Cheers,
Gavin



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] OT hardware recommend

2016-06-17 Thread Gavin Flower

On 18/06/16 14:23, Scott Marlowe wrote:
On Fri, Jun 17, 2016 at 2:36 PM, Andy Colson > wrote:


Hi all.

I have access to quite a few laptop HD's (10 to 15 of them at
least), and thought that might make a neat test box that might
have some good IO speed.

Needs to be cheap though, so linux with software raid, rack mount
preferred but not required.

Anyone have any experience with anything like that?  $2K might be
possible, painful, but possible.

Suggestions?


Sell them all and buy a couple of 800G SSDs? :)


You can get 4TB SSD's now - but somewhat pricey!!!

(besides, you'd still need backups)



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Uber migrated from Postgres to MySQL

2016-07-28 Thread Gavin Flower

On 28/07/16 17:52, Jason Dusek wrote:
With regards to write amplification, it makes me think about about 
OIDs. Used to be, every row had an OID and that OID persisted across 
row versions.


https://www.postgresql.org/docs/9.5/static/runtime-config-compatible.html#GUC-DEFAULT-WITH-OIDS

Would reintroducing such a feature address some of Uber's concerns 
about multiple indexes? It could, and would do so without the implicit 
requirement of a foreign key; but it would also require a fast OID to 
CTID mapping.


On Tue, 26 Jul 2016 at 10:40 Guyren Howe > wrote:


Honestly, I've never heard of anyone doing that. But it sounds
like they had good reasons.

https://eng.uber.com/mysql-migration/

Thoughts?

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org
)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Would it be best to increase OIDs to 64 bits?

Possibly a choice of 32/64 to be decided when the DB is created???


Cheers,
Gavin



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] C++ port of Postgres

2016-08-16 Thread Gavin Flower

On 16/08/16 18:24, dandl wrote:


Just wondering what the end goal is for this project... Is it to just 
maintain an up to date Postgres fork that will compile with a C++ 
compiler? Is it to get a conversation going for a direction for 
Postgres itself to move?  The former I don't see gaining much traction 
or doing all that much for the state of the project. The latter 
possibly could if the community gets on board.


I would certainly hope the latter. Having done some work on extension 
functions and an extension language for Postgres, the current 
situation can be quite limiting.


·Parts of my code could only be written in C++, so I finished up with 
a mixed build, which is not ideal.


·My other issue was dealing with the Datum macros. Type-safe inline 
C++ functions would help reduce dumb errors.


Not compelling reasons perhaps, but just a vote for a move in that 
direction, some time.


Regards

David M Bennett FACS

/

/

/Andl - A New Database Language - andl.org/

I  note that gcc itself is now written in C++, and so is squid (web 
proxy)...


In both cases, part of the motivation to change from C was to appeal to 
new developers - from what I remember of the discussions.



Cheers,
Gavin



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] IDE for function/stored proc development.

2016-09-08 Thread Gavin Flower

On 08/09/16 19:42, Martijn Tonies (Upscene Productions) wrote:

Hi,

For what it's worth, Database Workbench with PostgreSQL support was 
released Yesterday.


http://www.upscene.com/database_workbench/whatsnew


With regards,

Martijn Tonies
Upscene Productions

[...]

Hmm... exe's don't work natively on Linux...


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Chante domain type - Postgres 9.2

2016-09-25 Thread Gavin Flower

On 26/09/16 17:58, Patrick B wrote:

Hi guys,

I've got this domain:

CREATE DOMAIN public.a_city
  AS character varying(80)
  COLLATE pg_catalog."default";


And I need to increase the type from character varying(80) to 
character varying(255).


How can I do that? didn't find info about it. I'm using Postgres 9.2

Thanks!
Patrick


Why not simply use the 'text' data type?

To change the data type on a column you can use:
ALTER [ COLUMN ] /column_name/ [ SET DATA ] TYPE /data_type/ [ COLLATE 
/collation/ ] [ USING /expression/ ]


see:
https://www.postgresql.org/docs/9.2/static/sql-altertable.html


Note that 9.5 is the latest version of pg, with 9.6 being released very soon!


Cheers,
Gavin



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] initdb createuser commands

2016-10-29 Thread Gavin Flower

On 30/10/16 11:25, John R Pierce wrote:

On 10/29/2016 3:02 PM, Samuel Williams wrote:

FYI,https://wiki.archlinux.org/index.php/PostgreSQL mentions initdb,
createuser, createdb and several others. I think my suggestion is
still relevant and something that would improve the system for new
users


and it would break it for the existing community that has up to 20 
years of experience with it, as well as breaking virtually all 
existing documentation, books, howto articles, etc, including the one 
you linked.



I think it would be good to have all the postgresql utilities prefixed 
by pg_ to keep them together, but have aliases so existing scripts & 
documentation (especially 3rd party) were still relevant.


However, I wouldn't consider it the highest priority.


Cheers,
Gavin



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreDB stores table name is lower case

2016-11-25 Thread Gavin Flower

On 24/11/16 20:52,  wrote:

Hello ??
I'am a  Postgre fan.
Now, I have a problem, the table name is stored in lower case 
, but i want to change it into  upper case. Can i have a simple 
method? Such as modify a parameter.

Thank you!




Why?

I can't see any practical benefit!


Cheers,
Gavin



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] About the MONEY type

2016-12-18 Thread Gavin Flower

On 18/12/16 12:25, Bruce Momjian wrote:

On Wed, Nov 30, 2016 at 01:35:12PM -0800, John R Pierce wrote:

note, btw, TIMESTAMP WITH TIME ZONE doesn't actually store the timezone...
rather, it converts it to an internal representation of GMT, and then converts
it back to display time at the client's current (or specified) time zone.

Right, TIMESTAMP WITH TIME ZONE converts the timestamp value to the
local time zone on output.  Imagine a monetary type that converted the
money amount to local currency on output --- that would be cool.


Hmm...

Would need to know the appropriate conversion rate. the 2 obvious 
dates/times, on entry and now, may neither be the one wanted.


Also, often the buy/sell conversion rates are not the same!

Am sure there also other issues.

I don't think automatic conversion is as easy as you make it out to be.



Cheers,

Gavin



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] About the MONEY type

2016-12-18 Thread Gavin Flower

On 19/12/16 14:17, Rob Sargent wrote:

On Dec 18, 2016, at 5:23 PM, Gavin Flower  wrote:

On 18/12/16 12:25, Bruce Momjian wrote:

On Wed, Nov 30, 2016 at 01:35:12PM -0800, John R Pierce wrote:

note, btw, TIMESTAMP WITH TIME ZONE doesn't actually store the timezone...
rather, it converts it to an internal representation of GMT, and then converts
it back to display time at the client's current (or specified) time zone.

Right, TIMESTAMP WITH TIME ZONE converts the timestamp value to the
local time zone on output.  Imagine a monetary type that converted the
money amount to local currency on output --- that would be cool.


Hmm...

Would need to know the appropriate conversion rate. the 2 obvious dates/times, 
on entry and now, may neither be the one wanted.

Also, often the buy/sell conversion rates are not the same!

Am sure there also other issues.

I don't think automatic conversion is as easy as you make it out to be.



Cheers,

Gavin


Bets on how serious Mr. Pierce was are currently trending in Vegas.


:-)




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Performance PLV8 vs PLPGSQL

2016-12-29 Thread Gavin Flower

On 29/12/16 09:12, Francisco Olarte wrote:

On Wed, Dec 28, 2016 at 5:53 PM, Jan de Visser  wrote:

.but the term "impedance mismatch"
is at least 25 year old;

Much older, I was told it in class at least 32 years ago.


as far as I know it was coined

_Borrowed_ from electrical engineering / communication techs.

It is used to highlight how signals 'bounce' at the points of a
transmision path where impedances do not match. It extrapolates the
fact that if you have a battery with an internal resistance R the way
to extract the maximum energy on a load is for it to match the
impedance, be R too. Higher load impedance and the fraction of energy
in the load goes up, the total down. Lower load impedance and the
fraction in the load goes down, the total up. In either case absolute
power in the load goes down. Match the impedance and the energy in the
load is the maximum ( and equal to the internal loss in the battery ).

[...]

From my distant memory of studying AC stuff at University many moons ago...

You want the two impedances to be complex conjugates of each other (this 
means the MAGNITUDES will be equal) - which means the phase change 
should be equal & opposite, and the resistance to match.



Cheers,
Gavin



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] What is the purpose of PostGIS on PostgreSQL?

2017-01-24 Thread Gavin Flower

On 25/01/17 14:12, Andy Colson wrote:

On 01/23/2017 11:08 AM, Kased, Razy (Chemistry and Geosciences) wrote:
I recently came across this question: "What is the purpose of PostGIS 
on PostgreSQL?"  and 
wanted to know what this mailing list had to respond with.



​Thanks,




PostGIS is to shapefile
as PostgreSQL is to csv

-Andy



What is 'shapefile'?

I don't recall ever coming across that term!


Cheers,
Gavin



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] What is the purpose of PostGIS on PostgreSQL?

2017-01-25 Thread Gavin Flower

On 25/01/17 20:14, Johann Spies wrote:



On 25 January 2017 at 08:32, Gavin Flower 
mailto:gavinflo...@archidevsys.co.nz>> 
wrote:





What is 'shapefile'?

I don't recall ever coming across that
term!https://en.wikipedia.org/wiki/Shapefile


 See https://en.wikipedia.org/wiki/Shapefile

Regards
Johann

--
Because experiencing your loyal love is better than life itself,
my lips will praise you.  (Psalm 63:3)


Thanks!

This might prove very useful for a project of mine that is almost 
nothing to do with GIS!!!



Cheers,
Gavin



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Recurring and non recurring events.

2015-12-26 Thread Gavin Flower

Pleas don't top post - see comment at the bottom of this email.

On 27/12/15 01:03, Kevin Waterson wrote:

Thanks, as I am new to postgres, I was unaware of this function.
To go with this, I guess I will need a table with which to store 
intervals, start and end dates?


eg
CREATE table events(
id serial primary key,
start_timestamp timestamp,
end_timestamp timestamp,
interval

with dateRange as
  (
  SELECT min(start_timestamp) as first_date, max(start_timestamp) as 
last_date

  FROM events
  )
select
generate_series(first_date, last_date, '1 
hour'::interval)::timestamp as date_hour

from dateRange;


or something??

Kind regards
Kevin


On Sat, Dec 26, 2015 at 7:22 PM, Pavel Stehule 
mailto:pavel.steh...@gmail.com>> wrote:


Hi

2015-12-26 8:28 GMT+01:00 Kevin Waterson mailto:kevin.water...@gmail.com>>:

I wish to set up a table of recurring, and non-recurring events.
I have been looking at

http://justatheory.com/computers/databases/postgresql/recurring_events.html
which looks nice (complex but nice) and wonder if there was a
better option for this in more recent pgsql versions.



[...]

In this list, the convention is to post replies at the end (with some 
rare exceptions), or interspersed when appropriate, and to omit parts no 
longer relevant.


The motivation of bottom posting like this: is that people get to see 
the context before the reply, AND emails don't end up getting longer & 
longer as people reply at the beginning forgetting to trim the now 
irrelevant stuff at the end.



Cheers,
Gavin



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] RAM of Postgres Server

2016-01-06 Thread Gavin Flower

On 07/01/16 18:39, Sachin Srivastava wrote:

Dear Team,

Please suggest, how much RAM and core should be define for New 
Postgres database server, if we will use Postgres 9.3 and above.


If suppose my postgres database size will be near about 300 to 500 GB 
for future.


There is any document regarding this server configuration, suggest ?

Regards,
SS


What O/S, Linux or something else?

What type of queries?

Expected number of queries per second?

Size of commonly accessed tables, indexes, ... ?


The experts will probably need AT LEAST the above!


Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Any way to selectively color query output in psql?

2016-01-07 Thread Gavin Flower

On 08/01/16 10:25, David G. Johnston wrote:

Basically I want to write this:

psql -c "SELECT E'\e[1;33m Some Text Here \e[0m';"

And have just the text "Some Text Here" colored while everything else 
is default white.


The \e constructs are the ANSI color escapes which work when using echo -e

I am using bash 4.2.25(1) on Ubuntu 14.04

Is this possible?

Thanks!

David J.


But what is the version of Postgres?  :-)



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Code of Conduct: Is it time?

2016-01-09 Thread Gavin Flower

On 10/01/16 20:37, Regina Obe wrote:

Josh informed me you guys are thinking about a CoC.  Let me start off by
saying that I don't think you need one and in fact having one may be
dangerous.  I fear for your safety.

I think Roxanne mentioned some good points in an earlier thread that you
should itemize what you expect to achieve with a Coc.

Quoted from her note:
"You implied in your first post that you would attract more contributors
with a CoC."

Let me say -- I do not think you will attract more contributors.  You may
in fact attract parasites.

The FreeBSD thread that is often mentioned as reason why you need a Coc is
this from Randi Harper.
http://blog.randi.io/2015/12/31/the-developer-formerly-known-as-freebsdgirl/

Randi has just blocked me on twitter after I complimented her on her nice
shallow evidence - http://imgur.com/a/UVKfZ   Perhaps questioning her
harassment claims identifies me as a troll.

I have spoken to one of her victims Roberto Rosario.  A great loving man
from what I can tell, and she has truly tried to destroy his credibility
and harassed him beyond end.  I feel so horrible that a woman claiming to
watch out for minorities and women in tech is disingenuous and will
probably laugh pushing me down, because I'm not poor and have educated
parents.

So please whatever you do, if you really feel you need a Coc, do not
choose this one or anything that looks like it:

http://contributor-covenant.org/

I'm seeing social bullies going at every project demanding they adopt this
garbage.

Who knows what their intention is, I can only imagine.


Thanks Regina,
PostGIS PSC member


I was at the 2015 Australian Linux Conference (held in Auckland, NZ), 
when Sarah Sharp harangued Linus Torvalds for over 20 minutes. Linus 
remained calm and polite throughout, yet most people would have been 
obviously annoyed within the first 5 minutes.  As backround see: 
http://www.networkworld.com/article/2988850/opensource-subnet/linux-kernel-dev-sarah-sharp-quits-citing-brutal-communications-style.html


I think some people, unintentionally, set themselves up as a victim.

So I would agree that a Coc is likely only to lead to arguments. Take 
something innocuous like 'do not offend people' - sounds good, now 
politely  explain why someone's deeply held beliefs contradict reality!



Cheers,
Gavin





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Code of Conduct: Is it time?

2016-01-10 Thread Gavin Flower

On 10/01/16 22:55, John R Pierce wrote:

On 1/9/2016 11:57 PM, Gavin Flower wrote:
I was at the 2015 Australian Linux Conference (held in Auckland, NZ), 
when Sarah Sharp harangued Linus Torvalds for over 20 minutes. Linus 
remained calm and polite throughout, yet most people would have been 
obviously annoyed within the first 5 minutes. 


(total outsider here, looking in)

some people are just toxic.   psychic vampires.  They can suck all the 
energy out of something while contributing little or nothing.


OTOH, she seems to have done some seriously good work, hard stuff like 
pioneering the linux framework for USB 3.0.


The more I read, the more I'm at least somewhat on her side, Linus 
does not need to be as much of an a**hole as he comes off as. For sure 
dealing with an environment like that you need to be really thick 
skinned.At times when I read about Linus and the whole kernel 
environment I think he's a vampire, but he's taking the power he's 
sucking up and building something, so maybe thats excusable... does he 
really need to be /that/ big of an ahole?  I dunno.



entirely on the other hand, I note that FreeBSD development has a 
whole lot less drama, and at least in my opinion, the kernel is a 
whole lot more stable.   h.




I sometimes look at the kernel mailing list:
 https://lkml.org
Linus is normally very mild tempered, rarely do I see him lash out, but 
I've only seen that against people who are competent, but doing/saying 
something Linus strongly disagrees with.  Most times he disagrees in an 
almost boringly mild way.


I would be quiet chuffed if Linus was rude to me - as that would mean 
that I'd met a fairly high standard.  If I sent in a really stupid 
patch, it would simply be ignored.  Though I must say, I've not, and 
almost certainly never going to, send in a kernel patch!


In the early days, Linus would quite readily admit to doing something 
stupid & suggest that he should wear a brown paper bag in shame!  He has 
a wonderful sense of humour, especially apparent in the early days of 
Linux - but his kernel release comments now appear far too professional!


Linus had said that one time he was too polite, and a developer 
persisted wasting a lot of effort before Linus could get through to 
him.  So Linus is now a lot more direct.


Sarah is an extremely brilliant and very productive kernel programmer, 
out classes me many times over in all programming metrics of any value - 
it is a grave pity that she takes comments as personal attacks.


I have immense respect for Linus, and I understand where he is coming from.

I unexpectedly had about a ten minute one-to-one conversation with him 
at the 2015 conference.  He is aware that he is far from perfect.  I 
would be very happy if I was at least 1% as he is, in terms of effective 
ability and contributions.


I suspect that Linux is more capable and growing a lot faster than any 
of the BSD's!!!  Though the BSD's may be more stable.



Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Code of Conduct: Is it time?

2016-01-10 Thread Gavin Flower

On 11/01/16 07:44, Regina Obe wrote:
[...]

This may come as a big shock to many of you, but as a contributor
I don't care if you are racist, sexist, transphobic or whatever as long as
you

1) Are helpful when I ask a question
2) Stick to the topic
3) Don't get into petty etiquettes like "Please stop top posting"
and if you really need to - A polite we prefer top posting would do

4) Are sensitive to people on other operating systems other than your
preferred.
I once went out of my way to help someone with Mac.  They were so Mac 
centric they did not realize that they were not giving us the right 
information to help them, but this was not obvious until later in the 
thread.  I made some comment about Linux - next moment they were 
accusing everyone of not helping them properly because they were using a 
Mac, as though we had been deliberately discriminating against them!


So yes, I am sensitive to the O/S people are using, I will now avoid 
helping people who don't use Linux.  As I may not understand their needs 
properly, and I don't want to be accused of picking on them because they 
not using the "RIGHT OPERATING SYSTEM"!  But I have neither the time nor 
the expertise to even help everyone who uses Linux, even if they DO use 
the "ONE TRUE LINUX DISTRIBUTION" (being very careful not to mention the 
distribution I'm using - not wanting to start a flame war!!!).


I've twice been effective in supporting people with programs written in 
BASIC, were the version of BASIC was unfamiliar to me and I could not 
test my suggested change because they used a Microsoft O/S and I did not 
have access to any Microsoft boxen for testing purposes (at the time).  
In a recent project, I even ran a Microsoft O/S in a VM on my Linux box 
to test something for a project I was leading.  So I don't have an 
overriding religious type objections to helping people with other 
operating systems!


One thing that really pushes my buttons is when I ask for help as a windows
user and some person makes a snide remark about why don't I switch to
Linux - problem solved
Or because I'm on windows, I don't care about performance.

Here is an example thread I recall from a while back on PostGIS list.

https://lists.osgeo.org/pipermail/postgis-users/2008-June/020331.html

In PostGIS group people are very good at calling out other people when
they think they've said something mean-spirited
and I think people are greatful for being called out because the nasty
person had no idea
their joke was mean.

My other concern about CoCs is I fear someone is going to come and demand
we change Master/Slave  to Leader/Follower, because Master is a male term
and Slave is insensitive to grand-children of slaves.


Comrades, we are all equal! So to set one program above another is an 
anathema!  :-)



Thanks,
Regina






Chers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Code of Conduct: Is it time?

2016-01-10 Thread Gavin Flower

On 10/01/16 21:31, Regina Obe wrote:

I was at the 2015 Australian Linux Conference (held in Auckland, NZ), when

Sarah Sharp harangued Linus Torvalds for over 20 minutes. Linus remained
calm and polite throughout, yet most people would have been obviously
annoyed within the first 5 minutes.  As backround see:
http://www.networkworld.com/article/2988850/opensource-subnet/linux-kernel-d
ev-sarah-sharp-quits-citing-brutal-communications-style.html


I think some people, unintentionally, set themselves up as a victim.
So I would agree that a Coc is likely only to lead to arguments. Take

something innocuous like 'do not offend people' - sounds good, now politely
explain why someone's deeply held beliefs contradict reality!



Cheers,
Gavin


Sarah is my most favorite person in the world.  I made critical comments on
her blog once when she went crazy on Linus which she deleted. I must be a
troll. I see now she's into doing stats on the people she deleted comments
of.
I initially heard about Sarah when I read an item about her being the 
first to implement USB 3 support, and it was for Linux!  So I started 
off having tremendous respect for her.  While I have programmed at the 
assembly level for 3 different types of processors many years ago, I am 
certain she is considerably more competent than I ever was technically.




http://sarah.thesharps.us/2016/01/07/metrics-of-haters/

I had a look at this.

While there obviously were some comments that I and most others would 
utterly condemn - there was no breakdown of other comments. So no way of 
knowing if she considered all criticism of her as being hateful.




Maybe we should suggest she should use PostgreSQL for that and demonstrate
our fancy stat functions.

She won Red Hat Woman of the Year Award -
https://www.redhat.com/en/about/women-in-open-source

Sarah Sharp
2015 Community Award winner

Am I the only one concerned about some of the women role models we have in
FOSS?

Am I, as a mere male, entitled to have an opinion on this?  :-)

It is very sad, that some people would answer the above question with a 
resounding no!




Thanks,
Regina









--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Code of Conduct: Is it time?

2016-01-11 Thread Gavin Flower

On 11/01/16 19:13, Oleg Bartunov wrote:
[...]


Some people don't understand all these issues with she/he, for 
example, we in Russia are not really concern about this.



[...]

I started using 'Gender Appropriate' language long before this PC 
nonsense started up. Back in those days the word 'he' in instructions 
included the female gender, which I though was stupid. Back then, and 
also these days, I see no point in mentioning gender unless it is relevant.


So I use: one, they, their, and them.  Which avoids the gender specific 
problem, and also suggests (as is usually the case) that one or more 
people are involved.


The problem with he/she is also that it is not totally politically 
correct either, what about people who are a bit of both, and/or can't 
decide?  Not to mention people with multiple personalities, not always 
of the same gender (I spent a few years conversing with people in the 
usenet group alt.sexual.abuse.recovery - long story, but I got into it 
when I did a project on network traffic).  I also did some research when 
I read an article that said about 10% of children born on an island 
started life looking like girls, but changed into males at the time of 
puberty, apparently about 0.5% (depending on precise definitions) of 
children world wide are born not definitely of any particular gender.


Cheers,
Gavin



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: Things to notice (was Re: [GENERAL] Code of Conduct: Is it time?, broken thread I hope)

2016-01-11 Thread Gavin Flower

On 11/01/16 15:00, Andrew Sullivan wrote:

Someone (never mind who, this isn't intended to be a blame-game
message) wrote:


Am I, as a mere male […]  :-)

It was me.

The phrase "Mere Male" was title of a column I read in NZ Women's Weekly 
that my mother bought when I was a teenager.


>>> An an aside: the use of '[...]' is something I introduced into 
usenet about 1991, previously people used '[ omitted ]'  - when I was at 
the Victoria University of Wellington (NZ) <<<


The rest of the sentience you omitted, was inspired by a woman 
complaining that when she turned up to one feminist meeting, her baby 
was removed when some other women found it was male.



Even with the smiley, _this_ is the sort of thing that causes
discussions to wander into hopeless weeds from which projects cannot
emerge.  I _know_ it is tempting to make this kind of remark.  But
it's not cool, it doesn't help, and it is exactly the sort of thing
that makes some people think CoCs are needed in the first place.
Your reply is exactly why a Coc is dangerous.  Almost anything people 
say, can be interpreted by someone as either offensive and/or inappropriate!




Suppose you were an uncertain young woman from a culture where men
have legal authority over you.  Suppose the only interaction with
programming peers you get is online.  (Yes, I know of at least one
such case personally.)  This sort of sarcastic remark, smiley or no,
causes you a new uncertainty.

It was not intended to be sarcastic.

Note that even between England and the USA there is a culture gap. For 
example: British comedians found lots of Americans could not understand 
sarcasm, hence the habit of saying 'Not!' after a positive statement and 
a short pause.




Just be sensitive to the fact that the Internet is bigger than your
world, however big it is, and things will be better.
My wife is Chinese, I lived in Sierra Leone for a couple of years, 
Ireland for about 4 years.  I was born in England, live in New Zealand, 
have visited several other countries including Australia & the USA.  I 
have also considered aspects of culture (both human & alien) relating to 
living on other planets, not all orbiting our star.  So my world view 
might be bigger than yours!


Before I started using the Internet & email I had read that electronic 
communication does not have a non-verbal component.  I've been using the 
Internet for 25 years - I found within a year that there is considerable 
non-verbal aspects to communication.  However, when you see someone 
face-to-face, you can tell their mood.  So there are some things I might 
say to someone's face, that I would not put in an email as I don't know 
their state of mind when they come to read it - that is quite apart from 
wondering what the various spy agencies will make of my communication.




I am not a big
believer in written-down rules: I think mostly they're a fetishizing
of constitutional arrangements like those of the US and Canada (which
mostly don't work for those who are not already enfranchised).  But we
can do something about that by thinking about that possibility much
more than we can do something about it by writing down rules.
Try defining a car that includes everything that you consider a car, and 
excludes everything that doesn't.  If you do the exercise properly, you 
will find it impossible, no matter how much nor how carefully you 
write!  Now most people would agree what a car is (For the Americans use 
'automobile'), yet trying to define it rigorously is simply not feasible.


Still, the exercise of writing down rules may help to notice things
one wouldn't say to a friend.  And I hope we're all friends here.
I had a boss who was a Maori who was (& is still) a great friend, of 
whom I have considerable respect.  There are things I said to him that 
are definitely not PC, that he took in the intended spirit, that would 
be inappropriate to say in public.  I was very careful not to be in that 
mode too often, as it would be somewhat wearing.  A couple of years 
later he was quite happy to hire me for another project.


It is the perceived intention of what one says that is important, not 
what one actually says!  For another example, you can be very rude 
simply by being inappropriately polite.


I've often called my best friend a bastard - but due to context, he took 
as a compliment.




Best regards,

A





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Code of Conduct: Is it time? (WIP CoC)

2016-01-11 Thread Gavin Flower

On 12/01/16 11:21, Karsten Hilbert wrote:

On Mon, Jan 11, 2016 at 02:00:22PM -0800, Joshua D. Drake wrote:


3. A safe, respectful, productive and collaborative environment is free
comments related to gender, sexual orientation, disability, physical
appearance, body size or race.

... for of _off-topic_ comments related to ...

Since I am using PostgreSQL for storing Electronic Medical
Record data I fear I will need to be able to discuss schema
layout related to gender, sexual orientation, disability,
physical appearance, body size, and race.

Karsten Hilbert
And what about people who want to construct a database to help survivors 
of sexual abuse, and/or doing research into sexual abuse?




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Let's Do the CoC Right

2016-02-08 Thread Gavin Flower

On 24/01/16 13:48, Regina Obe wrote:

This is mostly in response to David's recent comments.  I should say David,
you are really beginning to make me feel unsafe.
By unsafe I mean my mental safety of being able to speak truthfully without
fear of being kicked out of a community I love.

I do not think we need a Coc and if we do, it's only to protect me from
people like this Kurtis guy:

https://twitter.com/siloraptor/status/690637345972981760

So if we have a Coc, I want all people who are on Core and Coc committee  to
be exempt from it.  Because if I can't trust them I can't trust anybody in
the PostgreSQL group.

So here are some comments to your comments David:



3. If I understand correctly, the impetus for adopting a CoC (which,

believe me, I laud in no uncertain terms) was this post by Randi Harper
about her experience reporting abuse to the FreeBSD community:

   >
http://blog.randi.io/2015/12/31/the-developer-formerly-known-as-freebsdgirl/


Ideally, by adopting a CoC and an enforcement policy, we can try to

prevent bad experiences for people reporting abuse. However, in this
example, the abuse, which came from a

FreeBSD committer and was aimed at another, took place on Twitter, not in

a FreeBSD forum. However, the rules of the FreeBSD community at that time
did not cover abuse outside

sanctioned community forums. As a result, the FreeBSd core:

I brought that up by the way and is what broke my camel's back about simply
ignoring this nonsense and going about my business doing PostGIS and tech
writing.
I personally went and talked to all the people that supposedly harassed
Randi and guess what?
They happened to be very nice people, that seemed emotionally traumatized by
her unjust assaults and her hiding behind (I'M A WOMAN YOU CAN'T TOUCH ME -
trump card).
  In search of the truth, I found new friends.

I don't want to even go into detail about the torture in community and
outside she put this poor guy thru what she put him thru I would expect her
to pay a million dollars in law-suits.

https://twitter.com/siloraptor/status/689969604102328320

As for her, she blocked me because I said after studying the evidence I
found her accusations baseless.



Look, I'm not an authority on this stuff, either. But I understand that

rules, such as those in a Code of Conduct, must be explicit and as
unambiguous as language will allow.

Those who claim to be authorities are the most narrow-minded, self-absorbed,
culturally sheltered people I have ever met.  They can only think of
unambiguity in their own minds.

Chis Travers has demonstrated, that though he's white, he's been exposed to
so many cultures that he has a sense of how each feels. His experiences make
him an authority.
George Winkless has faced abuse and bullying.  He knows what it is when he
sees it.  Forget he's white.  His experiences make him an authority.
Josh Drake has to put up with 2 women every day being the only guy in his
immediate family.  Ironically he probably has a better perspective on the
"How women feel?" story than I do.

Now as for me true I'm a mixed race (Half-black, Half-white, woman, and I'm
a dual citizen - Half-Nigerian/ Half American and married to a Chinese man).
I should be the master authority, but guess what, I don't consider myself
one.

If I'm in an all black group I'm asked -- "You're part white, what do you
think white people think about this?" - I say, if you have a group of white
people, they'll all disagree with each other

If I'm in an all white group I'm asked -- "You're black what do black people
think?" - Well Nigerians think very differently than non-Nigerians, and I'm
not usually in an all black group that is not my extended Nigerian family.

If I'm in my husband's family meeting -- "Hmm Regina likes this food, I
wonder if that means all non-Chinese will like this" - curiously enough they
all pass me there - Red-bean porridge dessert, and I remain puzzled why
Chinese hand out desserts that their people don't seem to care for.

if I'm in an all-male group, I'm asked, "You're a woman, do you feel FOSS is
a rape culture.  Has someone tried to rape you in conferences? Do you feel
unsafe"   And I'll

a) Point them to Josh Drake, cause he's had more experience dealing with
women than I have
b) Also point out that I've lived under the shadow of my older brothers,
following them around, had boy hand-me down toys, had a mother who was
"Daddy's favorite girl".
So essentially I'm a Tom-boy that feels extremely uncomfortable in all
female groups.  They look like me, but they are foreign creatures to me.  I
feel I understand the "male" psyche better if there is such a thing.

Finally I've suffered a lot of bullying in youth (and I mean real unsafe
kind like running from the bus when being chased by a gang of Italian boys
in an all-italian neighborhoo ready to lynch you kind of bullying) and I bet
most geeks have, so we are all very experienced on the subject and I would
hope wouldn't wish it on anyone else.  We don

Re: [GENERAL] Uninstalled working db by mistake

2016-03-24 Thread Gavin Flower

On 25/03/16 07:03, Howard News wrote:



On 24/03/2016 17:52, David Wilson wrote:
Per the heading printed by dpkg --list, this means the package is in 
the removed state, but it's config files are still present. "apt-get 
install postgresql-9.0" should be all required. David


David,

I owe you a beer. Nay - several drinks of your choice.

Howard.




David,

May I suggest you choose something like:
Tequila Ley .925 which is a mere $3,500,000

see:
http://www.therichest.com/luxury/most-expensive/the-10-most-expensive-alcoholic-drinks-on-earth/?view=all



Cheers,
Gavin

P.S.
Smiley's omitted due to severe budget constraints...


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] $foo $bar is BAD

2016-04-17 Thread Gavin Flower

On 16/04/16 14:00, Peter Devoy wrote:

Although people commonly use $foo $bar in examples, it is actually a misuse of 
a VERY rude acronym.
The next time you need to make an example, please try being a little more 
original (or meaningful) with your variable names.

In light of recent CoC decisions, I would like to propose the
opposite.  I think more expletives would
diversify the language of the documentation and lower the barriers to
contribution by individuals more
dispositioned to use colourful language due to their cultural and/or
socioeconomic background. O:-)



Peter,

You're just a misbegotten son of a Third Hand Camel Dealer!  :-)


Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] disable ipv6?

2016-04-21 Thread Gavin Flower

On 20/04/16 13:35, Jonathan Vanasco wrote:

I'm running postgresql on ubuntu.  the 9.4 branch from postgresql.org

I think the only way to disable ipv6 is to edit postgresql.conf and explicitly 
state localhost in ipv4 as follows

 - listen_addresses = 'localhost'
 + listen_addresses = '127.0.0.1'

can anyone confirm?





Curious, why do you want to disable IPv6 for local access?


Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Fast way to delete big table?

2016-05-15 Thread Gavin Flower

On 16/05/16 17:09, Haiming Zhang wrote:


Hi All,

I have a big table (200G with 728 million records), the table slows 
down lots of things. It's time to clean the data up. The requirement 
is when I delete I must summarise few columns to a new table for 
backup purpose. So that mean I need to go through each row and add the 
columns' value to the summary table (the corresponding category).


The table has indexes (multicoumn indexes) before, I removed the 
indexes because I read on forum says the indexes *heavily* reduce the 
speed of deleting. That's true if I only delete from the table, but my 
case is I first SELECT this row, then add few values to the summary 
table and then DELETE the row according to its multiple primary key. 
This seems to be a bad idea now as it takes very long time to DELETE 
the row (as delete needs to find the row first).


Here are the two version of the delete functions, please help to point 
out how can I speed it up.


1.

CREATE OR REPLACE FUNCTION summary_delete_table()

RETURNS integer AS

$BODY$

DECLARE

rec RECORD;

subrec  RECORD;

BEGIN

FOR rec IN SELECT * FROM tableA limit 100 LOOP

BEGIN

UPDATE summaryTable set count1 = rec.count1 + rec.count1, count2 =... 
where category match;


delete from tableA where tableA.primaryKeyA=rec.primaryKeyA and 
tableA.primaryKeyB=rec.primaryKeyB;


END;

END LOOP;

return 1;

END;

$BODY$

LANGUAGE plpgsql VOLATILE

COST 100;

And then I have a .bat script to loop the above function million times.

2.

CREATE OR REPLACE FUNCTION summary_delete_table()

RETURNS integer AS

$BODY$

DECLARE

rec RECORD;

td_cursorCURSOR FOR SELECT * FROM tableA limit 100;

BEGIN

FOR rec IN td_cursor LOOP

BEGIN

UPDATE summaryTable set count1 = rec.count1 + rec.count1, count2 =... 
where category match;


delete from tableA WHERE CURRENT OF td_cursor;

END;

END LOOP;

return 1;

END;

$BODY$

LANGUAGE plpgsql VOLATILE

COST 100;

Method 2 is bit faster but not much, the delete speed is 2478 rows/s 
for method 2 and 2008 rows/s for method 1.


Any suggestions are welcomed.

BTW, I guess if reindex, it may take few days to finish.

Also, I tried change delete 100 rows at a time and 1000, 2000. The 
result showed 1000 is faster than 100 and 2000 a time.


Thanks and Regards,

Haiming


[...]

In one transaction:

1. populate the summary table
2. drop indexes on tableA
3. delete selected entries in tableA
4. recreate indexes for tableA

If deleting all entries, then simply truncate it!

N.B. I have NOT checked the fine print in the documentation, nor tested 
this - so be warned!  :-)



Cheers,
Gavin




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Members in the Middle East?

2016-05-24 Thread Gavin Flower

On 25/05/16 02:18, Umair Shahid wrote:

Hi,

Do we have folks in this group based out of the Middle East, 
preferably in UAE? We currently don't have a user group in the GCC 
region and I would like to help start one up.


Thanks!

- Umair


All the best!

What does 'GCC' stand for?  My first thought was 'gcc': the 'Gnu 
Compiler Collection'. but obviously it is NOT that!  :-)



Cheers,
Gavin



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How can I get first day date of the previous month ?

2014-06-20 Thread Gavin Flower

On 21/06/14 03:12, Steve Crawford wrote:

On 06/20/2014 12:11 AM, Arup Rakshit wrote:
Thanks for your answer. How to get the first day date of last 6 
months from now then will be :


yelloday_development=# select date_trunc('month', now()) - interval 
'5 month' as first_month;

first_month
---
 2014-01-01 00:00:00+05:30
(1 row)

Is it correct ? I am new pgdql DB :-) Awesome DB it is...

Welcome. And yes, it is awesome. Being new to the DB and mailing list, 
please note that the convention on all PostgreSQL mailing lists is to 
post your reply at the bottom and not to top-post.


The solution you gave will work but I'll offer a word of caution - 
date and time manipulation can get tricky and even the way it is 
handled in PostgreSQL has occasionally been tweaked between versions 
(a good reason to always read the release notes).


The three things that seem to cause the most confusion are time-zones, 
daylight saving time and irregular intervals. So if you assume that 
one day is 24 hours you can encounter trouble at DST changes. And 
PostgreSQL, like any system that manipulates time, needs to make 
certain assumptions about what an interval means (what is one month 
before March 31) which can lead to this:


steve=> select '2014-03-31'::date - '1 month'::interval + '1 
month'::interval;

-
 2014-03-28 00:00:00

when you might have expected this:

steve=> select '2014-03-31'::date - ('1 month'::interval + '1 
month'::interval);

-
 2014-01-31 00:00:00

Have fun but read the docs, experiment and test - especially with 
dates and times.


Cheers,
Steve


Some SQL I wrote to explore this.

I think my solution does not have the above problems, but may have 
others! :-)



Cheers,
Gavin


DROP TABLE IF EXISTS datex;


CREATE TABLE datex
(
id  int PRIMARY KEY,
a_date  dateNOT NULL
);


INSERT INTO datex
(
id,
a_date
)
VALUES
(101, '2014-01-01'),
(102, '2014-01-02'),
(128, '2014-01-28'),
(129, '2014-01-29'),
(130, '2014-01-30'),
(131, '2014-01-31');


SELECT
(date_part('year', d.a_date)::text
|| '-'
|| (date_part('month', d.a_date) + 1)::text
|| '-1')::date
FROM
datex d;



Re: [GENERAL] Random-looking primary keys in the range 100000..999999

2014-07-04 Thread Gavin Flower

On 05/07/14 01:24, Kynn Jones wrote:
I'm looking for a way to implement pseudorandom primary keys in the 
range 10..99.


The randomization scheme does not need to be cryptographically 
strong.  As long as it is not easy to figure out in a few minutes it's 
good enough.


My starting point for this is the following earlier message to this list:

http://www.postgresql.org/message-id/49f96730.4000...@postnewspapers.com.au

The answer given to it here

http://www.postgresql.org/message-id/448163db-cac5-4e99-8c4c-57cbc6f6af78@mm

...is really cool, but I don't see how to modify it for the case where 
the domain of the permutation has a cardinality that is not a power of 
2, as it is in my case (cardinality = 90).


---

(In the crypto world there are "format preserving encryption" 
techniques that probably could do what I want to do, but their focus 
on cryptographic strength makes learning and implementing them tough 
going, plus, the performance will probably be poor, since high 
workloads are an asset for such crypto applications.  Since 
cryptographic strength is not something I need, I'm trying to find 
non-crypt-grade alternatives.)


Thanks in advance!

kynn


Hi Kynn,

How about  (note that 'payload' could be any set of valid columns):

   -- using a crude Linear Congruential Generator
   -- not very random, but does NOT create duplicates


   DROP TABLE IF EXISTS rtab;
   DROP SEQUENCE IF EXISTS rseq;

   CREATE SEQUENCE rseq;

   CREATE TABLE rtab
   (
id int PRIMARY KEY default(10 + (nextval('rseq') * 543537 +
   997) % 90),
payload int NOT NULL
   );

   INSERT INTO rtab (payload) VALUES (generate_series(1, 10));

   TABLE rtab;

Sample output:

   id   | payload
   +-
 644534 |   1
 288071 |   2
 831608 |   3
 475145 |   4
 118682 |   5
 662219 |   6
 305756 |   7
 849293 |   8
 492830 |   9
 136367 | 10
 679904 | 11
 323441 | 12
 866978 | 13
 510515 | 14
 154052 | 15
 697589 | 16
 341126 | 17
 884663 | 18
 528200 | 19
 171737 | 20



Cheers,
Gavin


Re: [GENERAL] Random-looking primary keys in the range 100000..999999

2014-07-04 Thread Gavin Flower

On 05/07/14 15:48, Gavin Flower wrote:

On 05/07/14 01:24, Kynn Jones wrote:
I'm looking for a way to implement pseudorandom primary keys in the 
range 10..99.


The randomization scheme does not need to be cryptographically 
strong.  As long as it is not easy to figure out in a few minutes 
it's good enough.


My starting point for this is the following earlier message to this list:

http://www.postgresql.org/message-id/49f96730.4000...@postnewspapers.com.au

The answer given to it here

http://www.postgresql.org/message-id/448163db-cac5-4e99-8c4c-57cbc6f6af78@mm

...is really cool, but I don't see how to modify it for the case 
where the domain of the permutation has a cardinality that is not a 
power of 2, as it is in my case (cardinality = 90).


---

(In the crypto world there are "format preserving encryption" 
techniques that probably could do what I want to do, but their focus 
on cryptographic strength makes learning and implementing them tough 
going, plus, the performance will probably be poor, since high 
workloads are an asset for such crypto applications.  Since 
cryptographic strength is not something I need, I'm trying to find 
non-crypt-grade alternatives.)


Thanks in advance!

kynn


Hi Kynn,

How about  (note that 'payload' could be any set of valid columns):

-- using a crude Linear Congruential Generator
-- not very random, but does NOT create duplicates


DROP TABLE IF EXISTS rtab;
DROP SEQUENCE IF EXISTS rseq;

CREATE SEQUENCE rseq;

CREATE TABLE rtab
(
id int PRIMARY KEY default(10 + (nextval('rseq') * 543537
+ 997) % 90),
payload int NOT NULL
);

INSERT INTO rtab (payload) VALUES (generate_series(1, 10));

TABLE rtab;

Sample output:

id   | payload
+-
 644534 |   1
 288071 |   2
 831608 |   3
 475145 |   4
 118682 |   5
 662219 |   6
 305756 |   7
 849293 |   8
 492830 |   9
 136367 |  10
 679904 |  11
 323441 |  12
 866978 |  13
 510515 |  14
 154052 |  15
 697589 |  16
 341126 |  17
 884663 |  18
 528200 |  19
 171737 |  20



Cheers,
Gavin

Hmm...

for a 10 times larger range
id   int PRIMARY KEY default(100 + (nextval('rseq') * 
543537 + 997) % 900),

also works!



Re: [GENERAL] Random-looking primary keys in the range 100000..999999

2014-07-08 Thread Gavin Flower

Please don't top post!

See below for my comments.

On 09/07/14 07:04, Kynn Jones wrote:
Thanks to Gavin and Martijn for their suggestions. They're both simple 
good-ol' LCGs, and both avoid the need to check for collisions.


I ultimately went with a multiplicative LCG, like Martijn's, mostly 
because I understand better how it avoids collisions, so it was easier 
for me to tweak it in various ways.


In particular, I changed the prime number from 899981 to the very 
lucky prime 91.  This happens to work *perfectly*, because the 
range of such a generator is p-1, not p.  (BTW, Martijn's choice of 
the "random" 2345 for the multiplier was a somewhat lucky one, since 
such generators are not full for arbitrary multipliers; for example, 
the one with modulus 899981 is not full for a multiplier of 3456, say.)


I also followed Martijn's pointer regarding the 3-argument form of 
python's pow function, and implemented a 3-argument pow for PL/PgSQL. 
 I include all the code below, including a snippet borrowed from 
Gavin's post, and modified here and there.  (I'm not very experienced 
with PL/PgSQL, so please feel free to point out ways in which my 
PL/PgSQL code can be improved.)


First the functions:

CREATE OR REPLACE FUNCTION pow_mod(bigx bigint, n bigint, m 
bigint) returns bigint AS $$

DECLARE
x  bigint;
xx bigint;
BEGIN
  IF n = 0 THEN RETURN 1; END IF;

  x := bigx % m;
  xx := (x * x) % m;

  IF n % 2 = 0 THEN
RETURN pow_mod(xx, n/2, m);
  ELSE
RETURN (x * pow_mod(xx, (n-1)/2, m)) % m;
  END IF;

END;
$$ LANGUAGE plpgsql strict immutable;


-- "mcg" = "multiplicative congruential generator"
CREATE OR REPLACE FUNCTION mcg_91(i bigint) returns int AS $$
BEGIN
  -- CHECK (0 < i AND i < 91)
  RETURN 9 + pow_mod(, i, 91);
END;
$$ LANGUAGE plpgsql strict immutable;


And here's a small demo:

DROP TABLE IF EXISTS rtab;
DROP SEQUENCE IF EXISTS rseq;

CREATE SEQUENCE rseq;

CREATE TABLE rtab
(
id   int PRIMARY KEY DEFAULT mcg_91(nextval('rseq')),
payload  int NOT NULL
);

\timing on \\ INSERT INTO rtab (payload) VALUES 
(generate_series(1, 90)); \timing off

-- Timing is on.
-- INSERT 0 90
-- Time: 201450.781 ms
-- Timing is off.

SELECT * FROM rtab WHERE 449990 < payload AND payload < 450011;
--id   | payload
-- +-
--  539815 |  449991
--  901731 |  449992
--  878336 |  449993
--  564275 |  449994
--  863664 |  449995
--  720159 |  449996
--  987833 |  449997
--  999471 |  449998
--  77 |  44
--  99 |  45
--  921739 |  450001
--  722684 |  450002
--  596638 |  450003
--  121592 |  450004
--  687895 |  450005
--  477734 |  450006
--  585988 |  450007
--  942869 |  450008
--  175776 |  450009
--  377207 |  450010
-- (20 rows)

kj



On Sat, Jul 5, 2014 at 4:35 AM, Martijn van Oosterhout 
mailto:klep...@svana.org>> wrote:


On Fri, Jul 04, 2014 at 09:24:31AM -0400, Kynn Jones wrote:
> I'm looking for a way to implement pseudorandom primary keys in
the range
> 10..99.
>
> The randomization scheme does not need to be cryptographically
strong.  As
> long as it is not easy to figure out in a few minutes it's good
enough.

Well, a trick that produces a not too easy to guess sequence is:

X(n) = p^n mod q

where q is prime. Pick the largest prime that will fit, in this case
899981 (I beleive) and some random p, say 2345.

Then 10 + (2345^n) mod 899981

should be a sequence fitting your purpose. Unfortunatly, the pow()
function in Postgres can't be used here (too slow and it overflows),
but python has a helpful function:

In [113]: len( set( pow(2345, n, 899981) for n in range(899981)  ) )
Out[113]: 899980

You could probably write an equivalent function in Postgres if
necessary.

Hope this helps,
--
Martijn van Oosterhout   mailto:klep...@svana.org>> http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset
that he does
> not attach much importance to his own thoughts.
   -- Arthur Schopenhauer

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)

iQIVAwUBU7e450vt++dL5i1EAQhwew/9Fps1rkjMl85kAhD4nj9i5Gy+Y6T71vbS
gXkHyJOVHr9r9kT8/1shG8OtTDEIKI1FEjDD5wdkTvj+K//wswPcpCIcj5eJVu5K
56v8ITYnc3/YeYthoBI829adAreP7kjBAJlB8lENTAbxkdJmRBEGA3KjEnSLj7I/
pdqlrrbkUq7r/OBFlJYFnv/YXLAFeOWQRAk+Be+UorAUmkrvoA0g7gW4VEFnQ1Qk
k1kTYIEU3HUXVDHUeYTC2jjLj7cFVhYaQ52FA950MzkpkqFAej34gpitcOFC8yf+
KSglMq4nAFNF6VCU50RwPLjMIXXbHTSYxjJ5n3qYo4CExlg0wBLcmuu25GHc69qP
wEx71lPvXb4yfI3YNNHcH3Cwgl46u5M5Dt2aqWDcr+haAy8Hmhm5zqjTcfpUhyD+
efi8B512YDr4HoDV6qEKx0MdjHUFptX34L8tjkmnNYQlXj89ATE82lUoTusiIxts
axwJwbjl81cg3ZbtfoWPQ3LXXSRNI0Nh

Re: [GENERAL] About limit on cube dimensions

2014-08-31 Thread Gavin Flower

On 01/09/14 09:05, Shida Sato wrote:

Hi

Why is there limit on the number of cube dimensions?
It is bit strange because ARRAY has no such limit which is similar to 
cube.

Does it relate to Rtree?
Can I use 1 dimensional cube without R-tree?

---
sato
Have you calculated how much disc space you would need to store a cube 
with 1 dimensions???


Hint, an 8 TB disc would be woefully inadequate, unless it was very 
sparsely populated.



Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] ALTER TEXT field to VARCHAR(1024)

2014-09-19 Thread Gavin Flower

On 19/09/14 19:32, Marius Grama wrote:

Hello,


i am using Postgres 9.2 and I'd like to perform the following ALTER 
statement on a database table with about 30M entries :


ALTER TABLE images ALTER COLUMN mask_descriptors TYPE VARCHAR(1024);

The mask_descriptors field is currently having the type TEXT.
I want to perform the ALTER due to the fact that it seems that copying 
the contents of the table to a BI SQL Server is done in row by row 
(instead of batch) when handling CLOBs.


From the Postgres documentation I got the following :

http://www.postgresql.org/docs/8.3/static/datatype-character.html

Tip: There are no performance differences between these three
types, apart from increased storage size when using the
blank-padded type, and a few extra cycles to check the length when
storing into a length-constrained column. While character has
performance advantages in some other database systems, it has no
such advantages in PostgreSQL. In most situations text or
character varying should be used instead.



Can anybody explain me what happens in the background when the alter 
statement is executed? I've tried it out on a small copy of the table 
(70K) and the operation completed in 0.2 seconds.
Will the table be completely locked during the execution of the ALTER 
statement?




I am curious as to why you want to change text to VARCHAR(1024), 
especially as I think that the middleware should be controlling how long 
a string is saved in the database rather than end user client code (for 
several reasons., including security concerns).  However, I do not know 
your use cases, nor your overall situation - so my concerns may not 
apply to you.



Cheers,
Gavin




Re: [GENERAL] advice sought - general approaches to optimizing queries around "event streams"

2014-09-26 Thread Gavin Flower

On 27/09/14 09:02, Jonathan Vanasco wrote:

I have a growing database with millions of rows that track resources against an 
event stream.

i have a few handfuls of queries that interact with this stream in a variety of 
ways, and I have managed to drop things down from 70s to 3.5s on full scans and 
offer .05s partial scans.

no matter how i restructure queries, I can't seem to get around a few 
bottlenecks and I wanted to know if there were any tips/tricks from the 
community on how to approach them.

a simple form of my database would be:

--  1k of
create table stream (
id int not null primary key,
)

-- 1MM of
create table resource (
id int not null primary key,
col_a bool,
col_b bool,
col_c text,
);

-- 10MM of
create table streamevent (
id int not null,
event_timestamp timestamp not null,
stream_id int not null references stream(id)
);

-- 10MM of
create table resource_2_stream_event(
resource_id int not null references resource(id),
streamevent_id int not null references streamevent(id)
)

Everything is running off of indexes; there are no seq scans.

I've managed to optimize my queries by avoiding joins against tables, and 
turning the stream interaction into a subquery or CTE.
better performance has come from limiting the number of "stream events"  ( 
which are only the timestamp and resource_id off a joined table )

The bottlenecks I've encountered have primarily been:

1.  When interacting with a stream, the ordering of event_timestamp and 
deduplicating of resources becomes an issue.
I've figured out a novel way to work with the most recent events, but 
distant events are troublesome

using no limit, the query takes 3500 ms
using a limit of 1, the query takes 320ms
using a limit of 1000, the query takes 20ms

there is a dedicated index of on event_timestamp (desc) , and it is 
being used
according to the planner... finding all the records is fine; 
merging-into and sorting the aggregate to handle the deduplication of records 
in a stream seems to be the issue (either with DISTINCT or max+group_by)


2.  I can't figure out an effective way to search for a term against an 
entire stream (using a tsquery/gin based search)

I thought about limiting the query by finding matching resources first, 
then locking it to an event stream, but:
- scanning the entire table for a term takes about 10 seconds 
on an initial hit.  subsequent queries for the same terms end up using the 
cache, and complete within 20ms.

I get better search performance by calculating the event stream, then 
searching it for matching documents, but I still have the performance issues 
related to limiting the window of events

i didn't include example queries, because I'm more concerned with the general 
approaches and ideas behind dealing with large data sets than i am with raw SQL 
right now.

i'm hoping someone can enlighten me into looking at new ways to solve these 
problems.   i think i've learned more about postgres/sql in the past 48hour 
than I have in the past 15 years, and I'm pretty sure that the improvements I 
need will come from new ways of querying data , rather than optimizing the 
current queries.

Minor point: when specifying PRIMARY KEY, you don't need to also put NOT 
NULL (this should make no change to performance).


I notice that the 'id' of 'streamevent' is not marked as a PRIMARY KEY, 
so it will not have an index associated with it - hence referencing it 
as a foreign key might be slower than expected.



Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] password in recovery.conf [SOLVED]

2014-09-26 Thread Gavin Flower

On 27/09/14 11:56, John R Pierce wrote:

On 9/26/2014 4:40 PM, John R Pierce wrote:
I'd consider using `mkpasswd -l 15 -s 0` just to avoid any such 
problems.   15 random alphanumerics is already plenty complex, 
62^15th possible combinations, without needing to mix in special 
characters.


$ mkpasswd -l 15 -s 0
eec1kj7ZsthlYmh


btw, thats 768,909,700,000,000,000,000,000,000 possible passwords. 768 
septillion, using the aamerican 'short scale' naming convention.  if 
you could brute force try 1/second, it would merely take 
24,365,800,000,000 centuries (24 trillion).



So do you think a password like *Nxw7TnC2^}%(}tEz* is strong enough?  :-)

I developed a Java program that generates 20 passwords (each of 16 
characters) at a time, I've attached it for anyone who might be 
interested.  I have put it under the GPL version 3, but I might consider 
releasing under other licences.



Cheers,
Gavin
package gcf.misc;

/**
 * Copyright © 2012 Gavin C. Flower
 * 
 * author: gavin.flo...@archidevsys.co.nz
 * 
 * This program is free software: you can redistribute it and/or modify it under
 * the terms of the GNU General Public License as published by the Free Software
 * Foundation, either version 3 of the License, or (at your option) any later
 * version.
 * 
 * This program is distributed in the hope that it will be useful, but WITHOUT
 * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
 * FOR A PARTICULAR PURPOSE. See the GNU General Public License for more
 * details.
 * 
 * For full details of the license see .
 */

import java.security.SecureRandom;

public class AppPasswordGenerator
{
private final static int PASSWORD_LENGTH = 16;

private final static int MAX_INDEX = PASSWORD_LENGTH - 1;

/*
 * We avoid ambiguous characters, so you won't get 'I1|l', 'B8', 'S5', or
 * 'O0' being produced
 */

private static String DIGITS = "23456789";

private static String SPECIAL = "!@#$%^&*()_+{}[]<>.:";

private static String UPPER = "ACDEFGHJKLMNPQRTVWXY";

private static String LOWER = "abcdefghijklmnopqrstuvwxyz";

private static String FULL = DIGITS + SPECIAL + UPPER + LOWER;

private final StringBuilder SB = new StringBuilder(PASSWORD_LENGTH);

SecureRandom secureRandom = new SecureRandom();

AppPasswordGenerator()
{
/*
 * This is way more complicated than it needs to be for the current
 * application, but it was fun coding it!
 * 
 * The use of sin() & exp() introduce a semirandom delay in obtaining
 * the current time in nano seconds as well as returning values to act
 * as additional randomising factors.
 */
long nanoA = System.nanoTime();
double sinVal = Math.sin(nanoA);
long nanoB = System.nanoTime();
double expVal = Math.exp(sinVal);
long nanoC = System.nanoTime();
int shift = (int) nanoB & 0x3F;
long rotation = Long.rotateRight(nanoC, shift);
long rawBits = Double.doubleToRawLongBits(expVal);
long seed = rotation ^ rawBits;
secureRandom.setSeed(seed);

// System.out.printf("nanoA: %016X\n", nanoA);
// System.out.printf("   sinVal: %16.13f\n", sinVal);
// System.out.printf("nanoB: %016X\n", nanoB);
// System.out.printf("   expVal: %16.13f\n", expVal);
// System.out.printf("nanoC: %016X\n", nanoC);
// System.out.printf("shift: %16d\n", shift);
// System.out.printf("  rawBits: %016X\n", rawBits);
// System.out.printf(" rotation: %016X\n", rotation);
// System.out.printf(" seed: %016X\n", seed);
// System.out.printf("FULL.length(): %16d\n", FULL.length());
}

public static void main(String[] args)
{
AppPasswordGenerator appPasswordGenerator = new AppPasswordGenerator();
appPasswordGenerator.go();
}

private void go()
{
assert PASSWORD_LENGTH > 5; // Actually, later code assume 16...

for (int i = 0; i < 20; i++)
{
printAPassword();
}
}

private void printAPassword()
{
addChar(DIGITS);
addChar(DIGITS);
addChar(SPECIAL);
addChar(UPPER);
addChar(LOWER);

for (int ii = SB.length(); ii < PASSWORD_LENGTH; ii++)
{
addChar(FULL);
}

// Randomise password characters
for (int index_a = 0; index_a < PASSWORD_LENGTH; index_a++)
{
char ca = SB.charAt(index_a);
int index_b = secureRandom.nextInt(PASSWORD_LENGTH);
char cb = SB.charAt(index_b);
SB.setCharAt(index_b, ca);
SB.setCharAt(index_a, cb);
}

// Ensure the last character is not a digit
while (Character.isDigit(SB.charAt(MAX_INDEX)))
{
int index = secureRandom.nextInt(MAX_INDEX);
char ca = SB.charAt(MAX_INDEX);
char cb = SB.charAt(

Re: [GENERAL] Postgres as key/value store

2014-09-27 Thread Gavin Flower

On 28/09/14 12:48, snacktime wrote:
I'm looking for some feedback on the design I'm using for a basic 
key/value storage using postgres.


Just some quick background.  This design is for large scale games that 
can get up to 10K writes per second or more.  The storage will be 
behind a distributed memory cache that is built on top of Akka, and 
has a write behind caching mechanism to cut down on the number of 
writes when you have many updates in a short time period of the same 
key, which is common for a lot of multiplayer type games.


I have been using Couchbase, but this is an open source project, and 
Couchbase is basically a commercial product for all intents and 
purposes, which is problematic.  I will still support Couchbase, but I 
don't want it have to tell people if you really want to scale, 
couchbase is the only option.


The schema is that a key is a string, and the value is a string or 
binary.  I am actually storing protocol buffer messages, but the 
library gives me the ability to serialize to native protobuf or to 
json.  Json is useful at times especially for debugging.


This is my current schema:

CREATE TABLE entities
(
  id character varying(128) NOT NULL,
  value bytea,
  datatype smallint,
  CONSTRAINT entities_pkey PRIMARY KEY (id)
);

CREATE OR REPLACE RULE entities_merge AS
ON INSERT TO entities
   WHERE (EXISTS ( SELECT 1
   FROM entities entities_1
  WHERE entities_1.id::text = new.id::text)) DO INSTEAD 
 UPDATE entities SET value = new.value, datatype = new.datatype

  WHERE entities.id::text = new.id::text;

Additional functionality I want is to do basic fuzzy searches by key.  
Currently I'm using a left anchored LIKE query.  This works well 
because keys are left prefixed with a scope, a delimiter, and then the 
actual key for the data.  These fuzzxy searches would never be used in 
game logic, they would be admin only queries for doing things like 
 obtaining a list of players.  So they should be infrequent.


The scope of the query ability will not expand in the future.  I 
support multiple backends for the key/value storage so I'm working 
with the lowest common denominator.  Plus I have a different approach 
for data that you need to do complex queries on (regular tables and an 
ORM).


 Chris

Note:
I suspect that what I suggest below will probably NOT improve 
performance, and may not necessarily be appropriate for your use case.  
However, they may facilitate a wider range of queries, and might be 
easier to understand.


Note the comment about using 'PRIMARY KEY' in 
http://www.postgresql.org/docs/9.2/static/sql-createtable.html


   [...]
   The primary key constraint specifies that a column or columns of a
   table can contain only unique (non-duplicate), nonnull values.
   Technically, PRIMARY KEY is merely a combination of UNIQUE and NOT
   NULL, but identifying a set of columns as primary key also provides
   metadata about the design of the schema, as a primary key implies
   that other tables can rely on this set of columns as a unique
   identifier for rows.
   [...]


My first thought was to simplify the table create, though I think the 
length check on the id is best done in the software updating the databased:


   CREATE TABLE entities
   (
id text PRIMARY KEY,
value  bytea,
datatype   smallint,
CONSTRAINT id_too_long CHECK (length(id) <= 128)
   );

Then I noticed that your id is actually a compound key, and probably 
would be better modelled as:


   CREATE TABLE entities
   (
scope  text,
keytext,
value  bytea,
datatype   smallint,
CONSTRAINT entities_pkey PRIMARY KEY (scope, key)
   );

I suspect that making 'datatype' an 'int' would improve performance, but 
only by a negligible amount!



Cheers,
Gavin



Re: [GENERAL] table versioning approach (not auditing)

2014-09-28 Thread Gavin Flower

On 29/09/14 15:00, Abelard Hoffman wrote:
Hi. I need to maintain a record of all changes to certain tables so 
assist in viewing history and reverting changes when necessary 
(customer service makes an incorrect edit, etc.).


I have studied these two audit trigger examples:
https://wiki.postgresql.org/wiki/Audit_trigger
https://wiki.postgresql.org/wiki/Audit_trigger_91plus

I've also read about two other approaches to versioning:
1. maintain all versions in one table, with a flag to indicate which 
is the current version
2. have a separate versions table for each real table, and insert into 
the associated version table whenever an update or insert is done.


My current implementation is based on the wiki trigger examples, using 
a single table, and a json column to record the row changes (rather 
than hstore). What I like about that, in particular, is I can have a 
"global," chronological view of all versioned changes very easily.


But there are two types of queries I need to run.
1. Find all changes made by a specific user
2. Find all changes related to a specific record

#1 is simple to do. The versioning table has a user_id column of who 
made the change, so I can query on that.


#2 is more difficult. I may want to fetch all changes to a group of 
tables that are all related by foreign keys (e.g., find all changes to 
"user" record 849, along with any changes to their "articles," 
"photos," etc.). All of the data is in the json column, of course, but 
it seems like a pain to try and build a query on the json column that 
can fetch all those relationships (and if I mess it up, I probably 
won't generate any errors, since the json is so free-form).


So my question is, do you think using the json approach is wrong for 
this case? Does it seem better to have separate versioning tables 
associated with each real table? Or another approach?


Thanks


I implemented a 2 table approach over 15 years ago for an insurance 
application.  I used both an /effective_date/ & and an /as_at_date/, no 
triggers were involved.  I think a 2 table approach gives you more 
flexibility.


The /effective_date/ allowed changes to be made to the table in advance 
of when they were to become effective.


The /as_at_date/ allowed quotes to be made, valid for a period starting 
at the as_at_date.


End users did not query the database directly, all queries were precoded 
in a 4GL called Progress backed by an Oracle database.  The same could 
be done with a WildFly Java Enterprise AppSever (or some other 
middleware) and a PostgreSQL backend.


Different use case, but the concept is probably adaptable to your situation.

You may want a change table, that has a change_number that is in each 
type of table affected by a change.  This would help for query type #2.


I would be quite happy to contract to work out the appropriate schema 
and develop some SQL scripts to query & update the database, if you were 
interested.  My approach would be to create a minimal database with 
sample data to validate the schema design and SQL scripts.


Using a flag to indicate current record, seems inflexible.  As some 
changes may not take affect until some time in the future, and you can't 
query the database to see what was the situation at a particular point 
in the past.  For example: somebody complains about something that 
happened last Saturday near noon, how would you query the database to 
what it was like then?



Cheers,
Gavin



Re: [GENERAL] table versioning approach (not auditing)

2014-10-06 Thread Gavin Flower

On 07/10/14 10:47, Jim Nasby wrote:

On 10/2/14, 9:27 AM, Adam Brusselback wrote:
i've also tried to implement a database versioning using JSON to log 
changes in tables. Here it is: 
https://github.com/fxku/audit[https://github.com/fxku/audit][https://github.com/fxku/audit[https://github.com/fxku/audit]] 
 

I've got two versioning tables, one storing information about all 
transactions that happened and one where i put the JSON logs of row 
changes of each table. I'm only logging old values and not complete 
rows.


Then I got a function that recreates a database state at a given time 
into a separate schema - either to VIEWs, MVIEWs or TABLES. This 
database state could then be indexed in order to work with it. You 
can also reset the production state to the recreated past state.


Unfortunately I've got no time to further work on it at the moment + 
I have not done tests with many changes in the database so I can't 
say if the recreation process scales well. On downside I've realised 
is that using the json_agg function has limits when I've got binary 
data. It gets too long. So I'm really looking forward using JSONB.


There are more plans in my mind. By having a Transaction_Log table it 
should be possible to revert only certain transactions. I'm also 
thinking of parallel versioning, e.g. different users are all working 
with their version of the database and commit their changes to the 
production state. As I've got a unique history ID for each table and 
each row, I should be able to map the affected records.
Sorry I'm coming late to this thread. I agree that getting interested 
people together would be a good idea. Is there another mailing list we 
can do that with?


Versioning is also something I've interested in, and have put a lot of 
thought into (if not much actual code :( ). I'll also make some 
general comments, if I may...



I think timestamps should be *heavily avoided* in versioning, because 
they are frequently the wrong way to solve a problem. There are many 
use cases where you're trying to answer "What values were in place 
when X happened", and the simplest, most fool-proof way to answer that 
is that when you create a record for X, part of that record is a 
"history ID" that shows you the exact data used. For example, if 
you're creating an invoicing system that has versioning of customer 
addresses you would not try and join an invoice with it's address 
using a timestamp; you would put an actual address_history_id in the 
invoice table.


I thought I saw a reference to versioning sets of information. This is 
perhaps the trickiest part. You first have to think about the 
non-versioned sets (ie: a customer may have many phone numbers) before 
you think about versioning the set. In this example, you want the 
history of the *set* of phone numbers, not of each individual number. 
Design it with full duplication of data first, don't think about 
normalizing until you have the full set versioning design.


I understand the generic appeal of using something like JSON, but in 
reality I don't see it working terribly well. It's likely to be on the 
slow side, and it'll also be difficult to query from. Instead, I think 
it makes more sense to create actual history tables that derive their 
definition from the base table. I've got code that extracts 
information (column_name, data type, nullability) from a table (or 
even a table definition), and it's not that complex. With the work 
that's been done on capturing DDL changes it shouldn't be too hard to 
handle that automatically.



Yeah, my design was quite extensive and ensured all relevant information 
was associated with the 'history id' (still need timestamps to find the 
appropriate value), the powers that be watered it down somewhat (but 
that was outside my control). Performance was not too critical, probably 
less than 10 transactions per second at peak times.  JSON, had yet to be 
invented, but we would not have used it anyhow.


Even if timestamps are used extensively, you'd have to be careful 
joining on them. You may have information valid at T1 and changing at 
T3, but the transaction has T2, where T1 < T2 < T3 - the appropriate set 
of data would be associated with T1, would would not get anywhere trying 
to find data with a timestamp of T2 (unless you were very lucky!).


Actually things like phone numbers are tricky.  Sometimes you may want 
to use the current phone number, and not the one extant at that time (as 
you want to phone the contact now), or you may still want the old phone 
number (was the call to a specific number at date/time legitimate & who 
do we charge the cost of the call too).



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] table versioning approach (not auditing)

2014-10-07 Thread Gavin Flower

On 08/10/14 13:29, Jim Nasby wrote:

On 10/6/14, 6:10 PM, Gavin Flower wrote:
Even if timestamps are used extensively, you'd have to be careful 
joining on them. You may have information valid at T1 and changing at 
T3, but the transaction has T2, where T1 < T2 < T3 - the appropriate 
set of data would be associated with T1, would would not get anywhere 
trying to find data with a timestamp of T2 (unless you were very 
lucky!).


Yeah, this is why I think timestamps need to be shunned in favor of 
explicit pointers. Anyone that thinks timestamps are good enough 
hasn't thought the problem through completely. :)


I also think there's potential value to storing full transaction 
information (presumably in a separate table): txid_current(), 
txid_current_snapshot(), now(), current_user, maybe some other stuff 
(client IP address?). That way you can tell exactly what created a 
history record. With appropriate shenanigans you can theoretically 
determine exactly what other history data would be visible at that 
time without using pointers (but man would that bu ugly!)


Actually things like phone numbers are tricky.  Sometimes you may 
want to use the current phone number, and not the one extant at that 
time (as you want to phone the contact now), or you may still want 
the old phone number (was the call to a specific number at date/time 
legitimate & who do we charge the cost of the call too).


Yeah, I'm pretty convinced at this point that history/versioning 
should be built on top of a schema that always contains the current 
information, if for no other reason than so you always have a PK that 
points to what's current in addition to your history PKs.
One of the motivations for having an effective_date, was being able to 
put changes into the database ahead of time.


Finding the current value uses the same logic a find the value at any 
other date/time - so you don't need a special schema to distinguish the 
current state from anything else.  For example:


   DROP TABLE IF EXISTS stock;

   CREATE TABLE stock
   (
id text,
effective_date timestamptz,
price numeric
   );

   INSERT INTO stock
   (
   id,
   effective_date,
   price
   )
   VALUES
('y88', '2014-10-01', 12.0),
('x42', '2014-10-01', 12.1),
('x42', '2014-10-08', 12.2),
('x42', '2014-10-10', 12.3),
('x42', '2014-10-16', 12.4),
('z42', '2014-10-19', 12.5),
('z49', '2014-10-01', 12.6),
('z49', '2014-10-30', 12.7),
('z77', '2014-10-01', 12.8);

   CREATE UNIQUE INDEX primary_key ON stock (id ASC, effective_date DESC);

   SELECT
s.price
   FROM
stock s
   WHERE
s.id = 'x42'
AND s.effective_date <= '2014-10-11'
   ORDER BY
s.effective_date DESC
   LIMIT 1;



Cheers,
Gavin


Re: [GENERAL] Avoiding deadlocks when performing bulk update and delete operations

2014-11-23 Thread Gavin Flower

On 24/11/14 16:51, Sanjaya Vithanagama wrote:

Hi All,

We have a single table which does not have any foreign key references.

id_A (bigint)
id_B (bigint)
val_1 (varchar)
val_2 (varchar)

The primary key of the table is a composite of id_A and id_B.

Reads and writes of this table are highly concurrent and the table has 
millions of rows. We have several stored procedures which do mass 
updates and deletes. Those stored procedures are being called 
concurrently mainly by triggers and application code.


The operations usually look like the following where it could match 
thousands of records to update or delete:


DELETE FROM table_name t
USING (
   SELECT id_A, id_B
   FROM   table_name
   WHERE  id_A = ANY(array_of_id_A)
   ANDid_B = ANY(array_of_id_B)
   ORDER  BY id_A, id_B
   FORUPDATE
   ) del
WHERE  t.id_A = del.id_A
ANDt.id_B = del.id_B;


UPDATE table_name t
SETval_1 = 'some value'
 , val_2 = 'some value'
FROM (
   SELECT id_A, id_B
   FROM   table_name
   WHERE  id_A = ANY(array_of_id_A)
   ANDid_B = ANY(array_of_id_B)
   ORDER  BY id_A, id_B
   FORUPDATE
   ) upd
WHERE  t.id_A = upd.id_A
ANDt.id_B = upd.id_B;

We are experiencing deadlocks and all our attempts to perform 
operations with locks (row level using SELECT FOR UPDATE as used in 
the above queries and table level locks) do not seem to solve these 
deadlock issues. (Note that we cannot in any way use access exclusive 
locking on this table because of the performance impact)


Is there another way that we could try to solve these deadlock 
situations? The reference manual says — "The best defense against 
deadlocks is generally to avoid them by being certain that all 
applications using a database acquire locks on multiple objects in a 
consistent order."


Is there a guaranteed way to do bulk update/delete operations in a 
particular order so that we can ensure deadlocks won't occur? Or are 
there any other tricks to avoid deadlocks in this situation?


Thank you in advance,
Sanjaya
Unless there is some sort of implied locking, or other nonsense like 
different lock types, then always acquiring locks in the same order 
should work - as far as I can tell.


For purely locking problems, and assuming that all the relevant tables 
are locked:


For if process p1 acquires locks in the order A, B, & C
and process p2 acquires locks in the order A, C, & D,
Then as soon as one process grabs A, then the other process cannot grab 
A nor the other locks - so deadlock is avoided.


Similarly:
p1 A, B, C, & D
p2 B & C
However, if p1 grabs A and then p2 grabs B, P1 will have to wait for p2 
to finish before p1 continues - but still, neither is deadlocked.  Even 
if there is p3 which locks B - at worst 2 processes will wait until the 
lucky first process releases its locks.


You may have problems if there is some resource that is in contention, 
where 2 processes require the resource and grab it in several parts at 
different times, and they both grab some, and then there is insufficient 
to completely satisfy either - this is guesswork, I'm not sure what 
resources (if any) would be a problem here.



Cheers,
Gavin



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Stored procedure workflow question

2014-12-10 Thread Gavin Flower

On 11/12/14 13:53, Israel Brewster wrote:
Currently, when I need to create/edit a stored procedure in 
Postgresql, my workflow goes like the following:


- Create/edit the desired function in my "DB Commands" text file
- Copy and paste function into my development database
- Test
- repeat above until it works as desired
- Copy and paste function into my production DB.

To edit an existing function, the workflow is basically the same as 
above, but I first have to find the function in my file.


This whole workflow just feels kludgy to me. Is there a better way? Or 
is that essentially the recommended procedure? Thanks.

---
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
---





I create an SQL file using a text editer, and then execute it in psql 
using the '\i' command from the appropriate directory:


gavin=> \i bus.sql

I your case I would test it in one environment and copy it to another.

You could use git to track versions of the file and the nature of changes.

Though, I am sure there are sophisticated ways of doing this!


Cheers,
Gavin



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Error: "Out of memory while reading tuples." in pushing table from SAS to PostgreSQL on Mac

2014-12-14 Thread Gavin Flower

On 15/12/14 04:44, Andy Colson wrote:

On 12/13/2014 10:03 PM, wetter wetterana wrote:

Hi,

I'm passing rows from SAS to PostgreSQL (I assign a libname and use a 
PROC APPEND).  This works fine with smaller tables (~below 1 million 
rows).  However, as tables get larger I receive the following error 
messages:



"ERROR: CLI describe error: Out of memory while reading tuples.; No 
query has been executed with that handle"

and
"GLOBAL SYSDBMSG POSTGRES: Out of memory while reading tuples.; No 
query has been executed with that handle

GLOBAL SYSDBRC HY000"

I've tried to change memory settings on the PostgreSQL server, but 
can't solve the problem.  As far as I could understand—I'm new to 
PostgreSQL ;)—it seems that PostgreSQL want to somehow read 
information on the whole table before processing it and this behavior 
could eventually be switched off, but I might be wrong here.


FYI:
- I run SAS 9.4 on a windows machine.
- I run PostgreSQL server on a MAC: PostgreSQL 9.3.5 on 
x86_64-apple-darwin12.5.0, compiled by Apple LLVM version 5.1, 64-bit

   I've Pgadmin 1.18.1 installed.

Any help would be much appreciated!!

THANKS!


PS: For several reasons, I cannot use the bulkload feature in SAS for 
this job.





That error is coming from SAS, not PG.  SAS must have pretty bad 
documentation because when I google "GLOBAL SYSDBRC HY000" there is 
pretty much nothing.  (Also, I've never used, or even heard of SAS.  
At first I thought you meant serial attached scsi)


Does SAS support a cursor of some kind?

-Andy


SAS: Statistical Analysis System (this is the original name I remember, 
I think they have renamed it)


SAS: Special Air Service (elite British fighting force, deliberately 
misnamed to confuse the Germans in World War II)


Funny, just now I thought of the second definition first, but obviously 
the first applies.


I actually used SAS many years ago, long before I had written any SQL - 
let alone directly used any database!



Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Stability of JSON textual representation

2015-02-08 Thread Gavin Flower

On 03/02/15 00:06, David Evans wrote:
I've noticed that when representing lists as JSON, Postgres 9.4 
sometimes outputs spaces after commas, and other times does not.


# SELECT array_to_json(array[1, 2, 3]), json_build_array(1, 2, 3);

 array_to_json | json_build_array
---+--
 [1,2,3]   | [1, 2, 3]


This matters to me because I'm intending the hash the JSON text and 
use the hash for change detection.


This difference in behaviour doesn't seem to be documented anywhere. 
Does anyone know whether it is likely to be stable?


Many thanks,

Dave

 Would using jsonb be more consistent?


Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Hardware requirements for a PostGIS server

2015-02-10 Thread Gavin Flower

On 11/02/15 13:52, Mathieu Basille wrote:

Dear PostgreSQL users,

I am posting here a question that I initially asked on the PostGIS 
list [1], where I was advised to try here too (I will keep both lists 
updated about the developments on this issue).


I am currently planning to set up a PostgreSQL + PostGIS instance for 
my lab. Turns out I believe this would be useful for the whole center, 
so that I'm now considering setting up the server for everyone—if 
interest is shared of course. At the moment, I am however struggling 
with what would be required in terms of hardware, and of course, the 
cost will depend on that—at the end of the day, it's really a matter 
of money well spent. I have then a series of questions/remarks, and I 
would welcome any feedback from people with existing experience on 
setting up a multi-user PostGIS server. I'm insisting on the PostGIS 
aspect, since the most heavy requests will be GIS requests 
(intersections, spatial queries, etc.). However, people with similar 
PostgreSQL setup may have very relevant comments about their own 
configuration.


* My own experience about servers is rather limited: I used PostGIS 
quite a bit, but only on a desktop, with only 2 users. The desktop was 
quite good (quad-core Xeon, 12 Go RAM, 500 GB hd), running Debian, and 
we never had any performance issue (although some queries were rather 
long, but still acceptable).


* The use case I'm envisioning would be (at least in the foreseeable 
future):
- About 10 faculty users (which means potentially a little bit more 
students using it); I would have hard time considering more than 4 
concurrent users;
- Data would primarily involve a lot (hundreds/thousands) of high 
resolution (spatial and temporal) raster and vector maps, possibly 
over large areas (Florida / USA / continental), as well as potentially 
millions of GPS records (animals individually monitored);
- Queries will primarily involve retrieving points/maps over given 
areas/time, as well as intersecting points over environmental layers 
[from what I understand, a lot of I/O, with many intermediary tables 
involved]; other use cases will involve working with steps, i.e. the 
straight line segment connecting two successive locations, and 
intersecting them with environmental layers;


* I couldn't find comprehensive or detailed guidelines on-line about 
hardware, but from what I could see, it seems that memory wouldn't be 
the main issue, but the number of cores would be (one core per 
database connection if I'm not mistaken). At the same time, we want to 
make sure that the experience is smooth for everyone... I was advised 
on the PostGIS list to give a look at pgpool (however, UNIX only).


* Is there a difference in terms of possibilities, performance and 
usability between a Linux-based and a MS-based server (from the user 
perspective)? My center is unfortunately MS-centered, and existing 
equipment runs with MS systems... It would thus be easier for them to 
set up a MS-based server. Does it change anything for the user? (I 
insist on the user perspective, since I and others will not admin the 
system, but only use it)


* Does anyone have worked with a server running the DB engine, while 
the DB itself was stored on another box/server? That would likely be 
the case here since we already have a dedicated box for file storage. 
Along these lines, does the system of the file storage box matter 
(Linux vs. MS)?


* We may also use the server as a workstation to streamline PostGIS 
processing with further R analyses/modeling (or even use R from within 
the database using PL/R). Again, does anyone have experience doing it? 
Is a single workstation the recommended way to work with such 
workflow? Or would it be better (but more costly) to have one server 
dedicated to PostGIS and another one, with different specs, dedicated 
to analyses (R)?


I realize my questions and comments may be a confusing, likely because 
of the lack of experience about these issues on my side. I really 
welcome any feedback of people working with PostgreSQL servers (+ 
PostGIS ideally!) in a small unit, or any similar setting that could 
be informative!


In advance, thank you very much!

Sincerely,
Mathieu Basille.


[1] Start of the thread here: 
http://lists.osgeo.org/pipermail/postgis-users/2015-February/040120.html


When I looked at tuning options for PostgreSQL, I found that there were 
limitations mentioned for Microsoft O/S's.


I get the general impression from my reading from multiple sources over 
the years, that if you are serious about performance on a server, then 
you should prefer Linux to Microsoft.  Note that most servers run Linux, 
and that over 95% of the top 500 super computers runs Linux - which is 
rather telling about the perception of Microsoft's performance.



Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-g

Re: [GENERAL] Hardware requirements for a PostGIS server

2015-02-11 Thread Gavin Flower

On 12/02/15 12:38, Mathieu Basille wrote:
Thanks to everyone who contributed to this thread, either on the 
PostGIS [1] or the PostgreSQL [2] mailing lists. I will try to 
summarize everything in this message, which I will actually post on 
both lists to give an update to everyone. I hope it can be useful for 
other people interested. Please feel free to add more advice and other 
experiences, this is always useful!

[...]

* Memory
Examples go from 8 to >32 GB RAM.

Because RAM is relatively cheap, sometimes money spent on lots of RAM is 
better than buying fast disks - especially if the vast majority of SQL 
executed is read only and most of the database likely to be accessed can 
reside in RAM along with relevant indexes and working memory.


Good SSD's are quite reliably fine for database use, unless you have an 
enormous number of UPDATE/CREATE/DELETE stuff going on - even then, you 
might find that appropriate SSD's will still be okay. Note that SSD 
offerings are constantly changing, and tend to be improving in many 
areas such as reliability, performance, and cost per GB (obviously, be 
wary of market speak!).  Though you should still have regular backups.





Platform


Linux is the platform of choice:
* Easier administration (install/configuration/upgrade), which is also 
true for addons/dependencies (starting with PostGIS, but also GEOS, 
GDAL, PL/R);

* Better performance [4];
* More tuning options (limited with MS systems);

There is still the possibility of a virtualbox on a MS server.

Performance of a database is usually (always?) better on an O/S running 
on bare metal.


[...]
* Integration with R: a dedicated R server brings more flexibility / 
extensions (e.g. Shiny) / performance (more cores and memory available 
for PostGIS) except if data transfer is the bottleneck. Use Pl/R for 
small functions (also if it fits naturally into PostgreSQL workflow) / 
otherwise in R with PostgreSQL connector.


You might want to look at SageMath (think Mathematica & MatLab), as it 
incorporates R and provides much more functionality in some areas:

http://sagemath.org
http://www.sagemath.org/doc/reference/interfaces/sage/interfaces/r.html

[...]


[1] Start of the thread here:
http://lists.osgeo.org/pipermail/postgis-users/2015-February/040120.html




All the best!

You have certainly been very thorough in your homework, and I'm sure 
there are many people here who would love to hear how things turn out.



Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Hardware requirements for a PostGIS server

2015-02-11 Thread Gavin Flower

On 12/02/15 12:38, Mathieu Basille wrote:
[...]

[1] Start of the thread here:
http://lists.osgeo.org/pipermail/postgis-users/2015-February/040120.html

[...]


http://lists.osgeo.org/pipermail/postgis-users/2015-February/040134.html
[...]
* About usage being mostly read: this will be true for most "pure GIS" 
tasks (mostly intersecting), but I find that (from experience), we 
usually end up with a lot of intermediary tables for our analyses (new 
tables for the most part, not new columns).

[...]

For greater performance of intermediary tables: if the these tables can 
be easily recreated, then you might want to make use of PostgreSQL's 
unlogged tables:


   http://www.postgresql.org/docs/9.4/static/sql-createtable.html
   [...]
   CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE
   [ IF NOT EXISTS ] table_name ( [
   [...]
   UNLOGGED

If specified, the table is created as an unlogged table. Data
   written to unlogged tables is not written to the write-ahead log
   (see Chapter 29), which makes them considerably faster than ordinary
   tables. However, they are not crash-safe: an unlogged table is
   automatically truncated after a crash or unclean shutdown. The
   contents of an unlogged table are also not replicated to standby
   servers. Any indexes created on an unlogged table are automatically
   unlogged as well.
   [...]


Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres architecture for multiple instances

2015-02-22 Thread Gavin Flower

On 23/02/15 15:53, Samuel Smith wrote:

On 02/21/2015 05:25 PM, David Steele wrote:

On 2/21/15 6:08 PM, Adrian Klaver wrote:


Currently the built in replication solutions work at the cluster level,
not at the database level. There are third party tools, Slony and
Bucardo come to mind, that can work at a more focused level.


Again, it depends in what kind of access your customers have to the
database.  Logical backups can be done on a per-database basis (and are
very practical for small datasets), but physical backups (which are more
efficient) must backup the entire cluster (except for the third-party
solutions noted above that have their own challenges).

You said you don't want to give a lot of information, but the optimal
solution depends on how your customers connect to their database:

1) Entirely through an application or API
2) Directly to the db via psql, PgAdmin, JDBC, etc.

Without that information it's hard to give really solid advice.



Sorry, to be more specific our application is on top of a java 
framework. Similar to Ruby On Rails. Customers don't interface with 
the database at all, only through the website which provides our 
services.


--Sam



In that case I'd go for one instance of pg.

With either

1. different DB's or ,
2. or same db, but fields identifying which customer is allowed access.

Depending what factors are important in your situation.

The second option might be best, if there is a lot in common between 
customers (tables of data that are not customer specific, and/or common 
pg configuration options).




Cheers,
Gavin




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres architecture for multiple instances

2015-02-22 Thread Gavin Flower

On 23/02/15 17:21, Samuel Smith wrote:

On 02/22/2015 01:53 PM, Scott Marlowe wrote:>
> I'd run a debian based distro (Ubuntu or Debian work well) and use the
> pg_* commands to create the clusters the same way. Gives you the
> maximum separation for clients.
>
> pg_createcluster
>
> Usage: /usr/bin/pg_createcluster [options]  
> [-- ]
>
> Options:
>-u   cluster owner and superuser (default: 'postgres')
>-g   group for data files (default: primary group of owner)
>-d   data directory (default:
> /var/lib/postgresql//)
>-s   socket directory (default: /var/run/postgresql for 
clusters

>  owned by 'postgres', /tmp for other clusters)
>-l   path to desired log file (default:
> /var/log/postgresql/postgresql--.log)
>--locale 
>  set cluster locale (default: inherit from environment)
>--lc-collate/ctype/messages/monetary/numeric/time 
>  like --locale, but only set for a particular category
>-e  Default encoding (default: derived from locale)
>-p  port number (default: next free port starting from 
5432)

>--start   start the cluster after creating it
>--start-conf auto|manual|disabled
>  Set automatic startup behaviour in start.conf 
(default: 'auto')

>--createclusterconf=file alternative createcluster.conf to use
>--environment=file alternative environment file to use
> other options to pass to initdb
>
> Just use -u and -d to put it where you want, -l to setup logging and
> -p to set the port you want.
>

I am stuck with redhat as the OS so I'll only have initdb. But this is 
a good point. I assume there is nothing wrong with having multiple 
postgres instances (clusters) all running under a single postgres user 
on different ports on the same machine?
But then what is the best way to handle connecting to an individual 
cluster  (local only for admin purposes) with psql? Doesn't look like 
I could go by the cluster name alone, but would have to manage 
(memorize) connections by port number?


Thanks,
Sam



pg_archivecleanup
pg_basebackup
pg_config
pg_controldata
pg_ctl
pg_dump
pg_dumpall
pg_isready
pg_receivexlog
pg_recvlogical
pg_resetxlog
pg_restore
pg_standby
pg_test_fsync
pg_test_timing
pg_upgrade
pg_xlogdump

Are the pg_* commands I found on my Fedora 21 box - if its of any help!


Cheers,
Gavn


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] SQL solution for my JDBC timezone issue

2015-02-24 Thread Gavin Flower

On 25/02/15 04:29, Adrian Klaver wrote:

On 02/24/2015 06:25 AM, George Woodring wrote:

  -- In your original post you mentioned that access to the databases is
through a Web server.

  -- Is there just one Web server with one time zone?

We have 2 web servers that are clustered together.  They are both set to
Eastern since that is the timezone they are located in.




So:

JDBC Web servers(US/East)  <---> 90 database (5 different timezones)

Therefore everything to the end user is passed through the Web servers?

Is there a reason why the databases have different timezones?

Seems to me less complicated to have all the databases share the UTC 
timezone. Then you only have one offset, US/East <--> UTC.



iGLASS Networks
www.iglass.net 





Yes I think it sanest to have servers use GMT*, as then it is simple 
enough to convert results to whatever local time zone the client is in.


I live in New Zealand, and we are offset from GMT about 12 hours (depend 
on seasonal adjustments).  Have no problem getting query results using 
my local time zone automatically. - regardless of whether I query from 
the database box or another, using psql!



Cheers,
Gavin


*UTC or whatever it is currently fashionable to call it!  :-)


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [postgis-users] Query with LIMIT but as random result set?

2013-01-11 Thread Gavin Flower

On 12/01/13 06:45, Bosco Rama wrote:

On 01/11/13 09:31, Gavin Flower wrote:

-- theta in radians
-- for radius = 100

INSERT INTO ranpoint
  (id, theta, r)
VALUES
  (generate_series(1, 10), pi() * random(), 100 * random());

Shouldn't the value for theta be:
  2 * pi() * random()

Bosco.



Very definitely! :-)

Me bad, as the saying goes...


Cheers,
Gavin


Re: [GENERAL] >

2013-01-11 Thread Gavin Flower

On 12/01/13 10:44, Gavan Schneider wrote:

On Saturday, January 12, 2013 at 04:49, Gavin Flower wrote:


On 12/01/13 06:45, Bosco Rama wrote:

Shouldn't the value for theta be:
2 * pi() * random()

Bosco.



Very definitely! :-)


One could also ask if the value for theta shouldn't be:
tau() * random()

<http://tauday.com/> :-)

Regards
Gavan




Well Gavan,

I must bow down before your greater wisdom, as I am forced to agree with 
you!


Especially as your name sorts before mine, yet our names are almost 
exactly the same.  :-)



Cheers,
Gavin

P.S. Is tau() a standard part of pg core- if not, when will it be?



Re: [GENERAL] Linux Distribution Preferences?

2013-01-13 Thread Gavin Flower

On 14/01/13 07:27, Shaun Thomas wrote:

Hey guys,

I'm not sure the last time I saw this discussion, but I was somewhat curious: 
what would be your ideal Linux distribution for a nice solid PostgreSQL 
installation? We've kinda bounced back and forth between RHEL, CentOS, and 
Ubuntu LTS, so I was wondering what everyone else thought.

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd | Suite 500 | Chicago IL, 60604
312-676-8870
stho...@optionshouse.com



__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email


I would tend use Fedora for development, but would consider CentOS (or 
RHEL, if we had the budget) for production - I avoid Ubuntu like the plague.



Cheers,
Gavin


Re: [GENERAL] Linux Distribution Preferences?

2013-01-13 Thread Gavin Flower

On 14/01/13 13:07, Chris Ernst wrote:

On 01/13/2013 03:44 PM, Gavin Flower wrote:

I would tend use Fedora for development, but would consider CentOS (or
RHEL, if we had the budget) for production - I avoid Ubuntu like the 
plague.


I happen to be doing my own research on this matter.  I tend to lean 
more toward RHEL or CentOS for production servers just because there 
seem to be more people using it in that capacity and it seem to be 
easier to get solid support or advice for those. But I prefer Ubuntu 
for my laptop mainly because of the size of the community, available 
PPAs, ease of administration, etc...


Ultimately, it seem to come down to what you are most 
familiar/comfortable managing.  I don't see much practical difference 
between the distributions other than the versions of various software 
that they ship with by default.  But that is usually rather easy to 
change according to your needs anyway.


I've seen the opinion of "avoid Ubuntu like the plague" expressed many 
times, but it is never followed up with any solid reasoning. Can you 
(or anyone else) give specific details on exactly why you believe 
Ubuntu should be avoided?


- Chris




4 reasons:

1. One place where I worked Ubuntu was standard, I tried it and found
   that it lacked at least a couple of desktop features in GNOME 2 that
   I found very useful into Fedora. Fortunately, I was allowed to
   revert back to Fedora. Prior to that, I was using Fedora mainly by
   default.

2. Twice I came across features that I liked and Ubuntu seemed to imply
   they had done them, later I found the projects been initiated and
   sponsored largely by Red Hat. Especially as Red Hat is in the top
   ten contributors to the kernel, and the contribution of Ubuntu is
   not significant.

3. Ubuntu distributions are now starting to be filled with crapware and
   ant-privacy features features.

4. Ubuntu seems very good at collecting fanbois.

If I were to change from Fedora, I would probably go back to Debian.



Cheers,
Gavin


Re: [GENERAL] Linux Distribution Preferences?

2013-01-13 Thread Gavin Flower
Please don't top post, add your comments at the end as per the norm for 
this group.


On 14/01/13 12:06, SUNDAY A. OLUTAYO wrote:

Ubuntu did the marketing for linux and many more. Some people are just haters. 
Can you tell us about upstart?

Sent from my LG Mobile

Gavin Flower  wrote:

On 14/01/13 13:07, Chris Ernst wrote:

On 01/13/2013 03:44 PM, Gavin Flower wrote:

I would tend use Fedora for development, but would consider CentOS (or
RHEL, if we had the budget) for production - I avoid Ubuntu like the
plague.

I happen to be doing my own research on this matter.  I tend to lean
more toward RHEL or CentOS for production servers just because there
seem to be more people using it in that capacity and it seem to be
easier to get solid support or advice for those. But I prefer Ubuntu
for my laptop mainly because of the size of the community, available
PPAs, ease of administration, etc...

Ultimately, it seem to come down to what you are most
familiar/comfortable managing.  I don't see much practical difference
between the distributions other than the versions of various software
that they ship with by default.  But that is usually rather easy to
change according to your needs anyway.

I've seen the opinion of "avoid Ubuntu like the plague" expressed many
times, but it is never followed up with any solid reasoning. Can you
(or anyone else) give specific details on exactly why you believe
Ubuntu should be avoided?

 - Chris




4 reasons:

  1. One place where I worked Ubuntu was standard, I tried it and found
 that it lacked at least a couple of desktop features in GNOME 2 that
 I found very useful into Fedora. Fortunately, I was allowed to
 revert back to Fedora. Prior to that, I was using Fedora mainly by
 default.

  2. Twice I came across features that I liked and Ubuntu seemed to imply
 they had done them, later I found the projects been initiated and
 sponsored largely by Red Hat. Especially as Red Hat is in the top
 ten contributors to the kernel, and the contribution of Ubuntu is
 not significant.

  3. Ubuntu distributions are now starting to be filled with crapware and
 ant-privacy features features.

  4. Ubuntu seems very good at collecting fanbois.

If I were to change from Fedora, I would probably go back to Debian.



Cheers,
Gavin

 I don't know much about 'upstart'  - Fedora uses systemd:
http://www.freedesktop.org/wiki/Software/systemd


Cheers,
Gavin



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


  1   2   3   >