date:20110503

Re: [GENERAL] pervasiveness of surrogate (also called synthetic) keys

2011-05-03 Thread Karsten Hilbert

On Tue, May 03, 2011 at 10:52:23AM +0800, Craig Ringer wrote:

> ... and that's before we get into the horror of "what is someone's
> name". Which name? Which spelling? Do they even have a single canonical
> name?

- people have, at least over time, several compound names
- they have, at any one time, one active compound name
- additional spellings can be tracked as additional names
  of that individual

> Is their canonical name - if any - expressable in the character
> set used by the service? Is it even covered by Unicode?!?

- I haven't seen evidence to the contrary.
- But then, I haven't had a need to store a Klingon name.
- Yes, it's been difficult to come up with something sensible
  to store Spock's first name in the GNUmed database.

> Does it make
> any sense to split their name up into the traditional
> english-speaking-recent-western "family" and "given" name parts?

- any compound names I have come across work like this:

- group name
- individual name
- nicknames (pseudonyms, warrior names, actor names, ...)

The day-to-day usage of each part varies, though.

> Is there a single consistent way to do so for their name even if it does? etc.

Even in Japan, where the group is a lot more than the
individual, can you clearly split into group name and
individual name.

Karsten
-- 
GPG key ID E4071346 @ gpg-keyserver.de
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Bidirectional replication

2011-05-03 Thread Simon Riggs

On Tue, May 3, 2011 at 7:31 AM, Sim Zacks  wrote:

> I have heard good things about Bucardo, though I haven't tried it myself
> yet. I was warned that it would be risky to have 2 masters that have the
> same tables modified in both because of issues such as delayed sync, race
> conditions and other such goodies that may corrupt the meaning of the data.

Just to be clear and fair to Bucardo, I would add a few points.

All multi-master replication solutions that use an optimistic
mechanism require "conflict resolution" cases and code. This is the
same with SQLServer and Oracle etc.. Referring to a well known problem
as a race condition seems to introduce doubt and fear into a situation
that is well understood. Bucardo does offer hooks for conflict
resolution to allow you to program around the issues.

So if I felt that multi-master replication was the right way to go for
a solution, Bucardo is a good choice.

Just to add other info: if multi-master replication uses pessimistic
coherence, then the coherence mechanism can also be a source of
contention and/or cause the need for alternative kinds of conflict
resolution.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread raghu ram

Respected Committers,

It may be a silly question, still out of curiosity I want to know, is there
any possible way to flush the Postgres Shared Memory without restarting the
cluster.

In Oracle, we can flush the SGA, can we get the same feature here..

Thanks in Advance.

Regards
Raghu Ram
EnterpriseDB Corporation

Re: [GENERAL] pervasiveness of surrogate (also called synthetic) keys

2011-05-03 Thread Rick Genter


On May 2, 2011, at 10:52 PM, Craig Ringer wrote:

> SSN? What if they don't live in the US or aren't a citizen?

Non-citizens can have SSNs (they have to if they work in the US).
--
Rick Genter
rick.gen...@gmail.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Re: Seg Fault in backend after beginning to use xpath (PG 9.0, FreeBSD 8.1)

2011-05-03 Thread Ivan Voras


On 03/05/2011 07:12, alan bryan wrote:

Our developers started to use some xpath features and upon deployment
we now have an issue where PostgreSQL is seg faulting periodically.
Any ideas on what to look at next would be much appreciated.

FreeBSD 8.1
PostgreSQL 9.0.3 (also tried upgrading to 9.0.4)  built from ports
Libxml2 2.7.6 (also tried upgrading to 2.7.8)   built from ports

pgsql logs show:
May  1 17:51:13 192.168.20.100 postgres[11862]: [94-1] LOG:  server
process (PID 62112) was terminated by signal 11: Segmentation fault

syslog shows:
May  2 20:29:16 db3 kernel: pid 49956 (postgres), uid 70: exited on
signal 11 (core dumped)
May  2 21:06:37 db3 kernel: pid 39086 (postgres), uid 70: exited on
signal 10 (core dumped)

Checking out postgres.core and we see:

(gdb) bt
#0  0x0008f5f19afd in pthread_mutex_lock () from /lib/libthr.so.3
#1  0x000800d22965 in xmlRMutexLock () from /usr/local/lib/libxml2.so.5


This is unusual. There isn't any need to use pthreads here. As far as I 
can see, the normal build of libxml2 doesn't import it explicitly:



ldd /usr/local/lib/libxml2.so

/usr/local/lib/libxml2.so:
libz.so.5 => /lib/libz.so.5 (0x800889000)
libiconv.so.3 => /usr/local/lib/libiconv.so.3 (0x800e5)
libm.so.5 => /lib/libm.so.5 (0x80104b000)
libc.so.7 => /lib/libc.so.7 (0x800647000)

Judging by the mix of SIGBUS and SIGSEGV, I'd say it is likely this is 
causing you problems.


To make sure, you may want to rebuild libxml2 with WITHOUT_THREADS 
defined. You may also need to rebuild postgresql afterwards.




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Bidirectional replication

2011-05-03 Thread tushar nehete

Thanks you all,
I started with Bucardo. I installed activeperl 5.12 on my Linux(RHEL5.5)
server.
Can you please suggest some link which describe the installation steps in
details.


Thanks,
Tushar

On Tue, May 3, 2011 at 2:49 PM, Simon Riggs  wrote:

> On Tue, May 3, 2011 at 7:31 AM, Sim Zacks  wrote:
>
> > I have heard good things about Bucardo, though I haven't tried it myself
> > yet. I was warned that it would be risky to have 2 masters that have the
> > same tables modified in both because of issues such as delayed sync, race
> > conditions and other such goodies that may corrupt the meaning of the
> data.
>
>
> Just to be clear and fair to Bucardo, I would add a few points.
>
> All multi-master replication solutions that use an optimistic
> mechanism require "conflict resolution" cases and code. This is the
> same with SQLServer and Oracle etc.. Referring to a well known problem
> as a race condition seems to introduce doubt and fear into a situation
> that is well understood. Bucardo does offer hooks for conflict
> resolution to allow you to program around the issues.
>
> So if I felt that multi-master replication was the right way to go for
> a solution, Bucardo is a good choice.
>
> Just to add other info: if multi-master replication uses pessimistic
> coherence, then the coherence mechanism can also be a source of
> contention and/or cause the need for alternative kinds of conflict
> resolution.
>
> --
>  Simon Riggs   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread Simon Riggs

On Tue, May 3, 2011 at 11:54 AM, raghu ram  wrote:

> It may be a silly question, still out of curiosity I want to know, is there
> any possible way to flush the Postgres Shared Memory without restarting the
> cluster.
> In Oracle, we can flush the SGA, can we get the same feature here..
> Thanks in Advance.


The CHECKPOINT command will do this for you.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Bidirectional replication

2011-05-03 Thread Raghavendra

Best to start with..

http://bucardo.org/wiki/Bucardo/Installation

Best Regards,
Raghavendra
EnterpriseDB Corporation
Blog: http://raghavt.blogspot.com/



On Tue, May 3, 2011 at 5:34 PM, tushar nehete  wrote:

> Thanks you all,
> I started with Bucardo. I installed activeperl 5.12 on my Linux(RHEL5.5)
> server.
> Can you please suggest some link which describe the installation steps in
> details.
>
>
> Thanks,
> Tushar
>
> On Tue, May 3, 2011 at 2:49 PM, Simon Riggs  wrote:
>
>> On Tue, May 3, 2011 at 7:31 AM, Sim Zacks  wrote:
>>
>> > I have heard good things about Bucardo, though I haven't tried it myself
>> > yet. I was warned that it would be risky to have 2 masters that have the
>> > same tables modified in both because of issues such as delayed sync,
>> race
>> > conditions and other such goodies that may corrupt the meaning of the
>> data.
>>
>>
>> Just to be clear and fair to Bucardo, I would add a few points.
>>
>> All multi-master replication solutions that use an optimistic
>> mechanism require "conflict resolution" cases and code. This is the
>> same with SQLServer and Oracle etc.. Referring to a well known problem
>> as a race condition seems to introduce doubt and fear into a situation
>> that is well understood. Bucardo does offer hooks for conflict
>> resolution to allow you to program around the issues.
>>
>> So if I felt that multi-master replication was the right way to go for
>> a solution, Bucardo is a good choice.
>>
>> Just to add other info: if multi-master replication uses pessimistic
>> coherence, then the coherence mechanism can also be a source of
>> contention and/or cause the need for alternative kinds of conflict
>> resolution.
>>
>> --
>>  Simon Riggs   http://www.2ndQuadrant.com/
>>  PostgreSQL Development, 24x7 Support, Training & Services
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>>
>
>

Re: [GENERAL] Bidirectional replication

2011-05-03 Thread Raghavendra

One more point, Please take into consideration the points mentioned by Simon
Riggs in your testing.

Best Regards,
Raghavendra
EnterpriseDB Corporation
Blog: http://raghavt.blogspot.com/



On Tue, May 3, 2011 at 5:41 PM, Raghavendra <
raghavendra@enterprisedb.com> wrote:

> Best to start with..
>
> http://bucardo.org/wiki/Bucardo/Installation
>
> Best Regards,
> Raghavendra
> EnterpriseDB Corporation
> Blog: http://raghavt.blogspot.com/
>
>
>
> On Tue, May 3, 2011 at 5:34 PM, tushar nehete  wrote:
>
>> Thanks you all,
>> I started with Bucardo. I installed activeperl 5.12 on my Linux(RHEL5.5)
>> server.
>> Can you please suggest some link which describe the installation steps in
>> details.
>>
>>
>> Thanks,
>> Tushar
>>
>> On Tue, May 3, 2011 at 2:49 PM, Simon Riggs wrote:
>>
>>> On Tue, May 3, 2011 at 7:31 AM, Sim Zacks  wrote:
>>>
>>> > I have heard good things about Bucardo, though I haven't tried it
>>> myself
>>> > yet. I was warned that it would be risky to have 2 masters that have
>>> the
>>> > same tables modified in both because of issues such as delayed sync,
>>> race
>>> > conditions and other such goodies that may corrupt the meaning of the
>>> data.
>>>
>>>
>>> Just to be clear and fair to Bucardo, I would add a few points.
>>>
>>> All multi-master replication solutions that use an optimistic
>>> mechanism require "conflict resolution" cases and code. This is the
>>> same with SQLServer and Oracle etc.. Referring to a well known problem
>>> as a race condition seems to introduce doubt and fear into a situation
>>> that is well understood. Bucardo does offer hooks for conflict
>>> resolution to allow you to program around the issues.
>>>
>>> So if I felt that multi-master replication was the right way to go for
>>> a solution, Bucardo is a good choice.
>>>
>>> Just to add other info: if multi-master replication uses pessimistic
>>> coherence, then the coherence mechanism can also be a source of
>>> contention and/or cause the need for alternative kinds of conflict
>>> resolution.
>>>
>>> --
>>>  Simon Riggs   http://www.2ndQuadrant.com/
>>>  PostgreSQL Development, 24x7 Support, Training & Services
>>>
>>> --
>>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>>> To make changes to your subscription:
>>> http://www.postgresql.org/mailpref/pgsql-general
>>>
>>
>>
>

Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread Raghavendra

On Tue, May 3, 2011 at 5:37 PM, Simon Riggs  wrote:

> On Tue, May 3, 2011 at 11:54 AM, raghu ram 
> wrote:
>
> > It may be a silly question, still out of curiosity I want to know, is
> there
> > any possible way to flush the Postgres Shared Memory without restarting
> the
> > cluster.
> > In Oracle, we can flush the SGA, can we get the same feature here..
> > Thanks in Advance.
>
>
> The CHECKPOINT command will do this for you.
>

This command will empty the PSM...

Best Regards,
Raghavendra
EnterpriseDB Corporation
Blog: http://raghavt.blogspot.com/

Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread raghu ram

On Tue, May 3, 2011 at 6:01 PM, Raghavendra <
raghavendra@enterprisedb.com> wrote:

>
> On Tue, May 3, 2011 at 5:37 PM, Simon Riggs  wrote:
>
>> On Tue, May 3, 2011 at 11:54 AM, raghu ram 
>> wrote:
>>
>> > It may be a silly question, still out of curiosity I want to know, is
>> there
>> > any possible way to flush the Postgres Shared Memory without restarting
>> the
>> > cluster.
>> > In Oracle, we can flush the SGA, can we get the same feature here..
>> > Thanks in Advance.
>>
>>
>> The CHECKPOINT command will do this for you.
>>
>
>
>
According to PostgreSQL documentation, whenever you execute "CHECKPOINT" in
the database,it will flush the modified data files presented in the Shared
Buffers retuned to the Disk.

   http://www.postgresql.org/docs/9.0/static/sql-checkpoint.html

Is this clears the entire shared memory cache and same time,if i execute
fresh SQL statement, Data will be retuned from disk??



--Raghu Ram

Re: [GENERAL] pervasiveness of surrogate (also called synthetic) keys

2011-05-03 Thread Merlin Moncure

On Mon, May 2, 2011 at 11:53 PM, Craig Ringer
 wrote:
> On 03/05/11 11:07, Greg Smith wrote:
>
>> That doesn't mean you can't use
>> them as a sort of foreign key indexing the data; it just means you can't
>> make them the sole unique identifier for a particular entity, where that
>> entity is a person, company, or part.
>
> Classic case: a database here has several tables indexed by MAC address.
> It's used for asset reporting and software inventory.
>
> Problem: VMs generate random MAC addresses by default. They're not
> guaranteed to be globally unique. Collisions have happened and will
> probably happen again. In this case, it wasn't a big deal, but it just
> goes to show that even the "obviously" globally unique isn't necessarily so.

It's precisely pathological cases like this where uniqueness
constraints are important and should be used.  By the way, we aren't
debating the use of natural case but whether to define uniqueness
constraints.  My main gripe with surrogates is that their use often
leads directly to lazy schema design where uniqueness constraints are
not defined which leads to data problems exactly like the case you
described above.

In a purely surrogate table with no unique on the mac, suppose you
have two records with the same value for the address, and there are no
other interesting fields on the table or elsewhere in the database:
*) who/what made the decision to place a second record on the table?
*) is that decision verifiable?
*) Is that decision repeatable?
*) are there other routes of data entry into the database that bypass
that decision?  will there ever be?
*) what happens when the code that represents that decision has or
develops a bug?
*) why would you not want information supporting that decision in the database?
*) how do you know the tables keying to your mac table are pointing to
the correct record?
*) what are the consequences for ad hoc queries that join directly
against the mac?  DISTINCT?

If your data modeler that made the the assumptions that a MAC is
unique (a mistake obviously) at least the other tables are protected
from violations of that assumption because the database would reject
them with an error, which is a Good Thing.  Without a uniqueness
constraint you now have ambiguous data which is a Very Bad Thing.
Without proper unique constraints, a generated key is effectively
saying "well, I cant' figure this out right now...I'll deal with it
later".  That unmanaged complexity is now in the application and all
the queries that touch the database...you've created your own bug
factory.  With a uniqueness constraint, you have a rigorous definition
of what your record represents, and other entities in the database can
now rely on that definition.

Natural key designs are good for a lot of reasons, but #1 on the least
is that they force you to deal with problems in your data model up
front because they force you to define unqiueness.  If the MAC turns
out not to be unique and the problem is not in fact coming from the
input data or the application, yes, you do have to correct the model
but at least the data inside the database is clean, and can be
unambiguously mapped to the new model.  I'll take schema changes over
bad data.

Correcting the model means you have to figure out whatever information
is used to distinguish identical MACs #1 and #2 is stored in the
database because now your data and the corresponding decisions are
verifiable, repeatable, unambiguous, etc.  What extra field you have
to add to your 'mac' table to make it unique would depend on certain
things, but it's certainly a solvable problem, and when solved would
give you a more robust database.

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread Merlin Moncure

On Tue, May 3, 2011 at 8:30 AM, raghu ram  wrote:
> On Tue, May 3, 2011 at 6:01 PM, Raghavendra
>  wrote:
>>
>> On Tue, May 3, 2011 at 5:37 PM, Simon Riggs  wrote:
>>>
>>> On Tue, May 3, 2011 at 11:54 AM, raghu ram 
>>> wrote:
>>>
>>> > It may be a silly question, still out of curiosity I want to know, is
>>> > there
>>> > any possible way to flush the Postgres Shared Memory without restarting
>>> > the
>>> > cluster.
>>> > In Oracle, we can flush the SGA, can we get the same feature here..
>>> > Thanks in Advance.
>>>
>>>
>>> The CHECKPOINT command will do this for you.
>>
>>
>
> According to PostgreSQL documentation, whenever you execute "CHECKPOINT" in
> the database,it will flush the modified data files presented in the Shared
> Buffers retuned to the Disk.
>            http://www.postgresql.org/docs/9.0/static/sql-checkpoint.html
> Is this clears the entire shared memory cache and same time,if i execute
> fresh SQL statement, Data will be retuned from disk??

no it will not, or at least there is no guarantee it will be.  the
only way to reset the buffers in that sense is to restart the database
(and even then they might not be read from disk, because they could
sit in the o/s cache).  to force a read from the drive you'd have to
reboot the server, or at least shut it down and use a lot of memory
for some other purpose.

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread tv

> On Tue, May 3, 2011 at 6:01 PM, Raghavendra <
> raghavendra@enterprisedb.com> wrote:
>
>>
>> On Tue, May 3, 2011 at 5:37 PM, Simon Riggs 
>> wrote:
>>
>>> On Tue, May 3, 2011 at 11:54 AM, raghu ram 
>>> wrote:
>>>
>>> > It may be a silly question, still out of curiosity I want to know, is
>>> there
>>> > any possible way to flush the Postgres Shared Memory without
>>> restarting
>>> the
>>> > cluster.
>>> > In Oracle, we can flush the SGA, can we get the same feature here..
>>> > Thanks in Advance.
>>>
>>>
>>> The CHECKPOINT command will do this for you.
>>>
>>
>>
>>
> According to PostgreSQL documentation, whenever you execute "CHECKPOINT"
> in
> the database,it will flush the modified data files presented in the Shared
> Buffers retuned to the Disk.
>
>http://www.postgresql.org/docs/9.0/static/sql-checkpoint.html
>
> Is this clears the entire shared memory cache and same time,if i execute
> fresh SQL statement, Data will be retuned from disk??

No. Checkpoint means all dirty buffers are written to the datafiles, it
does not mean emptying the shared buffers. Checkpoints happen regularly so
this would have an unwanted impact on performance.

And besides that, there's a page cache maintained by the OS (not sure if
you're running Linux or Windows). So even when the block does not exist in
the shared buffers, it may be in the page cache (thus not read from the
drive).

Dropping the page cache is quite simple (http://linux-mm.org/Drop_Caches),
emptying the shared buffers is not that simple - I guess the easiest way
is to restart the db.

What are you trying to achieve? Why do you need this?

Tomas

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread Cédric Villemain

2011/5/3 Merlin Moncure :
> On Tue, May 3, 2011 at 8:30 AM, raghu ram  wrote:
>> On Tue, May 3, 2011 at 6:01 PM, Raghavendra
>>  wrote:
>>>
>>> On Tue, May 3, 2011 at 5:37 PM, Simon Riggs  wrote:

 On Tue, May 3, 2011 at 11:54 AM, raghu ram 
 wrote:

 > It may be a silly question, still out of curiosity I want to know, is
 > there
 > any possible way to flush the Postgres Shared Memory without restarting
 > the
 > cluster.
 > In Oracle, we can flush the SGA, can we get the same feature here..
 > Thanks in Advance.


 The CHECKPOINT command will do this for you.
>>>
>>>
>>
>> According to PostgreSQL documentation, whenever you execute "CHECKPOINT" in
>> the database,it will flush the modified data files presented in the Shared
>> Buffers retuned to the Disk.
>>            http://www.postgresql.org/docs/9.0/static/sql-checkpoint.html
>> Is this clears the entire shared memory cache and same time,if i execute
>> fresh SQL statement, Data will be retuned from disk??
>
> no it will not, or at least there is no guarantee it will be.  the
> only way to reset the buffers in that sense is to restart the database
> (and even then they might not be read from disk, because they could
> sit in the o/s cache).  to force a read from the drive you'd have to
> reboot the server, or at least shut it down and use a lot of memory
> for some other purpose.

with linux, you can : "echo 3 > /proc/sys/vm/drop_caches" for the OS cache

>
> merlin
>
> --
> Sent via pgsql-admin mailing list (pgsql-ad...@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-admin
>



-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread Merlin Moncure

On Tue, May 3, 2011 at 8:55 AM, Cédric Villemain
 wrote:
> 2011/5/3 Merlin Moncure :
>>
>> no it will not, or at least there is no guarantee it will be.  the
>> only way to reset the buffers in that sense is to restart the database
>> (and even then they might not be read from disk, because they could
>> sit in the o/s cache).  to force a read from the drive you'd have to
>> reboot the server, or at least shut it down and use a lot of memory
>> for some other purpose.
>
> with linux, you can : "echo 3 > /proc/sys/vm/drop_caches" for the OS cache
>

yeah -- good point.  aside: does that also drop cache on the drive/raid card?

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread Cédric Villemain

2011/5/3 Merlin Moncure :
> On Tue, May 3, 2011 at 8:55 AM, Cédric Villemain
>  wrote:
>> 2011/5/3 Merlin Moncure :
>>>
>>> no it will not, or at least there is no guarantee it will be.  the
>>> only way to reset the buffers in that sense is to restart the database
>>> (and even then they might not be read from disk, because they could
>>> sit in the o/s cache).  to force a read from the drive you'd have to
>>> reboot the server, or at least shut it down and use a lot of memory
>>> for some other purpose.
>>
>> with linux, you can : "echo 3 > /proc/sys/vm/drop_caches" for the OS cache
>>
>
> yeah -- good point.  aside: does that also drop cache on the drive/raid card?

no -- good point too ! (damn! how SAN users will do...maybe EMC or
other are good enough to provide some control panel for that ? )

and as I read on the link provided by Tomas, it is better to issue a
'sync' before trying to drop cache (I do that sometime, but postgresql
flush its write before shutdown, so I expected the dirty pages in OS
cache not to be relative to postgresql files.)


-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Bidirectional replication

2011-05-03 Thread Merlin Moncure

On Tue, May 3, 2011 at 4:19 AM, Simon Riggs  wrote:
> On Tue, May 3, 2011 at 7:31 AM, Sim Zacks  wrote:
>
>> I have heard good things about Bucardo, though I haven't tried it myself
>> yet. I was warned that it would be risky to have 2 masters that have the
>> same tables modified in both because of issues such as delayed sync, race
>> conditions and other such goodies that may corrupt the meaning of the data.
>
>
> Just to be clear and fair to Bucardo, I would add a few points.
>
> All multi-master replication solutions that use an optimistic
> mechanism require "conflict resolution" cases and code. This is the
> same with SQLServer and Oracle etc.. Referring to a well known problem
> as a race condition seems to introduce doubt and fear into a situation
> that is well understood. Bucardo does offer hooks for conflict
> resolution to allow you to program around the issues.
>
> So if I felt that multi-master replication was the right way to go for
> a solution, Bucardo is a good choice.
>
> Just to add other info: if multi-master replication uses pessimistic
> coherence, then the coherence mechanism can also be a source of
> contention and/or cause the need for alternative kinds of conflict
> resolution.

Yeah.  One nasty property that async multi master solutions share is
that they change the definition of what 'COMMIT' means -- the database
can't guarantee the transaction is valid because not all the
supporting facts are necessarily known.  Even after libpq gives you
the green light that transaction could fail an arbitrary length of
time later, and you can't rely in the assumption it's valid until
you've done some synchronizing with the other 'masters'.  Maybe you
don't need to rely on that assumption so a 'fix it later, or possibly
never' methodology works well.  Those cases unfortunately fairly rare
in the real world.

Multi master replication, at least those implementations that don't
hold locks and release the transaction until you've got a guarantee
it's valid and will stay valid, are fundamentally incompatible with
SQL.  I know some people do some cool, usable things with that stuff,
but the whole concept seems awfully awkward to me.  I suppose I'm a
crotchety, cane shaking fundamentalist, but the old school approach of
dividing work logically and developing communication protocols is
often the best approach to take.

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread Simon Riggs

On Tue, May 3, 2011 at 3:16 PM, Mark Johnson
 wrote:

> The contents of this email may not be copied or forwarded in part or in
> whole without the express written consent of the author.

Pleased to meet you Mark.

If you post here, the above disclaimer is not effective. Right now
your words are being copied across the internet...

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread Mark Johnson

Is there a particular one of Oracle's memory clearning features you want to use 
in PostgreSQL? In Oracle you cannot flush the entire SGA without a restart, but 
you can flush three parts of the SGA using three separate commands.
1. In Oracle you can flush the redo buffer by issuing a COMMIT or by rotating 
the logs. You can force a log switch in PostgreSQL using "select 
pg_switch_xlog();".
2. In Oracle when you flush the shared pool this does three things: (a) removes 
sql and pl/sql statements from the shared library cache, (b) flushes the 
dictionary cache of object info and security data, and (c) flushes the query 
result cache (11g only). I am relatively new to PostgreSQL and have not seen an 
equivalent in PostgreSQL to these things. Based on other replies it does not 
seem possible to flush the catalog cache in PostgreSQL.
3. In Oracle when you request a flush of the buffer cache it signals a 
checkpoint to ensure all dirty buffers are written out AND later it will remove 
the dirty buffers from memory. This can take anywhere from a few seconds on 
very small systems to several minutes on VLDB systems, per my observations. The 
Oracle checkpoint is fast, and the SQL prompt comes back very fast, but the 
removal of dirty buffers from memory runs in the background with a low priority 
over a long period of time. If you are planning to use "alter system flush 
buffer_cache" to clear memory in between tests you actually have no way to know 
when memory is clear except to wait a long time and then assume all is well 
(yes, this is also true with ASM and direct i/o to raw devices). In PostgreSQL, 
you can checkpoint manually to signal bgwriter to flush dirty pages to the 
operating system's cache and from there you will see a lazy write to disk 
(e.g., watch pdflush on linux), so immediately re-running a query will still 
get some caching benefits eventhough the checkpoint is complete. There are 
operating system commands that you could use for that ("cat /proc/meminfo" to 
see what's there, "sync" to write dirty pages to disk, then "echo 3 > 
/proc/sys/vm/drop_caches" to remove the now clean pages, and then "cat 
/proc/meminfo" one more time). And, if you are using SAN consider array based 
caching as well.
Sincerely,
Mark R. Johnson
Owner, Remington Database Solutions, LLC
Author, Oracle Database 10g: From Nuts to Soup

The contents of this email may not be copied or forwarded in part or in whole 
without the express written consent of the author.

-Original Message-
From: Raghavendra [mailto:raghavendra@enterprisedb.com]
Sent: Tuesday, May 3, 2011 08:31 AM
To: 'Simon Riggs'
Cc: 'raghu ram', 'pgsql-admin', 'pgsql-general', 'pgsql novice'
Subject: Re: [ADMIN] Can we Flush the Postgres Shared Memory ?

On Tue, May 3, 2011 at 5:37 PM, Simon Riggs  wrote:
On Tue, May 3, 2011 at 11:54 AM, raghu ram  wrote:

> It may be a silly question, still out of curiosity I want to know, is there
> any possible way to flush the Postgres Shared Memory without restarting the
> cluster.
> In Oracle, we can flush the SGA, can we get the same feature here..
> Thanks in Advance.

The CHECKPOINT command will do this for you.

This command will empty the PSM...

Best Regards,
Raghavendra
EnterpriseDB Corporation
Blog: http://raghavt.blogspot.com/

Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread Karsten Hilbert

On Tue, May 03, 2011 at 03:33:34PM +0100, Simon Riggs wrote:

> > The contents of this email may not be copied or forwarded in part or in
> > whole without the express written consent of the author.
> 
> Pleased to meet you Mark.
> 
> If you post here, the above disclaimer is not effective. Right now
> your words are being copied across the internet...

By typing / selecting a public list address "written consent
of the author" can be assumed to exist implicitely ;-)

Karsten
-- 
GPG key ID E4071346 @ gpg-keyserver.de
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread Simon Riggs

On Tue, May 3, 2011 at 2:30 PM, raghu ram  wrote:

>>> The CHECKPOINT command will do this for you.
>>
>>
>
> According to PostgreSQL documentation, whenever you execute "CHECKPOINT" in
> the database,it will flush the modified data files presented in the Shared
> Buffers retuned to the Disk.
>            http://www.postgresql.org/docs/9.0/static/sql-checkpoint.html
> Is this clears the entire shared memory cache and same time,if i execute
> fresh SQL statement, Data will be retuned from disk??

No, but then you'd need to flush OS buffers and all disk caches as
well to make that effective.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread Tom Lane

Karsten Hilbert  writes:
> On Tue, May 03, 2011 at 03:33:34PM +0100, Simon Riggs wrote:
>> If you post here, the above disclaimer is not effective. Right now
>> your words are being copied across the internet...

> By typing / selecting a public list address "written consent
> of the author" can be assumed to exist implicitely ;-)

Nonetheless, corporate lawyers who insist on such disclaimers on all
email are idiots, and make their company's employees look like idiots
as well.  Every disclaimer on obviously-public mail hastens the day
when such disclaimers will have no legal force whatsoever (if indeed
there's any left to them now).

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread Raymond O'Donnell


On 03/05/2011 16:08, Tom Lane wrote:

Karsten Hilbert  writes:

On Tue, May 03, 2011 at 03:33:34PM +0100, Simon Riggs wrote:

If you post here, the above disclaimer is not effective. Right now
your words are being copied across the internet...



By typing / selecting a public list address "written consent
of the author" can be assumed to exist implicitely ;-)


Nonetheless, corporate lawyers who insist on such disclaimers on all
email are idiots, and make their company's employees look like idiots
as well.  Every disclaimer on obviously-public mail hastens the day
when such disclaimers will have no legal force whatsoever (if indeed
there's any left to them now).


I don't want to start a flame war, but did they every have any legal 
force in the first place?


Ray.


--
Raymond O'Donnell :: Galway :: Ireland
r...@iol.ie

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread Peter Geoghegan

On 3 May 2011 16:49, Raymond O'Donnell  wrote:
> I don't want to start a flame war, but did they every have any legal force
> in the first place?

No.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread Mark Johnson

Yes, understood and agreed. My mail server adds it automatically. I can 
manually remove it prior to sending to the mail list.
-Mark
-Original Message-
From: Simon Riggs [mailto:si...@2ndquadrant.com]
Sent: Tuesday, May 3, 2011 10:33 AM
To: 'Mark Johnson'
Cc: 'pgsql-admin', 'pgsql-general'
Subject: Re: [ADMIN] Can we Flush the Postgres Shared Memory ?

On Tue, May 3, 2011 at 3:16 PM, Mark Johnson wrote: > The contents of this 
email may not be copied or forwarded in part or in > whole without the express 
written consent of the author. Pleased to meet you Mark. If you post here, the 
above disclaimer is not effective. Right now your words are being copied across 
the internet... -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL 
Development, 24x7 Support, Training & Services -- Sent via pgsql-admin mailing 
list (pgsql-ad...@postgresql.org) To make changes to your subscription: 
http://www.postgresql.org/mailpref/pgsql-admin

Re: [GENERAL] Switching Database Engines

2011-05-03 Thread Carlos Mennens

On Sat, Apr 30, 2011 at 4:29 AM, Greg Smith  wrote:
> I wouldn't fight with this too much though.  Unless you have some really
> customized stuff in your wiki, there really is nothing wrong with the idea
> of dumping everything into XML, creating a blank PostgreSQL-backed MediaWiki
> install, then restoring into that.  That's what I always do in order to get
> a plain text backup of my server, and to migrate a wiki from one server to
> another.  There are all kinds of issues you could have left here before this
> works, trying to do a database-level export/reload--encoding, foreign key
> problems, who knows what else.  The database-agnostic export/import into XML
> avoids all of those.

Greg,

I'm with you and think that just doing an XML dump of the Wiki itself
is the best way to go. My question is when I do the XML dump as
follows:

/var/www/html/int/main/wiki/maintenance
[root@ideweb1 maintenance]# php dumpBackup.php --full > mw_wiki_2011_05_03.xml
PHP Warning:  PHP Startup: mcrypt: Unable to initialize module
Module compiled with module API=20050922, debug=0, thread-safety=0
PHPcompiled with module API=20060613, debug=0, thread-safety=0
[...]

So now I have a backup file of the Wiki on my old server running
MySQL. I have created the database and installed MediaWiki on the new
server using PostgreSQL as the backend. My question now is what are
your recommended steps in order to get the XML data imported on
MediaWiki using PostgreSQL? I know I also have to move the users since
the XML script / backup doesn't do anything in regards to the users.

Thanks again so much!

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] converting databases form SQL_ASCII to UTF8

2011-05-03 Thread Jasen Betts

On 2011-04-22, Geoffrey Myers  wrote:
> Vick Khera wrote:
>> On Fri, Apr 22, 2011 at 11:00 AM, Geoffrey Myers 
>> mailto:li...@serioustechnology.com>> wrote:
>> 
>> Here's our problem.  We planned on moving databases a few at a time.
>> Problem is, there is a process that pushes data from one database to
>> another.  If this process attempts to push data from a SQL_ASCII
>> database to a new UTF8 database and it has one of these characters
>> mentioned above, the process fails.
>> 
>> 
>> The database's enforcement of the encoding should be the last layer that 
>> does so.  Your applications should be enforcing strict utf-8 encoding 
>> from start to finish.  Once this is done, and the old data already in 
>> the DB is properly encoded as utf-8, then there should be no problems 
>> switching on the utf-8 encoding in postgres to get that final layer of 
>> verification.
>
> Totally agree.  Still, the question remains, why not leave it as SQL_ASCII?

perhaps you want sorted output in some locale other than 'C'?
or maybe want to take a substring in the database...

utf8 in SQL-ASCII is just a string of octets

utf8 in a utf8 database is a string of unicode characters.

-- 
⚂⚃ 100% natural


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Database recovery.

2011-05-03 Thread Jasen Betts

On 2011-03-24, Waqar Azeem  wrote:
> --0015174766a0ffbf86049f35206e
> Content-Type: text/plain; charset=ISO-8859-1
>
> My XP is crashed and now I have to take a full backup of my postgresql 8.4
>
> I am used to get backup of ldf/mdf files in case of SQLServer
>
> Please let me know the right way of doing this for postgresql 8.4.

You need the data directory, and the service user "postgres" needs to
own it's contents.   consider using runas and xcopy.

-- 
⚂⚃ 100% natural


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Bidirectional replication

2011-05-03 Thread John R Pierce


On 05/03/11 5:04 AM, tushar nehete wrote:
I started with Bucardo. I installed activeperl 5.12 on my 
Linux(RHEL5.5) server.


why ActivePerl, which is usually used by MS Windows users, rather than 
the Perl built into RHEL 5.5 (btw, 5.6 is out now, you really should run 
'yum update').




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] converting databases form SQL_ASCII to UTF8

2011-05-03 Thread Geoffrey Myers


Jasen Betts wrote:

On 2011-04-22, Geoffrey Myers  wrote:

Vick Khera wrote:
On Fri, Apr 22, 2011 at 11:00 AM, Geoffrey Myers 
mailto:li...@serioustechnology.com>> wrote:


Here's our problem.  We planned on moving databases a few at a time.
Problem is, there is a process that pushes data from one database to
another.  If this process attempts to push data from a SQL_ASCII
database to a new UTF8 database and it has one of these characters
mentioned above, the process fails.


The database's enforcement of the encoding should be the last layer that 
does so.  Your applications should be enforcing strict utf-8 encoding 
from start to finish.  Once this is done, and the old data already in 
the DB is properly encoded as utf-8, then there should be no problems 
switching on the utf-8 encoding in postgres to get that final layer of 
verification.

Totally agree.  Still, the question remains, why not leave it as SQL_ASCII?


perhaps you want sorted output in some locale other than 'C'?
or maybe want to take a substring in the database...

utf8 in SQL-ASCII is just a string of octets

utf8 in a utf8 database is a string of unicode characters.



We finally have a solution in place. A bug in my code was making the 
problem bigger then it really is.  Gotta love those bugs.



--
Until later, Geoffrey

"I predict future happiness for America if they can prevent
the government from wasting the labors of the people under
the pretense of taking care of them."
- Thomas Jefferson

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread Joshua D. Drake


On 5/3/2011 7:33 AM, Simon Riggs wrote:


Pleased to meet you Mark.

If you post here, the above disclaimer is not effective. Right now
your words are being copied across the internet...



I believe our community needs to move past posting replies like this. It 
isn't even relevant to the context of his question and makes us look 
like a bunch of ideological buffoons.


JD

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pervasiveness of surrogate (also called synthetic) keys

2011-05-03 Thread Jeff Davis

On Mon, 2011-05-02 at 20:06 -0600, Rob Sargent wrote:
> Jeff Davis wrote:
> > In particular, I think you are falsely assuming that a natural key must
> > be generated from an outside source (or some source outside of your
> > control), and is therefore not reliably unique.
> >
> > You can generate your own keys...

...

> My wife works (at the sql level) with shall we say "records about 
> people".  Real records, real people.  Somewhere around 2 million unique 
> individuals, several million source records.  They don't all have ssn, 
> they don't all have a drivers license.  They don't all have an address, 
> many have several addresses (especially over time) and separate people 
> have at one time or another lived at the same address.  You would be 
> surprise how many "bob smith"s where born on the same day.  But then 
> they weren't all born in a hospital etc etc etc.  A person may present 
> on any of a birth record, a death record, a hospital record, a drivers 
> license, a medical registry, a marriage record and so on.  There simply 
> is no natural key for a human.  We won't even worry about the 
> non-uniqueness of ssn. And please don't get her started on twins. :) 
> 
> 
> I can only imagine that other equally complex entities are just as 
> slippery when it comes time to pinpoint the natural key.

I think you missed my point. You don't have to rely on natural keys that
come from somewhere else; you can make up your own, truly unique
identifier.

Regards,
Jeff Davis


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pervasiveness of surrogate (also called synthetic) keys

2011-05-03 Thread Rob Sargent




On 05/03/2011 12:51 PM, Jeff Davis wrote:

On Mon, 2011-05-02 at 20:06 -0600, Rob Sargent wrote:

Jeff Davis wrote:

In particular, I think you are falsely assuming that a natural key must
be generated from an outside source (or some source outside of your
control), and is therefore not reliably unique.

You can generate your own keys...


...


My wife works (at the sql level) with shall we say "records about
people".  Real records, real people.  Somewhere around 2 million unique
individuals, several million source records.  They don't all have ssn,
they don't all have a drivers license.  They don't all have an address,
many have several addresses (especially over time) and separate people
have at one time or another lived at the same address.  You would be
surprise how many "bob smith"s where born on the same day.  But then
they weren't all born in a hospital etc etc etc.  A person may present
on any of a birth record, a death record, a hospital record, a drivers
license, a medical registry, a marriage record and so on.  There simply
is no natural key for a human.  We won't even worry about the
non-uniqueness of ssn. And please don't get her started on twins. :)


I can only imagine that other equally complex entities are just as
slippery when it comes time to pinpoint the natural key.


I think you missed my point. You don't have to rely on natural keys that
come from somewhere else; you can make up your own, truly unique
identifier.

Regards,
Jeff Davis

Sorry, but I'm confused, but that's common.  Isn't a "natural key" to be 
compose solely from the attributes of the entity?  As in a subset of the 
columns of the table in a third-normalish world. Isn't tacking on 
another column with a concocted id joining the "pervassiveness"?


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pervasiveness of surrogate (also called synthetic) keys

2011-05-03 Thread Jeff Davis

On Tue, 2011-05-03 at 13:35 -0600, Rob Sargent wrote:
> Sorry, but I'm confused, but that's common.  Isn't a "natural key" to be 
> compose solely from the attributes of the entity?  As in a subset of the 
> columns of the table in a third-normalish world. Isn't tacking on 
> another column with a concocted id joining the "pervassiveness"?

Not in my opinion. Before cars existed, there was no driver's license
number. The DMV (as it's called in California, anyway) created it, and
it's now a key that they can trust to be unique. It's also an attribute
of the entity now, because it's printed on the cards you hand to people.

The thing that I think is a mistake is to use generated IDs like an
internal implementation detail (i.e. hide them like pointers); then at
the same time mix them into the data model.

Regards,
Jeff Davis

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] postgres segfaulting on pg_restore

2011-05-03 Thread Tom Lane

Chris Curvey  writes:
> and, FWIW, here's another trace, which is NEARLY the same as the first one I
> posted, with the difference being a slightly different line number at #3.  I
> will be quiet now and leave the brain trust to ponder.  Let me know if there
> is anything else I can get for you guys.

> #0  0x006ce317 in GetMemoryChunkSpace (pointer=0x292a0e0) at
> mcxt.c:264
> #1  0x006d3d56 in writetup_index (state=0x2281670, tapenum= optimized out>, stup=) at tuplesort.c:2924

It occurred to me that a simple explanation for a core dump there would
be if something had scribbled past the end of the preceding palloc
chunk.  That would tend to clobber the "context" link of the palloc
chunk after it, which would send GetMemoryChunkSpace off into
never-never land following a trashed pointer.

That doesn't get us very much closer to a solution, but it does suggest
that you might learn something if you try this with an assert-enabled
build (configure --enable-cassert).  Also, are there any contrib modules
or third-party add-on modules or home-brew C functions that you're using?

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pervasiveness of surrogate (also called synthetic) keys

2011-05-03 Thread Jeff Davis

On Mon, 2011-05-02 at 23:07 -0400, Greg Smith wrote:
> I see this whole area as being similar to SQL injection.  The same way 
> that you just can't trust data input by the user to ever be secure, you 
> can't trust inputs to your database will ever be unique in the way you 
> expect them to be. 

So, don't trust them to be unique then. Make up your own unique
identifier, and use that.

Regards,
Jeff Davis


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Question on Wal time lines

2011-05-03 Thread dabicho

For restoring a database from wal files, if I omit a target on the
recovery.conf file, can I make it so the database continues the time line
instead of starting one?
Or is there a tool to pick the most recent time line from a bunch of wal
files?

thankyou.

Re: [GENERAL] Question on Wal time lines

2011-05-03 Thread John R Pierce


On 05/03/11 3:07 PM, dabicho wrote:


For restoring a database from wal files, if I omit a target on the 
recovery.conf file, can I make it so the database continues the time 
line instead of starting one?
Or is there a tool to pick the most recent time line from a bunch of 
wal files?


thankyou.



you need to playback all the wal files from when you started the base 
backup.  of course, you need that base backup, too.






--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Question on Wal time lines

2011-05-03 Thread dabicho

El may 3, 2011 5:59 p.m., "John R Pierce"  escribió:
>
> On 05/03/11 3:07 PM, dabicho wrote:
>>
>>
>> For restoring a database from wal files, if I omit a target on the
recovery.conf file, can I make it so the database continues the time line
instead of starting one?
>> Or is there a tool to pick the most recent time line from a bunch of wal
files?
>>
>> thankyou.
>>
>
> you need to playback all the wal files from when you started the base
backup.  of course, you need that base backup, too.
>
I did that.
I restored the database, put the recovery file in place along with previous
wal files and the last wall files, and after start up there was a new time
line.
Am I missing something?
This is postgres 9.0
>
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pervasiveness of surrogate (also called synthetic) keys

2011-05-03 Thread Rob Sargent




On 05/03/2011 03:08 PM, Jeff Davis wrote:

On Tue, 2011-05-03 at 13:35 -0600, Rob Sargent wrote:

Sorry, but I'm confused, but that's common.  Isn't a "natural key" to be
compose solely from the attributes of the entity?  As in a subset of the
columns of the table in a third-normalish world. Isn't tacking on
another column with a concocted id joining the "pervassiveness"?


Not in my opinion. Before cars existed, there was no driver's license
number. The DMV (as it's called in California, anyway) created it, and
it's now a key that they can trust to be unique. It's also an attribute
of the entity now, because it's printed on the cards you hand to people.

The thing that I think is a mistake is to use generated IDs like an
internal implementation detail (i.e. hide them like pointers); then at
the same time mix them into the data model.

Regards,
Jeff Davis



Well yes it does all depend on how you model things after all. I think a 
drivers license is and attribute of driver not person. So before cars, 
one still had a hard time coming up with a natural key on person.  Of 
course California's DMV only cares about Californian licenced drivers, 
so they get to generate and assign license number as an arbitary key for 
drivers 'cause under that we're back to person.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pervasiveness of surrogate (also called synthetic) keys

2011-05-03 Thread Greg Smith


Merlin Moncure wrote:

If your data modeler that made the the assumptions that a MAC is
unique (a mistake obviously) at least the other tables are protected
from violations of that assumption because the database would reject
them with an error, which is a Good Thing.  Without a uniqueness
constraint you now have ambiguous data which is a Very Bad Thing.
  


With a uniqueness constraint in this situation, the unexpected data--row 
with a non unique MAC--will be rejected and possibly lost when the 
insertion happens.  You say that's a good thing, plenty of people will 
say that's the worst possible thing that can happen.  When dealing with 
external data, it's often impossible to know everything you're going to 
see later at design time.  Approaching that problem with the idea that 
you're going to lose any data that doesn't fit into the original model 
is not what everyone finds reasonable behavior.


I don't think it's possible to decide in a generic way which of these is 
the better approach:  to reject unexpected data and force the problem 
back at the application immediately (commit failure), or to accept with 
with because you're using a surrogate key and discover the problems down 
the line.  Both are valid approaches with a very different type of risk 
associated with them.  I think it's fair to say that real-world data is 
not always well known enough at design time to follow the idea you're 
suggesting though, and that does factor into why there is such a 
preference for surrogate keys in the industry.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pervasiveness of surrogate (also called synthetic) keys

2011-05-03 Thread Greg Smith


Jeff Davis wrote:

On Mon, 2011-05-02 at 23:07 -0400, Greg Smith wrote:
  
I see this whole area as being similar to SQL injection.  The same way 
that you just can't trust data input by the user to ever be secure, you 
can't trust inputs to your database will ever be unique in the way you 
expect them to be. 



So, don't trust them to be unique then. Make up your own unique
identifier, and use that.
  


If you're making up your own unique identifier, that's closer to a 
surrogate key as far as I'm concerned, even though it doesn't fit the 
strict definition of that term (it doesn't have the subtle idea that 
"surrogate" implies "meaningless").  Now, there is some value to doing 
that well, instead of just using the typical incrementing integer 
"pointer" approach, as you've called it.  But if it's not derived from 
external data you're storing anyway, it's not a true natural key either.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pervasiveness of surrogate (also called synthetic) keys

2011-05-03 Thread David Johnston

On May 3, 2011, at 22:03, Greg Smith  wrote:

> Merlin Moncure wrote:
>> If your data modeler that made the the assumptions that a MAC is
>> unique (a mistake obviously) at least the other tables are protected
>> from violations of that assumption because the database would reject
>> them with an error, which is a Good Thing.  Without a uniqueness
>> constraint you now have ambiguous data which is a Very Bad Thing.
>>  
> 
> With a uniqueness constraint in this situation, the unexpected data--row with 
> a non unique MAC--will be rejected and possibly lost when the insertion 
> happens.  You say that's a good thing, plenty of people will say that's the 
> worst possible thing that can happen.  When dealing with external data, it's 
> often impossible to know everything you're going to see later at design time. 
>  Approaching that problem with the idea that you're going to lose any data 
> that doesn't fit into the original model is not what everyone finds 
> reasonable behavior.
> 
> I don't think it's possible to decide in a generic way which of these is the 
> better approach:  to reject unexpected data and force the problem back at the 
> application immediately (commit failure), or to accept with with because 
> you're using a surrogate key and discover the problems down the line.  Both 
> are valid approaches with a very different type of risk associated with them. 
>  I think it's fair to say that real-world data is not always well known 
> enough at design time to follow the idea you're suggesting though, and that 
> does factor into why there is such a preference for surrogate keys in the 
> industry.
> 
> -- 
> Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
> PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
> "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books
> 
> 
> -- 
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

If you implicitly expect MAC to be unique but do not enforce it then you are 
likely to write queries that rely upon that uniqueness.  If you do enforce it 
then the constraint failure occurs anyway.

A scalar sub-query that links via the MAC will fail when the duplicate data is 
encountered, and normal queries will return too-many records.  A detail report 
may be obvious but if you are summarizing the data the specific offending 
record is going to require some effort to find.

I guess if you are the compromising type you can loosely enforce the uniqueness 
by running a check query periodically to see if supposedly unique values have 
been duplicated.

I agree there is no right answer - the designer needs to make trade-offs - but 
I'd rather reject new data and leave the system in a status-quo stable state 
instead of introducing invalid data and putting the system into a state where 
it requires effort to get it functioning again.  If you accept the invalid data 
the likely scenario, if something breaks, is someone finds the offending record 
and removes it until the application and database can be fixed properly - which 
is where we are at with validation.  The common exception is where identifiers 
are reused over time and you remove the old record in order to keep/allow the 
newer record to remain.

On a tangential course I've started considering is a setup whereby you 
basically have two identifiers for a record.  One is end-user facing and 
updatable whereas the other is static and used in intra-table relations.  You 
can create a new record with the same user-facing id as an existing Id but the 
existing Id will be replaced with its system id.  This is useful when users 
will be using the Id often and it can be reasonably assumed to be unique over a 
moderate period of time (say a year).  Invoice numbers, customer numbers are 
two common examples.  The lookup Id itself may require additional fields in 
order to qualify as a primary (natural) key but the static key wants to be a 
single field.  Often simply putting a date with the original id (and parent 
identifiers) is sufficient due to the infrequency of updates.  The downside is, 
with string-based parent identifiers the pk value can be quite long.  I 
currently have PKs of 40-50 length but during my new design my first pass on a 
couple of tables indicated >100 characters limit.

Is there any rules-of-thumb on the performance of a PK as a function of key 
length?  I like using varchar based identifiers since I tend to query tables 
directly and writing where clauses is much easier if you can avoid the joins.  
I'm likely better off creating views and querying those but am still curious on 
any basic thoughts on having a 100+ length primary key.

David J.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] "full_page_writes" makes no difference?

2011-05-03 Thread Tian Luo


Hi guys,

No matter I turn on or turn off the "full_page_writes", I always
observe 8192-byte writes of log data for simple write operations
(write/update).

But according to the document, when this is off, it could speed up
operations but may cause problems during recovery. So, I guess this is
because it writes less when the option is turned off. However, this
contradicts my observations 

If I am not missing anything, I find that the writes of log data go
through function "XLogWrite" in source file
"backend/access/transam/xlog.c".

In this file, log data are written with the following code:

from = XLogCtl->pages + startidx * (Size) XLOG_BLCKSZ;
nbytes = npages * (Size) XLOG_BLCKSZ;
if (write(openLogFile, from, nbytes) != nbytes)
{
 ...
}

So, "nbytes" should always be multiples of XLOG_BLCKSZ, which in the
default case, is 8192.

My question is, if it always writes full pages no matter
"full_page_writes" is on or off, what is the difference?

Thanks!

Regards,
- Tian


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Needs Suggestion

2011-05-03 Thread SUBHAM ROY

My output of explain (analyze,buffers) is something like this:

shared hit=3796624 read=46038

So what is meant by *read* here? Does it indicates number of disk reads?

Does *shared hit* takes into account only the hit in the pg_buffercache of
the postgres or it also takes into account the *linux buffers* that postgres
uses?

-- 
Thank You,
Subham Roy,
CSE IIT Bombay.

Re: [GENERAL] Needs Suggestion

2011-05-03 Thread Andreas Kretschmer

SUBHAM ROY  wrote:

> My output of explain (analyze,buffers) is something like this:
> 
> shared hit=3796624 read=46038
> 
> So what is meant by read here? Does it indicates number of disk reads?
> 
> Does shared hit takes into account only the hit in the pg_buffercache of the
> postgres or it also takes into account the linux buffers that postgres uses?

shared hit: read from shared mem
shared read: read from disk into shared mem


Andreas
-- 
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.  (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.  N 51.05082°, E 13.56889°

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] xlog min recovery request A/C is past current point A/B

2011-05-03 Thread Henry C.

Greets,

I've just activated another replication slave and noticed the following in the
logs:

WARNING:  xlog min recovery request 38E/E372ED60 is past current point
38E/D970

It seems to be happily restoring log files from the archive, but the warning
message above concerns me.

Googling only yields source code -
http://doxygen.postgresql.org/xlog_8c-source.html, which has an ominous
comment:

/* ...the passed
 *  'lsn' value could be bogus, i.e., past the end of available WAL, if
 *  the caller got it from a corrupted heap page.  Accepting such a
 *  value as the min recovery point would prevent us from coming up at
 *  all.  Instead, we just log a warning and continue with recovery.
 *  (See also the comments about corrupt LSNs in XLogFlush.)
 */


Is it something to be concerned about (and should I restart initial rsync &
replication), or can I safely ignore it?

Thanks


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Needs Suggestion

2011-05-03 Thread Guillaume Lelarge

On 05/04/2011 08:39 AM, Andreas Kretschmer wrote:
> SUBHAM ROY  wrote:
> 
>> My output of explain (analyze,buffers) is something like this:
>>
>> shared hit=3796624 read=46038
>>
>> So what is meant by read here? Does it indicates number of disk reads?
>>
>> Does shared hit takes into account only the hit in the pg_buffercache of the
>> postgres or it also takes into account the linux buffers that postgres uses?
> 
> shared hit: read from shared mem
> shared read: read from disk into shared mem
> 

To be more precise:

shared hit

  number of blocks directly read from PostgreSQL shared memory

read

  number of blocks PostgreSQL asked the OS to give him (then the OS may
  read the blocks from its own shared memory or may ask the disk
  subsystem to give him (and then the disk subsystem may read it from
  its own shared memory (SAN cache for example) or read it from the
  disk))


To put it more simply:

shared hit:
  number of blocks directly read from PostgreSQL shared memory

read:
  number of blocks read outside of PostgreSQL shared memory (meaning it
  may need disk read)


-- 
Guillaume
 http://www.postgresql.fr
 http://dalibo.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

49 matches

Mail list logo