[GENERAL] JSON vs Text + Regexp Index Searching

2014-02-25 Thread Eliot Gable
I am working on a project for which I require some assistance. Any input
would be appreciated.

We have a table with millions of records and dozens of columns. On some
systems, it takes up >10GB of disk space, and it runs on a single disk
which is rather slow (7200 RPM). We typically don't do any complicated
queries on it, so performance has not posed a problem previously. The table
is actually defined as 3 tables using inheritance. One table shows only the
last 7 days of data, and then a process moves the 8th day into the history
table for archiving. For performance reasons, we typically only query the
table with 7 days of data, which generally does not exceed 1 million rows.
In fact, it is typically under 50k rows for most systems. However, a few
systems may contain a few million records, at most. There are indices on
each of the fields we query, and we also put limits on the number of
records returned. The table is practically an append-only table.

We are looking at adding some additional columns to the table, and planning
for some future features. One of those features will allow cross
referencing the records in this table with a list of IDs. If the record's
list of IDs contains the one we are looking for, we want to include the
record in our query. I advocated creating a separate mapping table which
maps the ID of these records to the other ID we are searching for and
performing a JOIN on the two tables with appropriate foreign key
relationships and indices. However, I was ask to instead put the list into
a single column on each row to reduce implementation complexity.

Assuming the list of IDs is in a column on each row as TEXT in the format
of a JSON array, what is the best way to index the column so I can quickly
find the rows with the given ID?


Limitations:

We run version 9.0.x on these systems right now. We plan on updating later
this year, but for now we don't have the JSON type. Once we do, we plan to
make use of it. The column value will also be passed to the UI, which is
JavaScript-based, so a JSON format would be ideal.

We cannot modify the disks we run on to make them faster. It is cost
prohibitive in our particular scenario.

The systems in question are real-time systems, and are carefully optimized.
Any excessively lengthy queries which have high disk IO or CPU usage will
adversely affect the entire system.


My Thoughts:

Is it possible to use a REGEXP INDEX in this case? For example: column ~
'^\[67|,67,|,67\]$'
Will such an index perform well? How would I create it?

How badly would a simple LIKE operation perform? Is there any way to
optimize it?

Would using a JSON type in the future allow us to do any better type of
index on the column?

If we wanted to put an index on a JSON data type column whose values was a
simple array of IDs, what would the index look like, and how would we
construct the query to make use of it?


Thanks in advance for any input / insight on this.


Re: [GENERAL] JSON vs Text + Regexp Index Searching

2014-02-25 Thread Eliot Gable
On Tue, Feb 25, 2014 at 11:17 AM, David Johnston  wrote:

> David Johnston wrote
> >
> > Eliot Gable-4 wrote
> >> I advocated creating a separate mapping table which
> >> maps the ID of these records to the other ID we are searching for and
> >> performing a JOIN on the two tables with appropriate foreign key
> >> relationships and indices. However, I was ask to instead put the list
> >> into
> >> a single column on each row to reduce implementation complexity.
> >>
> >> Assuming the list of IDs is in a column on each row as TEXT in the
> format
> >> of a JSON array, what is the best way to index the column so I can
> >> quickly
> >> find the rows with the given ID?
> > I recommend benchmarking two implementations:
> >
> > 1) id_xref integer[] --on the same table, use "search_id = ANY(id_xref)"
> > as the WHERE condition
> >
> > [...]
>
> #1 can be greatly expanded in usefulness by making use of the "intarray"
> contrib/extension; as Merlin mentioned up-thread.
>
> David J.
>
>

Thank you both for the responses. I will benchmark the options you
suggested.


Re: [GENERAL] JSON vs Text + Regexp Index Searching

2014-02-25 Thread Eliot Gable
On Tue, Feb 25, 2014 at 3:13 PM, Eliot Gable  wrote:

> On Tue, Feb 25, 2014 at 11:17 AM, David Johnston  wrote:
>
>> David Johnston wrote
>> >
>> > Eliot Gable-4 wrote
>> >> I advocated creating a separate mapping table which
>> >> maps the ID of these records to the other ID we are searching for and
>> >> performing a JOIN on the two tables with appropriate foreign key
>> >> relationships and indices. However, I was ask to instead put the list
>> >> into
>> >> a single column on each row to reduce implementation complexity.
>> >>
>> >> Assuming the list of IDs is in a column on each row as TEXT in the
>> format
>> >> of a JSON array, what is the best way to index the column so I can
>> >> quickly
>> >> find the rows with the given ID?
>> > I recommend benchmarking two implementations:
>> >
>> > 1) id_xref integer[] --on the same table, use "search_id = ANY(id_xref)"
>> > as the WHERE condition
>> >
>> > [...]
>>
>> #1 can be greatly expanded in usefulness by making use of the "intarray"
>> contrib/extension; as Merlin mentioned up-thread.
>>
>> David J.
>>
>>
>
> Thank you both for the responses. I will benchmark the options you
> suggested.
>
>

Is there some way to make intarray work with int8 or some int8 variation of
intarray? The data type of the IDs is BIGINT.


[GENERAL] Table Vacuum Taking a Long Time

2014-04-02 Thread Eliot Gable
I have a table which is about 12 GB in size. It has had a vacuum full
analyze run on it, and then immediately after, I run vacuum analyze and it
takes about 90 seconds to complete. If I continue to run vacuum analyze on
it, it continues to take about 90 seconds each time. This system has a
single 7200 RPM drive in it, so it's not a very fast drive. I was under the
impression that if nothing had been done to the table since it was last
vacuumed, that it would return immediately. Further, this is an append-only
table, so why should it need to be vacuumed at all? We ran into cases where
after writing to it long enough, the PGSQL autovacuum process would kick in
and force a vacuum saying something about preventing wrap around. I don't
understand why it would do this if it is append-only and we are using
64-bit sequences as IDs without OIDs turned on. What would be wrapping
around without a vacuum? We tried to mitigate this by manually running
vacuum programmatically, but then we end up using all the disk IO just
running vacuum all the time, because it is constantly running through the
entire table even though very little (if anything) has been done to it
since the last vacuum.

Is this described behavior expected? If so, why?

If it is not expected, what should I be looking for which might explain why
it is taking so long?

Thanks.

-- 
Eliot Gable

"We do not inherit the Earth from our ancestors: we borrow it from our
children." ~David Brower

"I decided the words were too conservative for me. We're not borrowing from
our children, we're stealing from them--and it's not even considered to be
a crime." ~David Brower

"Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to live;
not live to eat.) ~Marcus Tullius Cicero


[GENERAL] Disable autovacuum on specific tables

2012-10-24 Thread Eliot Gable
In general, autovacuum seems to work well on most of the tables I deal
with. However, in a couple of specific cases, it seems to fail miserably. I
would like to switch to manual vacuuming on those tables and disable
auto-vacuuming for those tables alone. Is this possible? I searched the
docs and the mailing list archives, but it seems any time someone asks,
their question is side-stepped with something along the lines of "you are
probably doing something wrong, let's look at your situation" and it turns
out they generally are or they simply don't understand what vacuuming is
all about in the first place, so the question is never answered. So, before
we go further, if someone could just answer for the same of future searches
whether it is even possible, that would be great.

Here are the details about my specific reasoning, however, since I know you
are going to want to know:

I have a daemon which listens to events from a 3rd party system and keeps a
materialized view of the state of that 3rd party system in a small set of
tables. In certain use cases, this 3rd party system could generate as many
as 15,000 events per second which affect the materialized state of the
records in the tables.

The daemon must process these events and determine whether the event
inserts, updates, or deletes a record from the materialized view. It could
potentially do 50 - 100 inserts or deletes per second, or as many as 500
updates per second.

The materialized tables, if vacuumed properly, stay around 20 - 30 MB in
size. However, when usage spikes on the 3rd party system, there are so many
transactions going against these tables that the autovacuum process
constantly cancels itself so it doesn't block the pending transactions. The
usage spike can persist for a long duration of time and become so intense
that the size of the tables grow to 300 - 500 MB in size, each.

The database resides on a network appliance which has only a single 7200
RPM disk. When the table size grows to several hundred MB, the system
becomes I/O bound due to the disk seeking around everywhere inside those
large tables. Manually vacuuming the tables resolves the I/O bound state
and the system behaves "normally" again.

While the state updates generally need to be real-time, we do have some
windows of opportunity in which a vacuum can occur. The thread which writes
to the database maintains a queue of events to process and it can vacuum
the tables in between event processing. If we delay event processing every
30 - 60 seconds for the 100ms - 2s needed to vacuum analyze the tables, it
does not cause any particular problem for the system. And if we can
guarantee that the vacuum analyze will succeed (because nothing is writing
to those tables at the time the vacuum analyze runs), then it is worth the
small delay in processing the events to ensure the table is not growing out
of proportion. Assuming the tables have not grown to over 100 MB each,
there is enough performance available at currently conceivable loads to
ensure that any delay of 10s or less in processing events will quickly be
caught up.

However, this makes it unnecessary for the auto vacuum process to run on
these tables. So, is there some way I can disable the automatic vacuuming
of these tables?

I know I can disable auto vacuum on all tables and switch to manually
vacuuming all tables, but that then requires that all developers working on
this system remember to add any new tables to the list of tables to vacuum,
and it requires additional code to decide when it is appropriate to vacuum
those tables. The auto vacuum already does a great job on those tables, so
I do not see why I should disable it on those.

Thanks in advance.


-- 
Eliot Gable

"We do not inherit the Earth from our ancestors: we borrow it from our
children." ~David Brower

"I decided the words were too conservative for me. We're not borrowing from
our children, we're stealing from them--and it's not even considered to be
a crime." ~David Brower

"Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to live;
not live to eat.) ~Marcus Tullius Cicero


[GENERAL] Out of Shared Memory: max_locks_per_transaction

2012-11-09 Thread Eliot Gable
I have a PGSQL 9.0.1 database which is on the back-end of an app I was
stress testing last night. This morning, when I try to run psql, I get:

psql: FATAL:  out of shared memory
HINT:  You might need to increase max_locks_per_transaction.

I believe something in the app is failing to release resources -- maybe a
3-way or 4-way deadlock between writing to tables inside the triggers in
PGSQL or a deadlock between multiple processes talking to the app and the
database or something leaking in the app itself which is causing locks to
not be freed.

How do I track down what is going on if I cannot even run psql to get into
the DB to run troubleshooting queries?

Thanks in advance for any suggestions.


-- 
Eliot Gable

"We do not inherit the Earth from our ancestors: we borrow it from our
children." ~David Brower

"I decided the words were too conservative for me. We're not borrowing from
our children, we're stealing from them--and it's not even considered to be
a crime." ~David Brower

"Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to live;
not live to eat.) ~Marcus Tullius Cicero


Re: [GENERAL] Out of Shared Memory: max_locks_per_transaction

2012-11-09 Thread Eliot Gable
> most likely possibility you have a transaction being left open and
> accumulating locks.   of course, you have to rule out the fact that
> you simply have to increase max_locks_per_transaction: if you have a
> lot of tables, it might be reasonable to have to extend this on a
> stock config.
>
>
We allow 500 connections to the DB with 64 locks per transaction = 32,000
locks. During the stress testing, I had roughly 40 simultaneous operations
going through the test application. The test application uses a separate
set of threads for accessing the database along with a shared connection
pool and a FIFO queue attached to each connection. Queries are mostly
insert, update, and delete, so they are batched into transactions in blocks
of 100 - 1000 queries per transaction. At the start of the transaction, a
stored procedure is called which acquires locks on 8 tables in a specific
order to prevent triggers on the associated tables from deadlocking with
each other and with other things accessing the database. In total, there
might be 15 tables (at most) touched by the batch of queries.

Another process comes along and processes records which are being inserted
into the database. It pulls up to 10 records from a table, processes them,
and moves those records into a "processed" table. The processing of the
records is rather complex. To facilitate the processing, 6 temporary tables
are created during the processing of each record, and then dropped after
that record is completed. 8 additional tables are accessed in some way
during the processing of each record. Each call to the processing stored
procedure is run in its own transaction and handles only those 10 records
at a time. This is done to keep the length of the transaction short so it
does not block other activity in the database.


> one thing that can cause this unfortunately is advisory locks eating
> up exactly the amount of shared memory you have.  that's another thing
> to rule out.
>

How would I rule this out?


>
> > How do I track down what is going on if I cannot even run psql to get
> into
> > the DB to run troubleshooting queries?
>
> it's a nasty problem.  if shared memory is exhausted and stuck i think
> the only practical think to do is to restart the database or nuking
> attached clients. maybe try restarting the test, but keep an open
> session *with an open transaction* that has previously queried both
> pg_locks and pg_stat_activity.  it's just a hunch, but perhaps this
> might allow you to query said views and gather some details.
>
>
That is an interesting suggestion. I will definitely give that a try.

Is the pg_locks table the table I would query to see what is eating up
those 32,000 locks? Is there some other table or query I could run which
might provide additional information about those 32,000 locks and who /
what is using them?

Thanks for the info and your suggestions!


Re: [GENERAL] Out of Shared Memory: max_locks_per_transaction

2012-11-09 Thread Eliot Gable
> Another process comes along and processes records which are being inserted
> into the database. It pulls up to 10 records from a table, processes them,
> and moves those records into a "processed" table. The processing of the
> records is rather complex. To facilitate the processing, 6 temporary tables
> are created during the processing of each record, and then dropped after
> that record is completed. 8 additional tables are accessed in some way
> during the processing of each record. Each call to the processing stored
> procedure is run in its own transaction and handles only those 10 records
> at a time. This is done to keep the length of the transaction short so it
> does not block other activity in the database.
>
>
>> one thing that can cause this unfortunately is advisory locks eating
>> up exactly the amount of shared memory you have.  that's another thing
>> to rule out.
>>
>
> How would I rule this out?
>
>

It really was filling the locks table.

Using your suggestion, I managed to catch it in the process of the bad
behavior, before it exhausted all lock entries. After some sleuthing
through the resulting pg_locks output and my other code, I was able to
isolate and resolve the issue. Basically, there was a call going on which
tried to materialize a stats table based on thousands of records instead of
10 at a time. It was supposed to just be materializing the base rows in
that table, all zeroed out, not based on any of the records. However, it
does so using the same function which actually crunches numbers for the
records, and it was coded to try all records from start of day until the
function was run!

Thanks for the assist.

-Eliot


[GENERAL] cannot assign non-composite value to a row variable

2010-06-04 Thread Eliot Gable
In order to avoid using a 'FOR ... LOOP array_append(); END LOOP;' method of
building an array (which is not at all efficient), I rewrote some of my code
to do things more effectively. One of the steps involves building two arrays
that are input to another stored procedure, but I am getting an error with
this step. Specifically, I have something like this:

create type complex1 as ( ... ); -- one integer member and about 16 text
members
create type complex2 as ( ... ); -- few integer members, about 10 text
members, and about 6 different enum members

CREATE OR REPLACE blah ...
...
DECLARE
  myvariable complex1[];
  mydatasource complex1;
  myrowsource complex2[];
...
BEGIN
...
  -- The first way I tried to do it:
  myvariable := array(
SELECT mydatasource FROM unnest(myrowsource)
  );
  -- The second way I tried to do it:
  myvariable := array(
SELECT (mydatasource)::complex1 FROM unnest(myrowsource)
  );
  -- The third way I tried to do it:
  myvariable := array(
SELECT (mydatasource.member1, mydatasource.member2, ...)::complex1 FROM
unnest(myrowsource)
  );
...
END ...

Each of these gives the same error message:

CONTEXT: ERROR
CODE: 42804
MESSAGE: cannot assign non-composite value to a row variable

This is pl/pgsql in 8.4.1. Does anybody have any insight on how I can get
around this issue? I'm not sure exactly what circumstances are involved in
this SELECT that is causing this error. I don't understand what is being
considered the row variable or what is being considered the non-composite
value. I get the error when the 'myrowsource' variable has no rows, as well
as when it has 2 rows.

Basically, all I want is to have myvariable be an array that has one 'row'
for each row in 'unnest(myrowsource)' with the value of each row being equal
to the 'mydatasource' contents. Maybe there is a better way to achieve that
which someone can point out?

Thanks for any assistance anyone can provide.

-- 
Eliot Gable

"We do not inherit the Earth from our ancestors: we borrow it from our
children." ~David Brower

"I decided the words were too conservative for me. We're not borrowing from
our children, we're stealing from them--and it's not even considered to be a
crime." ~David Brower

"Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to live; not
live to eat.) ~Marcus Tullius Cicero


Re: [GENERAL] cannot assign non-composite value to a row variable

2010-06-04 Thread Eliot Gable
Thanks for the note on the bugfix in the update. I will try it. However,
there is no naming conflict. The idea was this:

The select query should return one result row for each row in the FROM
clause since there is no WHERE clause. Each result row should be the
contents of the complex1 data type contained by mydatasource. That set of
resulting rows should be converted to an array and assigned to myvariable.

On Fri, Jun 4, 2010 at 11:23 AM, Tom Lane  wrote:

> Alban Hertroys  writes:
> > I'm pretty sure you have a naming conflict.
>
> Yeah.  Specifically, the given example looks like it would try to assign
> a null to the target variable, since it'd be taking the null value of a
> different variable instead of a value from the intended source.
>
> I believe the bizarre error message is coming from a plpgsql bug that we
> fixed in 8.4.3, which basically was that assigning a null to a composite
> variable would fail in some cases.  If you weren't shooting yourself in
> the foot with naming conflicts, you might not trip over that case ...
> but an update to 8.4.recent wouldn't be a bad idea anyway.
>
>regards, tom lane
>



-- 
Eliot Gable

"We do not inherit the Earth from our ancestors: we borrow it from our
children." ~David Brower

"I decided the words were too conservative for me. We're not borrowing from
our children, we're stealing from them--and it's not even considered to be a
crime." ~David Brower

"Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to live; not
live to eat.) ~Marcus Tullius Cicero


Re: [GENERAL] cannot assign non-composite value to a row variable

2010-06-04 Thread Eliot Gable
This is the code I posted:

create type complex1 as ( ... ); -- one integer member and about 16 text
members
create type complex2 as ( ... ); -- few integer members, about 10 text
members, and about 6 different enum members

CREATE OR REPLACE blah ...
...
DECLARE
  myvariable complex1[];
  mydatasource complex1;
  myrowsource complex2[];
...
BEGIN
...
  -- The first way I tried to do it:
  myvariable := array(
SELECT mydatasource FROM unnest(myrowsource)
  );
  -- The second way I tried to do it:
  myvariable := array(
SELECT (mydatasource)::complex1 FROM unnest(myrowsource)
  );
  -- The third way I tried to do it:
  myvariable := array(
SELECT (mydatasource.member1, mydatasource.member2, ...)::complex1 FROM
unnest(myrowsource)
  );
...
END ...

I think you are thinking there is a naming conflict because you are seeing
this:

SELECT mydatasource FROM unnest(myrowsource)

And also seeing this:

DECLARE
  myvariable complex1[];
  mydatasource complex1;
  myrowsource complex2[];

And also think that there is a column called mydatasource in
unnest(myrowsource). But there is no such column. If you are thinking there
is a naming conflict for another reason, please explain, because I'm not
seeing it.

What I am doing here is rather strange, and maybe there is a better way to
do it. I'm just not aware of it. I'm relatively new to PostgreSQL. As I
said, what I'm expecting my code to do is literally:

1. unnest the myrowsource array into rows
2. use the current contents of the previously declared mydatasource
variable, which is of type 'complex1' to
3. generate as many rows using those contents as there are rows in the
unnesting of myrowsource array
4. construct an array based on the resulting set of rows and store it in
myvariable


On Fri, Jun 4, 2010 at 1:40 PM, Tom Lane  wrote:

> Eliot Gable >
> writes:
> > Thanks for the note on the bugfix in the update. I will try it. However,
> > there is no naming conflict.
>
> There was most certainly a naming conflict in the sample code you
> posted.  I realize that that probably was not the code you were
> actually trying to use, but we can only go on what you show us.
>
>regards, tom lane
>



-- 
Eliot Gable

"We do not inherit the Earth from our ancestors: we borrow it from our
children." ~David Brower

"I decided the words were too conservative for me. We're not borrowing from
our children, we're stealing from them--and it's not even considered to be a
crime." ~David Brower

"Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to live; not
live to eat.) ~Marcus Tullius Cicero


Re: [GENERAL] cannot assign non-composite value to a row variable

2010-06-04 Thread Eliot Gable
Updating did solve the problem. Thanks.

On Fri, Jun 4, 2010 at 11:23 AM, Tom Lane  wrote:

> Alban Hertroys  writes:
> > I'm pretty sure you have a naming conflict.
>
> Yeah.  Specifically, the given example looks like it would try to assign
> a null to the target variable, since it'd be taking the null value of a
> different variable instead of a value from the intended source.
>
> I believe the bizarre error message is coming from a plpgsql bug that we
> fixed in 8.4.3, which basically was that assigning a null to a composite
> variable would fail in some cases.  If you weren't shooting yourself in
> the foot with naming conflicts, you might not trip over that case ...
> but an update to 8.4.recent wouldn't be a bad idea anyway.
>
>regards, tom lane
>



-- 
Eliot Gable

"We do not inherit the Earth from our ancestors: we borrow it from our
children." ~David Brower

"I decided the words were too conservative for me. We're not borrowing from
our children, we're stealing from them--and it's not even considered to be a
crime." ~David Brower

"Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to live; not
live to eat.) ~Marcus Tullius Cicero


[GENERAL] Random Weighted Result Ordering

2010-06-06 Thread Eliot Gable
I have a set of results that I am selecting from a set of tables which I
want to return in a random weighted order for each priority group returned.
Each row has a priority column and a weight column. I sort by the priority
column with 1 being highest priority. Then, for each distinct priority, I
want to do a weighted random ordering of all rows that have that same
priority. I select the set of rows and pass it to a custom-built function
that does the ordering. I have tested both the prioritize and the random
weighted ordering functions and they do exactly what I want them to do for
ordering the data that I send them.

The problem comes from the fact that I tried to make them generalized. They
take an array of a small complex type which holds just an arbitrary ID, the
priority, and the weight. The output is the same information but the rows
are in the correct order. I thought I could take that output and just INNER
JOIN it back to my original data rows from which I obtained the ID,
priority, and weight values. However, when I do that, the ordering no longer
seems to be preserved, even though there is no ORDER BY clause on the query
that joins the data back to the original rows. Basically, I thought that if
the original data was:

50, 1, 5, 'data1'
55, 1, 4, 'data2'
34, 2, 0, 'data3'
90, 2, 1, 'data4'
95, 2, 1, 'data5

And the input to the functions was:

50, 1, 5
55, 1, 4
34, 2, 0
90, 2, 1
95, 2, 1

And the prioritized and weighted order came back:

50, 1, 5
55, 1, 4
95, 2, 1
90, 2, 1
34, 2, 0

Then, if I INNER JOINED them like:

(
  50, 1, 5
  55, 1, 4
  95, 2, 1
  90, 2, 1
  34, 2, 0
) AS randomized INNER JOIN (
  50, 1, 5, 'data1'
  55, 1, 4, 'data2'
  34, 2, 0, 'data3'
  90, 2, 1, 'data4'
  95, 2, 1, 'data5
) AS data ON (
  randomized.id = data.id
)

Then the rows would come back as:

50, 1, 5, 'data1'
55, 1, 4, 'data2'
95, 2, 1, 'data5'
90, 2, 1, 'data4'
34, 2, 0, 'data3

Unfortunately, that does not seem to be happening. Before I spend a ton of
time digging into this issue, I thought I would pose the questions here:

Does anyone know for certain why this would not work? Or, does anyone know
why this should work?

I assumed that the order of the joins would preserve the ordering of the
first set of data. However, I am worried about how the planner might
re-arrange the joins on me, and I am wondering whether the order is
guaranteed to be preserved like this in the first place... Does anyone know
for sure about these assumptions?

Thanks in advance for any assistance.


-- 
Eliot Gable

"We do not inherit the Earth from our ancestors: we borrow it from our
children." ~David Brower

"I decided the words were too conservative for me. We're not borrowing from
our children, we're stealing from them--and it's not even considered to be a
crime." ~David Brower

"Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to live; not
live to eat.) ~Marcus Tullius Cicero


Re: [GENERAL] Random Weighted Result Ordering

2010-06-07 Thread Eliot Gable
Great suggestion. Thanks. Don't know why I didn't think of that. I do almost
exactly the same thing further down in my stored procedure.

On Mon, Jun 7, 2010 at 4:34 AM, Dimitri Fontaine wrote:

> Eliot Gable >
> writes:
>
> > I have a set of results that I am selecting from a set of tables which I
> want to return in a random weighted order for each priority group returned.
> Each row has a
> > priority column and a weight column. I sort by the priority column with 1
> being highest priority. Then, for each distinct priority, I want to do a
> weighted random
> > ordering of all rows that have that same priority. I select the set of
> rows and pass it to a custom-built function that does the ordering. I have
> tested both the
> > prioritize and the random weighted ordering functions and they do exactly
> what I want them to do for ordering the data that I send them.
> >
> > The problem comes from the fact that I tried to make them generalized.
> They take an array of a small complex type which holds just an arbitrary ID,
> the priority,
> > and the weight. The output is the same information but the rows are in
> > the correct order.
>
> I'd try having the function return just numbers in the right order, then
> use that in the ORDER BY. To have those numbers, you'd still need to
> join with the result of the function, tho.
>
> Hope this helps you already, I don't have time to go deeper in the
> subject!
>
> Regards,
> --
> dim
>



-- 
Eliot Gable

"We do not inherit the Earth from our ancestors: we borrow it from our
children." ~David Brower

"I decided the words were too conservative for me. We're not borrowing from
our children, we're stealing from them--and it's not even considered to be a
crime." ~David Brower

"Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to live; not
live to eat.) ~Marcus Tullius Cicero


[GENERAL] Mammoth Replicator

2010-08-23 Thread Eliot Gable
Has anyone on this list (other than Command Prompt people) had any
experience with Mammoth Replicator? If so, what are your thoughts /
opinions of the system? Are there any drawbacks you can think of for
using it verses something else?

I like the fact that it is integrated into PostgreSQL, and I love the
way it operates in terms of commands for setting up tables to be
replicated and starting / stopping replication. It just seems like a
very simple system to use. However, with PostgreSQL 9.0 on the
horizon, I'm wondering if I should really be deploying version 8.3 on
new systems, and I'm worried about how long the code is going to be
supported. It looks like activity on the project has dropped off a
cliff in the past few months (at least from the perspective of
activity on the replicator-general mailing list).

I would appreciate any comments or thoughts anyone might have about the system.

Thanks.

-- 
Eliot Gable

"We do not inherit the Earth from our ancestors: we borrow it from our
children." ~David Brower

"I decided the words were too conservative for me. We're not borrowing
from our children, we're stealing from them--and it's not even
considered to be a crime." ~David Brower

"Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to
live; not live to eat.) ~Marcus Tullius Cicero


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Performing FETCH ALL from a SCROLL CURSOR failing to return results

2010-03-25 Thread Eliot Gable
This issue is on PostgreSQL 8.4.1 under CentOS 5.3 x86_64.

I have a scroll cursor defined like so:

source_host SCROLL CURSOR IS
SELECT ;

Inside my PL/PGSQL stored procedure, I am opening the cursor like so:

OPEN source_host;
FETCH source_host INTO src_host;
result.source_host_refcurs := source_host;
...
blah blah blah

RETURN result;

Then, I execute the stored procedure like this:

SELECT * FROM MyStoredProc(blah);
FETCH ALL FROM source_host;

The stored procedure returns a complex data type (result) with a refcursor
set up as source_host. When I use the "FETCH ALL" syntax, I get no results.
However, if I use any of these, I get the one and only record that is
returned:

FETCH FIRST FROM source_host;
FETCH LAST FROM source_host;
FETCH ABSOLUTE 1 FROM source_host;

Any of these fail:

FETCH NEXT
FETCH PRIOR
FETCH RELATIVE x where x is any number
FETCH x where x is any number
FETCH ALL
FETCH FORWARD
FETCH FORWARD x where x is any number
FETCH FORWARD ALL
FETCH BACKWARD
FETCH BACKWARD x where x is any number
FETCH BACKWARD ALL

Now, if I comment out the 'FETCH source_host INTO src_host' line inside the
stored procedure, then ALL of these work:

FETCH FIRST
FETCH LAST
FETCH ABSOLUTE x
FETCH RELATIVE x
FETCH NEXT
FETCH ALL
FETCH FORWARD
FETCH FORWARD x
FETCH FORWARD ALL
FETCH x

I have attempted to perform a MOVE FIRST aftering doing the 'FETCH
source_host INTO src_host' line, as well as MOVE LAST, MOVE ABSOLUTE 1, etc.
No attempt at doing a MOVE allows the FETCH ALL and the like to work. Only
FETCH FIRST, FETCH LAST, and FETCH ABSOLUTE seem to work after I have
touched the cursor inside the stored procedure. In fact, I can remove the
'FETCH source_host INTO src_host' line and replace it with a MOVE statement
and it results in the same problem.

I absolutely need to have FETCH ALL working. I don't care about anything
else other than FETCH ALL. I actually have about 10 cursors that are
returned like this from the stored procedure, and most of them have several
dozen records that need to be retrieved. I execute a single transaction
where I run the stored procedure and fetch all results all at once. This was
working just fine a couple of days ago. Not sure what broke.

If anyone has any ideas on what might be going wrong here, I would really
appreciate some assistance.

Thanks in advance.


-- 
Eliot Gable

"We do not inherit the Earth from our ancestors: we borrow it from our
children." ~David Brower

"I decided the words were too conservative for me. We're not borrowing from
our children, we're stealing from them--and it's not even considered to be a
crime." ~David Brower

"Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to live; not
live to eat.) ~Marcus Tullius Cicero


[GENERAL] Foreign Tables

2011-11-16 Thread Eliot Gable
I am working on a reporting project where I need to generate a report based
on data from several different network appliances. Each appliance runs a
PostgreSQL database which houses all of the information for the appliance.
Further, there are dozens of triggers in the database which fire when
various tables are touched in various different ways (insert, update,
delete). These triggers currently are used to build materialized views of
much of the data that I want to summarize in the reports.

I have read as much as I can find on 9.1's foreign table support, and it
looks almost ideal for bridging the gap between all the databases and
collecting all the data into a single report. However, I am unclear on a
few points...

1a) Can the foreign tables be written to? For example, I have server1 with
table foo and server2 which does 'create foreign table bar' where bar
references server1.foo. Can server2 write to bar and have it show in
server1.foo?

1b) If it does show in server1.foo, I assume it would also fire any
triggers on server1.foo; correct?

2) Given the example in question #1, can I put a trigger on server2.bar and
have it actually fire when server1.foo has an insert, update, or delete
operation on it?

Thanks in advance for any answers.


-- 
Eliot Gable

"We do not inherit the Earth from our ancestors: we borrow it from our
children." ~David Brower

"I decided the words were too conservative for me. We're not borrowing from
our children, we're stealing from them--and it's not even considered to be
a crime." ~David Brower

"Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to live;
not live to eat.) ~Marcus Tullius Cicero


Re: [GENERAL] Foreign Tables

2011-11-18 Thread Eliot Gable
Thank you for your response...


> Foreign tables in 9.1 are read-only, so you can't write to them.  Making
> foreign tables writable is a TODO item, but ISTM it's difficult to
> implement it for even 9.2.  So the answer to your question 1a) is "No".
>
> BTW, I'm interested in your use case very much because I'm working on
> enhancement of foreign tables for 9.2.  I would appreciate it if you tell
> me some details of your reporting system.  Foreign tables may suit your
> reporting system.
>
> a) Where are materialized views, triggers and source tables?  I guess all
> of them are on appliances, not on PostgreSQL server for reporting.
> b) Do you need to update data on appliances during making a report?  If you
> do, how do you do it without foreign tables? (from reporting application,
> or using dblink or something?)
>
>
Each appliance is a self-contained unit, including for reporting purposes.
However, there is a way to 'link' the appliances together to make them
aware of each other. Basically, a table stores the hostname and IP of other
systems. In the interface, someone would go to the 'Reports' tab and they
should be able to see the stats for the local appliance, stats for each
appliance that this one has been made aware of, and combined stats for all
appliances this one has been made aware of. Basically, there are objects
which are shared across the appliances which could be under the same
logical category. For example, users. Each appliance has its own set of
registered users and groups, but you might have the same group name across
multiple devices with different users in the group on different devices.
So, a combined view would show the stats for that group across all
appliances and it would list all users across all appliances that are in
that group.

Basically, my idea to do the reports was to put a trigger on the table
which stores the hostname and IP and then build the foreign server
entities, foreign tables, and triggers on the foreign tables. Ideally,
triggers on the foreign tables would fire when something is written to that
foreign table. The triggers could then materialize a local copy of the
statistics from those foreign tables into a local table. I would have a
'combined' table which is updated when the triggers on the foreign tables
fire. And triggers on the local tables would also update the 'combined'
stats table. Then, to generate my three types of reports, I would pull from
the local tables to display stats for the local system, from each of the
foreign tables to display stats for each device this one knows about, and
from the 'combined' table which has been getting updated every time the
foreign tables are written to and which would hold a composite of the
information from the local and all the foreign tables.

I would not need special guarantees that triggers on the local node on the
foreign table references complete before the transaction on the foreign
system is committed. It would be permissable if the foreign system
committed a transaction which updates the table, fires a local trigger, and
then the local trigger fails for some reason. I have no need for that to
cause a rollback of the transaction on the foreign system. Such an
occurrence should be rare, and I would adjust for it
by re-materializing the 'combined' view from time-to-time (maybe nightly).
The reports do not need to be 100% accurate, just 99% accurate. As long as
they are 'eventually' accurate, that is all I care about.

The reports are never really 'generated.' They are real-time. So, the
tables would store the exact information which would be displayed as the
report. A C trigger is used to communicate changes made to the report
tables in real time to a daemon which talks over web sockets to client
browsers. The client browsers are then updated in real time by push events
from the server as they occur in the report tables. So, there is no actual
reporting server in all of this. Clients only ever connect to their local
node, but they should be able to see the stats of the local node and any
remote systems the local node knows about.

So, how much of this is possible to do now with foreign tables, and how
much of it would I have to wait on? If I can do even some of it right now
with foreign tables, it would be useful.


-- 
Eliot Gable

"We do not inherit the Earth from our ancestors: we borrow it from our
children." ~David Brower

"I decided the words were too conservative for me. We're not borrowing from
our children, we're stealing from them--and it's not even considered to be
a crime." ~David Brower

"Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to live;
not live to eat.) ~Marcus Tullius Cicero


[GENERAL] LOCK DATABASE

2011-12-15 Thread Eliot Gable
Is this bogus, or is it an upcoming feature?

http://wiki.postgresql.org/wiki/Lock_database


-- 
Eliot Gable

"We do not inherit the Earth from our ancestors: we borrow it from our
children." ~David Brower

"I decided the words were too conservative for me. We're not borrowing from
our children, we're stealing from them--and it's not even considered to be
a crime." ~David Brower

"Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to live;
not live to eat.) ~Marcus Tullius Cicero


[GENERAL] What is the life of a postgres back end process?

2012-02-13 Thread Eliot Gable
Are postgres back end processes connection specific? In other words, can we
assume / trust that they will be terminated and cleaned up when we close a
connection and that they will not live on and be reused by other
connections?

What is a good way to test this is the case which would account for
differences in load?

-- 
Eliot Gable

"We do not inherit the Earth from our ancestors: we borrow it from our
children." ~David Brower

"I decided the words were too conservative for me. We're not borrowing from
our children, we're stealing from them--and it's not even considered to be
a crime." ~David Brower

"Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to live;
not live to eat.) ~Marcus Tullius Cicero


[GENERAL] Leaky Perl / DBIx / Postgres 9.0.1 Trio

2012-04-04 Thread Eliot Gable
I have a Perl daemon handling some events on a system which inserts or
updates rows in Postgres, and something is causing memory usage to grow. I
see the RSS memory size allocated to the Postgres connection continual
going up (slowly) and the RSS size allocated to the Perl daemon also
continually going up (slowly). The Perl daemon is using DBIx to make insert
and update queries to the Postgres DB.

There is only one connection open between the Perl daemon and the Postgres
DB. That connection stays open until the daemon exits or is killed. When
the daemon is killed, the Postgres back end is terminated and all of its
memory is released.

Any suggestions on how to go about troubleshooting this would be useful.
Most specifically, I would like to know how to find out what the Postgres
process for that connection is doing with all that memory. Is there a way I
can see what the memory is allocated for? Would running GDB on the Postgres
process help? Is there any logging I can turn on which might help shed some
light on what's using the memory? Anything in particular anyone knows about
which I should look for which might cause both Perl and Postgres memory
usage to climb continually?

Thanks in advance for any suggestions.


-- 
Eliot Gable

"We do not inherit the Earth from our ancestors: we borrow it from our
children." ~David Brower

"I decided the words were too conservative for me. We're not borrowing from
our children, we're stealing from them--and it's not even considered to be
a crime." ~David Brower

"Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to live;
not live to eat.) ~Marcus Tullius Cicero


Re: [GENERAL] Leaky Perl / DBIx / Postgres 9.0.1 Trio

2012-04-04 Thread Eliot Gable
On Wed, Apr 4, 2012 at 2:29 PM, Tom Molesworth wrote:

>
> I believe DBIx is more of a namespace than a specific module - do you mean
> DBIx::Class or DBI perhaps? If the former, the IRC channel #dbix-class on
> irc.perl.org tends to be very helpful in tracing problems like this,
> particularly if you can provide a reproducible test case.
>
>
It is using DBIx::Class. I will check on the IRC channel, too; thanks.



> On the Perl side, Devel::Size and possible Test::MemoryGrowth may be of
> some use in tracing where the memory is going. Does the memory usage hit a
> ceiling at any point, or does it carry on until all virtual memory is
> exhausted? You could try running the postgres and perl daemons with lower
> ulimit -m / ulimit -v settings if the increase in memory is too slow to
> test this in a reasonable timeframe.
>
>
The memory usage eventually exhausts all virtual memory on the system.


> Other than that, maybe try looking for statement handles that aren't
> closed or long-running transactions?
>
>
I checked for long running transactions, but did not find any. Is there a
way in Postgres to check for unclosed statement handles? If such a method
exists, it would be far easier than looking through the DBIx::Class code
for unclosed statement handles.

Thanks for your suggestions.

-- 
Eliot Gable


[GENERAL] Preventing an 'after' trigger from causing rollback on error

2012-04-13 Thread Eliot Gable
Is there any way I can stop a trigger which fires after a row is inserted
into a table from causing a rollback of the entire transaction if something
goes wrong?

I have rows being written to a table, and I have triggers which calculate
and update statistics automatically in stats tables based on the rows
inserted. However, if something goes wrong in that trigger, right now it
causes all the inserts to roll back, as well as the stats calculations. I
would rather keep all the inserts, but just fail the stats calculates. I
can always rematerialize the entire stats table with another procedure if
all the rows are present.

Thanks in advance for any assistance.


-- 
Eliot Gable


[GENERAL] LOCK TABLE is not allowed in a non-volatile function

2012-04-17 Thread Eliot Gable
I have a table which has a trigger on it. It is basically a log of user
activity. The trigger is created like this:

CREATE TRIGGER user_log_user_activity_call_in_trig AFTER INSERT ON
bbx_cdr.user_log FOR EACH ROW WHEN (
NEW.user_log_action = 'ringing'
) EXECUTE PROCEDURE user_log_user_activity_call_in_trigger_func();

It is roughly structured like this:

CREATE OR REPLACE FUNCTION user_log_user_activity_call_in_trigger_func()
RETURNS TRIGGER AS
$$
BEGIN
BEGIN
LOCK TABLE live_user_activity IN SHARE ROW EXCLUSIVE MODE;
LOCK TABLE user_activity_archive IN SHARE ROW EXCLUSIVE MODE;

... -- Do some stuff

PERFORM rotate_live_user_activity_table();

... -- Do some stuff

EXCEPTION WHEN OTHERS THEN
RAISE WARNING 'An exception occurred in
user_log_activity_call_in_trigger_func() code %: %', SQLSTATE, SQLERRM;
END;
RETURN NEW;
END;
$$
LANGUAGE plpgsql VOLATILE;

Which calls this function:

CREATE OR REPLACE FUNCTION rotate_live_user_activity_table() RETURNS
BOOLEAN AS
$$
BEGIN
BEGIN
LOCK TABLE live_user_activity IN SHARE ROW EXCLUSIVE MODE;
LOCK TABLE user_activity_archive IN SHARE ROW EXCLUSIVE MODE;

... -- Do some stuff, including move records to an archive table, if needed

/* If we don't have records or we already moved the records, then
materialize the table */
PERFORM materialize_live_user_activity();

EXCEPTION WHEN OTHERS THEN
RAISE WARNING 'An error occurred while trying to rotate the live user
activity records; code %: %', SQLSTATE, SQLERRM;
RETURN FALSE;
END;

RETURN TRUE;
END;
$$
LANGUAGE plpgsql VOLATILE;


Which calls this:

CREATE OR REPLACE FUNCTION materialize_live_user_activity() RETURNS BOOLEAN
AS
$$
DECLARE
tmp RECORD;
BEGIN
BEGIN
LOCK TABLE live_user_activity IN SHARE ROW EXCLUSIVE MODE;

TRUNCATE TABLE live_user_activity;
INSERT INTO live_user_activity
SELECT  nextval('user_activity_id_seq'),
date_trunc('day', CURRENT_TIMESTAMP)::DATE,
i.*,
NULL::TIMESTAMP WITH TIME ZONE,
FALSE
FROM  summarize_individuals(date_trunc('day',
CURRENT_TIMESTAMP)::TIMESTAMP, CURRENT_TIMESTAMP) AS i;

EXCEPTION WHEN OTHERS THEN
RAISE WARNING 'Failed to materialize the live_user_activity table; code %:
%', SQLSTATE, SQLERRM;
RETURN FALSE;
END;

RETURN TRUE;
END;
$$
LANGUAGE plpgsql VOLATILE;

When the trigger fires, I get this in my postgres.log file:

2012-04-17 16:57:15 EDT|test_db|169.254.5.138(56783)||[unknown]|30474
 WARNING:  Failed to materialize the live_user_activity table; code 0A000:
LOCK TABLE is not allowed in a non-volatile function
2012-04-17 16:57:15 EDT|test_db|169.254.5.138(56783)||[unknown]|30474
 CONTEXT:  SQL statement "SELECT materialize_live_user_activity()"
PL/pgSQL function "rotate_live_user_activity_table" line 22 at
PERFORM
SQL statement "SELECT rotate_live_user_activity_table()"
PL/pgSQL function "user_log_user_activity_call_in_trigger_func"
line 22 at PERFORM
SQL statement ""
PL/pgSQL function "live_stats_channel_trigger_func" line 262 at SQL
statement

The "live_stats_channel_trigger_func" is also a VOLATILE trigger function
structured the same way as above with a lot more lock table statements in
there.

The "summarize_individuals" function there is also VOLATILE and it calls
"summarize_user_log" which is also VOLATILE.

I cannot find a single non-volatile function in the call path; so I am
baffled on where this error message is coming from. I would be thankful for
any ideas anyone might have on where this error message might be coming
from or how to locate where it is coming from.

Thanks.


-- 
Eliot Gable

"We do not inherit the Earth from our ancestors: we borrow it from our
children." ~David Brower

"I decided the words were too conservative for me. We're not borrowing from
our children, we're stealing from them--and it's not even considered to be
a crime." ~David Brower

"Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to live;
not live to eat.) ~Marcus Tullius Cicero


[GENERAL] LOCK TABLE is not allowed in a non-volatile function

2012-04-17 Thread Eliot Gable
I have a table which has a trigger on it. It is basically a log of user
activity. The trigger is created like this:

CREATE TRIGGER user_log_user_activity_call_in_trig AFTER INSERT ON
bbx_cdr.user_log FOR EACH ROW WHEN (
NEW.user_log_action = 'ringing'
) EXECUTE PROCEDURE user_log_user_activity_call_in_trigger_func();

It is roughly structured like this:

CREATE OR REPLACE FUNCTION user_log_user_activity_call_in_trigger_func()
RETURNS TRIGGER AS
$$
BEGIN
BEGIN
LOCK TABLE live_user_activity IN SHARE ROW EXCLUSIVE MODE;
LOCK TABLE user_activity_archive IN SHARE ROW EXCLUSIVE MODE;

... -- Do some stuff

PERFORM rotate_live_user_activity_table();

... -- Do some stuff

EXCEPTION WHEN OTHERS THEN
RAISE WARNING 'An exception occurred in
user_log_activity_call_in_trigger_func() code %: %', SQLSTATE, SQLERRM;
END;
RETURN NEW;
END;
$$
LANGUAGE plpgsql VOLATILE;

Which calls this function:

CREATE OR REPLACE FUNCTION rotate_live_user_activity_table() RETURNS
BOOLEAN AS
$$
BEGIN
BEGIN
LOCK TABLE live_user_activity IN SHARE ROW EXCLUSIVE MODE;
LOCK TABLE user_activity_archive IN SHARE ROW EXCLUSIVE MODE;

... -- Do some stuff, including move records to an archive table, if needed

/* If we don't have records or we already moved the records, then
materialize the table */
PERFORM materialize_live_user_activity();

EXCEPTION WHEN OTHERS THEN
RAISE WARNING 'An error occurred while trying to rotate the live user
activity records; code %: %', SQLSTATE, SQLERRM;
RETURN FALSE;
END;

RETURN TRUE;
END;
$$
LANGUAGE plpgsql VOLATILE;


Which calls this:

CREATE OR REPLACE FUNCTION materialize_live_user_activity() RETURNS BOOLEAN
AS
$$
DECLARE
tmp RECORD;
BEGIN
BEGIN
LOCK TABLE live_user_activity IN SHARE ROW EXCLUSIVE MODE;

TRUNCATE TABLE live_user_activity;
INSERT INTO live_user_activity
SELECT nextval('user_activity_id_seq'),
date_trunc('day', CURRENT_TIMESTAMP)::DATE,
i.*,
NULL::TIMESTAMP WITH TIME ZONE,
FALSE
FROM summarize_individuals(date_trunc('day', CURRENT_TIMESTAMP)::TIMESTAMP,
CURRENT_TIMESTAMP) AS i;

EXCEPTION WHEN OTHERS THEN
RAISE WARNING 'Failed to materialize the live_user_activity table; code %:
%', SQLSTATE, SQLERRM;
RETURN FALSE;
END;

RETURN TRUE;
END;
$$
LANGUAGE plpgsql VOLATILE;

When the trigger fires, I get this in my postgres.log file:

2012-04-17 16:57:15 EDT|test_db|169.254.5.138(56783)||[unknown]|30474
 WARNING:  Failed to materialize the live_user_activity table; code 0A000:
LOCK TABLE is not allowed in a non-volatile function
2012-04-17 16:57:15 EDT|test_db|169.254.5.138(56783)||[unknown]|30474
 CONTEXT:  SQL statement "SELECT materialize_live_user_activity()"
PL/pgSQL function "rotate_live_user_activity_table" line 22 at
PERFORM
SQL statement "SELECT rotate_live_user_activity_table()"
PL/pgSQL function "user_log_user_activity_call_in_trigger_func"
line 22 at PERFORM
SQL statement ""
PL/pgSQL function "live_stats_channel_trigger_func" line 262 at SQL
statement

The "live_stats_channel_trigger_func" is also a VOLATILE trigger function
structured the same way as above with a lot more lock table statements in
there.

The "summarize_individuals" function there is also VOLATILE and it calls
"summarize_user_log" which is also VOLATILE.

I cannot find a single non-volatile function in the call path; so I am
baffled on where this error message is coming from. I would be thankful for
any ideas anyone might have on where this error message might be coming
from or how to locate where it is coming from.

Thanks.


-- 
Eliot Gable

"We do not inherit the Earth from our ancestors: we borrow it from our
children." ~David Brower

"I decided the words were too conservative for me. We're not borrowing from
our children, we're stealing from them--and it's not even considered to be
a crime." ~David Brower

"Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to live;
not live to eat.) ~Marcus Tullius Cicero


Re: [GENERAL] LOCK TABLE is not allowed in a non-volatile function

2012-04-18 Thread Eliot Gable
No, I have lots of calls to current_timestamp inside volatile functions
which lock tables without complaints. I am beginning to think I hit some
sort of bug. This is PostgreSQL 9.0.1.


On Tue, Apr 17, 2012 at 5:55 PM, Michael Nolan  wrote:

>
>
> On Tue, Apr 17, 2012 at 5:20 PM, Eliot Gable  wrote:
>
>>
>>
>>
>> I cannot find a single non-volatile function in the call path; so I am
>> baffled on where this error message is coming from. I would be thankful for
>> any ideas anyone might have on where this error message might be coming
>> from or how to locate where it is coming from.
>>
>>
>> According to the documentation, the current_timestamp family of functions
> is stable, could that be the cause?  Better yet, should it?
> --
> Mike Nolan
>



-- 
Eliot Gable

"We do not inherit the Earth from our ancestors: we borrow it from our
children." ~David Brower

"I decided the words were too conservative for me. We're not borrowing from
our children, we're stealing from them--and it's not even considered to be
a crime." ~David Brower

"Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to live;
not live to eat.) ~Marcus Tullius Cicero


Re: [GENERAL] LOCK TABLE is not allowed in a non-volatile function

2012-04-18 Thread Eliot Gable
On Tue, Apr 17, 2012 at 7:11 PM, Tom Lane  wrote:

> Eliot Gable  writes:
> > When the trigger fires, I get this in my postgres.log file:
> > 2012-04-17 16:57:15 EDT|test_db|169.254.5.138(56783)||[unknown]|30474
> >  WARNING:  Failed to materialize the live_user_activity table; code
> 0A000:
> > LOCK TABLE is not allowed in a non-volatile function
>
> > I cannot find a single non-volatile function in the call path; so I am
> > baffled on where this error message is coming from.
>
> I'm confused too, but I'm not going to guess at details of an incomplete
> example; the problem could well be related to code you didn't show us.
> Please provide a self-contained script that triggers this message.
> Also, what PG version are we talking about?
>
>regards, tom lane
>

Thanks, Tom. I will try to get an entire example put together which I can
post which will reproduce it. For your reference, the code I cut out was
just inserts, updates, selects, and if/then/else statements. There were no
other stored procedure calls or function calls present in any of the code I
cut out.


Re: [GENERAL] LOCK TABLE is not allowed in a non-volatile function

2012-04-18 Thread Eliot Gable
On Wed, Apr 18, 2012 at 10:18 AM, Eliot Gable <
egable+pgsql-gene...@gmail.com> wrote:

> On Tue, Apr 17, 2012 at 7:11 PM, Tom Lane  wrote:
>
>> Eliot Gable  writes:
>> > When the trigger fires, I get this in my postgres.log file:
>> > 2012-04-17 16:57:15
>> EDT|test_db|169.254.5.138(56783)||[unknown]|30474
>> >  WARNING:  Failed to materialize the live_user_activity table; code
>> 0A000:
>> > LOCK TABLE is not allowed in a non-volatile function
>>
>> > I cannot find a single non-volatile function in the call path; so I am
>> > baffled on where this error message is coming from.
>>
>> I'm confused too, but I'm not going to guess at details of an incomplete
>> example; the problem could well be related to code you didn't show us.
>> Please provide a self-contained script that triggers this message.
>> Also, what PG version are we talking about?
>>
>>regards, tom lane
>>
>
> Thanks, Tom. I will try to get an entire example put together which I can
> post which will reproduce it. For your reference, the code I cut out was
> just inserts, updates, selects, and if/then/else statements. There were no
> other stored procedure calls or function calls present in any of the code I
> cut out.
>
>
Tom,

While attempting to reproduce this issue in a sanitized set of tables,
functions, and triggers, I was able to locate the issue. Apparently I did
have another function call in there inside my summarize_individuals()
function and that other function was marked as STABLE while trying to grab
a SHARE lock on a table for reading purposes. However, that function will
probably never be called by itself, and since PostgreSQL will grab the
appropriate lock on that table anyway, I was able to just remove the lock
statement to fix it. However, it seems to me there should be some way of
grabbing a read-only lock on a set of tables at the top of a function
marked STABLE simply for the purpose of enforcing the order in which tables
are locked, regardless of which order they are queried.

If VOLATILE function A grabs an EXCLUSIVE lock on Table A while STABLE
Function B grabs a SHARE lock on Table A and then Function A tries to grab
an EXCLUSIVE lock on Table B while Function B tries to grab a SHARE lock on
Table A, then we have a deadlock. Function B won't be able to get the SHARE
lock while Function A has the EXCLUSIVE, and Function A won't be able to
get the EXCLUSIVE while Function B has the SHARE. But if Function B, which
is STABLE, can grab SHARE locks at the top by grabbing the locks in the
same order that Function A tries, then the deadlock is averted.

In my particular case, it will not be an issue because the STABLE function
is being called only by other functions which are VOLATILE and already have
either a SHARE or SHARE ROW EXCLUSIVE lock on the table in question, and
those orders are enforced across all functions.

-Eliot


Re: [GENERAL] LOCK TABLE is not allowed in a non-volatile function

2012-04-18 Thread Eliot Gable
On Wed, Apr 18, 2012 at 1:01 PM, Tom Lane  wrote:

> Eliot Gable  writes:
> > While attempting to reproduce this issue in a sanitized set of tables,
> > functions, and triggers, I was able to locate the issue. Apparently I did
> > have another function call in there inside my summarize_individuals()
> > function and that other function was marked as STABLE while trying to
> grab
> > a SHARE lock on a table for reading purposes. However, that function will
> > probably never be called by itself, and since PostgreSQL will grab the
> > appropriate lock on that table anyway, I was able to just remove the lock
> > statement to fix it. However, it seems to me there should be some way of
> > grabbing a read-only lock on a set of tables at the top of a function
> > marked STABLE simply for the purpose of enforcing the order in which
> tables
> > are locked, regardless of which order they are queried.
>
> Taking a lock is a side-effect, and stable functions are expected not
> to have side-effects.  So I don't agree that this is a bug.
>
> However, there still might be an issue, because the CONTEXT trace that
> you showed certainly seemed to point where you thought it did.  So I am
> wondering if there is a bug in the error-location-reporting stuff, or
> if that was an artifact of having stripped out too much information.
>
>regards, tom lane
>

After re-reading the LOCK modes and realizing that ACCESS SHARE is not the
same as SHARE, I believe you are correct; the only issue seems to be in the
CONTEXT trace failing to point out that the error occurred three function
calls deeper than what was reported. It seems it reported it in the first
function where the EXCEPTION handling was set up. It should have said it
was in user_log_slice() inside summarize_user_log() inside
summarize_individuals() inside materialize_live_user_activity(), etc. Going
from inner-most function to outer-most function, the first function with
exception handling was materialize_live_user_activity().


Re: [GENERAL] LOCK TABLE is not allowed in a non-volatile function

2012-04-18 Thread Eliot Gable
On Wed, Apr 18, 2012 at 3:47 PM, Tom Lane  wrote:

> Eliot Gable  writes:
> > On Wed, Apr 18, 2012 at 1:01 PM, Tom Lane  wrote:
> >> However, there still might be an issue, because the CONTEXT trace that
> >> you showed certainly seemed to point where you thought it did.
>
> > After re-reading the LOCK modes and realizing that ACCESS SHARE is not
> the
> > same as SHARE, I believe you are correct; the only issue seems to be in
> the
> > CONTEXT trace failing to point out that the error occurred three function
> > calls deeper than what was reported. It seems it reported it in the first
> > function where the EXCEPTION handling was set up.
>
> Oh!  Yes, that's to be expected, because so far as Postgres is concerned
> it's logging the location of the RAISE WARNING command.  You've only
> told it to print the SQLERRM string, and nothing else:
>
> RAISE WARNING 'An error occurred while trying to rotate the live user
> activity records; code %: %', SQLSTATE, SQLERRM;
>
> As of 9.2 there is a way to get the context string for the original
> error (GET STACKED DIAGNOSTICS) which you could then include in the
> RAISE message.  That hasn't made it to any released versions
> unfortunately.
>
>regards, tom lane
>

Thanks, Tom. I will keep that in mind for when we update our Postgres build
on our systems.


[GENERAL] pg_advisory_lock() and row deadlocks

2012-04-20 Thread Eliot Gable
Is it possible to prevent row deadlocks by using pg_advisory_lock()? For
example:

Transaction 1 grabs pg_advisory_lock(1)
Transaction 1 runs a statement that updates multiple rows on Table A
Transaction 1 releases pg_advisory_lock(1)
Transaction 1 continues processing other stuff
Transaction 1 grabs pg_advisory_lock(2)
Transaction 1 runs a statement that updates multiple rows on Table B
Transaction 1 releases pg_advisory_lock(2)
Transaction 1 continues processing other stuff

At the same time...

Transaction 2 grabs pg_advisory_lock(2)
Transaction 2 runs a statement that deletes multiple rows on Table B
Transaction 2 releases pg_advisory_lock(2)
Transaction 2 continues processing other stuff
Transaction 2 grabs pg_advisory_lock(1)
Transaction 2 runs a statement that deletes multiple rows on Table A
Transaction 2 releases pg_advisory_lock(1)
Transaction 2 continues processing other stuff

If these two transactions run simultaneously, is there any way that they
can have a row deadlock given the way the pg_advisory_lock() calls are made?

My underlying problem is trying to break row deadlocks due to cascading
deletes on foreign keys in one transaction colliding with updates to rows
in another transaction.

If I use pg_advisory_lock(), can I lock and unlock a table multiple times
in both transactions without ever needing to worry about them getting
deadlocked on rows? Doing select locks on rows is not an option because
they last until the end of the transaction and I cannot control the order
in which both transactions grab locks on the different tables involved, and
each transaction may have an affect on the same rows as the other
transaction in one or more of the same tables.


-- 
Eliot Gable

"We do not inherit the Earth from our ancestors: we borrow it from our
children." ~David Brower

"I decided the words were too conservative for me. We're not borrowing from
our children, we're stealing from them--and it's not even considered to be
a crime." ~David Brower

"Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to live;
not live to eat.) ~Marcus Tullius Cicero


Re: [GENERAL] pg_advisory_lock() and row deadlocks

2012-04-20 Thread Eliot Gable
On Fri, Apr 20, 2012 at 11:46 AM, Chris Angelico  wrote:

>
> You have a Dining Philosophers Problem. Why can you not control the
> order in which they acquire their locks? That's one of the simplest
> solutions - for instance, all update locks are to be acquired in
> alphabetical order of table name, then in primary key order within the
> table. Yes, select locks last until the end of the transaction, but
> are you really sure you can safely release the locks earlier? By
> releasing those advisory locks, you're allowing the transactions to
> deadlock, I think. Attempting a manual interleave of these:
>
> Transaction 1 grabs pg_advisory_lock(1)
> Transaction 2 grabs pg_advisory_lock(2)
> Transaction 1 runs a statement that updates multiple rows on Table A
> Transaction 2 runs a statement that deletes multiple rows on Table B
> Transaction 1 releases pg_advisory_lock(1)
> Transaction 2 releases pg_advisory_lock(2)
> Transaction 1 continues processing other stuff
> Transaction 2 continues processing other stuff
>
> At this point, Transaction 1 retains the locks on rows of Table A, and
> Transaction 2 retains locks on B.
>
> Transaction 1 grabs pg_advisory_lock(2)
> Transaction 2 grabs pg_advisory_lock(1)
> Transaction 1 runs a statement that updates multiple rows on Table B
> -- Lock --
> Transaction 2 runs a statement that deletes multiple rows on Table A
> -- Deadlock --
>
> Your advisory locks aren't actually doing anything for you here.
>
> ChrisA
>

How do you control the order in which cascading deletes occur across tables
and the order in which they fire the triggers which do the locking?

Within a single function or even within a couple of functions, I can
control the order. But they are called from within triggers on tables on
cascading delete or update operations. How do I control that? Some
functions only need to lock certain tables while other functions need a
large set of the tables locked. All the functions and triggers lock tables
in alphabetical order, and I have rolled the locking out to the furthest
level based on what sub-functions call. However, transaction 1 might call
function1() first and then function2() and transaction 2 might call
function2() first and then function1() and those functions might grab locks
on Table A and B independently, but then when transaction 1 or 2 calls
function3(), it needs to work with both tables, and then they deadlock.
Function1() or function2() might be called in a transaction without ever
calling function3() in that transaction, so it doesn't make sense to lock
all the tables in function1() and function2() that function3() also locks.


Re: [GENERAL] LOCK TABLE is not allowed in a non-volatile function

2012-04-20 Thread Eliot Gable
On Wed, Apr 18, 2012 at 3:47 PM, Tom Lane  wrote:

> Eliot Gable  writes:
> > On Wed, Apr 18, 2012 at 1:01 PM, Tom Lane  wrote:
> >> However, there still might be an issue, because the CONTEXT trace that
> >> you showed certainly seemed to point where you thought it did.
>
> > After re-reading the LOCK modes and realizing that ACCESS SHARE is not
> the
> > same as SHARE, I believe you are correct; the only issue seems to be in
> the
> > CONTEXT trace failing to point out that the error occurred three function
> > calls deeper than what was reported. It seems it reported it in the first
> > function where the EXCEPTION handling was set up.
>
> Oh!  Yes, that's to be expected, because so far as Postgres is concerned
> it's logging the location of the RAISE WARNING command.  You've only
> told it to print the SQLERRM string, and nothing else:
>
> RAISE WARNING 'An error occurred while trying to rotate the live user
> activity records; code %: %', SQLSTATE, SQLERRM;
>
> As of 9.2 there is a way to get the context string for the original
> error (GET STACKED DIAGNOSTICS) which you could then include in the
> RAISE message.  That hasn't made it to any released versions
> unfortunately.
>
>regards, tom lane
>

Is there a way to print out context in 9.0.1 when doing exception handling
and raising a message? Right now, all I get is my error code, description
of the error, and no idea what the specific statement is that is generating
the error in the first place. Am I missing something, or is this just not
possible in 9.0.1?