Re: [GENERAL] Implementing "thick"/"fat" databases

2011-07-26 Thread Sim Zacks

On 07/25/2011 06:24 PM, Chris Travers wrote:


On Sun, Jul 24, 2011 at 11:53 PM, Sim Zacks  wrote:


The goal is to make our system client agnostic, Most of our GUI is written
in wxpython, we also have some web functions and even a barcode terminal
function, written in C#. We would like to use an application server, so that
all the code is run on the server but we don't want to be web-based. I don't
want to have the same business logic written into various clients as that
causes maintenance headaches.

Ok.  So we are talking about a multi-application database, and you are
trying to provide some degree of consistency in business logic across
the software applications (and hence uses of your data).  So far, so
good.  Agreed to put anything that might need to be in common.


The way that postgresql works is that each session is its own process. That
means that from the server perspective (at least in most ways), my plpython
function is not connected to any other call on the database. My goal is to
allow any client to request functionality and have it execute the same way
every time.

So I assume that means using Pl/Mono to make your barcode stuff
work in the db too?
No need for PL/Mono or any other client specific language. The GUI 
should be dumb, so all I really need to program design is the interface 
and input output methods. When you push a button, it should call the 
appropriate function. The functions on the barcode terminal (which is a 
Windows Mobile platform) can also be run on the client application 
written in wxpython. Keeping as much as possible off of the client 
allows me to share the work and be sure that both clients do the exact 
same thing when the user pushes the button.


There is always functionality that needs to be on the client, however in 
my experience with multi-client applications, if it is business logic 
you will end up duplicating it and then needing to maintain 2 copies of 
the same thing.





Data logic is built in by use of constraints and triggers and some
functions, business logic is built only into functions.

I guess I am using "data logic" more broadly than you.Any select,
update, or insert statement against a relation (except, of course,
simple function calls) is part of that data logic, and there are
supporting dependencies too, such as security and permissions.

IOW, I think a lot of 'business logic' is data logic.  I separate this
with what you actually use the data for.

IOW, Generate a report with the following format is data logic (even
perhaps, and email it to the head of product ordering), but use this
report to decide what inventory to order is application logic (you
probably want people making these decisions).

Similarly "Generate me a list of invoices to be paid in the following
format": is data logic, but selecting those invoices to be paid,
entering the data, determining what bank account you want to pay them
from etc. is application logic.  Recording the payments is again data
logic.  Application logic might be better thought of as workflow
logic.
So much of human interaction is definition of terms. Now I understand 
what you mean by data logic.


To me data logic is strictly whether the data will be considered corrupt 
if the database allows an action to occur. For example, not enforcing 
relationships. If you erase the parent the child has no meaning and 
therefore you have data in the system which is suspect.
As my db is designed with the business rules in mind, some of the 
business logic is included in the data logic, for example, 3 tables must 
be updated at the same time in order for the transaction to be 
considered valid. This is taken care of either through a function or 
triggers.


An example of Business Logic is when my stock is updated to 0, I want 
the record deleted.
This is done through a rule that on update that changes the value to 0 
delete instead.



In other words, our postgresql server is a hybrid database/application
server.

Right.  One thing we are seeing in LedgerSMB as we go through this is
that the database does indeed become an application server.  The areas
which are taking the new approach are mostly SQL and HTML templates
(in TemplateToolkit), while the Perl code is shrinking pretty fast.

I guess what I am wondering is if you are using it as an application
server, is there a benefit to trying to put everything in the
database?  Or just those parts which necessarily follow from the
database independent of workflow, etc or are likely to be common
across diverse applications hitting your server?

If so, what is it?
The benefits of putting everything into the database is having one 
platform to maintain. Also, anything that can talk to the database can 
call the functions, so we have a defined interface for all our 
functions. If I wanted to use JBOSS, for example, how would my wxpython 
or my c# app talk to it?
IOW, I did not find a middleware that was more appropriate for a 
multi-client/multi-platform environm

Re: [GENERAL] pgsql error

2011-07-26 Thread Mcleod, John
Thank you for the reply.

At command line, I ran...
"psql  --version"
and received..
"psql  (PostgreSQL) 7.5devel"

The database is sitting on a Windows 2003 Server box.
A mapping application, wrote in PHP, runs with Apache 2.05 

I know in the past, the project manager would restart the database by just 
closing the .bat window, then restart by double-clicking the postgis.bat file 
on the desktop.
I'm not sure if this was the beginning of the problem.  I've learned to 
shutdown the database by "Ctrl C".

This batch file has the following...

cd c:\
cd ms4w/apps/pgsql75win/data/
del postmaster.pid

@ECHO OFF
set
PATH=%PATH%; 
\ms4w\apps\pgsql75win\lib;\ms4w\apps\pgsql75win\bin;\ms4w\apps\pgsql75win\share\contrib

cd c:\
cd ms4w/apps/pgsql75win/
cmd /c "postmaster -D \ms4w\apps\pgsql75win\\data"

I hope this will give you some clues.


John



-Original Message-
From: Tom Lane [mailto:t...@sss.pgh.pa.us] 
Sent: Monday, July 25, 2011 11:20 PM
To: Merlin Moncure
Cc: Mcleod, John; pgsql-general@postgresql.org
Subject: Re: [GENERAL] pgsql error 

Merlin Moncure  writes:
> On Mon, Jul 25, 2011 at 3:05 PM, Mcleod, John  wrote:
>> I'm receiving the following error
>> CONTEXT: writing block 614 of relation 394198/412175
>> WARNING: could not write block 614 of 394198/412175
>> DETAIL: Multiple failures --- write error may be permanent.
>> ERROR: xlog flush request 0/34D53680 is not satisfied --- flushed 
>> only to
>> 0/34CD1EB0

> This is a fairly low level error that is telling you the WAL could not 
> be written out.  Out of drive space?  Data corruption?

Yeah, this looks like the detritus of some previous failure.  There are 
basically two possibilities:

1. The problem page's LSN field has gotten trashed so that it appears to be 
past the end of WAL.

2. The page actually did get updated by a WAL entry with that LSN, and then 
there was a crash for some reason, and the database tried to recover by 
replaying WAL, and it hit some problem that caused it to stop recovering before 
what had really been the end of WAL.  So now it thinks the end of WAL is 
0/34CD1EB0, but there are page(s) out there with LSNs past that, and when it 
finds one you start getting complaints like this.

I doubt theory #1, though, because there are nearby fields in a page header 
that evidently weren't trashed or else the page would have been recognized as 
being corrupt.  Also the reported LSN is not very far past end of WAL, which 
would be unlikely in the event of random corruption.
So I'm betting on #2.

Unfortunately this tells us little about either the cause of the original 
crash, or the reason why recovery didn't work properly.  We'd need a lot more 
information before speculating about that, for starters the exact Postgres 
version and the platform it's running on.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pgsql error

2011-07-26 Thread Merlin Moncure
On Tue, Jul 26, 2011 at 7:47 AM, Mcleod, John  wrote:
> Thank you for the reply.
>
> At command line, I ran...
> "psql  --version"
> and received..
> "psql  (PostgreSQL) 7.5devel"
>
> The database is sitting on a Windows 2003 Server box.
> A mapping application, wrote in PHP, runs with Apache 2.05

ok -- first you reported the version # of the client, not the back
end.  let's hope they are different.  you are running an early
development build of postgres 8.0 -- I know this because about 3/4 the
way through the development cycle the version # changed from 7.5 to
8.0.  This is essentially a completely unpactched, riddled with known
bugs, alpha build. Prevailing wisdom is that the earliest good build
to use on windows is 8.2 or maybe 8.3.

Right now 100% of your energy should be devoted to double checking the
version# against the backend (postgres -- version) and, if it is also
7.5 devel, capturing a good backup if you don't have one already and
upgrading ASAP.

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] 100 times faster than mysql

2011-07-26 Thread Merlin Moncure
http://codesynthesis.com/~boris/blog/2011/07/26/odb-1-5-0-released/

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] 100 times faster than mysql

2011-07-26 Thread Allan Kamau
On Tue, Jul 26, 2011 at 4:41 PM, Merlin Moncure  wrote:
> http://codesynthesis.com/~boris/blog/2011/07/26/odb-1-5-0-released/
>
> merlin
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


"The surprising part is that PostgreSQL 9.0.4 is more than 100 times
faster on this test than MySQL 5.1.49 with the InnoDB backend (186s
for MySQL, 48s for SQLite, and 12s for PostgreSQL). Postgre developers
seem to be doing something right."


If the speed is to be measured purely (and simply) on these numbers,
186/12 yields 15.5 (or maybe 16 if your round it up or 15 if you use
integer division). May be about 15~16 times faster would be more in
line with numbers provided.


Allan.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pgsql error

2011-07-26 Thread Tom Lane
"Mcleod, John"  writes:
> Thank you for the reply.
> At command line, I ran...
> "psql  --version"
> and received..
> "psql  (PostgreSQL) 7.5devel"

Egad.  That is an early development snapshot of what eventually got
called the 8.0 release.  You should try "select version();" in psql
to verify that the server itself is the same version, but I'm
betting it is.  What you've got there is a development snapshot from
perhaps mid-2004, and even if it were a supported release, we dropped
support for it a couple years ago.

> The database is sitting on a Windows 2003 Server box.

Even worse.  8.0 was the first release that had native Windows support
(so that probably explains why your predecessor tried to use it
pre-release).  The number and extent of the bugs in that is, well,
remarkable.

Given this information, what's remarkable is not that your DB got
corrupted but that it survived this long without that.  I think your
best bet is (1) pg_dump as much data as you can, (2) reinstall a
reasonably recent, supported PG version, (3) reload the dump.

> I know in the past, the project manager would restart the database by just 
> closing the .bat window, then restart by double-clicking the postgis.bat file 
> on the desktop.
> I'm not sure if this was the beginning of the problem.

Sure didn't help any ... in principle the DB ought to withstand that,
but it's not a clean shutdown; and in the case of early Windows versions
in particular I'm not sure we understood how to do fsyncs correctly on
that platform.

I'm not a Windows person myself, but I believe the recent EDB packagings
of Postgres offer a much cleaner user interface than that.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Implementing "thick"/"fat" databases

2011-07-26 Thread Chris Travers
On Tue, Jul 26, 2011 at 1:04 AM, Sim Zacks  wrote:

> No need for PL/Mono or any other client specific language. The GUI should be
> dumb, so all I really need to program design is the interface and input
> output methods. When you push a button, it should call the appropriate
> function. The functions on the barcode terminal (which is a Windows Mobile
> platform) can also be run on the client application written in wxpython.
> Keeping as much as possible off of the client allows me to share the work
> and be sure that both clients do the exact same thing when the user pushes
> the button.
>
> There is always functionality that needs to be on the client, however in my
> experience with multi-client applications, if it is business logic you will
> end up duplicating it and then needing to maintain 2 copies of the same
> thing.

I am not sure it applies to all business logic.  For example suppose
the product team and the sales team both need to be able to be
assigned tickets relating to customer feedback.  They may have very
different rules and processes for dealing with that.  There is
commonality (how the data is stored, retrieved, presented to the
application), but if you want to enforce the different rules, you are
going to have to put the differences in business logic somewhere.




>
> So much of human interaction is definition of terms. Now I understand what
> you mean by data logic.
>
> To me data logic is strictly whether the data will be considered corrupt if
> the database allows an action to occur. For example, not enforcing
> relationships. If you erase the parent the child has no meaning and
> therefore you have data in the system which is suspect.
> As my db is designed with the business rules in mind, some of the business
> logic is included in the data logic, for example, 3 tables must be updated
> at the same time in order for the transaction to be considered valid. This
> is taken care of either through a function or triggers.
>
> An example of Business Logic is when my stock is updated to 0, I want the
> record deleted.
> This is done through a rule that on update that changes the value to 0
> delete instead.

Ok, that's a pretty simple example of logic that belongs in the database.


>
> The benefits of putting everything into the database is having one platform
> to maintain. Also, anything that can talk to the database can call the
> functions, so we have a defined interface for all our functions. If I wanted
> to use JBOSS, for example, how would my wxpython or my c# app talk to it?

That's not really what I am getting at.  The question could be better phrased:
Why put workflow-specific logic in the database (your interactive
query idea is an example of that)?

Why not use the db procedures to create a well-formed API which
enforces business logic consistently across applications, but allows
the applications to manage their own workflow?  Isn't this what you'd
want to use middleware for if you were going this route?

> IOW, I did not find a middleware that was more appropriate for a
> multi-client/multi-platform environment then the database server.
> Another big advantage of using the database for business logic, is that the
> plpython functions can be called from other functions. For example, in the
> python prompt function I posted earlier, I can have an plpgsql function say
> if python_prompt('Are you sure?"') then
>     process
> end if

So if arbitrary model function calls workflow functions, does that
mean you have already broken out of the MVC approach?  I mean
does the MVC approach allow the model to control the controller?

I guess the approach I take is to put as much of the model in the
database as possible, and put the controllers on the client.  Deciding
when to prompt the user for input seems to me it should be a
controller function.


> Any client that has the ability to listen to a port will be able to use this
> functionality.
> In C# for example, I would use System.Net.Sockets and the TcpListener class
> In Python I would use the socketserver module

Any application also has an ability to call database functions as an
API as well.



> What we do today (the reason I am playing with this interactivity) is have
> the database return an error code, which the client understands to be a
> question. It then prompts the user and the query is sent again with the
> answer as a parameter. It works but it is ugly (IMO).

That's what we do with LedgerSMB and the user creation workflow.  I
think it has the advantage of a cleaner API.

For example, our function definition for the save_user function is:

CREATE OR REPLACE FUNCTION admin__save_user(
in_id int,
in_entity_id INT,
in_username text,
in_password TEXT,
in_import BOOL
) returns int AS $$

There are some nice things about this.  It allows the client
application to specify at the outset whether the prospective user is
to be imported into the application (i.e. created as a user of the
application without 

[GENERAL] mac installer on Lion

2011-07-26 Thread Travis Choma



Wondering in their is an ETA on a dmg installer for the mac that does not fail 
on Lion?

Best,
-Travis





Re: [GENERAL] pgsql error

2011-07-26 Thread Mcleod, John
When I go to "C:\ms4w\apps\pgsql75win\bin" in the command line, and run 
"postgres --version", I get the following...

"postgres (PostgreSQL) 7.5devel"

John


-Original Message-
From: Merlin Moncure [mailto:mmonc...@gmail.com] 
Sent: Tuesday, July 26, 2011 9:40 AM
To: Mcleod, John
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] pgsql error

On Tue, Jul 26, 2011 at 7:47 AM, Mcleod, John  wrote:
> Thank you for the reply.
>
> At command line, I ran...
> "psql  --version"
> and received..
> "psql  (PostgreSQL) 7.5devel"
>
> The database is sitting on a Windows 2003 Server box.
> A mapping application, wrote in PHP, runs with Apache 2.05

ok -- first you reported the version # of the client, not the back end.  let's 
hope they are different.  you are running an early development build of 
postgres 8.0 -- I know this because about 3/4 the way through the development 
cycle the version # changed from 7.5 to 8.0.  This is essentially a completely 
unpactched, riddled with known bugs, alpha build. Prevailing wisdom is that the 
earliest good build to use on windows is 8.2 or maybe 8.3.

Right now 100% of your energy should be devoted to double checking the version# 
against the backend (postgres -- version) and, if it is also
7.5 devel, capturing a good backup if you don't have one already and upgrading 
ASAP.

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Implementing "thick"/"fat" databases

2011-07-26 Thread Merlin Moncure
On Mon, Jul 25, 2011 at 5:21 PM, Chris Travers  wrote:
> On Mon, Jul 25, 2011 at 12:33 PM, Merlin Moncure  wrote:
>
>> exactly. procedural middlewares written in languages like java tend to
>> be bug factories:
>> *) over-(mis-)use of threads
>> *) performance wins moving logic outside the database to scale it are
>> balanced out by the extra traffic
>
> Well, typically you have other performance issues other than the
> traffic.  A well written SQL query can do more, and more efficiently,
> than most application programmers realize.  The tendency to move stuff
> out of the database IMHO often occurs at the same time the SQL code
> itself ends up being less efficient.  A lot of things (aggregates for
> example) use of CPU cycles on the db server that could probably be
> saved by moving things out of the db, but only at the cost of memory
> usage.
>
> I have seen CRAZY stuff coming out of middleware and I tend to suspect
> that the "move it out of the db" crowd sometimes tend to be stuck
> thinking in their languages and thus unable to take advantage of good,
> well written SQL.
>
> I have seen Perl code be used instead of HAVING clauses.  I have seen
> loops through query results, firing other queries.  Ok, to be
> fair, I would call that program as something way below average in
> quality but still.
>
>> *) database concurrency issues
>> *) OO representation of data is brittle and inflexible
>
> And the DB ends up being designed around the ORM...  Which sucks...
>
>> *) extremely (and IMNSHO unnecessarily) verbose
>> *) framework all over the place
>> *) libraries all over the place
>
> generally agreed.
>
>>
>> On the plus side though, languages like java have huge pools of
>> available talent and excellent tools.  These factors are *critical*
>> for many IT companies.  plpgsql may be the bee's knee's (it is) but
>> DIY tools and methodologies tends to translate directly to a high cost
>> of labor, and application developers used to rich tool environments
>> tend to really dislike code practices that pl/pgsql requires like
>> debugging by logging and remembering where stuff is and what it does.
>
> I dunno.  It depends on how you use plpgsql.
>
> The approach we have taken in the LedgerSMB project is to use stored
> procs basically as what has been called "named queries."  We try to
> put as much as possible into single SQL statements as possible, and we
> extensively use PLPGSQL's syntactic sugar to make things more
> manageable (a very large number of our stored procs, probably a clear
> majority, could be written in SQL).
>
> I think that if you come up with all these excellent tools, the
> quality of code will go down and you will end up with something that
> approximates most of the Middleware problems today.  IMHO, the best
> use for PLPGSQL is that of straight SQL queries with just a little
> procedural logic around it.  There are exceptions due to the fact that
> utility statements are unparameterized..  But for the most part,
> that is what I have found to work best.
>
>>
>> plpgsql is a niche language that tends to attract the same really
>> smart people who write code in a more functional style.  When done
>> well, you can do a lot with a very small amount of code.  I would love
>> to see stronger validation features (with a higher emphasis on
>> warnings) and better tools/editors to help bring plpgsql to the
>> mainstream.
>
> What kinds of tools/editors do you think we need?  A heck of us really
> like VIM or EMACS (just don't ask which is best ;-) ).

Of course I pretty much agree on all points but you are preaching to
the choir.  You and I both code in a spartan, console heavy manner.
Of course we prefer plpgsql because over the course of years of
database application programming we've realized the advantages of
first class queries and having code running tightly coupled with the
data.  Unfortunately when I talk about these concepts to the majority
of programmers I've worked with over the years they look at me as if I
just got off my spacecraft from Neptune.  I think I've been cursed due
to have discovered the secret to fast, efficient programming while
continually being constrained from developing that way.   Then again,
most programmers probably feel like that :-).

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] variant column type

2011-07-26 Thread salah jubeh

 Hello,

suppose the following scenario

the car speed is 240 
the car has an airbag

Here the first value is integer and the second value is boolean. Consider that 
I 
have this table structure

feature (feature id feature name)
car (car id,  )
car_feature (car id, feature id, value). the value attribute might have 
different domains. How can I model this using postgres and using ANSI compliant 
design ?

Regards 

Re: [GENERAL] variant column type

2011-07-26 Thread John R Pierce

On 07/26/11 10:02 AM, salah jubeh wrote:
and using ANSI compliant design 


American National Standards Institute?   they have an ANSI standard on 
database schema design or something?




--
john r pierceN 37, W 122
santa cruz ca mid-left coast


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] variant column type

2011-07-26 Thread salah jubeh

 
Hello John,

I mean ANSI SQL 92 complaint, if I am not mistaken. One solution to this 
problem 
is to use something like hstore. but  it has some disadvantages in my 
application so I want another opinion. 

Regards 





From: John R Pierce 
To: pgsql-general@postgresql.org
Sent: Tue, July 26, 2011 7:10:47 PM
Subject: Re: [GENERAL] variant column type

On 07/26/11 10:02 AM, salah jubeh wrote:
> and using ANSI compliant design 

American National Standards Institute?   they have an ANSI standard on database 
schema design or something?



-- john r pierceN 37, W 122
santa cruz ca mid-left coast


-- Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] variant column type

2011-07-26 Thread Ben Chobot
On Jul 26, 2011, at 10:02 AM, salah jubeh wrote:

> 
>  Hello,
> 
> suppose the following scenario
> 
> the car speed is 240 
> the car has an airbag
> 
> Here the first value is integer and the second value is boolean. Consider 
> that I have this table structure
> 
> feature (feature id feature name)
> car (car id,  )
> car_feature (car id, feature id, value). the value attribute might have 
> different domains. How can I model this using postgres and using ANSI 
> compliant design ?
> 

You haven't been very clear but it sounds like maybe you're asking how to get 
both EAV "flexibility" and type safety? There isn't any good way to do that but 
you might consider something ugly, like:

car (id, ...)
car_boolean_features (car_id, )
car_int_features (car_id, ...)

A better thing to consider, of course, is how you might avoid the need for 
anything like EAV in the first place.

Re: [GENERAL] variant column type

2011-07-26 Thread John R Pierce


in general, attribute-value sorts of lists are very difficult to use for 
relational operations and result in clumsy inefficient queries, as well 
as poor data integrity.


whenever possible common attributes shoudl be stored properly as table 
fields.   reserve EAV for highly sparse freeform information that could 
not have been anticipated at design time.   for your example, all cars 
have a speed, and do/don't have an airbag, so these should be normal 
fields in a table.



--
john r pierceN 37, W 122
santa cruz ca mid-left coast


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] heavy load-high cpu itilization

2011-07-26 Thread Filippos
Dear all

first of all congratulations on your greak work here since from time to time
i 've found many answers to my problems. unfortunately for this specific
problem i didnt find much relevant information, so i would ask for your
guidance dealing with the following situation:

we have a dedicated server (8.4.4, redhat) with 24 cpus and 36 GB or RAM. i
would say that the traffic in the server is huge and the cpu utilization is
pretty high too (avg ~ 75% except during the nights when is it much lower).
i am trying to tune the server a little bit to handle this problem. the
incoming data in the database are about 30-40 GB /day.

at first the checkpoint_segments were set to 50, the checkpoint_timeout at
15 min and the checkpoint_completion_target was 0.5 sec.

i noticed that the utilization of the server was higher when it was close to
making a checkpoint and since the parameter of full_page_writes is ON , i
changed the parameters mentioned above to (i did that after reading a lot of
stuff online):
checkpoint_segments->250
checkpoint_timeout->40min
checkpoint_completion_target -> 0.8

but the cpu utilization is not significantly lower. another parameter i will
certainly change is the wal_buffers which is now set at 64KB and i plan to
make it 16MB. can this parameter cause a significant percentage of the
problem?

are there any suggestions what i can do to tune better the server? i can
provide any information you find relevant for the configuration of the
server, the OS, the storage etc

thank you in advance 


--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/heavy-load-high-cpu-itilization-tp4635696p4635696.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] variant column type

2011-07-26 Thread David Johnston
 

the car speed is 240 

the car has an airbag

 

Here the first value is integer and the second value is boolean. Consider
that I have this table structure

 

feature (feature id feature name)

car (car id,  )

car_feature (car id, feature id, value). the value attribute might have
different domains. How can I model this using postgres and using ANSI
compliant design ?

 

Regards 

Given "feature" and "car-feature" tables the presence of absence of an entry
in "car-feature" will accomplish your desire for true/false - i.e., "the car
has an airbag".  By abstracting just a little every "feature" can be boiled
down to a label/ID and then the added to "feature" and associated via
"car-feature".

In your example you could create a feature called "Top Speed - 240kph"

If every car is going to have a particular "feature" and only the "value"
matters you could considering adding a "car-properties" table:

car_property (car id, top_speed, etc.) and populate the top_speed column
with whatever value is applicable or leave it NULL if unknown or N/A.  The
relationship between "car" and "car_property" would be one-to-one (1-to-1)

Dave J.



Re: [GENERAL] variant column type

2011-07-26 Thread Thomas Kellerer

salah jubeh, 26.07.2011 19:02:


Hello,

suppose the following scenario

the car speed is 240
the car has an airbag

Here the first value is integer and the second value is boolean. Consider that 
I have this table structure

feature (feature id feature name)
car (car id,  )
car_feature (car id, feature id, value). the value attribute might have 
different domains. How can I model this using postgres and using ANSI compliant 
design ?

Regards


Have a look at the hstore contrib module.

It allows you to store key/value pairs (lots of them) in a single column.

create table car
(
  car_id integer,
  features hstore
);

insert into car (car_id, features)
values
(1, 'speed => 240, airbag => true');

insert into car (car_id, features)
values
(2, 'speed => 140, airbag => false');

insert into car (car_id, features)
values
(3, 'speed => 140, flux_capacitor => true');

-- show the airbag attribute for all cars
-- will return null for those that don't have that attribute
select car_id, (features -> 'airbag') as airbag_flag
from car;

-- return all rows that have an attribute named flux_capacitor with the value 
true
select *
from car
where features @> ('flux_capacitor => true')

Note that the only drawback of this solution is that you don't have any datatypes for the 
attributes and you can't create a foreign key constraint to a "feature" table. 
But it's probably the most flexible way to deal with such a requirement in Postgres.






--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] variant column type

2011-07-26 Thread Chris Travers
On Tue, Jul 26, 2011 at 11:06 AM, David Johnston  wrote:

> Given “feature” and “car-feature” tables the presence of absence of an entry
> in “car-feature” will accomplish your desire for true/false - i.e., “the car
> has an airbag”.  By abstracting just a little every “feature” can be boiled
> down to a label/ID and then the added to “feature” and associated via
> “car-feature”.
>
> In your example you could create a feature called “Top Speed – 240kph”
>
> If every car is going to have a particular “feature” and only the “value”
> matters you could considering adding a “car-properties” table:
>
> car_property (car id, top_speed, etc…) and populate the top_speed column
> with whatever value is applicable or leave it NULL if unknown or N/A.  The
> relationship between “car” and “car_property” would be one-to-one (1-to-1)
>
I don't like this approach for a couple of reasons.

1)  Storing non-applicable and unknowns as interchangeable in a
database schema introduces semantic ambiguity issues that are best
avoided if possible.

2)  While wide tables win in terms of supporting more complex
constraints, they lose in terms of storage, etc.

I would personally create three tables:

1)  car (id, top_speed, ec)
2)  features (id, feature_name, etc)
3)  car_has_feature (car_id, feature_id)

This has the benefits of allowing you to track additional information
about features.  For example, you could track that seatbelts are
required in Washington State on all cars manufactured after a certain
date.  The array functions in PostgreSQL are powerful enough to handle
queries of features by car pretty well, or that federal law requires
that certain airbag features are required.

Now, there are a few cases however where key-value-mapping is both
necessary and works and where variant column types are needed (for
example, storing application settings, or argument lists for the
functions that menu items call).  In those cases you have to have
somewhere that knows what the type is supposed to be and checks it.
That's really not a trivial problem because keeping things to a single
point of truth approach is very difficult with such relatively
unstructured data, which is why in applications where I have to do
this, we require that the table be updated through stored procedures
which do this checking.

Best Wishes,
Chris Travers

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] variant column type

2011-07-26 Thread David Johnston
-Original Message-
From: Chris Travers [mailto:chris.trav...@gmail.com] 
Sent: Tuesday, July 26, 2011 2:32 PM
To: David Johnston
Cc: salah jubeh; pgsql
Subject: Re: [GENERAL] variant column type
> In your example you could create a feature called “Top Speed – 240kph”
>
> If every car is going to have a particular “feature” and only the “value”
> matters you could considering adding a “car-properties” table:
>
> car_property (car id, top_speed, etc…) and populate the top_speed 
> column with whatever value is applicable or leave it NULL if unknown 
> or N/A.  The relationship between “car” and “car_property” would be 
> one-to-one (1-to-1)
>
I don't like this approach for a couple of reasons.

1)  Storing non-applicable and unknowns as interchangeable in a database
schema introduces semantic ambiguity issues that are best avoided if
possible.

2)  While wide tables win in terms of supporting more complex constraints,
they lose in terms of storage, etc.

--

Agreed.  But I was suggesting using the "wide-table" in addition to the
"car;feature;car_has_feature" tables.  Basically limit the extended table to
those properties that are truly (or at least almost truly) global.  There
should only be a few fields.  The fact that the car has a top-speed can be
assumed to be global and thus calling it a "feature" is possibly abstracting
things too much.  If you need to display it in a "feature list" you can
readily write a VIEW that will pull out that integer value from the extended
table, convert it into a meaningful "name/description", and present it as a
list of "Fixed Features".

My main concern with the whole "feature" table is you end up going down the
path of everything being a "feature" - the VIN, Make, Model, Year - where in
most sane cases you'd be better off having fields for those fields since
every car has one.  And so, while I say use an "wide-table" to capture some
of these additional values you can just add the "top-speed" field to the
main car table.  In this specific example there is not semantic ambiguity
since we know that a car has a top-speed and so a NULL must represent an
UNKNOWN value.  If the NULL could represent "Not Applicable" I would
probably leave it to the "feature" table.

David J.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] select all rows where any column is NULL

2011-07-26 Thread Nathan Boley
Does anyone have a concise way of doing $SUBJECT?

Best,
Nathan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] select all rows where any column is NULL

2011-07-26 Thread David Johnston
Copy and paste the column names from the "CREATE TABLE" statement...

You can try using information schema and building a dynamic SQL query inside
a function...

If you let people know WHAT you are trying to accomplish you may find you
get alternative suggestions that you never considered.  

SQL itself is designed for a variable/unknown numbers of rows but a
fixed/known number of columns.  The only way around that is to use a "table
of column names" to identify the columns and then dynamically create a query
with those "fixed/known" columns and "EXECUTE" it within a PL/PGSQL
function.

David J.

-Original Message-
From: pgsql-general-ow...@postgresql.org
[mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Nathan Boley
Sent: Tuesday, July 26, 2011 3:09 PM
To: PostgreSQL general
Subject: [GENERAL] select all rows where any column is NULL

Does anyone have a concise way of doing $SUBJECT?

Best,
Nathan

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make
changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] variant column type

2011-07-26 Thread Radosław Smogura

On Tue, 26 Jul 2011 10:45:27 -0700, John R Pierce wrote:

in general, attribute-value sorts of lists are very difficult to use
for relational operations and result in clumsy inefficient queries, 
as

well as poor data integrity.

whenever possible common attributes shoudl be stored properly as
table fields.   reserve EAV for highly sparse freeform information
that could not have been anticipated at design time.   for your
example, all cars have a speed, and do/don't have an airbag, so these
should be normal fields in a table.


--
john r pierceN 37, W 122
santa cruz ca mid-left coast

Everything above is true and.

Database table is like C struct, no inheritance. If you have common 
attributes per some class, but no all cars have same class, you may 
create "extending" table with those attributes as columns, and then join 
it with car.


Currently I work on project with design car 1..* features. It's 
painful. Many features id's hard-coded, no contract programming (no 
support from compiler, etc. I use O-R libraries, and I can't even write 
car.speed!


Regards,
Radek

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] select all rows where any column is NULL

2011-07-26 Thread Merlin Moncure
On Tue, Jul 26, 2011 at 2:08 PM, Nathan Boley  wrote:
> Does anyone have a concise way of doing $SUBJECT?

 select * from foo where (row((foo).*) = row((foo).*)) is null;

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] 100 times faster than mysql

2011-07-26 Thread Radosław Smogura

On Tue, 26 Jul 2011 17:02:12 +0300, Allan Kamau wrote:
On Tue, Jul 26, 2011 at 4:41 PM, Merlin Moncure  
wrote:

http://codesynthesis.com/~boris/blog/2011/07/26/odb-1-5-0-released/

merlin

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




"The surprising part is that PostgreSQL 9.0.4 is more than 100 times
faster on this test than MySQL 5.1.49 with the InnoDB backend (186s
for MySQL, 48s for SQLite, and 12s for PostgreSQL). Postgre 
developers

seem to be doing something right."


If the speed is to be measured purely (and simply) on these numbers,
186/12 yields 15.5 (or maybe 16 if your round it up or 15 if you use
integer division). May be about 15~16 times faster would be more in
line with numbers provided.


Allan.
Congratulations, but those number are little bit unrealistic. 50 micro 
seconds per query - maybe they use ODB caching?


Regards,
Radek

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] heavy load-high cpu itilization

2011-07-26 Thread Greg Smith

On 07/26/2011 01:47 PM, Filippos wrote:

we have a dedicated server (8.4.4, redhat) with 24 cpus and 36 GB or RAM. i
would say that the traffic in the server is huge and the cpu utilization is
pretty high too (avg ~ 75% except during the nights when is it much lower).
i am trying to tune the server a little bit to handle this problem. the
incoming data in the database are about 30-40 GB /day.
   


Well, the first question is where the CPU usage is coming from.  There 
are two basic schools of thought here:


1) Use real-time monitoring utilities like "top -c" and see what is 
gobbling time up.  It's possible to miss what's happening, but if you're 
at 75% a large chunk of the day that doesn't seem likely.


2) Set log_min_duration_statement and the other logging parameters; 
analyze the resulting log files to see where the CPU time is going.


You seem to be focused on the background writer and its checkpoint 
process right now.  That cannot be the source for high CPU usage; at 
most it could fully use one of your 24 cores.  You should fix 
wal_buffers to a reasonable value regardless, but your problem is not in 
that area.


Importing 30-40 GB/day is extremely difficult to do in PostgreSQL.  My 
guess is that most of the server time is spent running the data import 
process itself--even COPY, the most efficient way to get data in, is 
very CPU intensive.  The second layer of problems here that can increase 
CPU usage come from autovacuum taking up a lot of resources to run, 
which it will do all the time given this volume of activity.  And 
there's always the possibility that the queries you're running against 
the data are just taking a long time to execute.


Another layer of problems in this scenario you'll hit eventually is that 
you'll need to have your tables partitioned in order to prune old data 
out efficiently.  Presumably you can't keep up with that rate for very 
long before you have to start dropping older data, and that's really 
hard to do efficiently unless you've used partitions.


P.S. You should upgrade to PostgreSQL 8.4.8 as soon as possible.  There 
is a bug in autovacuum that's been resolved as of 8.4.6 that you are 
very likely to run into:  
http://www.postgresql.org/docs/8.4/static/release-8-4-6.html


P.P.S. The pgsql-performance list would be a more appropriate place to 
have this discussion at.  Some of the people who provide good input over 
there on topics like this don't read pgsql-general, too many messages on 
this list for them.



are there any suggestions what i can do to tune better the server? i can
provide any information you find relevant for the configuration of the
server, the OS, the storage etc
   


There's a chapter on each of these in my PostgreSQL performance book, 
and I'm not aware of any other resource that takes on all of these 
topics usefully.  If you're trying to keep up with this volume of data, 
buying a copy of that should repay itself in time savings--where you can 
look something up rather than trying to figure it out from 
scratch--about once every week.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Suggested enhancement to pg_restore

2011-07-26 Thread Michael Nolan
I suggest adding the following parameter to pg_restore:

--rename-table=

When used in conjunction with the --data-only, --schema and -t options (all
three of which would be necessary),
it would allow restoring a table (without indexes) to a different table name
(which would need to already exist
and match the structure of the table which is being restored, of course.)

This would give PostgreSQL users the ability to reload a table from a dump
file to a separate table name in the
same database and schema.

In other words, this command:

pg_restore --data-only --schema=abc -t xyz --rename-table=xyz_copy

would restore a copy of table xyz into the existing (and presumably empty)
table xyz_copy, leaving table xyz untouched.
--
Mike Nolan
no...@tssi.com


Re: [GENERAL] 100 times faster than mysql

2011-07-26 Thread Greg Smith

On 07/26/2011 10:02 AM, Allan Kamau wrote:

If the speed is to be measured purely (and simply) on these numbers,
186/12 yields 15.5 (or maybe 16 if your round it up or 15 if you use
integer division). May be about 15~16 times faster would be more in
line with numbers provided.
   


I guess he did the math on MySQL, too.  Could be worse; could have ran 
into http://bugs.mysql.com/bug.php?id=33704 which, as you can see, is 
totally not a bug.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Suggested enhancement to pg_restore

2011-07-26 Thread Chris Travers
On Tue, Jul 26, 2011 at 3:48 PM, Michael Nolan  wrote:
> I suggest adding the following parameter to pg_restore:
>
> --rename-table=
>
> When used in conjunction with the --data-only, --schema and -t options (all
> three of which would be necessary),
> it would allow restoring a table (without indexes) to a different table name
> (which would need to already exist
> and match the structure of the table which is being restored, of course.)

Does pg_restore allow you to specify a set of tables the same way
pg_dump does, i.e. by -t table1 -t table2?

If so how would this feature play along?

Best Wishes,
Chris Travers

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] 100 times faster than mysql

2011-07-26 Thread Chris Travers
On Tue, Jul 26, 2011 at 3:53 PM, Greg Smith  wrote:

> I guess he did the math on MySQL, too.  Could be worse; could have ran into
> http://bugs.mysql.com/bug.php?id=33704 which, as you can see, is totally not
> a bug.
>

Or transactions deadlocking against themselves.

Best Wishes,
Chris Travers

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general