Re: [GENERAL] Implementing "thick"/"fat" databases
On 07/25/2011 06:24 PM, Chris Travers wrote: On Sun, Jul 24, 2011 at 11:53 PM, Sim Zacks wrote: The goal is to make our system client agnostic, Most of our GUI is written in wxpython, we also have some web functions and even a barcode terminal function, written in C#. We would like to use an application server, so that all the code is run on the server but we don't want to be web-based. I don't want to have the same business logic written into various clients as that causes maintenance headaches. Ok. So we are talking about a multi-application database, and you are trying to provide some degree of consistency in business logic across the software applications (and hence uses of your data). So far, so good. Agreed to put anything that might need to be in common. The way that postgresql works is that each session is its own process. That means that from the server perspective (at least in most ways), my plpython function is not connected to any other call on the database. My goal is to allow any client to request functionality and have it execute the same way every time. So I assume that means using Pl/Mono to make your barcode stuff work in the db too? No need for PL/Mono or any other client specific language. The GUI should be dumb, so all I really need to program design is the interface and input output methods. When you push a button, it should call the appropriate function. The functions on the barcode terminal (which is a Windows Mobile platform) can also be run on the client application written in wxpython. Keeping as much as possible off of the client allows me to share the work and be sure that both clients do the exact same thing when the user pushes the button. There is always functionality that needs to be on the client, however in my experience with multi-client applications, if it is business logic you will end up duplicating it and then needing to maintain 2 copies of the same thing. Data logic is built in by use of constraints and triggers and some functions, business logic is built only into functions. I guess I am using "data logic" more broadly than you.Any select, update, or insert statement against a relation (except, of course, simple function calls) is part of that data logic, and there are supporting dependencies too, such as security and permissions. IOW, I think a lot of 'business logic' is data logic. I separate this with what you actually use the data for. IOW, Generate a report with the following format is data logic (even perhaps, and email it to the head of product ordering), but use this report to decide what inventory to order is application logic (you probably want people making these decisions). Similarly "Generate me a list of invoices to be paid in the following format": is data logic, but selecting those invoices to be paid, entering the data, determining what bank account you want to pay them from etc. is application logic. Recording the payments is again data logic. Application logic might be better thought of as workflow logic. So much of human interaction is definition of terms. Now I understand what you mean by data logic. To me data logic is strictly whether the data will be considered corrupt if the database allows an action to occur. For example, not enforcing relationships. If you erase the parent the child has no meaning and therefore you have data in the system which is suspect. As my db is designed with the business rules in mind, some of the business logic is included in the data logic, for example, 3 tables must be updated at the same time in order for the transaction to be considered valid. This is taken care of either through a function or triggers. An example of Business Logic is when my stock is updated to 0, I want the record deleted. This is done through a rule that on update that changes the value to 0 delete instead. In other words, our postgresql server is a hybrid database/application server. Right. One thing we are seeing in LedgerSMB as we go through this is that the database does indeed become an application server. The areas which are taking the new approach are mostly SQL and HTML templates (in TemplateToolkit), while the Perl code is shrinking pretty fast. I guess what I am wondering is if you are using it as an application server, is there a benefit to trying to put everything in the database? Or just those parts which necessarily follow from the database independent of workflow, etc or are likely to be common across diverse applications hitting your server? If so, what is it? The benefits of putting everything into the database is having one platform to maintain. Also, anything that can talk to the database can call the functions, so we have a defined interface for all our functions. If I wanted to use JBOSS, for example, how would my wxpython or my c# app talk to it? IOW, I did not find a middleware that was more appropriate for a multi-client/multi-platform environm
Re: [GENERAL] pgsql error
Thank you for the reply. At command line, I ran... "psql --version" and received.. "psql (PostgreSQL) 7.5devel" The database is sitting on a Windows 2003 Server box. A mapping application, wrote in PHP, runs with Apache 2.05 I know in the past, the project manager would restart the database by just closing the .bat window, then restart by double-clicking the postgis.bat file on the desktop. I'm not sure if this was the beginning of the problem. I've learned to shutdown the database by "Ctrl C". This batch file has the following... cd c:\ cd ms4w/apps/pgsql75win/data/ del postmaster.pid @ECHO OFF set PATH=%PATH%; \ms4w\apps\pgsql75win\lib;\ms4w\apps\pgsql75win\bin;\ms4w\apps\pgsql75win\share\contrib cd c:\ cd ms4w/apps/pgsql75win/ cmd /c "postmaster -D \ms4w\apps\pgsql75win\\data" I hope this will give you some clues. John -Original Message- From: Tom Lane [mailto:t...@sss.pgh.pa.us] Sent: Monday, July 25, 2011 11:20 PM To: Merlin Moncure Cc: Mcleod, John; pgsql-general@postgresql.org Subject: Re: [GENERAL] pgsql error Merlin Moncure writes: > On Mon, Jul 25, 2011 at 3:05 PM, Mcleod, John wrote: >> I'm receiving the following error >> CONTEXT: writing block 614 of relation 394198/412175 >> WARNING: could not write block 614 of 394198/412175 >> DETAIL: Multiple failures --- write error may be permanent. >> ERROR: xlog flush request 0/34D53680 is not satisfied --- flushed >> only to >> 0/34CD1EB0 > This is a fairly low level error that is telling you the WAL could not > be written out. Out of drive space? Data corruption? Yeah, this looks like the detritus of some previous failure. There are basically two possibilities: 1. The problem page's LSN field has gotten trashed so that it appears to be past the end of WAL. 2. The page actually did get updated by a WAL entry with that LSN, and then there was a crash for some reason, and the database tried to recover by replaying WAL, and it hit some problem that caused it to stop recovering before what had really been the end of WAL. So now it thinks the end of WAL is 0/34CD1EB0, but there are page(s) out there with LSNs past that, and when it finds one you start getting complaints like this. I doubt theory #1, though, because there are nearby fields in a page header that evidently weren't trashed or else the page would have been recognized as being corrupt. Also the reported LSN is not very far past end of WAL, which would be unlikely in the event of random corruption. So I'm betting on #2. Unfortunately this tells us little about either the cause of the original crash, or the reason why recovery didn't work properly. We'd need a lot more information before speculating about that, for starters the exact Postgres version and the platform it's running on. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pgsql error
On Tue, Jul 26, 2011 at 7:47 AM, Mcleod, John wrote: > Thank you for the reply. > > At command line, I ran... > "psql --version" > and received.. > "psql (PostgreSQL) 7.5devel" > > The database is sitting on a Windows 2003 Server box. > A mapping application, wrote in PHP, runs with Apache 2.05 ok -- first you reported the version # of the client, not the back end. let's hope they are different. you are running an early development build of postgres 8.0 -- I know this because about 3/4 the way through the development cycle the version # changed from 7.5 to 8.0. This is essentially a completely unpactched, riddled with known bugs, alpha build. Prevailing wisdom is that the earliest good build to use on windows is 8.2 or maybe 8.3. Right now 100% of your energy should be devoted to double checking the version# against the backend (postgres -- version) and, if it is also 7.5 devel, capturing a good backup if you don't have one already and upgrading ASAP. merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] 100 times faster than mysql
http://codesynthesis.com/~boris/blog/2011/07/26/odb-1-5-0-released/ merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] 100 times faster than mysql
On Tue, Jul 26, 2011 at 4:41 PM, Merlin Moncure wrote: > http://codesynthesis.com/~boris/blog/2011/07/26/odb-1-5-0-released/ > > merlin > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > "The surprising part is that PostgreSQL 9.0.4 is more than 100 times faster on this test than MySQL 5.1.49 with the InnoDB backend (186s for MySQL, 48s for SQLite, and 12s for PostgreSQL). Postgre developers seem to be doing something right." If the speed is to be measured purely (and simply) on these numbers, 186/12 yields 15.5 (or maybe 16 if your round it up or 15 if you use integer division). May be about 15~16 times faster would be more in line with numbers provided. Allan. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pgsql error
"Mcleod, John" writes: > Thank you for the reply. > At command line, I ran... > "psql --version" > and received.. > "psql (PostgreSQL) 7.5devel" Egad. That is an early development snapshot of what eventually got called the 8.0 release. You should try "select version();" in psql to verify that the server itself is the same version, but I'm betting it is. What you've got there is a development snapshot from perhaps mid-2004, and even if it were a supported release, we dropped support for it a couple years ago. > The database is sitting on a Windows 2003 Server box. Even worse. 8.0 was the first release that had native Windows support (so that probably explains why your predecessor tried to use it pre-release). The number and extent of the bugs in that is, well, remarkable. Given this information, what's remarkable is not that your DB got corrupted but that it survived this long without that. I think your best bet is (1) pg_dump as much data as you can, (2) reinstall a reasonably recent, supported PG version, (3) reload the dump. > I know in the past, the project manager would restart the database by just > closing the .bat window, then restart by double-clicking the postgis.bat file > on the desktop. > I'm not sure if this was the beginning of the problem. Sure didn't help any ... in principle the DB ought to withstand that, but it's not a clean shutdown; and in the case of early Windows versions in particular I'm not sure we understood how to do fsyncs correctly on that platform. I'm not a Windows person myself, but I believe the recent EDB packagings of Postgres offer a much cleaner user interface than that. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Implementing "thick"/"fat" databases
On Tue, Jul 26, 2011 at 1:04 AM, Sim Zacks wrote: > No need for PL/Mono or any other client specific language. The GUI should be > dumb, so all I really need to program design is the interface and input > output methods. When you push a button, it should call the appropriate > function. The functions on the barcode terminal (which is a Windows Mobile > platform) can also be run on the client application written in wxpython. > Keeping as much as possible off of the client allows me to share the work > and be sure that both clients do the exact same thing when the user pushes > the button. > > There is always functionality that needs to be on the client, however in my > experience with multi-client applications, if it is business logic you will > end up duplicating it and then needing to maintain 2 copies of the same > thing. I am not sure it applies to all business logic. For example suppose the product team and the sales team both need to be able to be assigned tickets relating to customer feedback. They may have very different rules and processes for dealing with that. There is commonality (how the data is stored, retrieved, presented to the application), but if you want to enforce the different rules, you are going to have to put the differences in business logic somewhere. > > So much of human interaction is definition of terms. Now I understand what > you mean by data logic. > > To me data logic is strictly whether the data will be considered corrupt if > the database allows an action to occur. For example, not enforcing > relationships. If you erase the parent the child has no meaning and > therefore you have data in the system which is suspect. > As my db is designed with the business rules in mind, some of the business > logic is included in the data logic, for example, 3 tables must be updated > at the same time in order for the transaction to be considered valid. This > is taken care of either through a function or triggers. > > An example of Business Logic is when my stock is updated to 0, I want the > record deleted. > This is done through a rule that on update that changes the value to 0 > delete instead. Ok, that's a pretty simple example of logic that belongs in the database. > > The benefits of putting everything into the database is having one platform > to maintain. Also, anything that can talk to the database can call the > functions, so we have a defined interface for all our functions. If I wanted > to use JBOSS, for example, how would my wxpython or my c# app talk to it? That's not really what I am getting at. The question could be better phrased: Why put workflow-specific logic in the database (your interactive query idea is an example of that)? Why not use the db procedures to create a well-formed API which enforces business logic consistently across applications, but allows the applications to manage their own workflow? Isn't this what you'd want to use middleware for if you were going this route? > IOW, I did not find a middleware that was more appropriate for a > multi-client/multi-platform environment then the database server. > Another big advantage of using the database for business logic, is that the > plpython functions can be called from other functions. For example, in the > python prompt function I posted earlier, I can have an plpgsql function say > if python_prompt('Are you sure?"') then > process > end if So if arbitrary model function calls workflow functions, does that mean you have already broken out of the MVC approach? I mean does the MVC approach allow the model to control the controller? I guess the approach I take is to put as much of the model in the database as possible, and put the controllers on the client. Deciding when to prompt the user for input seems to me it should be a controller function. > Any client that has the ability to listen to a port will be able to use this > functionality. > In C# for example, I would use System.Net.Sockets and the TcpListener class > In Python I would use the socketserver module Any application also has an ability to call database functions as an API as well. > What we do today (the reason I am playing with this interactivity) is have > the database return an error code, which the client understands to be a > question. It then prompts the user and the query is sent again with the > answer as a parameter. It works but it is ugly (IMO). That's what we do with LedgerSMB and the user creation workflow. I think it has the advantage of a cleaner API. For example, our function definition for the save_user function is: CREATE OR REPLACE FUNCTION admin__save_user( in_id int, in_entity_id INT, in_username text, in_password TEXT, in_import BOOL ) returns int AS $$ There are some nice things about this. It allows the client application to specify at the outset whether the prospective user is to be imported into the application (i.e. created as a user of the application without
[GENERAL] mac installer on Lion
Wondering in their is an ETA on a dmg installer for the mac that does not fail on Lion? Best, -Travis
Re: [GENERAL] pgsql error
When I go to "C:\ms4w\apps\pgsql75win\bin" in the command line, and run "postgres --version", I get the following... "postgres (PostgreSQL) 7.5devel" John -Original Message- From: Merlin Moncure [mailto:mmonc...@gmail.com] Sent: Tuesday, July 26, 2011 9:40 AM To: Mcleod, John Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] pgsql error On Tue, Jul 26, 2011 at 7:47 AM, Mcleod, John wrote: > Thank you for the reply. > > At command line, I ran... > "psql --version" > and received.. > "psql (PostgreSQL) 7.5devel" > > The database is sitting on a Windows 2003 Server box. > A mapping application, wrote in PHP, runs with Apache 2.05 ok -- first you reported the version # of the client, not the back end. let's hope they are different. you are running an early development build of postgres 8.0 -- I know this because about 3/4 the way through the development cycle the version # changed from 7.5 to 8.0. This is essentially a completely unpactched, riddled with known bugs, alpha build. Prevailing wisdom is that the earliest good build to use on windows is 8.2 or maybe 8.3. Right now 100% of your energy should be devoted to double checking the version# against the backend (postgres -- version) and, if it is also 7.5 devel, capturing a good backup if you don't have one already and upgrading ASAP. merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Implementing "thick"/"fat" databases
On Mon, Jul 25, 2011 at 5:21 PM, Chris Travers wrote: > On Mon, Jul 25, 2011 at 12:33 PM, Merlin Moncure wrote: > >> exactly. procedural middlewares written in languages like java tend to >> be bug factories: >> *) over-(mis-)use of threads >> *) performance wins moving logic outside the database to scale it are >> balanced out by the extra traffic > > Well, typically you have other performance issues other than the > traffic. A well written SQL query can do more, and more efficiently, > than most application programmers realize. The tendency to move stuff > out of the database IMHO often occurs at the same time the SQL code > itself ends up being less efficient. A lot of things (aggregates for > example) use of CPU cycles on the db server that could probably be > saved by moving things out of the db, but only at the cost of memory > usage. > > I have seen CRAZY stuff coming out of middleware and I tend to suspect > that the "move it out of the db" crowd sometimes tend to be stuck > thinking in their languages and thus unable to take advantage of good, > well written SQL. > > I have seen Perl code be used instead of HAVING clauses. I have seen > loops through query results, firing other queries. Ok, to be > fair, I would call that program as something way below average in > quality but still. > >> *) database concurrency issues >> *) OO representation of data is brittle and inflexible > > And the DB ends up being designed around the ORM... Which sucks... > >> *) extremely (and IMNSHO unnecessarily) verbose >> *) framework all over the place >> *) libraries all over the place > > generally agreed. > >> >> On the plus side though, languages like java have huge pools of >> available talent and excellent tools. These factors are *critical* >> for many IT companies. plpgsql may be the bee's knee's (it is) but >> DIY tools and methodologies tends to translate directly to a high cost >> of labor, and application developers used to rich tool environments >> tend to really dislike code practices that pl/pgsql requires like >> debugging by logging and remembering where stuff is and what it does. > > I dunno. It depends on how you use plpgsql. > > The approach we have taken in the LedgerSMB project is to use stored > procs basically as what has been called "named queries." We try to > put as much as possible into single SQL statements as possible, and we > extensively use PLPGSQL's syntactic sugar to make things more > manageable (a very large number of our stored procs, probably a clear > majority, could be written in SQL). > > I think that if you come up with all these excellent tools, the > quality of code will go down and you will end up with something that > approximates most of the Middleware problems today. IMHO, the best > use for PLPGSQL is that of straight SQL queries with just a little > procedural logic around it. There are exceptions due to the fact that > utility statements are unparameterized.. But for the most part, > that is what I have found to work best. > >> >> plpgsql is a niche language that tends to attract the same really >> smart people who write code in a more functional style. When done >> well, you can do a lot with a very small amount of code. I would love >> to see stronger validation features (with a higher emphasis on >> warnings) and better tools/editors to help bring plpgsql to the >> mainstream. > > What kinds of tools/editors do you think we need? A heck of us really > like VIM or EMACS (just don't ask which is best ;-) ). Of course I pretty much agree on all points but you are preaching to the choir. You and I both code in a spartan, console heavy manner. Of course we prefer plpgsql because over the course of years of database application programming we've realized the advantages of first class queries and having code running tightly coupled with the data. Unfortunately when I talk about these concepts to the majority of programmers I've worked with over the years they look at me as if I just got off my spacecraft from Neptune. I think I've been cursed due to have discovered the secret to fast, efficient programming while continually being constrained from developing that way. Then again, most programmers probably feel like that :-). merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] variant column type
Hello, suppose the following scenario the car speed is 240 the car has an airbag Here the first value is integer and the second value is boolean. Consider that I have this table structure feature (feature id feature name) car (car id, ) car_feature (car id, feature id, value). the value attribute might have different domains. How can I model this using postgres and using ANSI compliant design ? Regards
Re: [GENERAL] variant column type
On 07/26/11 10:02 AM, salah jubeh wrote: and using ANSI compliant design American National Standards Institute? they have an ANSI standard on database schema design or something? -- john r pierceN 37, W 122 santa cruz ca mid-left coast -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] variant column type
Hello John, I mean ANSI SQL 92 complaint, if I am not mistaken. One solution to this problem is to use something like hstore. but it has some disadvantages in my application so I want another opinion. Regards From: John R Pierce To: pgsql-general@postgresql.org Sent: Tue, July 26, 2011 7:10:47 PM Subject: Re: [GENERAL] variant column type On 07/26/11 10:02 AM, salah jubeh wrote: > and using ANSI compliant design American National Standards Institute? they have an ANSI standard on database schema design or something? -- john r pierceN 37, W 122 santa cruz ca mid-left coast -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] variant column type
On Jul 26, 2011, at 10:02 AM, salah jubeh wrote: > > Hello, > > suppose the following scenario > > the car speed is 240 > the car has an airbag > > Here the first value is integer and the second value is boolean. Consider > that I have this table structure > > feature (feature id feature name) > car (car id, ) > car_feature (car id, feature id, value). the value attribute might have > different domains. How can I model this using postgres and using ANSI > compliant design ? > You haven't been very clear but it sounds like maybe you're asking how to get both EAV "flexibility" and type safety? There isn't any good way to do that but you might consider something ugly, like: car (id, ...) car_boolean_features (car_id, ) car_int_features (car_id, ...) A better thing to consider, of course, is how you might avoid the need for anything like EAV in the first place.
Re: [GENERAL] variant column type
in general, attribute-value sorts of lists are very difficult to use for relational operations and result in clumsy inefficient queries, as well as poor data integrity. whenever possible common attributes shoudl be stored properly as table fields. reserve EAV for highly sparse freeform information that could not have been anticipated at design time. for your example, all cars have a speed, and do/don't have an airbag, so these should be normal fields in a table. -- john r pierceN 37, W 122 santa cruz ca mid-left coast -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] heavy load-high cpu itilization
Dear all first of all congratulations on your greak work here since from time to time i 've found many answers to my problems. unfortunately for this specific problem i didnt find much relevant information, so i would ask for your guidance dealing with the following situation: we have a dedicated server (8.4.4, redhat) with 24 cpus and 36 GB or RAM. i would say that the traffic in the server is huge and the cpu utilization is pretty high too (avg ~ 75% except during the nights when is it much lower). i am trying to tune the server a little bit to handle this problem. the incoming data in the database are about 30-40 GB /day. at first the checkpoint_segments were set to 50, the checkpoint_timeout at 15 min and the checkpoint_completion_target was 0.5 sec. i noticed that the utilization of the server was higher when it was close to making a checkpoint and since the parameter of full_page_writes is ON , i changed the parameters mentioned above to (i did that after reading a lot of stuff online): checkpoint_segments->250 checkpoint_timeout->40min checkpoint_completion_target -> 0.8 but the cpu utilization is not significantly lower. another parameter i will certainly change is the wal_buffers which is now set at 64KB and i plan to make it 16MB. can this parameter cause a significant percentage of the problem? are there any suggestions what i can do to tune better the server? i can provide any information you find relevant for the configuration of the server, the OS, the storage etc thank you in advance -- View this message in context: http://postgresql.1045698.n5.nabble.com/heavy-load-high-cpu-itilization-tp4635696p4635696.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] variant column type
the car speed is 240 the car has an airbag Here the first value is integer and the second value is boolean. Consider that I have this table structure feature (feature id feature name) car (car id, ) car_feature (car id, feature id, value). the value attribute might have different domains. How can I model this using postgres and using ANSI compliant design ? Regards Given "feature" and "car-feature" tables the presence of absence of an entry in "car-feature" will accomplish your desire for true/false - i.e., "the car has an airbag". By abstracting just a little every "feature" can be boiled down to a label/ID and then the added to "feature" and associated via "car-feature". In your example you could create a feature called "Top Speed - 240kph" If every car is going to have a particular "feature" and only the "value" matters you could considering adding a "car-properties" table: car_property (car id, top_speed, etc.) and populate the top_speed column with whatever value is applicable or leave it NULL if unknown or N/A. The relationship between "car" and "car_property" would be one-to-one (1-to-1) Dave J.
Re: [GENERAL] variant column type
salah jubeh, 26.07.2011 19:02: Hello, suppose the following scenario the car speed is 240 the car has an airbag Here the first value is integer and the second value is boolean. Consider that I have this table structure feature (feature id feature name) car (car id, ) car_feature (car id, feature id, value). the value attribute might have different domains. How can I model this using postgres and using ANSI compliant design ? Regards Have a look at the hstore contrib module. It allows you to store key/value pairs (lots of them) in a single column. create table car ( car_id integer, features hstore ); insert into car (car_id, features) values (1, 'speed => 240, airbag => true'); insert into car (car_id, features) values (2, 'speed => 140, airbag => false'); insert into car (car_id, features) values (3, 'speed => 140, flux_capacitor => true'); -- show the airbag attribute for all cars -- will return null for those that don't have that attribute select car_id, (features -> 'airbag') as airbag_flag from car; -- return all rows that have an attribute named flux_capacitor with the value true select * from car where features @> ('flux_capacitor => true') Note that the only drawback of this solution is that you don't have any datatypes for the attributes and you can't create a foreign key constraint to a "feature" table. But it's probably the most flexible way to deal with such a requirement in Postgres. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] variant column type
On Tue, Jul 26, 2011 at 11:06 AM, David Johnston wrote: > Given “feature” and “car-feature” tables the presence of absence of an entry > in “car-feature” will accomplish your desire for true/false - i.e., “the car > has an airbag”. By abstracting just a little every “feature” can be boiled > down to a label/ID and then the added to “feature” and associated via > “car-feature”. > > In your example you could create a feature called “Top Speed – 240kph” > > If every car is going to have a particular “feature” and only the “value” > matters you could considering adding a “car-properties” table: > > car_property (car id, top_speed, etc…) and populate the top_speed column > with whatever value is applicable or leave it NULL if unknown or N/A. The > relationship between “car” and “car_property” would be one-to-one (1-to-1) > I don't like this approach for a couple of reasons. 1) Storing non-applicable and unknowns as interchangeable in a database schema introduces semantic ambiguity issues that are best avoided if possible. 2) While wide tables win in terms of supporting more complex constraints, they lose in terms of storage, etc. I would personally create three tables: 1) car (id, top_speed, ec) 2) features (id, feature_name, etc) 3) car_has_feature (car_id, feature_id) This has the benefits of allowing you to track additional information about features. For example, you could track that seatbelts are required in Washington State on all cars manufactured after a certain date. The array functions in PostgreSQL are powerful enough to handle queries of features by car pretty well, or that federal law requires that certain airbag features are required. Now, there are a few cases however where key-value-mapping is both necessary and works and where variant column types are needed (for example, storing application settings, or argument lists for the functions that menu items call). In those cases you have to have somewhere that knows what the type is supposed to be and checks it. That's really not a trivial problem because keeping things to a single point of truth approach is very difficult with such relatively unstructured data, which is why in applications where I have to do this, we require that the table be updated through stored procedures which do this checking. Best Wishes, Chris Travers -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] variant column type
-Original Message- From: Chris Travers [mailto:chris.trav...@gmail.com] Sent: Tuesday, July 26, 2011 2:32 PM To: David Johnston Cc: salah jubeh; pgsql Subject: Re: [GENERAL] variant column type > In your example you could create a feature called Top Speed 240kph > > If every car is going to have a particular feature and only the value > matters you could considering adding a car-properties table: > > car_property (car id, top_speed, etc ) and populate the top_speed > column with whatever value is applicable or leave it NULL if unknown > or N/A. The relationship between car and car_property would be > one-to-one (1-to-1) > I don't like this approach for a couple of reasons. 1) Storing non-applicable and unknowns as interchangeable in a database schema introduces semantic ambiguity issues that are best avoided if possible. 2) While wide tables win in terms of supporting more complex constraints, they lose in terms of storage, etc. -- Agreed. But I was suggesting using the "wide-table" in addition to the "car;feature;car_has_feature" tables. Basically limit the extended table to those properties that are truly (or at least almost truly) global. There should only be a few fields. The fact that the car has a top-speed can be assumed to be global and thus calling it a "feature" is possibly abstracting things too much. If you need to display it in a "feature list" you can readily write a VIEW that will pull out that integer value from the extended table, convert it into a meaningful "name/description", and present it as a list of "Fixed Features". My main concern with the whole "feature" table is you end up going down the path of everything being a "feature" - the VIN, Make, Model, Year - where in most sane cases you'd be better off having fields for those fields since every car has one. And so, while I say use an "wide-table" to capture some of these additional values you can just add the "top-speed" field to the main car table. In this specific example there is not semantic ambiguity since we know that a car has a top-speed and so a NULL must represent an UNKNOWN value. If the NULL could represent "Not Applicable" I would probably leave it to the "feature" table. David J. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] select all rows where any column is NULL
Does anyone have a concise way of doing $SUBJECT? Best, Nathan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] select all rows where any column is NULL
Copy and paste the column names from the "CREATE TABLE" statement... You can try using information schema and building a dynamic SQL query inside a function... If you let people know WHAT you are trying to accomplish you may find you get alternative suggestions that you never considered. SQL itself is designed for a variable/unknown numbers of rows but a fixed/known number of columns. The only way around that is to use a "table of column names" to identify the columns and then dynamically create a query with those "fixed/known" columns and "EXECUTE" it within a PL/PGSQL function. David J. -Original Message- From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Nathan Boley Sent: Tuesday, July 26, 2011 3:09 PM To: PostgreSQL general Subject: [GENERAL] select all rows where any column is NULL Does anyone have a concise way of doing $SUBJECT? Best, Nathan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] variant column type
On Tue, 26 Jul 2011 10:45:27 -0700, John R Pierce wrote: in general, attribute-value sorts of lists are very difficult to use for relational operations and result in clumsy inefficient queries, as well as poor data integrity. whenever possible common attributes shoudl be stored properly as table fields. reserve EAV for highly sparse freeform information that could not have been anticipated at design time. for your example, all cars have a speed, and do/don't have an airbag, so these should be normal fields in a table. -- john r pierceN 37, W 122 santa cruz ca mid-left coast Everything above is true and. Database table is like C struct, no inheritance. If you have common attributes per some class, but no all cars have same class, you may create "extending" table with those attributes as columns, and then join it with car. Currently I work on project with design car 1..* features. It's painful. Many features id's hard-coded, no contract programming (no support from compiler, etc. I use O-R libraries, and I can't even write car.speed! Regards, Radek -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] select all rows where any column is NULL
On Tue, Jul 26, 2011 at 2:08 PM, Nathan Boley wrote: > Does anyone have a concise way of doing $SUBJECT? select * from foo where (row((foo).*) = row((foo).*)) is null; merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] 100 times faster than mysql
On Tue, 26 Jul 2011 17:02:12 +0300, Allan Kamau wrote: On Tue, Jul 26, 2011 at 4:41 PM, Merlin Moncure wrote: http://codesynthesis.com/~boris/blog/2011/07/26/odb-1-5-0-released/ merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general "The surprising part is that PostgreSQL 9.0.4 is more than 100 times faster on this test than MySQL 5.1.49 with the InnoDB backend (186s for MySQL, 48s for SQLite, and 12s for PostgreSQL). Postgre developers seem to be doing something right." If the speed is to be measured purely (and simply) on these numbers, 186/12 yields 15.5 (or maybe 16 if your round it up or 15 if you use integer division). May be about 15~16 times faster would be more in line with numbers provided. Allan. Congratulations, but those number are little bit unrealistic. 50 micro seconds per query - maybe they use ODB caching? Regards, Radek -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] heavy load-high cpu itilization
On 07/26/2011 01:47 PM, Filippos wrote: we have a dedicated server (8.4.4, redhat) with 24 cpus and 36 GB or RAM. i would say that the traffic in the server is huge and the cpu utilization is pretty high too (avg ~ 75% except during the nights when is it much lower). i am trying to tune the server a little bit to handle this problem. the incoming data in the database are about 30-40 GB /day. Well, the first question is where the CPU usage is coming from. There are two basic schools of thought here: 1) Use real-time monitoring utilities like "top -c" and see what is gobbling time up. It's possible to miss what's happening, but if you're at 75% a large chunk of the day that doesn't seem likely. 2) Set log_min_duration_statement and the other logging parameters; analyze the resulting log files to see where the CPU time is going. You seem to be focused on the background writer and its checkpoint process right now. That cannot be the source for high CPU usage; at most it could fully use one of your 24 cores. You should fix wal_buffers to a reasonable value regardless, but your problem is not in that area. Importing 30-40 GB/day is extremely difficult to do in PostgreSQL. My guess is that most of the server time is spent running the data import process itself--even COPY, the most efficient way to get data in, is very CPU intensive. The second layer of problems here that can increase CPU usage come from autovacuum taking up a lot of resources to run, which it will do all the time given this volume of activity. And there's always the possibility that the queries you're running against the data are just taking a long time to execute. Another layer of problems in this scenario you'll hit eventually is that you'll need to have your tables partitioned in order to prune old data out efficiently. Presumably you can't keep up with that rate for very long before you have to start dropping older data, and that's really hard to do efficiently unless you've used partitions. P.S. You should upgrade to PostgreSQL 8.4.8 as soon as possible. There is a bug in autovacuum that's been resolved as of 8.4.6 that you are very likely to run into: http://www.postgresql.org/docs/8.4/static/release-8-4-6.html P.P.S. The pgsql-performance list would be a more appropriate place to have this discussion at. Some of the people who provide good input over there on topics like this don't read pgsql-general, too many messages on this list for them. are there any suggestions what i can do to tune better the server? i can provide any information you find relevant for the configuration of the server, the OS, the storage etc There's a chapter on each of these in my PostgreSQL performance book, and I'm not aware of any other resource that takes on all of these topics usefully. If you're trying to keep up with this volume of data, buying a copy of that should repay itself in time savings--where you can look something up rather than trying to figure it out from scratch--about once every week. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Suggested enhancement to pg_restore
I suggest adding the following parameter to pg_restore: --rename-table= When used in conjunction with the --data-only, --schema and -t options (all three of which would be necessary), it would allow restoring a table (without indexes) to a different table name (which would need to already exist and match the structure of the table which is being restored, of course.) This would give PostgreSQL users the ability to reload a table from a dump file to a separate table name in the same database and schema. In other words, this command: pg_restore --data-only --schema=abc -t xyz --rename-table=xyz_copy would restore a copy of table xyz into the existing (and presumably empty) table xyz_copy, leaving table xyz untouched. -- Mike Nolan no...@tssi.com
Re: [GENERAL] 100 times faster than mysql
On 07/26/2011 10:02 AM, Allan Kamau wrote: If the speed is to be measured purely (and simply) on these numbers, 186/12 yields 15.5 (or maybe 16 if your round it up or 15 if you use integer division). May be about 15~16 times faster would be more in line with numbers provided. I guess he did the math on MySQL, too. Could be worse; could have ran into http://bugs.mysql.com/bug.php?id=33704 which, as you can see, is totally not a bug. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Suggested enhancement to pg_restore
On Tue, Jul 26, 2011 at 3:48 PM, Michael Nolan wrote: > I suggest adding the following parameter to pg_restore: > > --rename-table= > > When used in conjunction with the --data-only, --schema and -t options (all > three of which would be necessary), > it would allow restoring a table (without indexes) to a different table name > (which would need to already exist > and match the structure of the table which is being restored, of course.) Does pg_restore allow you to specify a set of tables the same way pg_dump does, i.e. by -t table1 -t table2? If so how would this feature play along? Best Wishes, Chris Travers -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] 100 times faster than mysql
On Tue, Jul 26, 2011 at 3:53 PM, Greg Smith wrote: > I guess he did the math on MySQL, too. Could be worse; could have ran into > http://bugs.mysql.com/bug.php?id=33704 which, as you can see, is totally not > a bug. > Or transactions deadlocking against themselves. Best Wishes, Chris Travers -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general