If you're not Scott Daniels, beware that this conversation has gone horribly off topic and, unless you have an interest in PostreSQL, you may not want to bother reading on...
On Oct 25, 2007, at 9:46 PM, Scott David Daniels wrote: > Erik Jones wrote: >> >> On Oct 25, 2007, at 7:28 AM, Scott David Daniels wrote: >>> Diez B. Roggisch wrote: >>>> Abandoned wrote: >>>>> Hi.. >>>>> I use the threading module for the fast operation. But .... >>> [in each thread] >>>>> def save(a,b,c): >>>>> cursor.execute("INSERT INTO ... >>>>> conn.commit() >>>>> cursor.execute(...) >>>>> How can i insert data to postgresql the same moment ?... >>>> >>>> DB modules aren't necessarily thread-safe. Most of the times, a >>>> connection (and ... cursor) can't be shared between threads. >>>> So open a connection for each thread. >>> >>> Note that your DB server will have to "serialize" your inserts, so >>> ... a single thread through a single connection to the DB is the way >>> to go. Of course it (the DB server) may be clever enough to behave >>> "as if" they are serialized, but most of your work parallelizing at >>> your end simply creates new work at the DB server end. >> >> Fortunately, in his case, that's not necessarily true.... If he >> goes the recommended route with a separate connection for each >> thread, >> then Postgres will not serialize multiple inserts coming from >> separate >> connections unless there is something like and ALTER TABLE or REINDEX >> concurrently happening on the table. >> The whole serialized inserts thing is strictly something popularized >> by MySQL and is by no means necessary or standard (as with a lot of >> MySQL). > > But he commits after every insert, which _does_ force serialization > (if > only to provide safe transaction boundaries). I understand you can > get > clever at how to do it, _but_ preserving ACID properties is exactly > what > I mean by "serialize," First, bad idea to work with your own definition of a very domain specific and standardized term. Especially when Postgres's Multi- Version Concurrency Control mechanisms are designed specifically for the purpose of preserve ACID compliance without forcing serialized transactions on the user. Second, unless he specifically sets his transaction level to serializable, he will be working in read-committed mode. What this specifically means is that two (or more) transactions writing to the same table will not block any of the others. Let's say the user has two concurrent inserts to run on the same table that, for whatever reason, take a while to run (for example, they insert the results of some horribly complex or inefficient select), if either is run in serializable mode then which ever one starts a fraction of a second sooner will run until completion before the second is even allowed to begin. In (the default) read-committed mode they will both begin executing as soon as they are called and will write their data regardless of conflicts. At commit time (which may be sometime later for transactions with multiple statements are used) is when conflicts are resolved. So, if between the two example transactions there does turn out to be a conflict betwen their results, whichever commits second will roll back and, since the data written by the second transaction will not be marked as committed, it will never be visible to any other transactions and the space will remain available for future transactions. Here's the relevant portion of the Postgres docs on all of this: http://www.postgresql.org/docs/8.2/interactive/mvcc.html > and while I like to bash MySQL as well as the > next person, I most certainly am not under the evil sway of the vile > MySQL cabal. Good to hear ;) > > The server will have to be able to abort each transaction > _independently_ of the others, and so must serialize any index > updates that share a page by, for example, landing in the same node > of a B-Tree. There is nothing inherent in B-Trees that prevents identical datum from being written in them. If there was the only they'd be good for would be unique indexes. Even if you do use a unique index, as noted above, constraints and conflicts are only enforced at commit time. Erik Jones Software Developer | Emma® [EMAIL PROTECTED] 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate & market in style. Visit us online at http://www.myemma.com -- http://mail.python.org/mailman/listinfo/python-list