Anyone know what is up with this? I have two queries here which return
the same results, one uses a left outer join to get some data from a
table which may not match a constraint, and one that uses a union to get
the data from each constraint and put them together. The second one
isn't nearly as
I'm getting a san together to consolidate my disk space usage for my
servers. It's iscsi based and I'll be pxe booting my servers from it.
The idea is to keep spares on hand for one system (the san) and not have
to worry about spares for each specific storage system on each server.
This also makes
Frank Wiles wrote:
> On Thu, 4 Jan 2007 15:00:05 -0300
> "Charles A. Landemaine" <[EMAIL PROTECTED]> wrote:
>
>> I'm building an e-mail service that has two requirements: It should
>> index messages on the fly to have lightening search results, and it
>> should be able to handle large amounts of s
Joshua D. Drake wrote:
> I agree. I have many people that want to purchase a SAN because someone
> told them that is what they need... Yet they can spend 20% of the cost
> on two external arrays and get incredible performance...
>
> We are seeing great numbers from the following config:
>
> (2) HP
Marcin Mank wrote:
>> So the question is why on a relatively simple proc and I getting a query
>> performance delta between 3549ms and 7ms?
>
> What version of PG is it?
>
> I had such problems in a pseudo-realtime app I use here with Postgres, and
> they went away when I moved to 8.1 (from 7.4).
Jim C. Nasby wrote:
>
> It can cause a race if another process could be performing those same
> inserts or updates at the same time.
There are inserts and updates running all of the time, but never the
same data. I'm not sure how I can get around this since the queries are
coming from my radius
Jim,
Thanks for the help. I went and looked at that example and I don't see
how it's different than the "INSERT into radutmp_tab" I'm already doing.
Both raise an exception, the only difference is that I'm not doing
anything with it. Perhaps you are talking about the "IF (NOT FOUND)" I
put afte
List,
I posted a little about this a while back to the general list, but never
really got any where with it so I'll try again, this time with a little
more detail and hopefully someone can send me in the right direction.
Here is the problem, I have a procedure that is called 100k times a day.
Mo
Jim C. Nasby wrote:
> On Wed, Dec 14, 2005 at 01:56:10AM -0500, Charles Sprickman wrote:
> You'll note that I'm being somewhat driven by my OS of choice, FreeBSD.
>
>>Unlike Solaris or other commercial offerings, there is no nice volume
>>management available. While I'd love to keep managing a
John A Meinel wrote:
> Surely this isn't what you have. You have *no* loop here, and you have
> stuff like:
> AND
> (bayes_token_tmp) NOT IN (SELECT token FROM bayes_token);
>
> I'm guessing this isn't your last version of the function.
>
> As far as putting the CREATE TEMP TABLE inside th
Matthew Schumacher wrote:
> Tom Lane wrote:
>
>
>>I don't really see why you think that this path is going to lead to
>>better performance than where you were before. Manipulation of the
>>temp table is never going to be free, and IN (sub-select) is always
>&g
Tom Lane wrote:
> I don't really see why you think that this path is going to lead to
> better performance than where you were before. Manipulation of the
> temp table is never going to be free, and IN (sub-select) is always
> inherently not fast, and NOT IN (sub-select) is always inherently
> aw
John A Meinel wrote:
> Matthew Schumacher wrote:
>
> I recommend that you drop and re-create the temp table. There is no
> reason to have it around, considering you delete and re-add everything.
> That means you never have to vacuum it, since it always only contains
> the late
Okay,
Here is the status of the SA updates and a question:
Michael got SA changed to pass an array of tokens to the proc so right
there we gained a ton of performance due to connections and transactions
being grouped into one per email instead of one per token.
Now I am working on making the pro
PFC wrote:
>
>
>> select put_tokens2(1, '{"\\246\\323\\061\\332\\277"}', 1, 1, 1);
>
>
> Try adding more backslashes until it works (seems that you need
> or something).
> Don't DBI convert the language types to postgres quoted forms on its
> own ?
>
Your right I am find
Tom Lane wrote:
>
> Revised insertion procedure:
>
>
> CREATE or replace FUNCTION put_tokens (_id INTEGER,
> _tokens BYTEA[],
> _spam_count INTEGER,
> _ham_count INTEGER,
> _at
Tom Lane wrote:
> Michael Parker <[EMAIL PROTECTED]> writes:
>
>>sub bytea_esc {
>> my ($str) = @_;
>> my $buf = "";
>> foreach my $char (split(//,$str)) {
>>if (ord($char) == 0) { $buf .= "000"; }
>>elsif (ord($char) == 39) { $buf .= "047"; }
>>elsif (ord($char) == 92) { $b
Ok, here is the current plan.
Change the spamassassin API to pass a hash of tokens into the storage
module, pass the tokens to the proc as an array, start a transaction,
load the tokens into a temp table using copy, select the tokens distinct
into the token table for new tokens, update the token t
Tom Lane wrote:
> I looked into this a bit. It seems that the problem when you wrap the
> entire insertion series into one transaction is associated with the fact
> that the test does so many successive updates of the single row in
> bayes_vars. (VACUUM VERBOSE at the end of the test shows it cl
Karim Nassar wrote:
>
> [EMAIL PROTECTED]:~/k-bayesBenchmark$ time ./test.pl
> <-- snip db creation stuff -->
> 17:18:44 -- START
> 17:19:37 -- AFTER TEMP LOAD : loaded 120596 records
> 17:19:46 -- AFTER bayes_token INSERT : inserted 49359 new records into
> bayes_token
> 17:19:50 -- AFTER bayes_
Ok, here is where I'm at, I reduced the proc down to this:
CREATE FUNCTION update_token (_id INTEGER,
_token BYTEA,
_spam_count INTEGER,
_ham_count INTEGER,
_atime INTEGER)
RETU
Andrew McMillan wrote:
>
> For the data in question (i.e. bayes scoring) it would seem that not
> much would be lost if you did have to restore your data from a day old
> backup, so perhaps fsync=false is OK for this particular application.
>
> Regards,
> And
Ok,
Here is something new, when I take my data.sql file and add a begin and
commit at the top and bottom, the benchmark is a LOT slower?
My understanding is that it should be much faster because fsync isn't
called until the commit instead of on every sql command.
I must be missing something here
Andrew McMillan wrote:
> On Thu, 2005-07-28 at 16:13 -0800, Matthew Schumacher wrote:
>
>>Ok, I finally got some test data together so that others can test
>>without installing SA.
>>
>>The schema and test dataset is over at
>>http://www.aptalaska.net/~matt.
Gavin Sherry wrote:
>
> I had a look at your data -- thanks.
>
> I have a question though: put_token() is invoked 120596 times in your
> benchmark... for 616 messages. That's nearly 200 queries (not even
> counting the 1-8 (??) inside the function itself) per message. Something
> doesn't seem ri
Karim Nassar wrote:
> On Wed, 2005-07-27 at 14:35 -0800, Matthew Schumacher wrote:
>
>
>>I put the rest of the schema up at
>>http://www.aptalaska.net/~matt.s/bayes/bayes_pg.sql in case someone
>>needs to see it too.
>
>
> Do you have sample data too?
Josh Berkus wrote:
> Matt,
>
> Well, it might be because we don't have a built-in GREATEST or LEAST prior to
> 8.1. However, it's pretty darned easy to construct one.
I was more talking about min() and max() but yea, I think you knew where
I was going with it...
>
> Well, there's the genera
Josh Berkus wrote:
> Matt,
>
>
>>After playing with various indexes and what not I simply am unable to
>>make this procedure perform any better. Perhaps someone on the list can
>>spot the bottleneck and reveal why this procedure isn't performing that
>>well or ways to make it better.
>
>
> Wel
I'm not sure how much this has been discussed on the list, but wasn't
able to find anything relevant in the archives.
The new Spamassassin is due out pretty soon. They are currently testing
3.1.0pre4. One of the things I hope to get out of this release is bayes
word stats moved to a real RDBMS.
29 matches
Mail list logo