scan may not be as
efficient since 4 times the data returned in just a little more time using the
index scan.
Thank you,
James
-Original Message-
From: Tomas Vondra [mailto:t...@fuzzy.cz]
Sent: Monday, July 14, 2014 12:29 PM
To: Magers, James
Cc: Tomas Vondra; Thomas Kellerer; pgsql
, but anyway.
>
> Could you first apply a kind of grid to your observations? What I mean is
> to round your coords to, say, 1/2 arcsec on each axe and group the results.
> I think you will have most observations grouped this way and then use your
> regular algorithm to combine the results.
>
> Best regards, Vitalii Tymchyshyn
>
--
-
Craig A. James
Chief Technology Officer
eMolecules, Inc.
-
Hi Jiri,
I’m really interested in those [clustering] algorithms and study them. But
> I would need somebody to point me directly at a specific algorithm to look
> at. The main problem with choosing the right one (which couldn’t get over
> even my university teacher) is that you don’t know the numb
On Fri, Oct 10, 2014 at 5:10 AM, Tomas Vondra wrote:
> > I've gone looking for papers on this topic but from what I read this
> > isn't so. To get any noticeable improvement you need to read 10-50% of
> > the table and that's effectively the same as reading the entire table
> > -- and it still ha
On Fri, Oct 10, 2014 at 9:53 AM, Tomas Vondra wrote:
>
> On 10.10.2014 16:21, Craig James wrote:
> > Our index is for chemical structures. Chemicals are indexed on
> > chemical fragments
> > <http://emolecules.com/info/molecular-informatics>. A search
> >
We're buying a new server in the near future to replace an aging system.
I'd appreciate advice on the best SSD devices and RAID controller cards
available today.
The database is about 750 GB. This is a "warehouse" server. We load
supplier catalogs throughout a typical work week, then on the weeken
.@postgresql.org] *On Behalf Of *Andreas Joseph
> Krogh
> *Sent:* Wednesday, July 01, 2015 6:56 PM
> *To:* pgsql-performance@postgresql.org
> *Subject:* Re: [PERFORM] New server: SSD/RAID recommendations?
>
>
>
> På torsdag 02. juli 2015 kl. 01:06:57, skrev Craig James <
On Wed, Jul 1, 2015 at 4:56 PM, Andreas Joseph Krogh
wrote:
> På torsdag 02. juli 2015 kl. 01:06:57, skrev Craig James <
> cja...@emolecules.com>:
>
> We're buying a new server in the near future to replace an aging system.
> I'd appreciate advice on the best
On Wed, Jul 1, 2015 at 4:57 PM, Scott Marlowe
wrote:
> On Wed, Jul 1, 2015 at 5:06 PM, Craig James wrote:
> > We're buying a new server in the near future to replace an aging system.
> I'd
> > appreciate advice on the best SSD devices and RAID controll
ll 100% of all eight CPUs.
I have no idea why, and never investigated further. The convenience of
having the functions in SQL wasn't that important.
Craig
>
> Graeme.
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>
--
-
Craig A. James
Chief Technology Officer
eMolecules, Inc.
-
On Tue, Jul 7, 2015 at 10:31 PM, Joshua D. Drake
wrote:
>
> On 07/07/2015 08:05 PM, Craig James wrote:
>
>>
>>
>> No ideas, but I ran into the same thing. I have a set of C/C++ functions
>> that put some chemistry calculations into Postgres as extensions (things
On Wed, Jul 8, 2015 at 10:52 AM, Joshua D. Drake
wrote:
>
> On 07/08/2015 10:48 AM, Craig James wrote:
>
> I admit that I haven't read this whole thread but:
>>
>> Using Apache Fast-CGI, you are going to fork a process for each
>> instance of the
On Wed, Jul 8, 2015 at 1:27 PM, Andres Freund wrote:
> On 2015-07-08 13:46:53 -0500, Merlin Moncure wrote:
> > On Wed, Jul 8, 2015 at 12:48 PM, Craig James
> wrote:
> > > On Tue, Jul 7, 2015 at 10:31 PM, Joshua D. Drake >
> > >> Using Apache Fast-CGI, you
B> scaling will suffer badly on FP functions.
That is half as many 256-bit float units; for scalar math and for
128-bit vector math each core gets a half of the float unit.
Only for the 256-bit vector math do the schedulars have to compete for
float unit access.
-JimC
--
James Cloos
The canonical advice here is to avoid more connections than you have CPUs,
and to use something like pg_pooler to achieve that under heavy load.
We are considering using the Apache mod_perl "fast-CGI" system and perl's
Apache::DBI module, which caches persistent connections in order to improve
per
On Sat, Jul 25, 2015 at 8:04 AM, Tom Lane wrote:
> Craig James writes:
> > ... This would result in a thousand
> > or so Postgres connections on a machine with 32 CPUs.
>
> > So the question is: do idle connections impact performance?
>
> Yes. Those connec
rmance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>
--
-
Craig A. James
Chief Technology Officer
eMolecules, Inc.
-
At some point in the next year we're going to reconsider our hosting
environment, currently consisting of several medium-sized servers (2x4
CPUs, 48GB RAM, 12-disk RAID system with 8 in RAID 10 and 2 in RAID 1 for
WAL). We use barman to keep a hot standby and an archive.
The last time we dug into
eing
compared to a non-negligible number of rows in (a).
I can probably make this data available as a pg_dump file. Let me know if
you think that's necessary, and where I should upload it.
Regards,
James
[1]
postgres=# explain (analyze,buffers) select b.* from b join a on b.
On Sun, Feb 28, 2016 at 2:06 AM, David Rowley
wrote:
> On 27 February 2016 at 11:07, James Parks wrote:
> >
> > CREATE TABLE a (id bigint primary key, nonce bigint);
> > CREATE TABLE b (id bigint primary key, a_id bigint not null);
> > CREATE INDEX a_idx ON
I'm working with a third-party plugin that does chemistry. It's very fast.
However, I'm trying to do a sampling query, such as the first 1% of the
database, and I just can't get the planner to create a good plan. Here is
the full query (the |>| operator does a subgraph match of a molecular
substru
On Wed, Jun 22, 2016 at 11:36 AM, Jeff Janes wrote:
> On Wed, Jun 22, 2016 at 9:03 AM, Craig James
> wrote:
> > I'm working with a third-party plugin that does chemistry.
>
>
> Out of personal/professional curiosity, which one are you using, if
> that can be
On Thu, Jun 23, 2016 at 8:47 AM, Jeff Janes wrote:
> On Wed, Jun 22, 2016 at 9:36 PM, Craig James
> wrote:
> > On Wed, Jun 22, 2016 at 11:36 AM, Jeff Janes
> wrote:
> ...
> But, JChem's cartridge is apparently not using a GiST index, which is
> what my first gues
version to see if it speeds up things.
>
> However is there way to keep query time constant as the database size
> grows. Should I use partitioning or partial indexes?
>
> Best Regards,
> Tommi Kaksonen
>
--
-
Craig A. James
Chief Technology Officer
eMolecules, Inc.
-
le! http://BlueTreble.com
> 855-TREBLE2 (855-873-2532) mobile: 512-569-9461
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>
--
-
Craig A. James
Chief Technology Officer
eMolecules, Inc.
-
lose to 6,000 records/second. The schema has been
> simplified since and last test reach just over 20,000 records/second with
> 300k tables.
>
> I'm not looking for alternatives yet but input to my test. Takers?
>
> I can't promise immediate feedback but will do my best to respond with
> results.
>
> TIA,
> -Greg
>
--
-
Craig A. James
Chief Technology Officer
eMolecules, Inc.
-
This is a weird problem. A "limit 5" query runs quicky as expected, but a
"limit 1" query never finishes -- it just blasts along at 100% CPU until I
give up. And this is a join between two small tables (262K rows and 109K
rows). Both tables were recently analyzed.
This is Postgres 9.3.5 (yes, we'l
iently query table A? Or, in other words, how do I
avoid incurring the cost of transaction isolation for queries against B on
a case-by-case basis?
Anything is on the table for implementation:
- moving tables to a different database / cluster / completely different
DBMS system
- designing an extension to tune either sets of queries
- partitioning tables
- etc
... although the simpler the better. If you were in this position, what
would you do?
Regards,
James
We are indexing about 5 million small documents using tsearch2/GIST. Each "document"
contains 2 to 50 words. This is a "write once, read many" situation. Write performance
is unimportant, and the database contents are static. (We build it offline.)
We're having problems with inconsistent pe
We are indexing about 5 million small documents using tsearch2/GIST. Each "document"
contains 2 to 50 words. This is a "write once, read many" situation. Write performance
is unimportant, and the database contents are static. (We build it offline.)
We're having problems with inconsistent pe
Oleg wrote:
Did you consider *decreasing* SIGLENINT ? Size of index will diminish
and performance could be increased. I use in current project SIGLENINT=15
The default value for SIGLENINT actually didn't work at all. It was only by
increasing it that I got any performance at all. An examinat
We are facing a* critical situation because of the performance of the
**database** .* Even a basic query like select count(*) from
bigger_table is taking about 4 minutes to return.
Several other replies have mentioned that COUNT() requires a full table scan, but this point can't be emphasized
Kevin Grittner wrote:
In addition to what Mark pointed out, there is the possibility that a
query
is running which is scanning a large table or otherwise bringing in a
large number of pages from disk. That would first use up all available
unused cache space, and then may start replacing some of
Scott Marlowe wrote:
What's needed is a way for the application developer to explicitely say,
"This object is frequenly used, and I want it kept in memory."
There's an interesting conversation happening on the linux kernel
hackers mailing list right about now that applies:
http://www.gossamer
Jim C. Nasby" wrote:
> Stefan Weiss wrote:
> ... IMO it would be useful to have a way to tell
> PG that some tables were needed frequently, and should be cached if
> possible. This would allow application developers to consider joins with
> these tables as "cheap", even when querying on columns
I have a function, call it "myfunc()", that is REALLY expensive computationally. Think
of it like, "If you call this function, it's going to telephone the Microsoft Help line and
wait in their support queue to get the answer." Ok, it's not that bad, but it's so bad that
the optimizer should A
I am mystified by the behavior of "alarm" in conjunction with
Postgres/perl/DBD. Here is roughly what I'm doing:
eval {
local $SIG{ALRM} = sub {die("Timeout");};
$time = gettimeofday;
alarm 20;
$sth = $dbh->prepare("a query that may take a long time...");
$sth->execut
Thanks for the info on alarm and timeouts, this was a big help. One further
comment:
Michael Fuhr wrote:
eval {
local $SIG{ALRM} = sub {die("Timeout");};
$time = gettimeofday;
alarm 20;
$sth = $dbh->prepare("a query that may take a long time...");
$sth->execute();
ala
In a recent thread, several people pointed out that UPDATE = DELETE+INSERT.
This got me to wondering.
I have a table that, roughly, looks like this:
create table doc (
id integer primary key,
document text,
keywords tsvector
);
where "keywords" has a GIST index. Th
When I set statement_timeout in the config file, it just didn't
do anything - it never timed out (PG 8.0.3).
... but did you reload the server after you [changed statement_timeout]?
Mystery solved. I have two servers; I was reconfiguring one and restarting the
other. Duh.
Thanks,
Craig
This article on ZDNet claims that hyperthreading can *hurt* performance, due to
contention in the L1/L2 cache by a second process:
http://news.zdnet.co.uk/0,39020330,39237341,00.htm
Has anyone tested this on Postgres yet? (And based on a recent somewhat
caustic thread about performance on t
So say I need 10,000 tables, but I can create tablespaces. Wouldn't that solve
the performance problem caused by Linux's (or ext2/3's) problems with large
directories?
For example, if each user creates (say) 10 tables, and I have 1000 users, I
could create 100 tablespaces, and assign groups o
I asked a while back if there were any plans to allow developers to override the
optimizer's plan and force certain plans, and received a fairly resounding
"No". The general feeling I get is that a lot of work has gone into the
optimizer, and by God we're going to use it!
I think this is just
Tom,
I see this over and over. Tweak the parameters to "force" a certain
plan, because there's no formal way for a developer to say, "I know
the best plan."
I think you've misunderstood those conversations entirely. The point
is not to force the planner into a certain plan, it is to explore
Christopher Kings-Lynne wrote:
select * from my_table where row_num >= 5 and row_num < 10
and myfunc(foo, bar);
You just create an index on myfunc(foo, bar)
Thanks, but myfunc() takes parameters (shown here as "foo, bar"), one of which
is not a column, it's external and c
Christopher Kings-Lynne wrote:
select * from my_table where row_num >= 5 and row_num <
10
and myfunc(foo, bar);
You just create an index on myfunc(foo, bar)
only if myfunc(foo, bar) is immutable...
And if it's not then the best any database can do is to index scan
ro
Christopher Kings-Lynne wrote:
I don't necessarily disagree with your assertion that we need planner
hints, but unless you or someone else is willing to submit a patch with
the feature it's unlikely to ever be implemented...
Now that's an answer I understand and appreciate. Open-source develo
Mark Kirkwood wrote:
I hear what you are saying, but to use this fine example - I don't know
what the best plan is - these experiments part of an investigation to
find *if* there is a better plan, and if so, why Postgres is not finding
it.
There isn't a database in the world that is as smart
Tom Lane wrote:
This discussion has been had before (many times) ... see the -hackers
archives for detailed arguments. The one that carries the most weight
in my mind is that planner hints embedded in applications will not adapt
to changing circumstances --- the plan that was best when you desig
Christopher Kings-Lynne wrote:
Can you paste explain analyze and your effective_cache_size, etc. settings.
...
This seems like a case where PostgreSQL's current optimiser should
easily know what to do if your config settings are correct and you've
been running ANALYZE, so I'd like to see your
Christopher Kings-Lynne wrote:
So your main example bad query is possibly just a case of lack of
analyze stats and wrong postgresql.conf config? And that's what causes
you to shut down your database? Don't you want your problem FIXED?
I'm trying to help by raising a question that I think is
Yeah it would - an implementation I have seen that I like is where the
developer can supply the *entire* execution plan with a query. This is
complex enough to make casual use unlikely :-), but provides the ability
to try out other plans, and also fix that vital query that must run
today.
Kevin Brown wrote:
Hints are dangerous, and I consider them a last resort.
If you consider them a last resort, then why do you consider them to
be a better alternative than a workaround such as turning off
enable_seqscan, when all the other tradeoffs are considered?
If I understand enable_seq
Jaime Casanova wrote:
The context is this - in a busy OLTP system, sometimes a query comes
through that, for whatever reason (foolishness on my part as a
developer, unexpected use by a user, imperfection of the optimizer,
etc.), takes a really long time to run, usually because it table-scans
one
Rod Taylor wrote:
Just a minor clarification here: INSERT does not create dead rows, only
UPDATE and DELETE do. Thus, if you only insert rows, you do not need to
vacuum (although you probably need to analyze).
Is there no real-time garbage collection at all in Postgres? And if so, is
this bec
Alex Turner wrote:
The problem I see with software raid is the issue of a battery backed
unit: If the computer loses power, then the 'cache' which is held in
system memory, goes away, and fubars your RAID.
I'm not sure I see the difference. If data are cached, they're not written
whether it
Ron wrote:
We are not going to get valuable contributions nor help people become
more valuable to the community by "flaming them into submission".
Let's support getting definitive evidence. No matter who brings it to
the table ;-)
Thanks, Ron, for a voice of respect and reason. Since I fir
I'm using pg_dump/pg_restore to quickly copy databases between servers. But my
server keeps crashing when I run pg_restore:
glibc detected *** double free or corruption (!prev): 0x0a00b1a0
Postgres: 8.1.4
Linux: 2.6.12-1.1381_FC3
glibc: 2.3.6-0.fc3.1
Server: Dell
CPU: Xeon 2.80GHz
Me
Tom Lane wrote:
"Craig A. James" <[EMAIL PROTECTED]> writes:
I'm using pg_dump/pg_restore to quickly copy databases between servers. But my
server keeps crashing when I run pg_restore:
glibc detected *** double free or corruption (!prev): 0x0a00b1a0
What can I do
Well, once again I'm hosed because there's no way to tell the optimizer the
cost for a user-defined function. I know this issue has already been raised
(by me!) several times, but I have to remind everyone about this. I frequently
must rewrite my SQL to work around this problem.
Here is the
Adam Rich wrote:
Craig,
What version of postgres are you using? I just tested this on PG 8.1.2
and was unable to reproduce these results. I wrote a simple function
that returns the same text passed to it, after sleeping for 1 second.
I use it in a where clause, like your example below, and rega
Tom Lane wrote:
"Craig A. James" <[EMAIL PROTECTED]> writes:
CREATE OR REPLACE FUNCTION cansmiles(text) RETURNS text
AS '/usr/local/pgsql/lib/libchem.so', 'cansmiles'
LANGUAGE 'C' STRICT IMMUTABLE;
Umm ... this is a single-argument func
Guy Rouillier wrote:
I've got back access to my test system. I ran another test run with the
same input data set. This time I put pg_xlog on a different RAID volume
(the unused one that I suspect is a software RAID), and I turned
fsync=off in postgresql.conf. I left the rest of the configura
Guy,
The application is fairly straightforward, but as you say, what is
working okay with BigDBMS isn't working as well under PG. I'm going to
try other configuration suggestions made by others before I attempt
logic changes. The core logic is unchangeable; millions of rows of data
in a sin
Craig A. James wrote:
The "idiom" to replace count() was
"select col from tbl order by col desc limit 1". It worked miracles for
my app.
Sorry, I meant to write, "the idiom to replace MAX()", not count()... MAX()
was the function that was killing me,
Andreas Tille wrote:
My web application was running fine for years without any problem
and the performance was satisfying. Some months ago I added a
table containing 450 data rows ...
Since about two weeks the application became *drastically* slower
and I urgently have to bring back the old
Ray,
I'd like to have a toolbox prepared for when performance goes south.
I'm clueless. Would someone mind providing some detail about how to
measure these four items Craig listed:
I hope I didn't give the impression that these were the only thing to look at
... those four items just popped
Stephan Szabo wrote:
I tried posting to the bugs, and they said this is a better question for here.
I have to queries. One runs in about 2 seconds. The other takes upwards
of 2 minutes. I have a temp table that is created with 2 columns. This
table is joined with the larger database of call detai
Bill Moran wrote:
I'm curious, what problem does the disclaimer cause?
I wrote the following TOS for my personal system:
https://www.potentialtech.com/cms/node/9
Excerpt of the relevant part:
"If you send me email, you are granting me the unrestricted right to use
the contents of that email howe
Rodrigo Madera wrote:
I would just like to note here that this is an example of inefficient
strategy.
We could all agree (up to a certain economical point) that Alex saved
the most expensive one thousand dollars of his life.
I don't know the financial status nor the size of your organization
Carlos,
Now, yet another thing that you (Craig) seem to be missing: you're
simply putting the expense of all this time under the expenses column
in exchange for solving the particular problem...
More like I was trying to keep my response short ;-). I think we're all in
agreement on pretty mu
Dan Harris wrote:
I've found that it would be helpful to be able to tell how busy my
dedicated PG server is ...
I have seen some other nice back-end things exposed through PG functions
( e.g. database size on disk ) and wondered if there was anything
applicable to this.
I'd write a simple p
Michael Stone wrote:
On Thu, Mar 22, 2007 at 01:30:35PM +0200, [EMAIL PROTECTED] wrote:
approximated count?
why? who would need it? where you can use it?
Do a google query. Look at the top of the page, where it says "results N
to M of about O". For user interfaces (which is where a lot o
Brian Hurt wrote:
One of our biggest single problems is this very thing. It's not a
Postgres problem specifically, but more embedded in the idea of a
relational database: There are no "job status" or "rough estimate of
results" or "give me part of the answer" features that are critical to
man
Tino Wildenhain wrote:
Craig A. James schrieb:
...
In our case (for a variety of reasons, but this one is critical), we
actually can't use Postgres indexing at all -- we wrote an entirely
separate indexing system for our data...
...There is no need to store or
maintain this information
Tino Wildenhain wrote:
You guys can correct me if I'm wrong, but the key feature that's
missing from Postgres's flexible indexing is the ability to maintain
state across queries. Something like this:
select a, b, my_index_state() from foo where ...
offset 100 limit 10 using my_index(prev_
Steve Atkins wrote:
As long as you're ordering by some row in the table then you can do that in
straight SQL.
select a, b, ts from foo where (stuff) and foo > X order by foo limit 10
Then, record the last value of foo you read, and plug it in as X the next
time around.
We've been over this be
Tom Lane wrote:
"Craig A. James" <[EMAIL PROTECTED]> writes:
Steve Atkins wrote:
As long as you're ordering by some row in the table then you can do that in
straight SQL.
select a, b, ts from foo where (stuff) and foo > X order by foo limit 10
Then, record the last va
Kevin Grittner wrote:
Management has simply given a mandate that the software be independent
of OS and database vendor, and to use Java to help with the OS independence.
... we write all of our queries in ANSI SQL in our own query tool, parse it,
and generate Java classes to run it.
A better so
I had a 'scratch' database for testing, which I deleted, and then disk went
out. No problem, no precious data. But now I can't drop the tablespace, or
the user who had that as the default tablespace.
I thought about removing the tablespace from pg_tablespaces, but it seems wrong
to be monkey
Spiegelberg, Greg wrote:
We have a query which generates a small set of rows (~1,000) which are
to be used in a DELETE on the same table. The problem we have is that
we need to join on 5 different columns and it takes far too long.
You may have encountered the same problem I did: You *must*
Merlin Moncure wrote:
Using surrogate keys is dangerous and can lead to very bad design
habits that are unfortunately so prevalent in the software industry
they are virtually taught in schools. ... While there is
nothing wrong with them in principle (you are exchanging one key for
another as a
Merlin Moncure wrote:
In the context of this debate, I see this argument all the time, with
the implied suffix: 'If only we used integer keys we would not have
had this problem...'. Either the customer identifies parts with a
part number or they don't...and if they do identify parts with a
numbe
Merlin Moncure wrote:
Since the part number is NOT the primary key, duplicate part numbers
are not a problem. If you had instead used the part number as the
primary key, you'd be dead in the water.
You are redefining the primary key to be (part_number,
obsoletion_date). Now, if you had not a
Merlin Moncure wrote:
In the context of this debate, I see this argument all the time, with
the implied suffix: 'If only we used integer keys we would not have
had this problem...'. Either the customer identifies parts with a
part number or they don't...and if they do identify parts with a
numbe
Bill Moran wrote:
In response to Heikki Linnakangas <[EMAIL PROTECTED]>:
Can anyone else confirm this? I don't know if this is a windows-only
issue, but I don't know of a way to check fragmentation in unix.
I can confirm that it's only a Windows problem. No UNIX filesystem
that I'm aware of s
Greg Smith wrote:
If you're going to the trouble of building a tool for offering
configuration advice, it can be widly more effective if you look inside
the database after it's got data in it, and preferably after it's been
running under load for a while, and make your recommendations based on
Looking for replication solutions, I find:
Slony-I
Seems good, single master only, master is a single point of failure,
no good failover system for electing a new master or having a failed
master rejoin the cluster. Slave databases are mostly for safety or
for parallelizing queries for perfo
Alessandro Baretta <[EMAIL PROTECTED]> writes:
I am aware that what I am dreaming of is already available through
cursors, but in a web application, cursors are bad boys, and should be
avoided. What I would like to be able to do is to plan a query and run
the plan to retreive a limited number of
Alessandro Baretta wrote:
I think you're trying to do something at the wrong layer of your
architecture. This task normally goes in your middleware layer, not
your database layer.
I am developing my applications in Objective Caml, and I have written
the middleware layer myself. I could easil
Tom Lane wrote:
Well, we're trying to split an index page that's gotten full into two
index pages, preferably with approximately equal numbers of items in
each new page (this isn't a hard requirement though). ... If that's
correct, what you really want is to divide the values so that the unions
First, this isn't really the right place to ask -- this forum is about
performance, not SQL syntax.
Second, this isn't a question anyone can answer in a reasonable length of time.
What you're asking for usually is taught in a class on relational database
theory, which is typically a semester
Ron <[EMAIL PROTECTED]> writes:
We have two problems here.
The first is that the page splitting code for these indexes currently
has O(N^2) performance.
The second is that whatever solution we do use for this functionality,
we still need good performance during searches that use the index.
No
Mike Biamonte wrote:
Does anyone have any experience with extremely large data sets?
I'm mean hundreds of millions of rows.
The queries I need to run on my 200 million transactions are relatively
simple:
select month, count(distinct(cardnum)) count(*), sum(amount) from
transactions group by
Nate Byrnes wrote:
I must claim some ignorance, I come from the application world... but,
from a data integrity perspective, it makes a whole lot of sense to
store video, images, documents, whatever in the database rather than on
the file system external to it. Personally, I would use LOB's, bu
Jeremy Haile wrote:
We are a small company looking to put together the most cost effective
solution for our production database environment. Currently in
production Postgres 8.1 is running on this machine:
Dell 2850
2 x 3.0 Ghz Xeon 800Mhz FSB 2MB Cache
4 GB DDR2 400 Mhz
2 x 73 GB 10K SCSI RAID
Markus Schaber wrote:
Ron wrote:
...and of course if you know enough about the data to be sorted so as to
constrain it appropriately, one should use a non comparison based O(N)
sorting algorithm rather than any of the general comparison based
O(NlgN) methods.
Sounds interesting, could you give
"Virag Saksena" <[EMAIL PROTECTED]> writes:
The individual queries run in 50-300 ms. However the optimizer is
choosing a nested loop to join them rather than a Hash join...
I have what appears to be the identical problem.
This is a straightforward query that should be fairly quick, but takes a
Bruce Momjian wrote:
Dell often says part X is included, but part X is not the exact same as
part X sold by the original manufacturer. To hit a specific price
point, Dell is willing to strip thing out of commodity hardware, and
often does so even when performance suffers. For many people, this
Joshua D. Drake wrote:
I find this strains credibility, that this major manufacturer of PC's
would do something deceptive that hurts performance, when it would be
easily detected and widely reported. Can anyone cite a specific
instances where this has happened? Such as, "I bought Dell model X
401 - 500 of 570 matches
Mail list logo