I'm running pgsql on an m1.large EC2 instance with 7.5gb available memory.
The free command shows 7gb of free+cached. My understand from the docs is that
I should dedicate 1.75gb to shared_buffers (25%) and set effective_cache_size
to 7gb.
Is this correct? I'm running 64-bit Ubuntu 10.10, e.g
Folks,
I'm trying to optimize the following query that performs KL Divergence [1]. As
you can see the distance function operates on vectors of 150 floats.
The query takes 12 minutes to run on an idle (apart from pgsql) EC2 m1 large
instance with 2 million documents in the docs table. The CPU i
I have a stored proc that potentially inserts hundreds of thousands,
potentially millions, of rows (below).
This stored proc is part of the the sequence of creating an ad campaign and
links an ad to documents it should be displayed with.
A few of these stored procs can run concurrently as users
be faster than calling it from a select, once for each array?
Sent from my comfortable recliner
On 30/04/2011, at 18:28, Kevin Grittner wrote:
> Joel Reymont wrote:
>
>> We have 2 million documents now and linking an ad to all of them
>> takes 5 minutes on my top-of-the-li
I'm calculating distance between probability vectors, e.g. topics that
a document belongs to and the topics of an ad.
The distance function is already a C function. Topics are float8[150].
Distance is calculated against all documents in the database so it's
arable scan.
Sent from my comfortable
On Apr 30, 2011, at 7:24 PM, Kevin Grittner wrote:
> If this is where most of the time is, the next thing is to run it
> with EXPLAIN ANALYZE, and post the output.
I was absolutely wrong about the calculation taking < 1s, it actually takes
about 30s for 2 million rows.
Still, the difference be
On Apr 30, 2011, at 7:36 PM, Kevin Grittner wrote:
> It may even be amenable to knnGiST indexing (a new feature coming in
> 9.1), which would let you do your select with an ORDER BY on the
> distance.
I don't think I can wait for 9.1, need to go live in a month, with PostgreSQL
or without.
> P
On Apr 30, 2011, at 11:11 PM, Jeff Janes wrote:
> But what exactly are you inserting? The queries you reported below
> are not the same as the ones you originally described.
I posted the wrong query initially. The only difference is in the table that
holds the probability array.
I'm inserting
What are the best practices for setting up PG 9.x on Amazon EC2 to get the best
performance?
Thanks in advance, Joel
--
- for hire: mac osx device driver ninja, kernel extensions and usb drivers
-
On May 3, 2011, at 8:41 PM, Alan Hodgson wrote:
> I am also interested in tips for this. EBS seems to suck pretty bad.
Alan, can you elaborate? Are you using PG on top of EBS?
--
- for hire: mac osx device driver ninja, ker
10 matches
Mail list logo