Re: [GENERAL] PostgreSQL reads each 8k block - no larger blocks are used - even on sequential scans

2009-10-03 Thread Gerhard Wiesinger

On Fri, 2 Oct 2009, Greg Smith wrote:


On Fri, 2 Oct 2009, Gerhard Wiesinger wrote:

Larger blocksizes also reduce IOPS (I/Os per second) which might be a 
critial threshold on storage systems (e.g. Fibre Channel systems).


True to some extent, but don't forget that IOPS is always relative to a block 
size in the first place.  If you're getting 200 IOPS with 8K blocks, 
increasing your block size to 128K will not result in your getting 200 IOPS 
at that larger size; the IOPS number at the larger block size is going to 
drop too.  And you'll pay the penalty for that IOPS number dropping every 
time you're accessing something that would have only been an 8K bit of I/O 
before.




Yes, there will be some (very small) drop in IOPS, when blocksize is 
higher but today disks have a lot of throughput when IOPS*128k are 
compared to e.g. 100MB/s. I've done some Excel calculations which support 
this.


The trade-off is very application dependent.  The position you're advocating, 
preferring larger blocks, only makes sense if your workload consists mainly 
of larger scans.  Someone who is pulling scattered records from throughout a 
larger table will suffer with that same change, because they'll be reading a 
minimum of 128K even if all they really needed with a few bytes.  That 
penalty ripples all the way from the disk I/O upwards through the buffer 
cache.




I wouldn't read 128k blocks all the time. I would do the following:
When e.g. B0, B127, B256 should be read I would read in 8k random block 
I/O.


When B1, B2, B3, B4, B5, B7, B8, B9, B10 are needed I would make 2 
requests with the largest possible blocksize:

1.) B1-B5: 5*8k=40k
2.) B7-B10: 4*8k=32k

In this case when B5 and B7 are only one block away we could also discuss 
whether we should read B1-B10=10*8k=80k in one read request and don't use 
B6.


That would reduce the IOPS of a factor of 4-5 in that scenario and 
therefore throughput would go up.


It's easy to generate a synthetic benchmark workload that models some 
real-world applications and see performance plunge with a larger block size. 
There certainly are others where a larger block would work better. Testing 
either way is complicated by the way RAID devices usually have their own 
stripe sizes to consider on top of the database block size.




Yes, there are block device read ahead buffers and also RAID stripe 
caches. But both don't seem to work well with the tested HEAP BITMAP SCAN 
scenario and also in practical PostgreSQL performance measurement 
scenarios.


But the modelled pgiosim isn't a synthetic benchmark it is the same as a 
real work HEAP BITMAP SCAN scenario in PostgreSQL where some blocks are 
read directly consecutive at least logically in the filesystem (and with 
some propability also physically on disk) but currently only with each 8k 
block read even when 2 or more blocks could be read with one request.


BTW: I would also limit the blocksize to some upper limit on such requests 
(e.g. 1MB).


Ciao,
Gerhard

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL reads each 8k block - no larger blocks are used - even on sequential scans

2009-10-03 Thread Gerhard Wiesinger

On Fri, 2 Oct 2009, Simon Riggs wrote:



On Sun, 2009-09-27 at 18:05 +0200, Gerhard Wiesinger wrote:


So I saw, that even on sequential reads (and also on bitmap heap scan acces)
PostgreSQL uses only 8k blocks. I think that's a major I/O bottleneck.

A commercial software database vendor solved the problem by reading multiple
continuous blocks by multiple 8k blocks up to a maximum threshold. Output per 5
seconds on an equivalent "sequence scan":


Is systemtap counting actual I/Os or just requests to access 8192 blocks
once in OS cache? Postgres doesn't read more than one block at a time
into its buffer pool, so those numbers of requests look about right.



As far as I know these are VFS reads. So some reads might be from cache 
but since I did all requests should be from disk:

echo 3 > /proc/sys/vm/drop_caches;service postgresql restart
do benchmark

Same for all benchmarks because I don't want to measure cache performance 
of OS or of the DB to benchmark.


Therefore all requests (except reaing twice or more but that shouldn't be 
the case and would also be fine as cache hit) are from disk and not from 
the cache.



There is belief here that multi-block I/O was introduced prior to OS
doing this as a standard mechanism. Linux expands its read ahead window
in response to sequential scans and so this seems like something we
don't want to do in the database.


I played even with large values on block device readaheads of /dev/md*, 
/dev/sd* and /dev/dm-* as well as stripe_cache_size of /dev/md* but 
without any performance improvements in the benmark scenarios.


=> All readaheads/caches don't seem to work in at least in the HEAP 
BITMAP SCAN scenarios on nearly latest Linux kernels.


But I think such block issues (reading in largest blocks as possible) have 
to be optimized on application level (in our case DB level) because

1.) We can't assume that OS and even storage works well in such scenarios
2.) We can't assume that OS/storage is intelligent enough to reduce number 
of IOPS when 2 random blocks are at random 2 sequential blocks and that 
therefore the number of IOPS is reduced.
3.) I think such a logic should be very easy to integrate and even has 
been done with some patches.




It's possible this is wrong. Is the table being scanned fairly sizable
and was it allocated contiguously? i.e. was it a large table loaded via
COPY?

I also wonder if more L2 cache effects exist.



What do you mean with "table being scanned fairly sizable"? I don't get 
it.


Table was filled with a lot of inserts but at one time point.

Ciao,
Gerhard

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] How useful is the money datatype?

2009-10-03 Thread Thom Brown
I've noticed that while you can perform various calculations on a column of
type money, you can't use it or cast it as any other numeric type directly.
 Furthermore, it appears that since the locale being applied to the type is
cluster-wide, you would need an entirely different cluster if say you had 2
web applications that were intended to store monetary amounts from different
locations.
Is there an advantage to a money data type over a NUMERIC(10,2) or just
representing it in lowest denomination of currency with an integer?

I've found that I unwittingly compiled PostgreSQL on my web server without
specifying locale, and now the money type is represented in dollars.  In
order to change that, it would require a recompilation of PostgreSQL  (and
I'm surprised that there is no option to set locale at the database-level in
the same way as collation has for 8.4).

Having a look around the archives, there seem to be some fairly old
discussions of possibly removing this data type, so is it fair to assume
it's probably not beneficial to use it?

Thanks

Thom


Re: [GENERAL] How useful is the money datatype?

2009-10-03 Thread Peter Geoghegan
Hi Thom,

Here's how I represent currency values:

CREATE DOMAIN currency
  AS numeric(10,2);


I understand money has been deprecated. It has one obvious flaw that I
can think of: It cannot represent different currencies in different
tuples, with a currency_id field.

Regards,
Peter Geoghegan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How useful is the money datatype?

2009-10-03 Thread Grzegorz Jaśkiewicz
depending on the countries, etc - keep currencies in 10.4 , or you can
compromise to 10.3 , otherwise you might run into problems with rounding,
etc.


Re: [GENERAL] How useful is the money datatype?

2009-10-03 Thread Thom Brown
2009/10/3 Peter Geoghegan 

>
> Here's how I represent currency values:
>
> CREATE DOMAIN currency
>  AS numeric(10,2);
>
>
>
See, I can understand why someone might take the extra step to create a
domain for storing monetary units.  The fact that money is in the
documentation, but contains no notes to explain that it's only there for
backward-compatibility lead me to wonder if it still had relevance, which I
gather it doesn't really.  I'll avoid using it as it only appears to
introduce unnecessary limitations with very little advantages.

Thom


Re: [GENERAL] How useful is the money datatype?

2009-10-03 Thread Raymond O'Donnell
On 03/10/2009 11:53, Grzegorz Jaśkiewicz wrote:
> depending on the countries, etc - keep currencies in 10.4 , or you can
> compromise to 10.3 , otherwise you might run into problems with
> rounding, etc.

I thought the idea of NUMERIC was that the value was exact, avoiding
rounding problems that you might get with other floating-point types?

Ray.

--
Raymond O'Donnell, Director of Music, Galway Cathedral, Ireland
r...@iol.ie
Galway Cathedral Recitals: http://www.galwaycathedral.org/recitals
--

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How useful is the money datatype?

2009-10-03 Thread Peter Eisentraut
On Sat, 2009-10-03 at 11:33 +0100, Thom Brown wrote:
> I've found that I unwittingly compiled PostgreSQL on my web server
> without specifying locale,

PostgreSQL isn't "compiled" with a locale or without one.

>  and now the money type is represented in dollars.  In order to change
> that, it would require a recompilation of PostgreSQL  (and I'm
> surprised that there is no option to set locale at the database-level
> in the same way as collation has for 8.4).

lc_monetary



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How useful is the money datatype?

2009-10-03 Thread Raymond O'Donnell
On 03/10/2009 11:33, Thom Brown wrote:
> I've found that I unwittingly compiled PostgreSQL on my web server
> without specifying locale, and now the money type is represented in

You specify the locale at the initdb stage, not when compiling.

Ray.


--
Raymond O'Donnell, Director of Music, Galway Cathedral, Ireland
r...@iol.ie
Galway Cathedral Recitals: http://www.galwaycathedral.org/recitals
--

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How useful is the money datatype?

2009-10-03 Thread Thom Brown
2009/10/3 Raymond O'Donnell 

>
> You specify the locale at the initdb stage, not when compiling.
>
> Ray.
>
>
>
Yes, you're right.  Got my wires crossed there.  However, it still means
locale-per-cluster which is disappointing.  Ideally we'd have collation and
locale per table or even per column.
Thom


Re: [GENERAL] How useful is the money datatype?

2009-10-03 Thread Filip Rembiałkowski
I understand it's kind of a survey, so to answer the question from my point
of view:

The "money" data type is not useful at all.





-- 
Filip Rembiałkowski
JID,mailto:filip.rembialkow...@gmail.com
http://filip.rembialkowski.net/


Re: [GENERAL] How useful is the money datatype?

2009-10-03 Thread Peter Geoghegan
2009/10/3 Grzegorz Jaśkiewicz :
> depending on the countries, etc - keep currencies in 10.4 , or you can
> compromise to 10.3 , otherwise you might run into problems with rounding,
> etc.

I myself don't find it useful to store currency values that include
fractions of a cent. I'm sure that there are legitimate reasons for
requiring greater precision, but none of those reasons happen to apply
to me. I dare say that they don't apply to most people who want to
store monetary values in a database.

Regards,
Peter Geoghegan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Procedure for feature requests?

2009-10-03 Thread Sam Mason
On Fri, Oct 02, 2009 at 09:48:14PM +, Tim Landscheidt wrote:
> Sam Mason  wrote:
> > 8.4 has a generate_series(timestamp,timestamp,interval) which would seem
> > to be a bit more flexible than you want.
> 
> Yes, I know :-). But as "generate_series(A, B, C)" can also
> be written as "A + generate_series(0, (C - B) / C) * C" (or
> something "flexible" like that :-)), a

For things as complicated as timestamps I'm not sure if this is such a
trivial transform.  If you can figure out the limit then it seems easy,
though I'm not sure how you'd do that.

> "generate_series(DATE, DATE)" would inter alia get rid off
> the need to cast the result from TIMESTAMP to DATE and to
> explicitly specify "'1 day'". Just a small, trivial enhance-
> ment for a popular use case :-).

Interesting, I tend to aim for maximum expressiveness not ease of
expressiveness.  It would be somewhat easy to add the above if you want
though:

  CREATE FUNCTION generate_series(date,date)
  RETURNS SETOF date
  IMMUTABLE LANGUAGE sql AS $$
SELECT generate_series($1::timestamp,$2::timestamp,interval '1 day')::date;
  $$;

or I suppose you could use the integer series generation:

  SELECT $1 + generate_series(0,$2 - $1);

Hum, now I'll have to see which is "better".

That second version seems to be slightly quicker (20 to 30%, for ranges
from a year up to a century respectively) so you may prefer it, but the
difference is going to be in the noise for any query I've ever used
generate_series for.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How useful is the money datatype?

2009-10-03 Thread Sam Mason
On Sat, Oct 03, 2009 at 12:20:57PM +0100, Raymond O'Donnell wrote:
> I thought the idea of NUMERIC was that the value was exact, avoiding
> rounding problems that you might get with other floating-point types?

Nope, sorry it's still a computer and thus can't represent anything
with infinite precision (just numeric fractions in PG's case, let alone
irrational numbers). For example:

  select (numeric '1'/3) * 3;

Gives me back 0..

What NUMERIC datatypes allow you to do however is allow you to specify
the precision used in calculations and storage (i.e. as 10 digits, four
of those being fractional digits, as above).  Thus you've got a chance
of putting a bound on the total error that can accumulate during a
calculation.

For example, you can choose between storing a few more digits in your
accounting tables so that when doing aggregations it comes out with the
"right" number at the end---i.e. 10 orders of something cost the same as
one order of 10 items.  Or you set the precision to be coarser and then
the values that have been rounded off will match everything else.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How useful is the money datatype?

2009-10-03 Thread Merlin Moncure
On Sat, Oct 3, 2009 at 11:40 AM, Sam Mason  wrote:
> On Sat, Oct 03, 2009 at 12:20:57PM +0100, Raymond O'Donnell wrote:
>> I thought the idea of NUMERIC was that the value was exact, avoiding
>> rounding problems that you might get with other floating-point types?
>
> Nope, sorry it's still a computer and thus can't represent anything
> with infinite precision (just numeric fractions in PG's case, let alone
> irrational numbers). For example:
>
>  select (numeric '1'/3) * 3;
>

I don't quite agree with your statement (I agree with your point, just
not the way you worded it).  I could make a type, 'rational', define
the numerator, denominator, and do calculations like the above with
zero loss.  So it depends how you define 'represent'.

Computers can do pretty much any type of bounded calculation given
enough time and memory.

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How useful is the money datatype?

2009-10-03 Thread Greg Stark
2009/10/3 Grzegorz Jaśkiewicz :
> depending on the countries, etc - keep currencies in 10.4 , or you can
> compromise to 10.3 , otherwise you might run into problems with rounding,
> etc.

Keeping more digits of precision than the application actually can use
is more likely to *cause* problems with rounding than solve them.

For example, if you calculate interest on a balance (using floating
point arithmetic) and then round it to $10.001 and store that in the
balance your application will tell the user and your accounting
department that they have $10 and their account. But if you do this
ten times they'll mysteriously have an extra cent that the accounting
department will not be able to account for.

To avoid problems like this you must store precisely as many digits as
the application requires. No more and no less. Intermediate
calculations can be done with more precision or floating point
arithmetic but you have to round or truncate before reporting the
results and then store precisely the value you reported.

-- 
greg

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Procedure for feature requests?

2009-10-03 Thread Martin Gainty

is there a way to create a cast with assignment e.g.?
CREATE CAST ((date,date) AS int4) WITH FUNCTION generate_series(date,date) AS 
ASSIGNMENT;http://www.postgresql.org/docs/8.4/static/sql-createcast.html
?
Martin Gainty 
__ 
Verzicht und Vertraulichkeitanmerkung/Note de déni et de confidentialité
 
Diese Nachricht ist vertraulich. Sollten Sie nicht der vorgesehene Empfaenger 
sein, so bitten wir hoeflich um eine Mitteilung. Jede unbefugte Weiterleitung 
oder Fertigung einer Kopie ist unzulaessig. Diese Nachricht dient lediglich dem 
Austausch von Informationen und entfaltet keine rechtliche Bindungswirkung. 
Aufgrund der leichten Manipulierbarkeit von E-Mails koennen wir keine Haftung 
fuer den Inhalt uebernehmen.
Ce message est confidentiel et peut être privilégié. Si vous n'êtes pas le 
destinataire prévu, nous te demandons avec bonté que pour satisfaire informez 
l'expéditeur. N'importe quelle diffusion non autorisée ou la copie de ceci est 
interdite. Ce message sert à l'information seulement et n'aura pas n'importe 
quel effet légalement obligatoire. Étant donné que les email peuvent facilement 
être sujets à la manipulation, nous ne pouvons accepter aucune responsabilité 
pour le contenu fourni.




> Date: Sat, 3 Oct 2009 16:23:36 +0100
> From: s...@samason.me.uk
> To: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] Procedure for feature requests?
> 
> On Fri, Oct 02, 2009 at 09:48:14PM +, Tim Landscheidt wrote:
> > Sam Mason  wrote:
> > > 8.4 has a generate_series(timestamp,timestamp,interval) which would seem
> > > to be a bit more flexible than you want.
> > 
> > Yes, I know :-). But as "generate_series(A, B, C)" can also
> > be written as "A + generate_series(0, (C - B) / C) * C" (or
> > something "flexible" like that :-)), a
> 
> For things as complicated as timestamps I'm not sure if this is such a
> trivial transform.  If you can figure out the limit then it seems easy,
> though I'm not sure how you'd do that.
> 
> > "generate_series(DATE, DATE)" would inter alia get rid off
> > the need to cast the result from TIMESTAMP to DATE and to
> > explicitly specify "'1 day'". Just a small, trivial enhance-
> > ment for a popular use case :-).
> 
> Interesting, I tend to aim for maximum expressiveness not ease of
> expressiveness.  It would be somewhat easy to add the above if you want
> though:
> 
>   CREATE FUNCTION generate_series(date,date)
>   RETURNS SETOF date
>   IMMUTABLE LANGUAGE sql AS $$
> SELECT generate_series($1::timestamp,$2::timestamp,interval '1 
> day')::date;
>   $$;
> 
> or I suppose you could use the integer series generation:
> 
>   SELECT $1 + generate_series(0,$2 - $1);
> 
> Hum, now I'll have to see which is "better".
> 
> That second version seems to be slightly quicker (20 to 30%, for ranges
> from a year up to a century respectively) so you may prefer it, but the
> difference is going to be in the noise for any query I've ever used
> generate_series for.
> 
> -- 
>   Sam  http://samason.me.uk/
> 
> -- 
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
  
_
Hotmail: Powerful Free email with security by Microsoft.
http://clk.atdmt.com/GBL/go/171222986/direct/01/

[GENERAL] Re: PostgreSQL reads each 8k block - no larger blocks are used - even on sequential scans

2009-10-03 Thread Greg Stark
On Sun, Sep 27, 2009 at 11:18 AM, Sam Mason  wrote:
> On Sun, Sep 27, 2009 at 06:05:51PM +0200, Gerhard Wiesinger wrote:
>> A google research has shown that Gregory Stark already worked on that issue
>> (see references below) but as far as I saw only on bitmap heap scans.
>
> Greg Stark's patches are about giving the IO subsystem enough
> information about where the random accesses will be ending up next.
> This is important, but almost completely independent from the case
> where you know you're doing sequential IO, which is what you seem to be
> talking about.

FWIW I did work to write code to use FADV_SEQUENTIAL and FADV_RANDOM
but couldn't demonstrate any performance improvement. Basically
Postgres was already capable of saturating any raid controller I could
test doing a normal sequential scan with 8k block sizes and no special
read-ahead advice.


-- 
greg

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How useful is the money datatype?

2009-10-03 Thread Sam Mason
On Sat, Oct 03, 2009 at 11:49:50AM -0400, Merlin Moncure wrote:
> On Sat, Oct 3, 2009 at 11:40 AM, Sam Mason  wrote:
> > it's still a computer and thus can't represent anything
> > with infinite precision (just numeric fractions in PG's case, let alone
> > irrational numbers).
> 
> I don't quite agree with your statement (I agree with your point, just
> not the way you worded it).

Maybe I didn't emphasize "numeric" enough; the current implementation
of numeric datatypes in PG does not allow fractions to be represented
accurately.  Is that any better?

> I could make a type, 'rational', define
> the numerator, denominator, and do calculations like the above with
> zero loss.

Yes, if you defined a datatype like this then it would be able to
express a strictly larger subset of all numbers.

> So it depends how you define 'represent'.
> Computers can do pretty much any type of bounded calculation given
> enough time and memory.

Which is why I said "with infinite precision".  Assuming infinite time
or space doesn't seem to help with any real world problem, it's the
details of the assumptions made and the use case(s) optimized for that
tend to be interesting.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Procedure for feature requests?

2009-10-03 Thread Sam Mason
 [ please don't top-post it's difficult to follow for those not directly
   involved in the discussion ]

On Sat, Oct 03, 2009 at 12:14:19PM -0400, Martin Gainty wrote:
> is there a way to create a cast with assignment e.g.?

Hum, I'm unsure how this would help.  Maybe more explanation would help?

> CREATE CAST ((date,date) AS int4) WITH FUNCTION generate_series(date,date) AS 
> ASSIGNMENT;

generate_series returns to a SETOF values.  It also has *much* more
complicated semantics than I'd expect most people would attribute as
useful to a datatype conversion function.  For example, why would
casting from a pair of dates end up as a set of rows containing a single
date value?


I have a large problem understanding the real purpose casts so maybe I'm
missing something.  My problem is that I don't understand the purpose
of trying to provide a "standard" way of converting between arbitrary
datatypes, it seems much easier to just provide a standard set of domain
specific functions that are explicitly used by the user.  The SQL
standard specifies that they need to exist so PG has to support them,
but their purpose still confuses me!

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Embarassing GROUP question

2009-10-03 Thread Corey Tisdale

Hi all,

I'm a recent MySQL convert, and I am having some difficulty with  
syntax on grouping a table by a foreign key and returning only the  
newest entry that matches. In MySQL, you can do something like


event_log

id
event_type
event_date
meaningful_data


SELECT
meaningful_data,
event_type,
event_date
FROM
event_log
GROUP BY
event_type
ORDER BY
event_date DESC



And this would return back the most recent event and meaningful data  
for each event type. When I try this in postgres, I get errors about  
either grouping by id or using id in an aggregate function. I am  
afraid if I use it in an aggregate function, it will disrupt the sort  
order I am attempting, and if I group by it, I no longer get the most  
recent data for each event type, but I get every event. How would I  
accomplish this is postresql? Is there some aggregate functions that  
help with this, or should I be trying a different method altogether?  
Grouping feels pretty different, and I suspect closer to standards, in  
postgres. Unfortunately, my mind works (for the short term) in mysql.  
Will someone please give me some pointers?


Thanks!
Corey

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Procedure for feature requests?

2009-10-03 Thread Tom Lane
Sam Mason  writes:
> I have a large problem understanding the real purpose casts so maybe I'm
> missing something.  My problem is that I don't understand the purpose
> of trying to provide a "standard" way of converting between arbitrary
> datatypes,

I think the reason CREATE CAST exists is exactly that the cast mechanism
*isn't* intended to provide conversions between any arbitrary pair of
datatypes.  It's only intended to provide conversions in those cases
where the conversion semantics are obvious to some degree or other.
Since that's somewhat in the eye of the beholder, we allow the user
to adjust edge cases by creating/removing casts --- but there's no
expectation that when you define a new datatype, you'll provide casts
to or from unrelated types.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Embarassing GROUP question

2009-10-03 Thread Tom Lane
Corey Tisdale  writes:
> SELECT
>   meaningful_data,
>   event_type,
>   event_date
> FROM
>   event_log
> GROUP BY
>   event_type
> ORDER BY
>   event_date DESC

Is event_type a primary key, or at least a candidate key, for this
table?  (I would guess not based on the name.)

If it is, then the above is actually well-defined, because there is
only one possible input row for each group.  The GROUP BY is actually
kinda pointless in that case.

If it is not, then the above is *not* well-defined --- there are
multiple possible meaningful_data and event_date values for each
event_type value, and you have absolutely no idea which ones you
will get.  This is not allowed per SQL standard, and MySQL has
done you no service by failing to detect the ambiguity.

What you might be after is something like Postgres' DISTINCT ON
feature, which allows you to resolve the ambiguity by specifying
a sort order for the rows within each group (and then taking the
first row in each group).  See the "weather reports" example in
our SELECT reference page.

I have never really played around with this aspect of MySQL ...
but looking at this example, and presuming that you find that
it actually does something useful, I wonder whether they interpret
the combination of GROUP BY and ambiguous-per-spec ORDER BY
in some fashion similar to DISTINCT ON.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Procedure for feature requests?

2009-10-03 Thread Sam Mason
On Sat, Oct 03, 2009 at 12:48:57PM -0400, Tom Lane wrote:
> I think the reason CREATE CAST exists is exactly that the cast mechanism
> *isn't* intended to provide conversions between any arbitrary pair of
> datatypes.  It's only intended to provide conversions in those cases
> where the conversion semantics are obvious to some degree or other.

Yup, but the decision to officially bless some code as being a cast
rather than "just" a function seems very arbitrary, I think this is why
I don't understand its purpose.

> Since that's somewhat in the eye of the beholder, we allow the user
> to adjust edge cases by creating/removing casts --- but there's no
> expectation that when you define a new datatype, you'll provide casts
> to or from unrelated types.

I know there's no expectation to create any casts.  I think what I'm
confused about is why anyone would ever bother creating any in the first
place.  I have a feeling I may have used the functionality once, but
I can't think why or for what now.  Having a function seems just as
expressive to me, which is why I think I'm missing the point.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Embarassing GROUP question

2009-10-03 Thread Sam Mason
On Sat, Oct 03, 2009 at 01:05:49PM -0400, Tom Lane wrote:
> What you might be after is something like Postgres' DISTINCT ON
> feature

Yup, looks that way to me as well.

> I have never really played around with this aspect of MySQL ...

Me neither.

> but looking at this example, and presuming that you find that
> it actually does something useful, I wonder whether they interpret
> the combination of GROUP BY and ambiguous-per-spec ORDER BY
> in some fashion similar to DISTINCT ON.

Yup, does look that way doesn't it.  It's still a weird pair of
semantics to conflate.

Hum, if they were assuming that you'd always have to implement GROUP BY
by doing a sort step first then I can see why they'd end up with this.
But if you want to do *anything* else (i.e. hash aggregate in PG) then
you want to keep the semantics of GROUP BY and ORDER BY separate as the
spec and indeed PG does.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Procedure for feature requests?

2009-10-03 Thread Tom Lane
Sam Mason  writes:
> On Sat, Oct 03, 2009 at 12:48:57PM -0400, Tom Lane wrote:
>> I think the reason CREATE CAST exists is exactly that the cast mechanism
>> *isn't* intended to provide conversions between any arbitrary pair of
>> datatypes.  It's only intended to provide conversions in those cases
>> where the conversion semantics are obvious to some degree or other.

> Yup, but the decision to officially bless some code as being a cast
> rather than "just" a function seems very arbitrary, I think this is why
> I don't understand its purpose.

It's useful when the conversion semantics are sufficiently natural that
you want the conversion to be applied implicitly.  I agree that the
explicit CAST syntax hasn't got very much to recommend it over a
function call.  That part you can blame on the SQL committee ;-) ...
the historical PostQUEL syntax for this was exactly a function call,
and you can still write it that way if you choose.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Embarassing GROUP question

2009-10-03 Thread Tom Lane
Sam Mason  writes:
> On Sat, Oct 03, 2009 at 01:05:49PM -0400, Tom Lane wrote:
>> but looking at this example, and presuming that you find that
>> it actually does something useful, I wonder whether they interpret
>> the combination of GROUP BY and ambiguous-per-spec ORDER BY
>> in some fashion similar to DISTINCT ON.

> Yup, does look that way doesn't it.  It's still a weird pair of
> semantics to conflate.

I poked around in the MySQL 5.1 manual to see if this is true.
I think it isn't --- it says very clearly here:
http://dev.mysql.com/doc/refman/5.1/en/group-by-hidden-columns.html
that you simply get an arbitrary choice among the possible values
when you reference an ambiguous column.  It's possible that Corey's
query actually does give him the answers he wants, but apparently
it would be an implementation artifact that they're not promising
to maintain.

> Hum, if they were assuming that you'd always have to implement GROUP BY
> by doing a sort step first then I can see why they'd end up with this.

It's worse than that --- they actually are promising that GROUP BY
orders the results!  In
http://dev.mysql.com/doc/refman/5.1/en/select.html
I find

If you use GROUP BY, output rows are sorted according to the
GROUP BY columns as if you had an ORDER BY for the same
columns. To avoid the overhead of sorting that GROUP BY
produces, add ORDER BY NULL:

SELECT a, COUNT(b) FROM test_table GROUP BY a ORDER BY NULL;

MySQL extends the GROUP BY clause so that you can also specify
ASC and DESC after columns named in the clause:

SELECT a, COUNT(b) FROM test_table GROUP BY a DESC;

The first of these examples implies that they allow ORDER BY to override
the default GROUP BY sorting, which would mean that the ORDER BY sort
has to happen after the GROUP BY operation, unlike the approach we take
for DISTINCT ON.  So that means the ORDER BY *isn't* going to affect
which row gets chosen out of each event_type group.

What I am currently betting is that Corey's query does not really do
what he thinks it does in MySQL.  It probably is selecting a random
representative row in each group and then sorting on the basis of the
event_dates in those rows.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Embarassing GROUP question

2009-10-03 Thread Corey Tisdale
We're coming from mysql 4, and changing the sort order changes the  
values of all columns as you would expect, given that you would expect  
a sort statement to affect grouping.  This certainly isn't the only  
time I've used this syntax. I've been mysql user for ten years, and  
the outcome has been consistant across hundreds of tables and millions  
of rows and thousands of queries. If you ever have to use or modify a  
mysql db, just keep this in mind in case it saves you some time.


That being said, we've discovered a few instances where docs were  
wrong, found numerous bugs with bitshifting and blob objects and cache  
usage and io buffering. We even sarted working on our own storage  
engine until we came to our senses and switched RDBMSeses.


5.1 has chased more than a few folks off, and rather than upgrade to  
it, we started porting to postgres. I didn't mean for my comparison to  
appearas a knock against postgres, merely to explain why I was having  
such a problem with such a simple issue. Thanks again for the help.


Corey Tisdale

On Oct 3, 2009, at 3:39 PM, Tom Lane  wrote:


Sam Mason  writes:

On Sat, Oct 03, 2009 at 01:05:49PM -0400, Tom Lane wrote:

but looking at this example, and presuming that you find that
it actually does something useful, I wonder whether they interpret
the combination of GROUP BY and ambiguous-per-spec ORDER BY
in some fashion similar to DISTINCT ON.



Yup, does look that way doesn't it.  It's still a weird pair of
semantics to conflate.


I poked around in the MySQL 5.1 manual to see if this is true.
I think it isn't --- it says very clearly here:
http://dev.mysql.com/doc/refman/5.1/en/group-by-hidden-columns.html
that you simply get an arbitrary choice among the possible values
when you reference an ambiguous column.  It's possible that Corey's
query actually does give him the answers he wants, but apparently
it would be an implementation artifact that they're not promising
to maintain.

Hum, if they were assuming that you'd always have to implement  
GROUP BY
by doing a sort step first then I can see why they'd end up with  
this.


It's worse than that --- they actually are promising that GROUP BY
orders the results!  In
http://dev.mysql.com/doc/refman/5.1/en/select.html
I find

   If you use GROUP BY, output rows are sorted according to the
   GROUP BY columns as if you had an ORDER BY for the same
   columns. To avoid the overhead of sorting that GROUP BY
   produces, add ORDER BY NULL:

   SELECT a, COUNT(b) FROM test_table GROUP BY a ORDER BY NULL;

   MySQL extends the GROUP BY clause so that you can also specify
   ASC and DESC after columns named in the clause:

   SELECT a, COUNT(b) FROM test_table GROUP BY a DESC;

The first of these examples implies that they allow ORDER BY to  
override

the default GROUP BY sorting, which would mean that the ORDER BY sort
has to happen after the GROUP BY operation, unlike the approach we  
take

for DISTINCT ON.  So that means the ORDER BY *isn't* going to affect
which row gets chosen out of each event_type group.

What I am currently betting is that Corey's query does not really do
what he thinks it does in MySQL.  It probably is selecting a random
representative row in each group and then sorting on the basis of the
event_dates in those rows.

   regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Embarassing GROUP question

2009-10-03 Thread Martin Gainty

Most Database Administrators dont allow jpg/png/gifs into BLOB columns simply 
because its Run-length
encoding and MUCH easier to store the picture's link e.g. 
http://www.mywebsite.com/PictureOfFido.jpg

Oracle on the other hand can store multi-gb images into blobs then again you're 
paying for that 'luxury'

Also keep in mind Postgres is under BSD license so you're getting what you pay 
for


PostgreSQL Database Management System

(formerly known as Postgres, then as Postgres95)



Portions Copyright (c) 1996-2009, The PostgreSQL Global Development Group



Portions Copyright (c) 1994, The Regents of the University of California



Permission to use, copy, modify, and distribute this software and its
documentation for any purpose, without fee, and without a written agreement
is hereby granted, provided that the above copyright notice and this
paragraph and the following two paragraphs appear in all copies.



IN NO EVENT SHALL THE UNIVERSITY OF CALIFORNIA BE LIABLE TO ANY PARTY FOR
DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING
LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS
DOCUMENTATION, EVEN IF THE UNIVERSITY OF CALIFORNIA HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.



THE UNIVERSITY OF CALIFORNIA SPECIFICALLY DISCLAIMS ANY WARRANTIES,
INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY
AND FITNESS FOR A PARTICULAR PURPOSE.  THE SOFTWARE PROVIDED HEREUNDER IS
ON AN "AS IS" BASIS, AND THE UNIVERSITY OF CALIFORNIA HAS NO OBLIGATIONS TO
PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS.


what types of caching issues are you experencing?
Martin Gainty 
__ 
Verzicht und Vertraulichkeitanmerkung/Note de déni et de confidentialité
 
Diese Nachricht ist vertraulich. Sollten Sie nicht der vorgesehene Empfaenger 
sein, so bitten wir hoeflich um eine Mitteilung. Jede unbefugte Weiterleitung 
oder Fertigung einer Kopie ist unzulaessig. Diese Nachricht dient lediglich dem 
Austausch von Informationen und entfaltet keine rechtliche Bindungswirkung. 
Aufgrund der leichten Manipulierbarkeit von E-Mails koennen wir keine Haftung 
fuer den Inhalt uebernehmen.
Ce message est confidentiel et peut être privilégié. Si vous n'êtes pas le 
destinataire prévu, nous te demandons avec bonté que pour satisfaire informez 
l'expéditeur. N'importe quelle diffusion non autorisée ou la copie de ceci est 
interdite. Ce message sert à l'information seulement et n'aura pas n'importe 
quel effet légalement obligatoire. Étant donné que les email peuvent facilement 
être sujets à la manipulation, nous ne pouvons accepter aucune responsabilité 
pour le contenu fourni.




> From: co...@eyewantmedia.com
> To: t...@sss.pgh.pa.us
> Subject: Re: [GENERAL] Embarassing GROUP question 
> Date: Sat, 3 Oct 2009 16:56:02 -0500
> CC: s...@samason.me.uk; pgsql-general@postgresql.org
> 
> We're coming from mysql 4, and changing the sort order changes the  
> values of all columns as you would expect, given that you would expect  
> a sort statement to affect grouping.  This certainly isn't the only  
> time I've used this syntax. I've been mysql user for ten years, and  
> the outcome has been consistant across hundreds of tables and millions  
> of rows and thousands of queries. If you ever have to use or modify a  
> mysql db, just keep this in mind in case it saves you some time.
> 
> That being said, we've discovered a few instances where docs were  
> wrong, found numerous bugs with bitshifting and blob objects and cache  
> usage and io buffering. We even sarted working on our own storage  
> engine until we came to our senses and switched RDBMSeses.
> 
> 5.1 has chased more than a few folks off, and rather than upgrade to  
> it, we started porting to postgres. I didn't mean for my comparison to  
> appearas a knock against postgres, merely to explain why I was having  
> such a problem with such a simple issue. Thanks again for the help.
> 
> Corey Tisdale
> 
> On Oct 3, 2009, at 3:39 PM, Tom Lane  wrote:
> 
> > Sam Mason  writes:
> >> On Sat, Oct 03, 2009 at 01:05:49PM -0400, Tom Lane wrote:
> >>> but looking at this example, and presuming that you find that
> >>> it actually does something useful, I wonder whether they interpret
> >>> the combination of GROUP BY and ambiguous-per-spec ORDER BY
> >>> in some fashion similar to DISTINCT ON.
> >
> >> Yup, does look that way doesn't it.  It's still a weird pair of
> >> semantics to conflate.
> >
> > I poked around in the MySQL 5.1 manual to see if this is true.
> > I think it isn't --- it says very clearly here:
> > http://dev.mysql.com/doc/refman/5.1/en/group-by-hidden-columns.html
> > that you simply get an arbitrary choice among the possible values
> > when you reference an ambiguous column.  It's possible that Corey's
> > query actually does give him the answers he wants, but apparently
> > it would be an implementation artifact that they're not promising
> >

Re: [GENERAL] Embarassing GROUP question

2009-10-03 Thread Tom Lane
Corey Tisdale  writes:
> We're coming from mysql 4, and changing the sort order changes the  
> values of all columns as you would expect, given that you would expect  
> a sort statement to affect grouping.  This certainly isn't the only  
> time I've used this syntax. I've been mysql user for ten years, and  
> the outcome has been consistant across hundreds of tables and millions  
> of rows and thousands of queries. If you ever have to use or modify a  
> mysql db, just keep this in mind in case it saves you some time.

Okay, I got sufficiently interested to drag out the nearest copy of
mysql and try it ...

mysql> create table t (f1 int, f2 int, f3 int);
Query OK, 0 rows affected (0.00 sec)

mysql> insert into t values(1,11,111), (1,22,222), (1,44,444), (1,33,333);
Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> insert into t values(2,55,555), (2,22,222), (2,44,444), (2,33,333);
Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> insert into t values(3,55,555), (3,22,222), (3,44,444), (3,77,777);
Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> select * from t group by f1 order by f2;
+--+--+--+
| f1   | f2   | f3   |
+--+--+--+
|1 |   11 |  111 | 
|2 |   55 |  555 | 
|3 |   55 |  555 | 
+--+--+--+
3 rows in set (0.00 sec)

mysql> select * from t group by f1 order by f2 desc;
+--+--+--+
| f1   | f2   | f3   |
+--+--+--+
|2 |   55 |  555 | 
|3 |   55 |  555 | 
|1 |   11 |  111 | 
+--+--+--+
3 rows in set (0.00 sec)

Looks to me like we're arbitrarily getting the physically-first row in
each f1 group.  It's certainly not looking for the minimum or maximum f2.

The above is with 5.1.37, but I find essentially the same wording in
the 3.x/4.x manual as in the 5.1 manual.

Now it's certainly possible that in particular circumstances you might
happen to get the right results --- for example, a scan that was using
an index might happen to deliver the rows in the right order.  But I
don't see any evidence that mysql is reliably producing groupwise
minimums or maximums with this syntax.  The long discussions in the
comments here:
http://dev.mysql.com/doc/refman/4.1/en/example-maximum-column-group-row.html
don't suggest that anyone else believes it works, either.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Embarassing GROUP question

2009-10-03 Thread Corey Tisdale
You may have nailed it. Everythig would have been indexed I. The order  
it was grouped by, so perhaps the order in which things are indexed  
and accesse is the kicker, or perhaps we've been consistantly lucky.


We also weren't adding image data to blobs, we were bit mapping  
faceted data to blob and shifting to allow people to shop by artist or  
color or subject matter across millions of posters. Normalized tables  
just weren't cutting it, and bit shifting up to 32 bit was crazy fast.  
After we rolled it out in production, we found mysql converts blobs to  
32 bit unsigned ints before shifting. Postgres appears to not do this  
at all, or our arbitrarily large test data did not trigger it on  
postgres.


After the last few days, it is becoming apparent how much of a joke  
mysql has been. Thanks again for such quick direction!


Corey Tisdale

On Oct 3, 2009, at 5:53 PM, Tom Lane  wrote:


Corey Tisdale  writes:

We're coming from mysql 4, and changing the sort order changes the
values of all columns as you would expect, given that you would  
expect

a sort statement to affect grouping.  This certainly isn't the only
time I've used this syntax. I've been mysql user for ten years, and
the outcome has been consistant across hundreds of tables and  
millions

of rows and thousands of queries. If you ever have to use or modify a
mysql db, just keep this in mind in case it saves you some time.


Okay, I got sufficiently interested to drag out the nearest copy of
mysql and try it ...

mysql> create table t (f1 int, f2 int, f3 int);
Query OK, 0 rows affected (0.00 sec)

mysql> insert into t values(1,11,111), (1,22,222), (1,44,444),  
(1,33,333);

Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> insert into t values(2,55,555), (2,22,222), (2,44,444),  
(2,33,333);

Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> insert into t values(3,55,555), (3,22,222), (3,44,444),  
(3,77,777);

Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> select * from t group by f1 order by f2;
+--+--+--+
| f1   | f2   | f3   |
+--+--+--+
|1 |   11 |  111 |
|2 |   55 |  555 |
|3 |   55 |  555 |
+--+--+--+
3 rows in set (0.00 sec)

mysql> select * from t group by f1 order by f2 desc;
+--+--+--+
| f1   | f2   | f3   |
+--+--+--+
|2 |   55 |  555 |
|3 |   55 |  555 |
|1 |   11 |  111 |
+--+--+--+
3 rows in set (0.00 sec)

Looks to me like we're arbitrarily getting the physically-first row in
each f1 group.  It's certainly not looking for the minimum or  
maximum f2.


The above is with 5.1.37, but I find essentially the same wording in
the 3.x/4.x manual as in the 5.1 manual.

Now it's certainly possible that in particular circumstances you might
happen to get the right results --- for example, a scan that was using
an index might happen to deliver the rows in the right order.  But I
don't see any evidence that mysql is reliably producing groupwise
minimums or maximums with this syntax.  The long discussions in the
comments here:
http://dev.mysql.com/doc/refman/4.1/en/example-maximum-column-group-row.html
don't suggest that anyone else believes it works, either.

   regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Procedure for feature requests?

2009-10-03 Thread Sam Mason
On Sat, Oct 03, 2009 at 04:14:21PM -0400, Tom Lane wrote:
> Sam Mason  writes:
> > the decision to officially bless some code as being a cast
> > rather than "just" a function seems very arbitrary
> 
> It's useful when the conversion semantics are sufficiently natural that
> you want the conversion to be applied implicitly.

Thanks!  After a big think I've ended up thinking the implicit casts
between the various numeric and date types are a good thing.  They can
cause some confusion and semantic strangeness, but the increase in code
verbosity that results without them normally offsets these costs.

In higher assurance code this balance may tip back the other way, but
databases have more focus on having a sane set of defaults rather than
forcing you to make all the decisions up front.

> I agree that the
> explicit CAST syntax hasn't got very much to recommend it over a
> function call.  That part you can blame on the SQL committee ;-) ...

What more would you want them to do?  Casts that is, the SQL committee
do enough I think!

> the historical PostQUEL syntax for this was exactly a function call,
> and you can still write it that way if you choose.

I have a feeling I'll probably carry on doing that then.  I'm not sure
if I'm ever going to write enough almost overlapping bits of code that
casts would become useful.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Embarassing GROUP question

2009-10-03 Thread Sam Mason
On Sat, Oct 03, 2009 at 06:12:20PM -0500, Corey Tisdale wrote:
> We also weren't adding image data to blobs, we were bit mapping  
> faceted data to blob and shifting to allow people to shop by artist or  
> color or subject matter across millions of posters. Normalized tables  
> just weren't cutting it, and bit shifting up to 32 bit was crazy fast.  

Just out of interest; have you tried PG's support of fancier index
types?  HStore or intarray would appear to help with what you're doing.
Not quite sure what you're actually doing so my guess could be a long
way off!

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general