Hi,
~
I am trying to get dups from some data from files which md5sums I
previously calculated
~
Here is my mere mortal SQL
~
SELECT md5, COUNT(md5) AS md5cnt
FROM jdk1_6_0_07_txtfls_md5
WHERE (md5cnt > 1)
GROUP BY md5
ORDER BY md5cnt DESC;
~
and this is what I get:
~
jpk=# SELECT md5, COUNT(md5
On Thu, Aug 28, 2008 at 7:50 PM, Adrian Klaver <[EMAIL PROTECTED]> wrote:
> Define easily.
~
OK, let me try to outline the approach I would go for:
~
I think "COPY FROM CSV" should have three options, namely:
~
1) the way we have used it in which you create the table first
~
2) another way in w
On Sat, Aug 30, 2008 at 08:23:25AM -0400, Albretch Mueller wrote:
> OK, let me try to outline the approach I would go for:
> ~
> I think "COPY FROM CSV" should have three options, namely:
I think you're confusing postgresql with a spreadsheet program. A
database is designed to take care of your
Also I know there is a DISTINCT keyword, but I also need to know how
many times the particular data in the column is repeated if it is,
that is why I need to go:
~
SELECT md5, COUNT(md5) AS md5cnt
FROM jdk1_6_0_07_txtfls_md5
WHERE (md5cnt > 1)
GROUP BY md5
ORDER BY md5cnt DESC;
~
Thanks
lbr
> I think you're confusing postgresql with a spreadsheet program.
~
I wonder what makes you think so
~
> There are client programs which will do this for you, perhaps you wan one of
> those?
~
Well, then obviously there is the need for it and you were not
successful enough at convincing these de
Albretch Mueller wrote:
Hi,
~
I am trying to get dups from some data from files which md5sums I
previously calculated
~
Here is my mere mortal SQL
~
SELECT md5, COUNT(md5) AS md5cnt
FROM jdk1_6_0_07_txtfls_md5
WHERE (md5cnt > 1)
GROUP BY md5
ORDER BY md5cnt DESC;
I think you are looking for
On Aug 30, 2008, at 6:26 AM, Albretch Mueller wrote:
Well, then obviously there is the need for it and you were not
successful enough at convincing these developers that they were
"confusing postgresql with a spreadsheet program"
The behavior you are looking for is typical of a spreadsheet, b
On Saturday 30 August 2008 5:23:25 am Albretch Mueller wrote:
> On Thu, Aug 28, 2008 at 7:50 PM, Adrian Klaver <[EMAIL PROTECTED]> wrote:
> > Define easily.
>
> ~
> OK, let me try to outline the approach I would go for:
> ~
> I think "COPY FROM CSV" should have three options, namely:
> ~
> 1) th
On Aug 30, 2008, at 9:19 AM, Christophe wrote:
On Aug 30, 2008, at 6:26 AM, Albretch Mueller wrote:
Well, then obviously there is the need for it and you were not
successful enough at convincing these developers that they were
"confusing postgresql with a spreadsheet program"
The behavior y
thank you Stefan your SQL worked, but still; I am just asking and my
programming bias will certainly show, but aren't you effectivly
"calling" count on the table three times if you go:
~
SELECT md5, COUNT(md5)
FROM jdk1_6_0_07_txtfls_md5
GROUP BY md5
HAVING COUNT(md5) > 1
ORDER BY COUNT(md5) DESC;
"Albretch Mueller" <[EMAIL PROTECTED]> writes:
> thank you Stefan your SQL worked, but still; I am just asking and my
> programming bias will certainly show, but aren't you effectivly
> "calling" count on the table three times if you go:
The system is smart enough to only do the count() once.
> spreadsheet programs (generally; I'm sure there are exceptions) don't have
> the notion of a schema; each cell can hold its own particular type.
~
Oh, now I see what Martin meant!
~
> that's not a traditional part of a database engine.
~
well, yeah! I would totally agree with you, but since I
> The system is smart enough to only do the count() once.
~
But not smart enough to make a variable you declare point to that
internal variable so that things are clearer/ easier ;-)
~
Thanks
lbrtchx
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your
On Saturday 30 August 2008 9:42:19 am Adrian Klaver wrote:
> On Saturday 30 August 2008 5:23:25 am Albretch Mueller wrote:
> > On Thu, Aug 28, 2008 at 7:50 PM, Adrian Klaver <[EMAIL PROTECTED]>
wrote:
> > > Define easily.
> >
> > ~
> > OK, let me try to outline the approach I would go for:
> > ~
On Sat, Aug 30, 2008 at 01:36:25PM -0400, Albretch Mueller wrote:
> > The system is smart enough to only do the count() once.
> ~
> But not smart enough to make a variable you declare point to that
> internal variable so that things are clearer/ easier ;-)
The SQL standard has pretty clear rules
On Aug 30, 2008, at 10:33 AM, Albretch Mueller wrote:
well, yeah! I would totally agree with you, but since I doubt very
much "COPY FROM CSV" is part of the SQL standard to beging with, why
not spice it up a little more?
I'd guess that coming up with a general algorithm to guess the type
fr
On Thu, Aug 28, 2008 at 7:45 PM, Matthew Dennis <[EMAIL PROTECTED]> wrote:
> Another question though. Since I could potentially start transaction, drop
> indexes/checks, replace function, create indexes/checks, commit tranasaction
> could I deal with the case of the constant folding into the cach
> ... are times local or UTC
~
this is a rather semantic, not a syntactic issue that some code could
NOT decide based on the data it reads
~
> Should we assume integer or float?
~
is a dot anywhere in the data you read in for that particular column? ...
~
> Varchar or text?
~
Is the length of th
You have made clear to me why my attempt for a RFE for COPY FROM CVS
has found some technical resistance/disagreement, but I still think my
idea even if not so popular for concrete and cultural reasons makes at
least sense to some people
It's a perfectly reasonable problem to want to solve; th
19 matches
Mail list logo