from:"Dan Bolser"

Re: Select query problem

2006-07-27 Thread Dan Bolser


Barry wrote:

Nenad Bosanac schrieb:


Hi I have one problem that i can`t resolve.



still need advice or is it solved?




IF!!! you need IF!! :)

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Query Complexity (big 'O')

2005-06-21 Thread Dan Bolser


Hello, 

I am interested in the theoretical time / space complexity of SQL queries
on indexed / non-indexed data. 

I think I read somewhere that a JOIN on an indexed column is something
like O[mn*log(mn)] (m rows joined to n).

I assume without an index it is just O[m*n]

Specifically I want to know the complexity of a query that does a
'cross tabulation'

SELECT
  X, 
  SUM(if(Y=1,Z,0)) AS s1,
  SUM(if(Y=2,Z,0)) AS s2,
  SUM(if(Y=3,Z,0)) AS s3,
  ...
FROM 
  T1
GROUP BY 
  X;

Assuming both X and Y are indexed, how does the complexity grow with
increasing 's' (more if clauses). More basic, what is the complexity of
the group by statement?

Can anyone point me to a good online guide to complexity of SQL?

Thanks very much for any suggestions :)

Dan.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

SQL 'clustering' query?

2005-07-14 Thread Dan Bolser


Hi, I remember reading about an SQL query type which did something like
select all 'aircraft hangers' which contained exactly (or at least) some
given set of aircraft.

Or did it select the list of pilots qualified to fly all the airplanes in
the hanger... 

Anyway, I forget the syntax (and the fancy name for this kind of query).

What I want to do is the following, given this data...


Table: ATTRIBUTE_LIST;

ID  ATTRIBUTE
W   A
W   B
W   C
X   A
X   B
X   C
Y   A
Y   B
Y   C
Y   D
Z   E

--> SQL MAGIC --> 


Table: CLUSTERS

G_IDID
1   W
1   X
2   Y
3   Z


That is, to group together all ID's with the same 'set' of ATTRIBUTES.

Currently I am doing this using 'GROUP_CONCAT', but my attribute list just
went above the limit for the GROUP_CONCAT column...

+-+--+--+
| Level   | Code | Message  |
+-+--+--+
| Warning | 1260 | 7 line(s) were cut by GROUP_CONCAT() |
+-+--+--+

My query looks roughly like this...


SET @i:=0, @x:='', @row:='';
#
DROPTABLE CLUSTERS;
CREATE  TABLE CLUSTERS
  (PRIMARY KEY (ID), INDEX (G_ID))
#
SELECT
  ID, G_ID
  #
FROM
(
  SELECT
ID,
#
@x:= ATTR_LIST   AS HIDDEN1,
#
IF(@row = @x, @i, @i:[EMAIL PROTECTED]) AS G_ID,
#
   @row:= @x AS HIDDEN2
#
  FROM 
  (
SELECT 
  ID,
  GROUP_CONCAT(ATTRIBUTE) AS ATTR_LIST,
FROM
  ATTRIBUTE_LIST
GROUP BY
  ID
#
  ) AS vt1
  #
  ORDER BY  -- This is very important for
ATTR_LIST   -- the overall query.
  #
) AS vt2;

(And thats the highly simplified version!)


I can't shake the feeling that this 'string based' approach (while quite
speedy) is inherently messy, and that a proper 'set based' approach
should exist, and shouldn't have the limitation in the number of
attributes that the above method has.

In general I would really like to (somehow) develop a suite of easy to use
'SQL CLUSTER' commands, as the data mining community needs that kind of
thing in nice general (set based) abundance :)

Anyway, thanks for any feedback on any of the above,

Dan.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: SQL 'clustering' query?

2005-07-15 Thread Dan Bolser

On Thu, 14 Jul 2005, Peter Brawley wrote:

>Dan,
>
>Hi, I remember reading about an SQL query type which did something like
>select all 'aircraft hangers' which contained exactly (or at least) some
>given set of aircraft.
>
>Or did it select the list of pilots qualified to fly all the airplanes in
>the hanger... 
>
>Anyway, I forget the syntax (and the fancy name for this kind of query).
>
>
>I think the concept you're after is relational division. The aeroplane 
>hangar was one of Celko's examples. Another from him is at 
>http://www.artfulsoftware.com/queries.php#28, other examples at
>http://www.artfulsoftware.com/queries.php#22, 
>http://www.artfulsoftware.com/queries.php#33,

Cheers, I will try to rephrase my query below as 'RELATIONAL DIVISION' :)

In the short term I solved my problem with 

group_concat_max_len system variable - For some reason I thought it would
already be at the maximum lenght. I doubled it up and saw my warnings
dissapear 

Warning: 1260

Thanks very much for the above links,

Dan.


>
>PB
>
>
>Dan Bolser wrote:
>
>>Hi, I remember reading about an SQL query type which did something like
>>select all 'aircraft hangers' which contained exactly (or at least) some
>>given set of aircraft.
>>
>>Or did it select the list of pilots qualified to fly all the airplanes in
>>the hanger... 
>>
>>Anyway, I forget the syntax (and the fancy name for this kind of query).
>>
>>What I want to do is the following, given this data...
>>
>>
>>Table: ATTRIBUTE_LIST;
>>
>>IDATTRIBUTE
>>W A
>>W B
>>W C
>>X   A
>>X B
>>X C
>>Y A
>>Y B
>>Y C
>>Y D
>>Z E
>>
>>--> SQL MAGIC --> 
>>
>>
>>Table: CLUSTERS
>>
>>G_ID  ID
>>1 W
>>1 X
>>2 Y
>>3 Z
>>
>>
>>That is, to group together all ID's with the same 'set' of ATTRIBUTES.
>>
>>Currently I am doing this using 'GROUP_CONCAT', but my attribute list just
>>went above the limit for the GROUP_CONCAT column...
>>
>>+-+--+--+
>>| Level   | Code | Message  |
>>+-+--+--+
>>| Warning | 1260 | 7 line(s) were cut by GROUP_CONCAT() |
>>+-+--+--+
>>
>>My query looks roughly like this...
>>
>>
>>SET @i:=0, @x:='', @row:='';
>>#
>>DROPTABLE CLUSTERS;
>>CREATE  TABLE CLUSTERS
>>  (PRIMARY KEY (ID), INDEX (G_ID))
>>#
>>SELECT
>>  ID, G_ID
>>  #
>>FROM
>>(
>>  SELECT
>>ID,
>>#
>>@x:= ATTR_LIST   AS HIDDEN1,
>>#
>>IF(@row = @x, @i, @i:[EMAIL PROTECTED]) AS G_ID,
>>#
>>   @row:= @x AS HIDDEN2
>>#
>>  FROM 
>>  (
>>SELECT 
>>  ID,
>>  GROUP_CONCAT(ATTRIBUTE) AS ATTR_LIST,
>>FROM
>>  ATTRIBUTE_LIST
>>GROUP BY
>>  ID
>>#
>>  ) AS vt1
>>  #
>>  ORDER BY  -- This is very important for
>>ATTR_LIST   -- the overall query.
>>  #
>>) AS vt2;
>>
>>(And thats the highly simplified version!)
>>
>>
>>I can't shake the feeling that this 'string based' approach (while quite
>>speedy) is inherently messy, and that a proper 'set based' approach
>>should exist, and shouldn't have the limitation in the number of
>>attributes that the above method has.
>>
>>In general I would really like to (somehow) develop a suite of easy to use
>>'SQL CLUSTER' commands, as the data mining community needs that kind of
>>thing in nice general (set based) abundance :)
>>
>>Anyway, thanks for any feedback on any of the above,
>>
>>Dan.
>>
>>
>>  
>>
>
>
>


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Make a report like this...

2005-07-20 Thread Dan Bolser


+--+---+++-+
| G_ID | TOTAL | G2 | NR | ASSEMBLY_LISTING|
+--+---+++-+
| 1132 |34 |  1 |  1 | 1bf3-1,1bgj-1,1bgn-1,1bkw-1,1cc4-1, |
|  |   ||| 1cc6-1,1cj2-1,1cj3-1,1cj4-1,1d7l-1, |
|  |   ||| 1dob-1,1doc-1,1dod-1,1doe-1,1ius-1, |
|  |   ||| 1k0i-1,1k0j-1,1k0l-1,1pbb-1,1pbc-1, |
|  |   ||| 1pbd-1,1pbe-1,1pbf-1,1pdh-1,1phh-1, |
|  |   ||| 1pxa-1,1pxb-1,1pxc-1,2phh-1 |
+--+---+++-+
| 1793 |10 |  1 |  1 | 1jdx-1,2jdw-1,2jdx-1,3jdw-1,4jdw-1, |
|  |   ||| 5jdw-1,6jdw-1,7jdw-1,8jdw-1,9jdw-1  |
+--+---+++-+


Hello, I made the above report by hand after using a GROUP_CONCAT
query. Is there any way to make this kind of format automatically in
MySQL?

It seems it would be a nice feature.

Dan.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Make a report like this...

2005-07-20 Thread Dan Bolser

On Wed, 20 Jul 2005, Peter Brawley wrote:

>Dan
>
>/>Hello, I made the above report by hand after using a GROUP_CONCAT
> >query. Is there any way to make this kind of format automatically in
> >MySQL?/
>
>How about HTML output (mysql ... -H)?

Yeah, I found that works, for example also with 

'GROUP_CONCAT( ... SEPARATOR = "")'

I was just looking for a text based report if possible. It seems it should
be do-able in perl or similar, but I don't know if it has.

Cheers,


>
>PB
>
>--
>
>Dan Bolser wrote:
>
>>+--+---+++-+
>>| G_ID | TOTAL | G2 | NR | ASSEMBLY_LISTING|
>>+--+---+++-+
>>| 1132 |34 |  1 |  1 | 1bf3-1,1bgj-1,1bgn-1,1bkw-1,1cc4-1, |
>>|  |   ||| 1cc6-1,1cj2-1,1cj3-1,1cj4-1,1d7l-1, |
>>|  |   ||| 1dob-1,1doc-1,1dod-1,1doe-1,1ius-1, |
>>|  |   ||| 1k0i-1,1k0j-1,1k0l-1,1pbb-1,1pbc-1, |
>>|  |   ||| 1pbd-1,1pbe-1,1pbf-1,1pdh-1,1phh-1, |
>>|  |   ||| 1pxa-1,1pxb-1,1pxc-1,2phh-1 |
>>+--+---+++-+
>>| 1793 |10 |  1 |  1 | 1jdx-1,2jdw-1,2jdx-1,3jdw-1,4jdw-1, |
>>|  |   ||| 5jdw-1,6jdw-1,7jdw-1,8jdw-1,9jdw-1  |
>>+--+---+++-+
>>
>>
>>Hello, I made the above report by hand after using a GROUP_CONCAT
>>query. Is there any way to make this kind of format automatically in
>>MySQL?
>>
>>It seems it would be a nice feature.
>>
>>Dan.
>>
>>
>>  
>>
>


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Truly random 'groupie'

2005-08-25 Thread Dan Bolser



Hi, I think I saw this question being asked before, but I can't remember 
if their was a simple answer.


If I have a table of two columns (PK and FK) with a one to many 
relationship, i.e.


PK  FK
1   a
2   a
3   a
4   b
5   b
6   b
7   c
8   c
9   d


Can I easily select a truly random value of PK for each distinct value 
of FK?


I use the words 'truly random' as I think the term 'pseudo random' 
applies to the following syntax...


  SELECT PK,FK FROM above_table GROUP BY FK;


Of course I can do...

  SELECT MIN(PK),FK FROM above_table GROUP BY FK;

  SELECT MAX(PK),FK FROM above_table GROUP BY FK;


But what I would really love to do is something like...

  SELECT RANDOM(PK),FK FROM above_table GROUP BY FK;


Any general (and simple) SQL solution to this problem?

Any hacker out there who can put together a UDF in 10 minutes flat?

The problem is slightly more complex than stated, because PK is actually 
a multi-part primary key. In my fevered imagination I picture syntax 
that looks like this...


  SELECT RANDOM(ROW(PK_P1,PK_P2)),FK FROM above_table GROUP BY FK;


but that could be a garbage suggestion.

Any pointers to previous answers to similar questions would be great.

All the best,
Dan.



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: SSH tunnel for Mysql

2005-11-22 Thread Dan Bolser


Jerry Swanson wrote:

Both Linux computers.



This is what I do...

$ more ssh_to_mysql_on_beta
#!/bin/sh

ssh \
 -t \
 -g \
 -L 3306:beta.domain.ac.uk:3306 \
 [EMAIL PROTECTED] \
 "./bin/sleep.plx mysql"

That last line is a trick to prevent the terminal from 'timeing out'. 
The sleep.plx application is just a little perl script which writes 
"sleeping $ARGV[0]\n" every 5 minuites.


Running the above on my local machine (linux) allows me to connect to 
mysql on 'local' port 3306, and have that magically redirected via ssh 
to beta port 3306.


Dan.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Central UDF project at mysql.com?

2005-03-30 Thread Dan Bolser


Hi, 

I searched for previous discussion on this topic, but didn't find any.

I would like to see a centralized MySQL hosted UDF archive and development
project. The only existing 'archives' seem to be somewhat poorly
maintained (sorry), and suffer for their duplicated efforts and being
loosely distributed throughout the web.

The best I can find are here (ranked according to Google)...

http://empyrean.lib.ndsu.nodak.edu/~nem/mysql/udf/
http://www.oreillynet.com/pub/wlg/2292
http://www.thecodeproject.com/Purgatory/mygroupconcat.asp
http://mysql-udf.sourceforge.net/
http://www.linuxjournal.com/article/6841

I think a centralized project would do wonders for the UDF community,
allowing UDF's to be discussed, suggested and developed under one roof. A
first step should be to create a [EMAIL PROTECTED] mailing list. Without
such a central list the UDF community can't communicate effectively. Who
better than MySQL to organize the MySQL UDF community?

A simple "not officially supported" statement is all that is needed. Good
UDF's could become part of MySQL proper, and a UDF 'bundle' would be a
great development. MySQL programmers could help build UDF's, and the
community could vote on 'wanted' functions.

You could probably guess where all this is going, and that is towards my
own UDF request (where to ask?), but I will leave that for later.

Any comments? Any postings that I have missed? Any reason that their is no
udf mailing list? I think that their are tons of UDF's waiting to happen,
given the right conditions.

All the best,
Dan.



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

RE: The best way to transfer data to another server

2005-03-30 Thread Dan Bolser

On Wed, 30 Mar 2005, Denis Gerasimov wrote:

>
>> >
>> > Hello list,
>> >
>> > I have two MySQL 4.1 servers, one local and one remote. I need to
>> transfer
>> > database from one server to another. What actually is the best way of
>> > handling this task?
>> >
>> > Are there any standard MySQL tools available for doing that (I mean
>> MySQL
>> > Administrator/Query Browser etc.)
>> >
>> It's not clear from your question whether you want to transfer the data
>> once
>> only or if you want to transfer it repeatedly, perhaps daily or weekly. If
>> you mean the latter, you may want to look into MySQL Replication, which
>> enables all the changes made to the data on one server to be automatically
>> made to the other server as well.
>
>Well, I need this only time to time, just for transferring data when I need.
>
>
>I assume that that is a bad idea to copy db files from one server to
>another. mysqldump is not an exciting option too...
>
>But there is a simple solution when using MySQL Administrator: make backup
>of desired data to SQL file, then connect to another server and restore it!

The perl package 'mysqlhotcopy' looks good. I didn't try it yet.


>
>> 
>> I haven't used MySQL Replication but I know it exists and is documented in
>> the MySQL manual.
>> 
>> Rhino
>> 
>> 
>> 
>> --
>> No virus found in this outgoing message.
>> Checked by AVG Anti-Virus.
>> Version: 7.0.308 / Virus Database: 266.8.3 - Release Date: 25/03/2005
>> 
>> 
>> --
>> MySQL General Mailing List
>> For list archives: http://lists.mysql.com/mysql
>> To unsubscribe:
>> http://lists.mysql.com/[EMAIL PROTECTED]
>
>
>


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Central UDF project at mysql.com?

2005-03-30 Thread Dan Bolser

On Wed, 30 Mar 2005, Mark Papadakis wrote:

>That is a really good idea - though I am not sure there is much of
>those out there to justify the cause, nor many developers actually
>using UDFs.

Cheers :)

I think with the right infra more people would start using and developing
UDF's, especially if the work was a part of 'MySQL' proper and not just
different groups of individuals. I know it wouldn't make any *real*
difference, but it would make a psychological difference. Also a central
project would be a way to boost visibility for different UDF projects, for
example good (ongoing) work could be showcased in the MySQL news letter.


>Still, having them all in one place could be nothing but a good thing.

Yeah, I totally agree :) Especially if resources like the MySQL bugtracker
and mailing lists could be shared. 

Is this the right forum for requesting such things?

All the best,
Dan. 


>
>MarkP
>
>On Wed, 30 Mar 2005 14:39:11 +0100 (BST), Dan Bolser
><[EMAIL PROTECTED]> wrote:
>> 
>> Hi,
>> 
>> I searched for previous discussion on this topic, but didn't find any.
>> 
>> I would like to see a centralized MySQL hosted UDF archive and development
>> project. The only existing 'archives' seem to be somewhat poorly
>> maintained (sorry), and suffer for their duplicated efforts and being
>> loosely distributed throughout the web.
>> 
>> The best I can find are here (ranked according to Google)...
>> 
>> http://empyrean.lib.ndsu.nodak.edu/~nem/mysql/udf/
>> http://www.oreillynet.com/pub/wlg/2292
>> http://www.thecodeproject.com/Purgatory/mygroupconcat.asp
>> http://mysql-udf.sourceforge.net/
>> http://www.linuxjournal.com/article/6841
>> 
>> I think a centralized project would do wonders for the UDF community,
>> allowing UDF's to be discussed, suggested and developed under one roof. A
>> first step should be to create a [EMAIL PROTECTED] mailing list. Without
>> such a central list the UDF community can't communicate effectively. Who
>> better than MySQL to organize the MySQL UDF community?
>> 
>> A simple "not officially supported" statement is all that is needed. Good
>> UDF's could become part of MySQL proper, and a UDF 'bundle' would be a
>> great development. MySQL programmers could help build UDF's, and the
>> community could vote on 'wanted' functions.
>> 
>> You could probably guess where all this is going, and that is towards my
>> own UDF request (where to ask?), but I will leave that for later.
>> 
>> Any comments? Any postings that I have missed? Any reason that their is no
>> udf mailing list? I think that their are tons of UDF's waiting to happen,
>> given the right conditions.
>> 
>> All the best,
>> Dan.
>> 
>> --
>> MySQL General Mailing List
>> For list archives: http://lists.mysql.com/mysql
>> To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
>> 
>> 
>
>
>


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: if statement help

2005-03-30 Thread Dan Bolser


I think I remember a match_at(":") or pat_index(":") UDF which would
return the position of the first ":" for you, but I can't find it if it
does exist. 


On Wed, 30 Mar 2005, Christopher Vaughan wrote:

>I have data in a table listed as 
>44:22:22
>333:33:33
>It stands for hhh:mm:ss
>I want to break each part of the data into different parts based on the ':' to 
>separate them.  Then I want to take that data and sum it.  I wrote an if 
>statement to parse through this table but I can't get it to work.  I am not 
>sure If my syntax is
>wrong because I can't find anything to check against it.  
>
>Here is the syntax:
>
>IF 
>(SELECT job_walltime
>FROM time
>WHERE CHAR_LENGTH( job_walltime ) >=9)
>THEN
>(SELECT sum( left( job_walltime,  '3'  )  ) hours, sum(mid(  `job_walltime` , 
>4, 2  )) , sum( right( job_walltime,  '2'  )  ) seconds
>FROM  `time`)
>ELSE
>(SELECT sum( left( job_walltime,  '3'  )  ) hours, sum(mid(  `job_walltime` , 
>3, 2  ) ), sum( right( job_walltime,  '2'  )  ) seconds
>FROM  `time`)
>END 
>
>I know this isn't the only way to do this but this but this is the first 
>suggestion that comes to mind.  Any input would be great.
>
>Further info MySQL 3.23.58  running on RedHat 9
>
>-Chris Vaughan
>
>www.clusters.umaine.edu
>
>
>
>
>
>


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

UDF request?

2005-03-31 Thread Dan Bolser


Hi, 

I have a request for a UDF that I would find really useful. I am
duplicating this question on the community list as well as the MySQL list,
because I am not sure where best to make this kind of request (see the
previous post http://lists.mysql.com/community/97).

I think the following syntax would be really cool...

SELECT 
  a.*,
  b.*
FROM
  table_one a 
RANDOM JOIN   -- <-- :)
  table_two b
USING 
  (ID)
;

Lets say table_one and table_two have a primary key called ID in common
(although of course that isn't necessary).

The idea is that the RANDOM JOIN would first calculate the INNER JOIN,
then scramble the 'links' (or rows) between the two tables. The result
would be the same number of rows overall, the same number of distinct
a.ID's and b.ID's, but those a.ID's and b.ID's would be randomly
associated with the marginals given by the correct inner join.

Hopefully that makes sense.

I think this function would be really useful for statistical analysis of
scientific data within MySQL (using randomized versions of the
associations within the tables).

Not sure if the above syntax is optimal, because I would like to be able
to do things like this...

table_one
DeptPerson  Gender
A   1   M
A   2   F
A   3   M
B   4   F
B   5   M
B   6   F

table_two
DeptPerson  Spending
A   1   10
A   2   20
A   3   30
B   4   40
B   5   50
B   6   60

SELECT 
  Dept,
  Gender,
  AVG(Spending)
FROM
  table_one a
INNER JOIN
  table_two b
USING 
  (Dept,Person)
GROUP BY
  Dept, Gender
;


With the above query (which I hope is sensible) I would like to keep the
departments fixed, and randomize the genders of the people in the
departments (keeping the number of each sex in each department the same).

So we could INNER JOIN using Dept and then RANDOM JOIN (as described
above) using Person - all in one 'JOIN'. All else being the same this
should randomize the Gender, but keep the marginals.

I guess this is overly complex given the problem, and it actually raises
more questions instantly (about statistical analysis), but the function is
basic (I hope), and like all SQL it is the simplicity that allows you to
build complex and robust statements.

Does the random join make any sense? Is this UDF material?

Any feedback is welcome,

Dan.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

RE: UDF request?

2005-04-08 Thread Dan Bolser

On Thu, 7 Apr 2005, Sean Nolan wrote:

>I think you'll find you can do what you want with a cross join. A cross join 
>will join every row from the first table with every row from the second 
>table. It will not randomly do this, so you'd have to be creative in 
>"randomly" selecting rows in the WHERE clause. Here's how you could do this 
>with your data, pardon the highly original and very scientific 
>randomization, but it is just an example :-)

Perhaps I don't understand your "randomization" (because I really don't
understand it ;), but I don't think a CROSS join does the trick, because I
want to randomly pick the same rows from either side of the join that I
would have gotten with an INNER JOIN using the 1:1 primary key between the
two tables. This is analagous to sampling without replacement. If I do a
cross join and then just restrict the number of rows returned (hey! I just
worked out what you are doing below) I don't necessarily get the same
'marginals' (or to randomly pick the same rows from either side of the
join) . This is analagous to sampling with replacement.

Do you see what I mean?




>
>SELECT
>  Dept,
>  Gender,
>  AVG(Spending)
>FROM
>  table_one a
>CROSS JOIN
>  table_two b
>WHERE (a.Person * b.Person) % 4 = 3
>GROUP BY
>  Dept, Gender;
>
>Sean
>
>- Original Message --
>
>Hi,
>
>I have a request for a UDF that I would find really useful. I am
>duplicating this question on the community list as well as the MySQL list,
>because I am not sure where best to make this kind of request (see the
>previous post http://lists.mysql.com/community/97).
>
>I think the following syntax would be really cool...
>
>SELECT
>  a.*,
>  b.*
>FROM
>  table_one a
>RANDOM JOIN   -- <-- :)
>  table_two b
>USING
>  (ID)
>;
>
>Lets say table_one and table_two have a primary key called ID in common
>(although of course that isn't necessary).
>
>The idea is that the RANDOM JOIN would first calculate the INNER JOIN,
>then scramble the 'links' (or rows) between the two tables. The result
>would be the same number of rows overall, the same number of distinct
>a.ID's and b.ID's, but those a.ID's and b.ID's would be randomly
>associated with the marginals given by the correct inner join.
>
>Hopefully that makes sense.
>
>I think this function would be really useful for statistical analysis of
>scientific data within MySQL (using randomized versions of the
>associations within the tables).
>
>Not sure if the above syntax is optimal, because I would like to be able
>to do things like this...
>
>table_one
>Dept   Person  Gender
>A  1   M
>A  2   F
>A  3   M
>B  4   F
>B  5   M
>B  6   F
>
>table_two
>DeptPerson  Spending
>A   1   10
>A   2   20
>A   3   30
>B   4   40
>B   5   50
>B   6   60
>
>SELECT
>  Dept,
>  Gender,
>  AVG(Spending)
>FROM
>  table_one a
>INNER JOIN
>  table_two b
>USING
>  (Dept,Person)
>GROUP BY
>  Dept, Gender
>;
>
>
>With the above query (which I hope is sensible) I would like to keep the
>departments fixed, and randomize the genders of the people in the
>departments (keeping the number of each sex in each department the same).
>
>So we could INNER JOIN using Dept and then RANDOM JOIN (as described
>above) using Person - all in one 'JOIN'. All else being the same this
>should randomize the Gender, but keep the marginals.
>
>I guess this is overly complex given the problem, and it actually raises
>more questions instantly (about statistical analysis), but the function is
>basic (I hope), and like all SQL it is the simplicity that allows you to
>build complex and robust statements.
>
>Does the random join make any sense? Is this UDF material?
>
>Any feedback is welcome,
>
>Dan.
>
>Thread
>UDF request? - Dan Bolser, April 1 2005 12:10am
>
>_
>Dont just search. Find. Check out the new MSN Search! 
>http://search.msn.click-url.com/go/onm00200636ave/direct/01/
>
>
>


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

UDF Request "AGGLOM()"

2005-04-08 Thread Dan Bolser


Who can I prod about setting up a UDF repo at MySQL. I think 'they' should
do this ;)

http://lists.mysql.com/community/97

Anyway I am posting this request to 'community' because I still don't know
the appropriate place to post UDF related stuff.

This is anoter (potentially crazy) idea for a UDF that I would find very
usefull in my research...

AGGLOM - Simple agglomerative clustering for MySQL ...

The UDF would work on any NUMBER column, and return the
number of 'clusters' using agglomerative clustering
with a certain threshold as an input.

Agglomerative clustering merges any two numbers that
are within the 'threshold', and replaces those numbers
with the average of the two. The clustering proceedes
smallest 'gap' first, and stops when no two numbers are
within the threshold.

The result would be the number (or perhaps the values) of the
remaining clusters.

Syntax (suggested) 

AGGLOM(THRESH,expr (returning a number))

For example

Table1

C1 C2
A 1
A 2
A 3
A 4
A 5
A 6
A 7
B 10
B 11
B 12
B 56
B 57
B 58
B 99
B 101


SELECT C1, AGGLOM(C2,1) AS C3 FROM Table1 GROUP BY C1;

C1 C3
A 4
B 6


SELECT C1, AGGLOM(C2,2) AS C3 FROM Table1 GROUP BY C1;

C1 C3
A 3
B 3


SELECT C1, AGGLOM(C2,3) AS C3 Table1 GROUP BY C1;

C1 C3
A 2
B 3


SELECT C1, AGGLOM(C2,4) AS C3 Table1 GROUP BY C1;

C1 C3
A 1
B 3


SELECT C1, AGGLOM(C2,50) AS C3 Table1 GROUP BY C1;

C1 C3
A 1
B 1



Remember, merge numbers with the smallest difference
first, and replace each pair with the average of the
two. Recalculate the differences for the new number,
and repeat until no distance is smaller than the threshold.

This is a usefull clustering 'hack' to see if a distribution 
is bi-modal or multi modal for example. It is very quick to
calculate using a hash table, and could be a great
function to add.

Is this idea as crazy as I think it might be?




-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Does such a JOIN exist that can create a pivot table?

2005-04-08 Thread Dan Bolser


I think what you are talking about could be called a 'crosstabulation' or
a crosstab. 

Their are some tutorials about making cross-tabs using perl. I have used
them a lot, and they are really great.

I tend to stack up lots of IF statemens...

Table1

month   person  sex sales
1   a   m   10  
1   b   f   20
1   c   m   30
2   a   m   40
2   b   f   50
2   c   m   60


select 
  month,
  sum(if(sex='m',sales,0)) as male_sales,
  sum(if(sex='f',sales,0)) as female_sales,
  sum(sales) as total
from 
  Table1
group by
  month;

This would give (I think)...

month   male_sales  female_salestotal
1   40  20  60
2   100 50  150

You could easily add a...

count(distinct(if(sex='m',person,NULL))) as total_men,
count(distinct(if(sex='f',person,NULL))) as total_women,

to create average sales for men and women, or anything else you want.

Does that look right?


On Fri, 8 Apr 2005, Jacob, Raymond A Jr wrote:

>Question: I frequently would like to summarize the results of my query in 
>heiarchical layout also
>known as a Pivot table.
>Here is an example of what I would like output. NULL will be printed as a 
>space 
>when output.
>
>sum of broken| source of  | qty  |reseller of |qty  |customer with 
>|qty by
>  bolts  |broken bolts|regional  |broken bolts|reseller |broken bolts  
> |customer
>-
> 100 |  NULL  |NULL  |NULL| NULL|NULL  
> |NULL
>(100/NULL)   |  US| 75   |NULL| NULL|NULL  
>|NULL
>(100/NULL)   | (US/NULL)  |(75/NULL) |ACME| 35  |NULL  
>|NULL
>(100/NULL)   | (US/NULL)  |(75/NULL) |(ACME/NULL) |(35/NULL)|Barney Rubble |20
>(100/NULL)   |  US| 75   |NULL| NULL|NULL  
>|NULL
>(100/NULL)   | (US/NULL)  |(75/NULL) |ACME| 35  |NULL  
>|NULL
>(100/NULL)   | (US/NULL)  |(75/NULL) |(ACME/NULL) |(35/NULL)|Betty Rubble  |5
>(100/NULL)   |  US| 75   |NULL| NULL|NULL  
>|NULL
>(100/NULL)   | (US/NULL)  |(75/NULL) |ABLE| 25  |NULL  
>|NULL
>(100/NULL)   | (US/NULL)  |(75/NULL) |(ABLE/NULL) |(25/NULL)|Sponge Bobble |20
>(100/NULL)   |  MEXICO| 15   |NULL| NULL|NULL  
>|NULL
>(100/NULL)   | (MEX/NULL) |(15/NULL) |TIPPY   | 12  |NULL  
>|NULL
>(100/NULL)   | (MEX/NULL) |(15/NULL) |(ACME/NULL) |(12/NULL)|Juan Pepe |7
>
>
>100  *  *  *   *   * *
>*   US  75 *   *   * *
>*   *   * ACME 35  * *
>*   *   *  *   *   Barney Ruble 20
>
>Where * represents NULL or a Primary Key.
>
>How does one build a pivot table?
>from tables such as:
>
>factory_parts table
> ::{
>part no, 
>plant, 
>qty_manufactured
>plant name
>}
>reseller_parts table
> ::{
>part no
>plant 
>qty received
>cost
>reseller name
>reseller id
>
>}
>customer_parts table
> ::{
>reseller id
>part no
>plant
>qty sold
>qty recvd
>customer id
>customer name
>}
>
>Ooops now the light bulb comes on 
>I would do:
>   select factory_parts.plant name, 
>  reseller_parts.reseller_name
>  customer_parts.customer_name,
>  customer_parts.qty_recvd
>   from factory_parts,reseller_parts,customer_parts
>   where customer_parts.part_no == 'broken_bolt' AND
> ( customer_parts.part_no == reseller_parts.part.no AND
>   customer_parts.part_no == factory_parts.part.no )
>
>Now the question becomes how does one construct the aggregate columns
>representing the sum of bolts produced by the company,made at the plant,
>shipped to the reseller and sold to the customer,
>then join those aggregate columns? Any suggestions?
>
>Thank you,
>Raymond
>
>
>
>


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Thank you-regarding: Does such a JOIN exist that can create a pivot table?

2005-04-08 Thread Dan Bolser

On Fri, 8 Apr 2005, Jacob, Raymond A Jr wrote:

>Does such a JOIN exist that can create a pivot table?
>   Thank you:
>   182361 by: Dan Bolser
>   182362 by: Peter Brawley
>
>Now I must go into my cave and meditate on these queries:
>Ommm,Ommm,...Ommm :-)

Great! Its a really interesting idea to come up with syntax for a single
'pivot table' query. It is certainly a usefull tool.

Any suggested syntax?

(I now see pivot table == crosstabulation query, cheers Pete!).

Dan.

>
>raymond
>
>

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

RE: UDF request?

2005-04-08 Thread Dan Bolser

On Fri, 8 Apr 2005 [EMAIL PROTECTED] wrote:

>Dan Bolser <[EMAIL PROTECTED]> wrote on 04/08/2005 12:41:35 PM:
>
>> On Thu, 7 Apr 2005, Sean Nolan wrote:
>> 
>> >I think you'll find you can do what you want with a cross join. A cross 
>join 
>> >will join every row from the first table with every row from the second 
>
>> >table. It will not randomly do this, so you'd have to be creative in 
>> >"randomly" selecting rows in the WHERE clause. Here's how you could do 
>this 
>> >with your data, pardon the highly original and very scientific 
>> >randomization, but it is just an example :-)
>> 
>> Perhaps I don't understand your "randomization" (because I really don't
>> understand it ;), but I don't think a CROSS join does the trick, because 
>I
>> want to randomly pick the same rows from either side of the join that I
>> would have gotten with an INNER JOIN using the 1:1 primary key between 
>the
>> two tables. This is analagous to sampling without replacement. If I do a
>> cross join and then just restrict the number of rows returned (hey! I 
>just
>> worked out what you are doing below) I don't necessarily get the same
>> 'marginals' (or to randomly pick the same rows from either side of the
>> join) . This is analagous to sampling with replacement.
>> 
>> Do you see what I mean?
>> 
>> 
>> 
>> 
>> >
>> >SELECT
>> >  Dept,
>> >  Gender,
>> >  AVG(Spending)
>> >FROM
>> >  table_one a
>> >CROSS JOIN
>> >  table_two b
>> >WHERE (a.Person * b.Person) % 4 = 3
>> >GROUP BY
>> >  Dept, Gender;
>> >
>> >Sean
>> >
>> >- Original Message --
>> >
>> >Hi,
>> >
>> >I have a request for a UDF that I would find really useful. I am
>> >duplicating this question on the community list as well as the MySQL 
>list,
>> >because I am not sure where best to make this kind of request (see the
>> >previous post http://lists.mysql.com/community/97).
>> >
>> >I think the following syntax would be really cool...
>> >
>> >SELECT
>> >  a.*,
>> >  b.*
>> >FROM
>> >  table_one a
>> >RANDOM JOIN   -- <-- :)
>> >  table_two b
>> >USING
>> >  (ID)
>> >;
>> >
>> >Lets say table_one and table_two have a primary key called ID in common
>> >(although of course that isn't necessary).
>> >
>> >The idea is that the RANDOM JOIN would first calculate the INNER JOIN,
>> >then scramble the 'links' (or rows) between the two tables. The result
>> >would be the same number of rows overall, the same number of distinct
>> >a.ID's and b.ID's, but those a.ID's and b.ID's would be randomly
>> >associated with the marginals given by the correct inner join.
>> >
>> >Hopefully that makes sense.
>> >
>> >I think this function would be really useful for statistical analysis 
>of
>> >scientific data within MySQL (using randomized versions of the
>> >associations within the tables).
>> >
>> >Not sure if the above syntax is optimal, because I would like to be 
>able
>> >to do things like this...
>> >
>> >table_one
>> >Dept   Person   Gender
>> >A   1   M
>> >A   2   F
>> >A   3   M
>> >B   4   F
>> >B   5   M
>> >B   6   F
>> >
>> >table_two
>> >DeptPerson  Spending
>> >A   1   10
>> >A   2   20
>> >A   3   30
>> >B   4   40
>> >B   5   50
>> >B   6   60
>> >
>> >SELECT
>> >  Dept,
>> >  Gender,
>> >  AVG(Spending)
>> >FROM
>> >  table_one a
>> >INNER JOIN
>> >  table_two b
>> >USING
>> >  (Dept,Person)
>> >GROUP BY
>> >  Dept, Gender
>> >;
>> >
>> >
>> >With the above query (which I hope is sensible) I would like to keep 
>the
>> >departments fixed, and randomize the genders of the people in the
>> >departments (keeping the number of each sex in each department the 
>same).
>> >
>> >So we could INNER JOIN using Dept and then RANDOM JOIN (as described
>> >above) using Person - all in one 'JOIN'. All else being the same this
>> >should randomize the Ge

Find valid numeric values in a string field?

2005-04-10 Thread Dan Bolser


I have a column like this "my_col varchar(20) null".

The values in the column can be text or numbers. How can I select only
those rows where the value in this column is a valid number?

I need something like IS_DECIMAL(), but I can't find that function.

The following SQL fails to do the job (probably because of
optimization)...

SELECT
  MIXED_COLUMN,
  MIXED_COLUMN + 0
FROM
  TABLE
WHERE
  MIXED_COLUMN =
  MIXED_COLUMN + 0
;

Hey, lets make an IS_DECIMAL UDF! ;)



p.s. why aren't the addresses of these mailing lists anywhere to be found
on the MySQL Lists pages?

Dan.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: UDF Request "AGGLOM()"

2005-04-11 Thread Dan Bolser

On Mon, 11 Apr 2005, Arjen Lentz wrote:

>Hi Dan,
>
>On Sat, 2005-04-09 at 02:59, Dan Bolser wrote:
>> Who can I prod about setting up a UDF repo at MySQL. I think 'they' should
>> do this ;)
>
>Yep it's an existing idea, a very good one, and it's on the todo.
>Putting such an infrastructure into place will take some time though.

I can imagine it isn't trivial to set up.

>Would a special forum for this purpose perhaps be a good intermediate
>solution? Routines could be posted there, it's searchable...

Forum = list?

I think it would be a good start, if only to discuss things like this :)

All the best,
Dan.


>
>
>Regards,
>Arjen.
>


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Display 1st row of every group?

2005-04-11 Thread Dan Bolser


I read with great interest this 

http://www.artfulsoftware.com/queries.php#4

Display 1st row of every group

SELECT id
FROM tbl
GROUP BY id
HAVING count(*) = 1;

I want to use this syntax with an 'order by' like this...

  SELECT *
FROM tbl
GROUP BY id
  HAVING count(*) = 1
ORDER BY bleah;

Will this syntax return the row within the "id group" with the smallest
value of the bleah column? (is it guaranteed to do so?)

Cheers,
Dan.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Display 1st row of every group?

2005-04-11 Thread Dan Bolser


I guess that entry is either wrong or misleading.

I can get what I want like this

SELECT * 
  FROM tbl 
INNER JOIN (  SELECT id, min(bleah) as bleah
FROM tbl
GROUP BY id
)
USING (id,bleah);

Which will work so long as bleah has a unique minimum value per id group.

On Mon, 11 Apr 2005, Dan Bolser wrote:

>
>I read with great interest this 
>
>http://www.artfulsoftware.com/queries.php#4
>
>Display 1st row of every group
>
>SELECT id
>FROM tbl
>GROUP BY id
>HAVING count(*) = 1;
>
>I want to use this syntax with an 'order by' like this...
>
>  SELECT *
>FROM tbl
>GROUP BY id
>  HAVING count(*) = 1
>ORDER BY bleah;
>
>Will this syntax return the row within the "id group" with the smallest
>value of the bleah column? (is it guaranteed to do so?)
>
>Cheers,
>Dan.
>
>
>


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Display 1st row of every group?

2005-04-11 Thread Dan Bolser

On Mon, 11 Apr 2005, Rhino wrote:

>
>- Original Message - 
>From: "Dan Bolser" <[EMAIL PROTECTED]>
>To: 
>Sent: Monday, April 11, 2005 5:58 AM
>Subject: Display 1st row of every group?
>
>
>>
>> I read with great interest this
>>
>> http://www.artfulsoftware.com/queries.php#4
>>
>> Display 1st row of every group
>>
>> SELECT id
>> FROM tbl
>> GROUP BY id
>> HAVING count(*) = 1;
>>
>Despite what the heading says for that query, it is NOT going to return the
>first row of every group. In fact, it is going to return only groups that
>contain one row. For example, given this data:
>
>ID
>--
>5
>5
>5
>6
>7
>7
>8
>8
>8
>9
>
>That query will return return this:
>
>id
>--
>6
>9
>
>In other words, it is returning only groups where there is a single ID with
>that value. Note also that it is returning ONLY an ID, not the ID plus the
>rest of the row. Their description of what the query does is just plain
>wrong.

Delicious second opinion!

Thanks for the reply - if you see my second post you see I fixed my
problem.

Cheers,
Dan.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Display 1st row of every group?

2005-04-11 Thread Dan Bolser

On Mon, 11 Apr 2005, Peter Brawley wrote:

>/>I guess that entry is either wrong or misleading./
>
>Ordering by another column which isn't mutually dependent with the 
>grouping column will have unpredictable results. Is that what you mean 
>by the example being "wrong or misleading"?

No, I mean the example here

http://www.artfulsoftware.com/queries.php#4

Its misleading in so far as it mislead me. Strange how we can take on
faith the behaviour of something that is so clearly not doing what is
expected simply given the context in which it is presented!

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

SCRAMBLE(A,B) (was UDF:Request).

2005-04-11 Thread Dan Bolser


Requirement: 

Given two columns of a table (Column1 and Column2) of length x, return two
columns (Column1_Scram and Column2_Scram) such that the distributions of
values in Column1 and Column2 are preserved in Column1_Scram and
Column2_Scram, but the pairs of values are randomized.


Solution suggested by Shawn Green: 

Create a table with two columns, and populate this table with random pairs
of primary keys picked from the original table. Additionally, allow no
duplicate primary keys within either column. Select x rows from this
table, and link both primary keys (the primary key pair) back to the
original table to get the appropriate number of randomized pairs of
Column1 and Column2.

He suggests doing the above like this (more or less):

OriginalTable
PK  A   B
1   a   c
2   a   d
3   b   e
...

CREATE TEMPORARY TABLE IntermediateTable (
  PK1  INT NOT NULL,
  ACHAR(1) NOT NULL,
  PK2  INT NOT NULL,
  BCHAR(1) NOT NULL,
  #
  UNIQUE INDEX (PK1,A),
  UNIQUE INDEX (PK2,B)
);

INSERT IGNORE INTO IntermediateTable
SELECT 
  x.PK, x.A,
  y.PK, y.B
FROM 
  OriginalTable x,
  OriginalTable y
ORDER BY
  RAND();

SELECT 
  x.A,
  y.B
FROM
  IntermediateTable 
INNER JOIN 
  OriginalTable x ON (PK1 = x.PK) INNER JOIN
  OriginalTable y ON (PK2 = y.PK)
LIMIT 
  the_length_of_OriginalTable;


The problem with this solution: 

Its too slow on reasonable sized tables! I never get past the second step
with my data after 10 minutes!

I have 52699 rows in my 'OriginalTable' leading to ~2.7 billion checks
when inserting into the IntermediateTable... or rather 5.4 billion, as I
guess it has to check both rows for the UNIQUE key constraint on every
attempted insert. 

Ideally I would like to be able to do several thousand randomizations over
my data, and at 10 mins a pop that would take all week. (assuming the
query was about to finish when I killed it after 10 mins.)

Is their a faster way to do this randomization in SQL? Am I doing
something really dumb that was never intended by Shawn?

I can easily get the data I need with a quick step into perl, but it would
be really neat if I could do all this in MySQL.

I can imagine a general way to create 'random' joins (over scrambled
data) would have some interesting applications.

Dan.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Display 1st row of every group?

2005-04-11 Thread Dan Bolser

On Mon, 11 Apr 2005, Peter Brawley wrote:

>Thanks to Rhino & Dan for the corrections, HAVING in that query makes it 
>return only id values for which there is one row, and the header is 
>wrong since "first row of every group" doesn't mean anything without 
>explicit ordering.

What do you mean?


>
>PB
>
>-
>
>Dan Bolser wrote:
>
>>On Mon, 11 Apr 2005, Rhino wrote:
>>
>>  
>>
>>>- Original Message - 
>>>From: "Dan Bolser" <[EMAIL PROTECTED]>
>>>To: 
>>>Sent: Monday, April 11, 2005 5:58 AM
>>>Subject: Display 1st row of every group?
>>>
>>>
>>>
>>>
>>>>I read with great interest this
>>>>
>>>>http://www.artfulsoftware.com/queries.php#4
>>>>
>>>>Display 1st row of every group
>>>>
>>>>SELECT id
>>>>FROM tbl
>>>>GROUP BY id
>>>>HAVING count(*) = 1;
>>>>
>>>>  
>>>>
>>>Despite what the heading says for that query, it is NOT going to return the
>>>first row of every group. In fact, it is going to return only groups that
>>>contain one row. For example, given this data:
>>>
>>>ID
>>>--
>>>5
>>>5
>>>5
>>>6
>>>7
>>>7
>>>8
>>>8
>>>8
>>>9
>>>
>>>That query will return return this:
>>>
>>>id
>>>--
>>>6
>>>9
>>>
>>>In other words, it is returning only groups where there is a single ID with
>>>that value. Note also that it is returning ONLY an ID, not the ID plus the
>>>rest of the row. Their description of what the query does is just plain
>>>wrong.
>>>
>>>
>>
>>Delicious second opinion!
>>
>>Thanks for the reply - if you see my second post you see I fixed my
>>problem.
>>
>>Cheers,
>>Dan.
>>
>>
>>  
>>
>


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Find valid numeric values in a string field?

2005-04-11 Thread Dan Bolser

On Mon, 11 Apr 2005, Eamon Daly wrote:

>You were on the right track. Casting a string to a number
>results in a 0 if MySQL can't make a proper conversion,
>which is slightly counter-intuitive. This might suit your
>needs:

I see. Cheers.

Shame their is no 'IS_VALID_INT()' or 'IS_VALID_FLOAT()'.

:)

Dan.




>
>mysql> select * from mixed_num;
>++
>| my_col |
>++
>| a  |
>| 0  |
>| 1  |
>| abc123 |
>| 123abc |
>| 1.2|
>| -1 |
>++
>7 rows in set (0.12 sec)
>
>mysql> select my_col, my_col + 0 from mixed_num where my_col = '0' OR my_col 
>+ 0 != 0;
>+++
>| my_col | my_col + 0 |
>+++
>| 0  |  0 |
>| 1  |  1 |
>| 123abc |123 |
>| 1.2|1.2 |
>| -1 | -1 |
>+++
>5 rows in set (0.00 sec)
>
>Note that '123abc' is changed to '123', which may make a big
>difference to you.
>
>A less magical way to get at these numbers is using REGEXP:
>
>Unsigned decimals:
>
>mysql> select my_col from mixed_num where my_col REGEXP '^[0-9.]+$';
>++
>| my_col |
>++
>| 0  |
>| 1  |
>| 1.2|
>++
>3 rows in set (0.00 sec)
>
>Unsigned integers:
>
>mysql> select my_col from mixed_num where my_col REGEXP '^[0-9]+$';
>++
>| my_col |
>++
>| 0  |
>| 1  |
>++
>2 rows in set (0.00 sec)
>
>Signed integers:
>
>mysql> select my_col from mixed_num where my_col REGEXP '^-?[0-9]+$';
>++
>| my_col |
>+----+
>| 0  |
>| 1  |
>| -1 |
>++
>3 rows in set (2.78 sec)
>
>...et cetera. REGEXP is explained here:
>
>http://dev.mysql.com/doc/mysql/en/regexp.html
>
>
>Eamon Daly
>
>
>
>- Original Message - 
>From: "Dan Bolser" <[EMAIL PROTECTED]>
>To: 
>Sent: Sunday, April 10, 2005 10:10 AM
>Subject: Find valid numeric values in a string field?
>
>
>>
>> I have a column like this "my_col varchar(20) null".
>>
>> The values in the column can be text or numbers. How can I select only
>> those rows where the value in this column is a valid number?
>>
>> I need something like IS_DECIMAL(), but I can't find that function.
>>
>> The following SQL fails to do the job (probably because of
>> optimization)...
>>
>> SELECT
>>  MIXED_COLUMN,
>>  MIXED_COLUMN + 0
>> FROM
>>  TABLE
>> WHERE
>>  MIXED_COLUMN =
>>  MIXED_COLUMN + 0
>> ;
>>
>> Hey, lets make an IS_DECIMAL UDF! ;)
>>
>> 
>>
>> p.s. why aren't the addresses of these mailing lists anywhere to be found
>> on the MySQL Lists pages?
>>
>> Dan.
>>
>>
>> -- 
>> MySQL General Mailing List
>> For list archives: http://lists.mysql.com/mysql
>> To unsubscribe: 
>> http://lists.mysql.com/[EMAIL PROTECTED]
>> 
>


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: SCRAMBLE(A,B) (was UDF:Request).

2005-04-11 Thread Dan Bolser

On Mon, 11 Apr 2005, Dan Bolser wrote:

>
>Requirement: 
>
>Given two columns of a table (Column1 and Column2) of length x, return two
>columns (Column1_Scram and Column2_Scram) such that the distributions of
>values in Column1 and Column2 are preserved in Column1_Scram and
>Column2_Scram, but the pairs of values are randomized.
>
>
>Solution suggested by Shawn Green: 
>
>Create a table with two columns, and populate this table with random pairs
>of primary keys picked from the original table. Additionally, allow no
>duplicate primary keys within either column. Select x rows from this
>table, and link both primary keys (the primary key pair) back to the
>original table to get the appropriate number of randomized pairs of
>Column1 and Column2.
>
>He suggests doing the above like this (more or less):
>
>OriginalTable
>PK A   B
>1  a   c
>2  a   d
>3  b   e
>...
>
>CREATE TEMPORARY TABLE IntermediateTable (
>  PK1  INT NOT NULL,
>  ACHAR(1) NOT NULL,
>  PK2  INT NOT NULL,
>  BCHAR(1) NOT NULL,
>  #
>  UNIQUE INDEX (PK1,A),
>  UNIQUE INDEX (PK2,B)
>);
>
>INSERT IGNORE INTO IntermediateTable
>SELECT 
>  x.PK, x.A,
>  y.PK, y.B
>FROM 
>  OriginalTable x,
>  OriginalTable y
>ORDER BY
>  RAND();
>
>SELECT 
>  x.A,
>  y.B
>FROM
>  IntermediateTable 
>INNER JOIN 
>  OriginalTable x ON (PK1 = x.PK) INNER JOIN
>  OriginalTable y ON (PK2 = y.PK)
>LIMIT 
>  the_length_of_OriginalTable;
>
>
>The problem with this solution: 
>
>Its too slow on reasonable sized tables! 


Their is also a problem with the way RAND() works...

SELECT
  x.PK, x.A,
  y.PK, y.B
FROM
  OriginalTable x,
  OriginalTable y
ORDER BY
  RAND()
LIMIT 
  1;

This takes soo long to pick a random row. Some cleaver 'LIMIT'
optimization could pick a results set almost instantly, instead of taking
in excess of half an hour with ~50,000 rows.





-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Group By - Is there a way to set which rows values are used for the fields not in the Group By clause?

2005-04-12 Thread Dan Bolser


I think the suggestion posted here...

http://lists.mysql.com/mysql/182424

should get you going in the right direction.

You really need to know what you are doing to know if it is giving you the
correct answer or not.

It would be cool if their was something like a GROUP_ROW(cols, expr) to do
what you want explicity.

SELECT * FROM chi;
+--++--+
| child_id | id | type |
+--++--+
|1 |  1 |0 |
|2 |  1 |0 |
|3 |  1 |1 |
|4 |  1 |2 |
|5 |  2 |2 |
+--++--+

-- Totally fake and probably silly SQL...
SELECT id, group_row(child_id,type,child_id=min(child_id)) 
FROM chi2 GROUP BY id;

+--++--+
| child_id | id | type |
+--++--+
|1 |  1 |0 |
|5 |  2 |2 |
+--++--+

or GROUP_ROW(cols,sortby=)

-- More sillyness
SELECT id, group_row(child_id,type,sortby=rand()) 
FROM chi2 GROUP BY id;

Just an idea.

Dan.

On Mon, 11 Apr 2005, Rich Carr wrote:

>Is there a way to set which rows values are used by the GROUP BY clause
>for the fields that are not in the GROUP BY clause?
> In this following select statement the group by and order work but the
>value of the lastviewtime field is not the value of the most recent
>datetime row.  Is there any way to modify the select statement so that
>the returned lastviewtime field will have the most recent datetime?
>
> select app, itemid, ownerid, visitorid, vusername, lastviewtime,
>sum(viewcount) AS totalcount, itemname from viewvisitor where ownerid =
>2 GROUP BY concat( app, itemid ) ORDER BY totalcount
> 
> 
>create table viewvisitor  
>(
>  app   char(15),
>  itemidINT UNSIGNED NOT NULL,   
>  ownerid   INT UNSIGNED NOT NULL,   
>  ousername varchar(25),
>  visitorid INT UNSIGNED NOT NULL,   
>  vusername varchar(25),
>  vfullname varchar(70),
>  lastviewtime  DATETIME NOT NULL,
>  viewcount INT, 
>  itemname  VARCHAR(40),   
>
>
>  PRIMARY KEY master(app, visitorid, itemid),
>  INDEX (ownerid),
>  INDEX (lastviewtime), 
>  INDEX (viewcount)
>);
> 
>Thanks!!!
> 
>Rich
>
>
>   
>-
>Do you Yahoo!?
> Yahoo! Small Business - Try our new resources site! 


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Group By - Is there a way to set which rows values are used for the fields not in the Group By clause?

2005-04-12 Thread Dan Bolser

On Tue, 12 Apr 2005, Rich Carr wrote:

>Hi Dan,
> Thanks very much!  First, I can't figure out how to reply to this so
>that it shows up in the MySQL list.  How does one do it?

erm... if you hit 'reply all' or answer yes to 'reply to all' it should
send mail to [EMAIL PROTECTED]

Ahhh...I see what you mean now... you want to reply to *that* message in
the archive... Don't know.


> Second, when I tried to suggest I get the following error message, I'm
>using MySQL 4.0.15

ahhh... you need >= 4.1 to support subqueries...

I cc:'ed this back to the mailing list if anyone else can help.

Let me reformat your query to check

 SELECT * 
  FROM viewvisitor 
INNER JOIN  ( select app, itemid, sum(viewcount) AS totalcount, 
FROM viewvisitor 
GROUP BY concat( app,itemid ) 
) 
 USING( app, itemid, visitorid, itemid, itemname,
vusername,totalcount, lastviewtime);


ERROR 1064: You have an error in your SQL syntax.  Check the manual that
corresponds to your MySQL server version for the right syntax to use near
'select ...'

Yeah.. it don't like the sub-query. 

Not sure what your query is actually doing, but it is academic unless you
upgrade your server.

Try this recipy...

create temporary table bleah 
select 
  PK, 
  MIN(value_that_you_want_to_min_of_in_the_results_row) 
as the_original_column name
from meTable;

select a.* from meTable a 
inner join bleah
using(PK,the_original_column_name);

Their should be a unique minimum for the_original_column.

When I said before that 'you really need to know what you are doing' -
what I mean is, I am very bad at explaining. Sorry if the above looks like
garble (but I can't do better).




>Thanks!
>
>Dan Bolser <[EMAIL PROTECTED]> wrote:
>
>I think the suggestion posted here...
>
>http://lists.mysql.com/mysql/182424
>
>should get you going in the right direction.
>
>You really need to know what you are doing to know if it is giving you the
>correct answer or not.
>
>It would be cool if their was something like a GROUP_ROW(cols, expr) to do
>what you want explicity.
>
>SELECT * FROM chi;
>+--++--+
>| child_id | id | type |
>+--++--+
>| 1 | 1 | 0 |
>| 2 | 1 | 0 |
>| 3 | 1 | 1 |
>| 4 | 1 | 2 |
>| 5 | 2 | 2 |
>+--++--+
>
>-- Totally fake and probably silly SQL...
>SELECT id, group_row(child_id,type,child_id=min(child_id)) 
>FROM chi2 GROUP BY id;
>
>+--++--+
>| child_id | id | type |
>+--++--+
>| 1 | 1 | 0 |
>| 5 | 2 | 2 |
>+--++--+
>
>or GROUP_ROW(cols,sortby=)
>
>-- More sillyness
>SELECT id, group_row(child_id,type,sortby=rand()) 
>FROM chi2 GROUP BY id;
>
>Just an idea.
>
>Dan.
>
>On Mon, 11 Apr 2005, Rich Carr wrote:
>
>>Is there a way to set which rows values are used by the GROUP BY clause
>>for the fields that are not in the GROUP BY clause?
>> In this following select statement the group by and order work but the
>>value of the lastviewtime field is not the value of the most recent
>>datetime row. Is there any way to modify the select statement so that
>>the returned lastviewtime field will have the most recent datetime?
>>
>> select app, itemid, ownerid, visitorid, vusername, lastviewtime,
>>sum(viewcount) AS totalcount, itemname from viewvisitor where ownerid =
>>2 GROUP BY concat( app, itemid ) ORDER BY totalcount
>> 
>> 
>>create table viewvisitor 
>>(
>> app char(15), 
>> itemid INT UNSIGNED NOT NULL, 
>> ownerid INT UNSIGNED NOT NULL, 
>> ousername varchar(25), 
>> visitorid INT UNSIGNED NOT NULL, 
>> vusername varchar(25), 
>> vfullname varchar(70), 
>> lastviewtime DATETIME NOT NULL, 
>> viewcount INT, 
>> itemname VARCHAR(40), 
>>
>>
>> PRIMARY KEY master(app, visitorid, itemid),
>> INDEX (ownerid),
>> INDEX (lastviewtime), 
>> INDEX (viewcount)
>>);
>> 
>>Thanks!!!
>> 
>>Rich
>>
>>
>> 
>>-
>>Do you Yahoo!?
>> Yahoo! Small Business - Try our new resources site! 
>
>
>   
>-
>Do you Yahoo!?
> Yahoo! Small Business - Try our new resources site! 


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

remove newline hack?

2005-04-15 Thread Dan Bolser


How to remove newlines from a column? Data is given me in binary format.

If I dump should I set some new record terminator, parse out newlines and
reload?

Any beter hack?

Dan.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Size of BLOB types?

2005-04-16 Thread Dan Bolser


The manual dosn't specify the maximum number of characters in the

TINYTEXT, TEXT, MEDIUMTEXT, and LONGTEXT data types.

http://dev.mysql.com/doc/mysql/en/blob.html

Are these valid synonyms, TINYBLOB, MEDIUMBLOB and LONGBLOB?

I have a field with just under 1000 characters, am I OK with a TINYTEXT?

Cheers, 
Dan.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Size of BLOB types?

2005-04-16 Thread Dan Bolser

On Sat, 16 Apr 2005, Hassan Schroeder wrote:

>Dan Bolser wrote:
>> The manual dosn't specify the maximum number of characters in the
>> 
>> TINYTEXT, TEXT, MEDIUMTEXT, and LONGTEXT data types.
>> 
>> http://dev.mysql.com/doc/mysql/en/blob.html
>> 
>> Are these valid synonyms, TINYBLOB, MEDIUMBLOB and LONGBLOB?
>
>The very beginning of the cited page is:
>
>
>  A BLOB is a binary large object that can hold a variable amount of
>  data. The four BLOB types, TINYBLOB, BLOB, MEDIUMBLOB, and LONGBLOB,
>  differ only in the maximum length of the values they can hold.
>
>  See Section 11.5, “Column Type Storage Requirements”.
>
>  The four TEXT types, TINYTEXT, TEXT, MEDIUMTEXT, and LONGTEXT,
>  correspond to the four BLOB types and have the same maximum lengths
>  and storage requirements.
>
>
>And if you follow that link to Section 11.5, you'll find the size
>of all of the above, and the answer to your question:
>
>> I have a field with just under 1000 characters, am I OK with a TINYTEXT?
>
>:: which is "nope" :-)
>
>FWIW!

FWIW?

Naturally I followed that link, but was unable to understand (or piece
together) the information there in.

Storage Requirements for String Types: 
(or maximum length of BLOB types for dummies)

TINYBLOB   (or TINYTEXT)   = 1+(2^ 8) =257 characters
BLOB   (or TEXT)   = 2+(2^16) =  65538 characters
MEDIUMBLOB (or MEDIUMTEXT) = 3+(2^24) =   16777219 characters
LONGBLOB   (or LONGTEXT)   = 4+(2^32) = 4294967300 characters

Does that answer my question?

I think it does, but I am not sure. Seems strange not to have this
information at the very begining of the cited page.



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Size of BLOB types?

2005-04-16 Thread Dan Bolser

On Sat, 16 Apr 2005, Robert Dunlop wrote:

>Your question was would a TINYBLOB hold 1,000 characters.
>TINYBLOB (TINYTEXT) = 257 characters.  1,000 > 257 = No.
>Your next option, BLOB (TEXT) = 64K.  1,000 < 64K = Yes.

Yeah, I got that part, more generally my question is what are the maximum
sizes (in characters) of each of the BLOB / TEXT types, and why isn't that
information on the manual page for BLOB / TEXT types.

Another question comes to mind, which is why is TINYTEXT so useless?

A VARHCAR(255) is nearly as good! I would have thought any datatype with
BLOB in its name would have the self respect to hold at least 65536
characters!

Thanks for the pointers so far,
Dan.



>
>- Original Message - 
>From: "Dan Bolser" <[EMAIL PROTECTED]>
>To: "Hassan Schroeder" <[EMAIL PROTECTED]>
>Cc: 
>Sent: Saturday, April 16, 2005 1:49 PM
>Subject: Re: Size of BLOB types?
>
>
>On Sat, 16 Apr 2005, Hassan Schroeder wrote:
>
>>Dan Bolser wrote:
>>> The manual dosn't specify the maximum number of characters in the
>>>
>>> TINYTEXT, TEXT, MEDIUMTEXT, and LONGTEXT data types.
>>>
>>> http://dev.mysql.com/doc/mysql/en/blob.html
>>>
>>> Are these valid synonyms, TINYBLOB, MEDIUMBLOB and LONGBLOB?
>>
>>The very beginning of the cited page is:
>>
>>
>>  A BLOB is a binary large object that can hold a variable amount of
>>  data. The four BLOB types, TINYBLOB, BLOB, MEDIUMBLOB, and LONGBLOB,
>>  differ only in the maximum length of the values they can hold.
>>
>>  See Section 11.5, â?oColumn Type Storage Requirementsâ?.
>>
>>  The four TEXT types, TINYTEXT, TEXT, MEDIUMTEXT, and LONGTEXT,
>>  correspond to the four BLOB types and have the same maximum lengths
>>  and storage requirements.
>>
>>
>>And if you follow that link to Section 11.5, you'll find the size
>>of all of the above, and the answer to your question:
>>
>>> I have a field with just under 1000 characters, am I OK with a TINYTEXT?
>>
>>:: which is "nope" :-)
>>
>>FWIW!
>
>FWIW?
>
>Naturally I followed that link, but was unable to understand (or piece
>together) the information there in.
>
>Storage Requirements for String Types:
>(or maximum length of BLOB types for dummies)
>
>TINYBLOB   (or TINYTEXT)   = 1+(2^ 8) =257 characters
>BLOB   (or TEXT)   = 2+(2^16) =  65538 characters
>MEDIUMBLOB (or MEDIUMTEXT) = 3+(2^24) =   16777219 characters
>LONGBLOB   (or LONGTEXT)   = 4+(2^32) = 4294967300 characters
>
>Does that answer my question?
>
>I think it does, but I am not sure. Seems strange not to have this
>information at the very begining of the cited page.
>
>
>
>


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Size of BLOB types?

2005-04-16 Thread Dan Bolser


Beautiful!

Cheers, (one and all),
Dan.

On Sat, 16 Apr 2005, Dan Nelson wrote:

>In the last episode (Apr 16), Dan Bolser said:
>> The manual dosn't specify the maximum number of characters in the
>> 
>> TINYTEXT, TEXT, MEDIUMTEXT, and LONGTEXT data types.
>> 
>> http://dev.mysql.com/doc/mysql/en/blob.html
>
>At the top of that page:
>
>  The four TEXT types, TINYTEXT, TEXT, MEDIUMTEXT, and LONGTEXT,
>  correspond to the four BLOB types and have the same maximum lengths
>  and storage requirements.
>
>There should probably be a link to
>http://dev.mysql.com/doc/mysql/en/string-type-overview.html , which
>gives you the exact limits.
> 
>> Are these valid synonyms, TINYBLOB, MEDIUMBLOB and LONGBLOB?
>> 
>> I have a field with just under 1000 characters, am I OK with a TINYTEXT?
>
>No, since TINYTEXT is limited to 256 bytes.
>
>


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

COUNT(DISTINCT R1,R2) within an IF statement?

2005-04-17 Thread Dan Bolser


Hi, I have a query which looks like this...

SELECT 
  BLEAH,
  COUNT(DISTINCT R1,R2)
FROM
  T1
WHERE
  FK = 1
GROUP BY 
  BLEAH
;

Lets say that over 10 rows where FK = 1 it counts 5 distinct R1-R2 pairs
in a single 'BLEAH' group BLEAH = 'Y'.

Now I want to search the table for all FK's with the same number of R1-R2
pairs in the BLEAH group, and I do it like this...

 
SELECT 
  FK,
  COUNT(DISTINCT IF(BLEAH='Y',CONCAT(R1,"-",R2),NULL)) AS BLING
GROUP BY
  FK
HAVING
  BLING = 5;


The problem is that CONCAT. I don't like the look of it. It makes me think
that the optimizer dosn't stand a chance, which I am not sure if it does
anyway (even though I have a "index x (R1,R2)".

I would like to be able to say something like...

 ... IF(BLEAH='Y',R1,R2,NULL) ...

But of course that messes up the IF syntax. Or I would like to say...

 ... IF(BLEAH='Y',ROW(R1,R2),NULL) ...

But DISTINCT balls "ERROR 1241 (21000): Operand should contain 1
column(s)", which seems a bit strange, as in the first query we are
passing it two columns.

Same error occurs with this syntax...

... IF(BLEAH='Y',(R1,R2),NULL) ...

Am I stuck doing my CONCAT? 

I like to keep the syntax general, as for a particular PK, BLEAH could
have several values, in which case I stack up my thingies...

SELECT
  FK,
  COUNT(DISTINCT IF(BLEAH='Y',CONCAT(R1,"-",R2),NULL)) AS BLING,
  COUNT(DISTINCT IF(BLEAH='N',CONCAT(R1,"-",R2),NULL)) AS BLANG,
GROUP BY
  FK
HAVING
  BLING = 5 AND
  BLANG = 5;

For example.

Would the "COUNT(DISTINCT" above get optimized if I was only looking at
the values in one (indexed) column (R1 for example)?

Cheers,
Dan.



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Relative Numeric Values

2005-04-17 Thread Dan Bolser

On Sat, 16 Apr 2005, David Blomstrom wrote:

>I think my question is more oriented towards PHP, but
>I'd like to ask it on this list, as I suspect the
>solution may involve MySQL.
>
>I'm about to start developing an enormous database
>focusing on the animal kingdom and want to find a key
>system more user friendly than the traditional
>scientific name.

http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=taxonomy

I use the 'tax_id' that they use. You can get a dump of the DB here

ftp://ftp.ncbi.nih.gov/pub/taxonomy/

see ftp://ftp.ncbi.nih.gov/pub/taxonomy/taxdump_readme.txt



>
>So imagine instead a page with the following in the
>head section:
>
>$AnimalID = 'canlup';
>
>This page displays information on the wolf, based on
>the first three letters of its genus and species name,
>Canis lupus.
>
>Now imagine a page with this value:
>
>$AnimalID = 'bal';
>
>This page displays information on the whale family
>Balaenidae. But what about the whale family
>Balaenopteridae, which begins with the same three
>letters?
>
>I could solve this problem by adding a numerical key
>to my database and displaying the following:
>
>$AnimalID = 'bal23';
>$AnimalID = 'bal24';
>
>The problem with this is that it makes it much harder
>to work with my data. When tweaking a page or writing
>a script, I can easily remember that bal = Balaenidae,
>but I can't possibly remember which numeral is
>associated with each mammal family. Also, what happens
>if I add or subtract rows from my database table, and
>the above values suddenly change to bal27 and bal28?
>
>So here's what I think I'd like to do:
>
>$AnimalID = 'canlup1';
>$AnimalID = 'bal1';
>$AnimalID = 'bal2';
>
>The page with canlup1 will display the FIRST (and
>only) instance of canlup in the database - the wolf.
>
>The page with bal1 will display the first instance of
>bal, which will always be Balaenidae, whether the
>absolute value is bal27 or bal2884. A page with bal2
>will always display the next mammal family that begins
>with bal, Balaenopteridae.
>
>So I THINK all I need to do is create a variable that
>reflects a particular value's ordinal position in a
>database...
>abc1
>abc2
>abc3, etc.
>
>Plus, I'll have to join two or three fields together
>to form a key; e.g. animals.species + animals.numerals
>
>Does anyone know how I can do this? Thanks.
>
>
>   
>__ 
>Do you Yahoo!? 
>Plan great trips with Yahoo! Travel: Now over 17,000 guides!
>http://travel.yahoo.com/p-travelguide
>
>


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Relative Numeric Values

2005-04-17 Thread Dan Bolser

On Sat, 16 Apr 2005, David Blomstrom wrote:

>
>--- Kim Briggs <[EMAIL PROTECTED]> wrote:
>> David,
>> 
>> In reading through miscellaneous database design
>> text on the web, I
>> read just the other day that you should not try to
>> include meaningful
>> data in your key values.  I assume there will be
>> some kind of "lookup"
>> tables for species, phylum, whatever.  Trying to
>> make your key field
>> "smart" seems like way too much overhead and
>> complexity.  I'm
>> wondering why, if the database is enormous, are you
>> being so short and
>> cryptic with the "user-friendly" values?
>
>Primarily because I want to make it easier to work
>with. If I create a new page that focuses on the king
>salmon, I'd rather type in $MyID = 'onc'; than $MyID =
>'Oncorhynchus'. Or if I create an array, I'd rather
>list rhi, hip, equ than Rhinocerotidae,
>Hippopotamidae, Equidae.
>
>In fact, I'll have to discard big chunks of the
>animals database I received on a CD, as it's way to
>big (several MB) and includes living things I won't
>cover (bacteria, viruses, etc.).
>
>I'm also trying to decide on my URL structure. I could
>follow tradition and map out the lion like this:
>
>mammals.geobop.org/carnivora/felidae/panthera/leo/
>
>...but I'm thinking of shortening the URL's:
>
>mammals.geobop.org/car/fel/pan/leo/
>
>I'm just trying to come up with something that's more
>user friendly.

That is a really good idea. This data is horrible and messy. If you could
come up with a 'biologist friendly' system it would be really nice to work
with. 

However, I think the message is - make the database sound, and build any
'cleanup' (userfriendlyness) ontop of a robust data model.


>
>Thanks.
>
>
>__
>Do You Yahoo!?
>Tired of spam?  Yahoo! Mail has the best spam protection around 
>http://mail.yahoo.com 
>
>


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

amPiguous!

2005-05-06 Thread Dan Bolser


Why are columns included in the join between two tables ambigious?

It seems that they should *not* be ambigious!

Like this

select pk from a inner join b using (pk);

ERROR 1052 (23000): Column 'pk' in field list is ambiguous!!!

Is this a bug, or is it like this for a reason? It drives me nuts, because
it is not ambigious (as far as I can tell). Am I a fool?



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: amPiguous!

2005-05-06 Thread Dan Bolser

On Fri, 6 May 2005, Rhino wrote:

>Actually, the error message is misleading. There is nothing that I would
>call ambiguous in your query: you have a syntax error. The join should be
>written:
>
>select pk from a inner join b on a.col1 = b.col2
>
>Of course, you need to replace 'a.col1' and 'b.col2' with real column names
>from tables a and b respectively.

I would have said...

select pk from a inner join b on a.pk = b.pk;

(probably pk was a bad choice for an example column name).

Using the ON syntax instead of the USING syntax makes my problem look even
more silly than it already is, i.e. just say

select a.pk from a inner join b on a.pk = b.pk;

Thing is I use the USING syntax all the time and name equivelent columns
in different tables the same to help me do this (I read somewhere that
this is good practice).

Is it still me, or should the USING syntax 'disambiguate' columns in the
select statement?


>
>Rhino
>
>- Original Message - 
>From: "Dan Bolser" <[EMAIL PROTECTED]>
>To: 
>Sent: Friday, May 06, 2005 6:50 PM
>Subject: amPiguous!
>
>
>>
>> Why are columns included in the join between two tables ambigious?
>>
>> It seems that they should *not* be ambigious!
>>
>> Like this
>>
>> select pk from a inner join b using (pk);
>>
>> ERROR 1052 (23000): Column 'pk' in field list is ambiguous!!!
>>
>> Is this a bug, or is it like this for a reason? It drives me nuts, because
>> it is not ambigious (as far as I can tell). Am I a fool?
>>
>>
>>
>> -- 
>> MySQL General Mailing List
>> For list archives: http://lists.mysql.com/mysql
>> To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
>>
>>
>> -- 
>> No virus found in this incoming message.
>> Checked by AVG Anti-Virus.
>> Version: 7.0.308 / Virus Database: 266.11.5 - Release Date: 04/05/2005
>>
>>
>
>
>
>


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: amPiguous!

2005-05-06 Thread Dan Bolser

On Sat, 7 May 2005, Simon Garner wrote:

>On 7/05/2005 11:00 a.m., Rhino wrote:
>> Actually, the error message is misleading. There is nothing that I would
>> call ambiguous in your query: you have a syntax error. The join should be
>> written:
>> 
>> select pk from a inner join b on a.col1 = b.col2
>> 
>> Of course, you need to replace 'a.col1' and 'b.col2' with real column names
>> from tables a and b respectively.
>> 
>> Rhino
>
>Not true, his join syntax is correct.
>
>   select pk from a inner join b using (pk)
>
>is the same as saying
>
>   select pk from a inner join b on a.pk = b.pk
>
>However, MySQL is complaining about the "select pk" part because it 
>doesn't know whether to select a.pk or b.pk. I think what Dan is arguing 
>is that MySQL should know from the join that a.pk and b.pk are the same, 
>so it doesn't matter which one it uses.

Exactly! Afterall a.pk = b.pk! However, I can imagine how this could
become somewhat tricky with the ON syntax.


>
>-Simon
>
>
>
>> 
>> - Original Message - 
>> From: "Dan Bolser" <[EMAIL PROTECTED]>
>> To: 
>> Sent: Friday, May 06, 2005 6:50 PM
>> Subject: amPiguous!
>> 
>> 
>> 
>>>Why are columns included in the join between two tables ambigious?
>>>
>>>It seems that they should *not* be ambigious!
>>>
>>>Like this
>>>
>>>select pk from a inner join b using (pk);
>>>
>>>ERROR 1052 (23000): Column 'pk' in field list is ambiguous!!!
>>>
>>>Is this a bug, or is it like this for a reason? It drives me nuts, because
>>>it is not ambigious (as far as I can tell). Am I a fool?
>>>
>
>


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: amPiguous!

2005-05-06 Thread Dan Bolser

On Fri, 6 May 2005, Eric Bergen wrote:

>He's right in saying that mysql is capable of knowing. My thoughts are 
>that it's not worth the speed loss, extra code, and potential guess work 
>by mysql just so you don't have to type a table name.

I see what you mean. I didn't think about additional query processing
burden / potential bugs as a result of this 'fix'.



>
>Eric Jensen wrote:
>
>>The way he is joining tables is fine.  You can specify how to link the
>>using ON or you can just say to use a commonly named field with USING. 
>>The Problem is with the SELECT pk.  That is ambiguous.  From what table
>>would you like the pk field?  It can be table1.pk or table2.pk.
>>
>>Eric Jensen
>>
>>Rhino wrote:
>>
>>  
>>
>>>Actually, the error message is misleading. There is nothing that I would
>>>call ambiguous in your query: you have a syntax error. The join should be
>>>written:
>>>
>>>   select pk from a inner join b on a.col1 = b.col2
>>>
>>>Of course, you need to replace 'a.col1' and 'b.col2' with real column names
>>>
>>>
>>>from tables a and b respectively.
>>  
>>
>>>Rhino
>>>
>>>- Original Message - 
>>>From: "Dan Bolser" <[EMAIL PROTECTED]>
>>>To: 
>>>Sent: Friday, May 06, 2005 6:50 PM
>>>Subject: amPiguous!
>>>
>>>
>>> 
>>>
>>>
>>>
>>>>Why are columns included in the join between two tables ambigious?
>>>>
>>>>It seems that they should *not* be ambigious!
>>>>
>>>>Like this
>>>>
>>>>select pk from a inner join b using (pk);
>>>>
>>>>ERROR 1052 (23000): Column 'pk' in field list is ambiguous!!!
>>>>
>>>>Is this a bug, or is it like this for a reason? It drives me nuts, because
>>>>it is not ambigious (as far as I can tell). Am I a fool?
>>>>
>>>>
>>>>
>>>>-- 
>>>>MySQL General Mailing List
>>>>For list archives: http://lists.mysql.com/mysql
>>>>To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
>>>>
>>>>
>>>>-- 
>>>>No virus found in this incoming message.
>>>>Checked by AVG Anti-Virus.
>>>>Version: 7.0.308 / Virus Database: 266.11.5 - Release Date: 04/05/2005
>>>>
>>>>
>>>>   
>>>>
>>>>  
>>>>
>>>
>>> 
>>>
>>>
>>>
>>
>>
>>  
>>
>
>
>


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

SELECT DISTINCT ROW(A,B) ?

2005-05-07 Thread Dan Bolser


SELECT DISTINCT ROW(A,B) from a;

While I am on a roll (of sorts) can I ask if the above syntax should be
valid or not?

I have a feeling it should be valid (but isn't - it causes an error).

Is this the correct behaviour?




-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: amPiguous!

2005-05-07 Thread Dan Bolser

On Sat, 7 May 2005, Jochem van Dieten wrote:

>On 5/7/05, Dan Bolser wrote:
>> 
>> Why are columns included in the join between two tables ambigious?
>
>Because MySQL does not follow the SQL standard (ISO/IEC 9075-2:2003).
>
>
>> select pk from a inner join b using (pk);
>> 
>> ERROR 1052 (23000): Column 'pk' in field list is ambiguous!!!
>> 
>> Is this a bug, or is it like this for a reason? It drives me nuts, because
>> it is not ambigious (as far as I can tell). Am I a fool?
>
>It is not ambiguous according to the SQL standard. If this behaviour
>of MySQL is documented it is an omission in the MySQL implementation
>that you get the error. If this behaviour is not documented, it is a
>bug.

Perhaps the omission is documented? Should I try to log this as a bug?

I found that the following *does not* cause problems...

select * from a inner join b using (pk) group by pk;

I haven't explored the above further, but it shows that mysql goes at
least some way toward implementing the standard (if I understand what you
wrote corectly).

Out of interest, what would happen in the following case...

select pk from a inner join b on a.pk = b.pk+1; 

Would that be 'correctly' ambigious according to the sql specification?

Cheers,
Dan.

>
>Jochem
>
>--
>MySQL General Mailing List
>For list archives: http://lists.mysql.com/mysql
>To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
>

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: SELECT DISTINCT ROW(A,B) ?

2005-05-07 Thread Dan Bolser

On Sat, 7 May 2005, Roger Baklund wrote:

>Dan Bolser wrote:
>> SELECT DISTINCT ROW(A,B) from a;
>> 
>> While I am on a roll (of sorts) can I ask if the above syntax should be
>> valid or not?
>
>If you mean the exact syntax  above, I think not... it looks like ROW() 
>is a function taking two parameters in this case... what does the 
>function return? Is it an aggregate function?

Good question. I don't really know what the function is. You can see it in
action here...

http://dev.mysql.com/doc/mysql/en/row-subqueries.html

"The expressions (1,2) and ROW(1,2) are sometimes called row
constructors. The two are equivalent. They are legal in other contexts,
too."

>If the ROW() function was a synonym for the CONCAT() function, or worked 
>in a similar way, it might work, but what is the point?

I know it is easy to fix given the example, but I want to nest the whole
thing in an IF statement

COUNT(DISTINCT IF(C=1,ROW(A,B),NULL)) AS x FROM a;

Which is equivelent to ...

COUNT(DISTINCT A,B) AS x FROM a WHERE C=1;

But the former is more powerfull (if it worked). The following works (as
suggested)...

COUNT(DISTINCT IF(C=1,CONCAT(A,B),NULL)) AS x FROM a;

But it isn't optimal because the DISTINCT can't make use of an index over
A and B.

I guess I should just stop complaining and make an AB column and put an
index over it, but it just 'feels' like ROW should work here.

Cheers

>
>> I have a feeling it should be valid (but isn't - it causes an error).
>> 
>> Is this the correct behaviour?
>
>Try this:
>
>SELECT DISTINCTROW A,B from a;
>
>DISTINCTROW and DISTINCT are synonyms:
>
>http://dev.mysql.com/doc/mysql/en/select.html#id2828077 >
>
>

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Efficiently finding a random record

2005-05-13 Thread Dan Bolser

On Fri, 13 May 2005, Eric Bergen wrote:

>Even better is if you have an integer primary key (think auto_increment) 
>and use in()
>
>So if you want 10 random rows in your app generate 20 or so random 
>numbers and do something like
>select col1, from t where x in (1, 5, 3, 2...) limit 10
>
>check num rows and if you don't get enough generate more random numbers 
>and try again.
>
>in() is blazing fast even with thousands of numbers so don't be afraid 
>to kick a few extra in.

I heard about a 255 'in' limit. When you say 'thousands of numbers' do you
mean in the IN or in the column?


>
>-Eric
>
>
>Philip Hallstrom wrote:
>
>>> I have a db of about 300,000 records and when I try to find one 
>>> random record like this:
>>>
>>> select * from table order by rand() limit 1;
>>>
>>> it can take several minutes. My Sherlock Holmes instincts tell me 
>>> that what I'm doing is somehow inefficient. What is the primary 
>>> culprit here?
>>
>>
>> The culprit is that mysql has to assign a random value (via rand()) to 
>> each of the 300,000 rows, then order all 300,000 rows by that random 
>> value and return the first one.
>>
>> So even though you're just retrieving one value, you're processing 
>> 300,000 rows.
>>
>> You'd be better off doing something like this in your application..
>>
>> row_count = get row count;
>> random_row = get random number from 0 to row_count - 1;
>> result = db query LIMIT 1 OFFSET random_row
>>
>> or something like that...
>>
>> -philip
>>
>
>
>


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: amPiguous!

2005-05-15 Thread Dan Bolser

On Sat, 7 May 2005, Jochem van Dieten wrote:

>On 5/7/05, Dan Bolser wrote:
>> On Sat, 7 May 2005, Jochem van Dieten wrote:
>>>On 5/7/05, Dan Bolser wrote:
>>>
>>>> select pk from a inner join b using (pk);
>>>> 
>>>> ERROR 1052 (23000): Column 'pk' in field list is ambiguous!!!
>>>> 
>>>> Is this a bug, or is it like this for a reason? It drives me nuts, because
>>>> it is not ambigious (as far as I can tell). Am I a fool?
>>>
>>> It is not ambiguous according to the SQL standard. If this behaviour
>>> of MySQL is documented it is an omission in the MySQL implementation
>>> that you get the error. If this behaviour is not documented, it is a
>>> bug.
>> 
>> Perhaps the omission is documented? Should I try to log this as a bug?
>
>Always get it into the system. Even if it is considered not a bug but
>a feature I think it warrants a documentation update.
>But the problem with fixing this is that it is not backward compatible
>and will break for everybody who qualifies his field names.

Here is the bug report that I finally made...

http://bugs.mysql.com/10646

Thanks for the help!

Dan.



>
>
>> Out of interest, what would happen in the following case...
>> 
>> select pk from a inner join b on a.pk = b.pk+1; 
>> 
>> Would that be 'correctly' ambigious according to the sql specification?
>
>That would indeed be ambiguous according to the SQL standard.
>
>Jochem
>
>--
>MySQL General Mailing List
>For list archives: http://lists.mysql.com/mysql
>To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
>


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Single vs Multiple primary keys

2005-05-15 Thread Dan Bolser

On Sun, 15 May 2005, Martijn Tonies wrote:

>Bob,
>
>> I have a table, see below, that contains a single primary key
>(SubTestCaseKey ) and a number of foreign keys
>>
>> * plantriggers_ID_FK ,
>> * testcase_root_ID_FK
>> * testcasesuffix_name_FK
>>
>> What I want to ensure is that there are no duplicate records when
>considering the three foreign keys above. Would it be appropriate to remove
>the single primary and replace with three multiple primary keys? Is there a
>performance impact when doing this. this seems overly complex and wonder if
>I should be breaking the table up to simplify? Any suggestions would be
>appreciated.
>>
>
>There's no such thing as 3 primary keys.


You must mean a multipart primary key with three parts :) 

or "multiple-column indexes"

That is what I would do (use a multiple-column index (primary key)  - its
kinda based on opinion, but I think you should let the real data be the
primary key where appropriate, and avoid artificial 'auto_increment'
unless they are specifically useful or necessary in your situation.

I.e build the database around the data, not the other way round :)

That is just my design preference though.

Not sure about performance problems, but you get two 'indexes' for free
with one multipart primary key with three parts (so the order of the
parts is significant (depending on your application)). 

...


http://dev.mysql.com/doc/mysql/en/multiple-column-indexes.html




>
>Why do you need "subtestcasekey"? If the foreign key columns should be
>unique, why not
>make those 3 columns the primary key?
>
>With regards,
>
>Martijn Tonies
>Database Workbench - tool for InterBase, Firebird, MySQL, Oracle & MS SQL
>Server
>Upscene Productions
>http://www.upscene.com
>
>> Bob
>>
>> CREATE TABLE testplans (
>>   SubTestCaseKey INTEGER NOT NULL AUTO_INCREMENT,
>>   plantriggers_ID_FK INTEGER UNSIGNED NOT NULL,
>>   testcase_root_ID_FK INTEGER NOT NULL,
>>   testcasesuffix_name_FK VARCHAR(20) NULL,
>>   FSAGA ENUM('FSA1','FSA2','GA') NULL DEFAULT 'GA',
>>   Priority ENUM('P1','P2','P3','Do Not Run') NULL DEFAULT 'Do Not Run',
>>   DateMod TIMESTAMP NULL,
>>   tester_list_Name_FK VARCHAR(50) NULL,
>>   PRIMARY KEY(SubTestCaseKey),
>>   INDEX testplans_FKIndex1(tester_list_Name_FK),
>>   INDEX testplans_FKIndex2(testcasesuffix_name_FK),
>>   INDEX testplans_FKIndex3(testcase_root_ID_FK),
>>   INDEX testplans_FKIndex4(plantriggers_ID_FK),
>>   FOREIGN KEY(tester_list_Name_FK)
>> REFERENCES tester_list(Name)
>>   ON DELETE RESTRICT
>>   ON UPDATE CASCADE,
>>   FOREIGN KEY(testcasesuffix_name_FK)
>> REFERENCES testcasesuffix(name)
>>   ON DELETE RESTRICT
>>   ON UPDATE CASCADE,
>>   FOREIGN KEY(testcase_root_ID_FK)
>> REFERENCES testcase_root(ID)
>>   ON DELETE RESTRICT
>>   ON UPDATE CASCADE,
>>   FOREIGN KEY(plantriggers_ID_FK)
>> REFERENCES plantriggers(ID)
>>   ON DELETE RESTRICT
>>   ON UPDATE CASCADE
>> )
>> TYPE=InnoDB;
>
>
>


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: MySQL 4.1.12 has been released

2005-05-16 Thread Dan Bolser

On Sun, 15 May 2005, Matt Wagner wrote:

>Hi,
>
>A new version of MySQL Community Edition 4.1.12 Open Source database
>management system has been released. It is now available in source and
>binary form for a number of platforms from our download pages at
>http://dev.mysql.com/downloads/ and mirror sites.


How long is it before the Dec OSF downloads are updated usually?




-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: SCRAMBLE(A,B) (was UDF:Request).

2005-05-24 Thread Dan Bolser

On Mon, 11 Apr 2005 [EMAIL PROTECTED] wrote:

>Dan Bolser <[EMAIL PROTECTED]> wrote on 04/11/2005 11:50:31 AM:
>
>> On Mon, 11 Apr 2005, Dan Bolser wrote:
>> 
>> >
>> >Requirement: 
>> >
>> >Given two columns of a table (Column1 and Column2) of length x, return 
>two
>> >columns (Column1_Scram and Column2_Scram) such that the distributions 
>of
>> >values in Column1 and Column2 are preserved in Column1_Scram and
>> >Column2_Scram, but the pairs of values are randomized.
>> >
>> >
>> >Solution suggested by Shawn Green: 
>> >
>> >Create a table with two columns, and populate this table with random 
>pairs
>> >of primary keys picked from the original table. Additionally, allow no
>> >duplicate primary keys within either column. Select x rows from this
>> >table, and link both primary keys (the primary key pair) back to the
>> >original table to get the appropriate number of randomized pairs of
>> >Column1 and Column2.
>> >
>> >He suggests doing the above like this (more or less):
>> >
>> >OriginalTable
>> >PK   A   B
>> >1   a   c
>> >2   a   d
>> >3   b   e
>> >...
>> >
>> >CREATE TEMPORARY TABLE IntermediateTable (
>> >  PK1  INT NOT NULL,
>> >  ACHAR(1) NOT NULL,
>> >  PK2  INT NOT NULL,
>> >  BCHAR(1) NOT NULL,
>> >  #
>> >  UNIQUE INDEX (PK1,A),
>> >  UNIQUE INDEX (PK2,B)
>> >);
>> >
>> >INSERT IGNORE INTO IntermediateTable
>> >SELECT 
>> >  x.PK, x.A,
>> >  y.PK, y.B
>> >FROM 
>> >  OriginalTable x,
>> >  OriginalTable y
>> >ORDER BY
>> >  RAND();
>> >
>> >SELECT 
>> >  x.A,
>> >  y.B
>> >FROM
>> >  IntermediateTable 
>> >INNER JOIN 
>> >  OriginalTable x ON (PK1 = x.PK) INNER JOIN
>> >  OriginalTable y ON (PK2 = y.PK)
>> >LIMIT 
>> >  the_length_of_OriginalTable;
>> >
>> >
>> >The problem with this solution: 
>> >
>> >Its too slow on reasonable sized tables! 
>> 
>> 
>> Their is also a problem with the way RAND() works...
>> 
>> SELECT
>>   x.PK, x.A,
>>   y.PK, y.B
>> FROM
>>   OriginalTable x,
>>   OriginalTable y
>> ORDER BY
>>   RAND()
>> LIMIT 
>>   1;
>> 
>> This takes soo long to pick a random row. Some cleaver 'LIMIT'
>> optimization could pick a results set almost instantly, instead of 
>taking
>> in excess of half an hour with ~50,000 rows.
>> 
>> 
>> 
>
>Let's try this. I will assume, because you used the PK hack, you have 
>duplicate values in at least one of your sets. Let's cure the Rand() speed 
>issue by adding a column to Original Table to hold a random number and 
>eliminate the lookup problem. Since integer math is much faster than 
>floating point math, we will set up this field as an integer field and 
>fill it appropriately
>
>ALTER TABLE OriginalTable ADD COLUMN RandomKey INT UNSIGNED;
>
>UPDATE OriginalTable SET RandomKey = RAND()*200;
>
>Let's also modify IntermediateTable like this:
>
>DROP TABLE IntermediateTable;
>
>CREATE TABLE FirstColumn
>id INT auto_increment
>, a char(1)
>, PRIMARY KEY (id) 
>);
>
>CREATE TABLE SecondColumn
>id INT auto_increment
>, b char(1)
>, PRIMARY KEY (id)
>);
>
>And populate the new tables:
>INSERT FirstColumn (a)
>SELECT a
>FROM OriginalTable
>ORDER BY PK1;
>
>INSERT SecondColumn (b)
>SELECT b
>FROM OriginalTable
>ORDER BY RandomKey;
>
>Then get your randomized (A,B) pairs this way:
>
>SELECT x.A, y.B
>FROM FirstColumn x
>INNER JOIN SecondColumn y
>on x.id = y.id;
>
>This should be MUCH faster than 30 mins (I would guess on the order of 2 
>or 3 at most). FirstColumn gets filled with data in original order, 
>SecondColumn gets filled with data in random order (thanks to the random 
>value). By creating new tables to cache those values we create two new 
>contiguous auto_increment runs (this way you can analyze subsets of your 
>original data and not need to worry about mismatching on the final INNER 
>JOIN).
>
>On the next pass, Re-run the UPDATE to assign new RAND() values to your 
>data. Do not empty or refill FirstColum. Execute a "TRUNCATE TABLE 
>SecondColumn;" then refill it (INSERT SecondColumn...) and repeat the 
>final query. 
>
>HTH!!

Query to select...

2005-05-24 Thread Dan Bolser


Hello,

I have data like this

PK  GRP_COL
1   A
2   A
3   A
4   B
5   B
6   B
7   C
8   C
9   C


And I want to write a query to select data like this...

PK  FK  GRP_COL
1   1   A
2   1   A
3   1   A
4   4   B
5   4   B
6   4   B
7   7   C
8   7   C
9   7   C


Where FK is a random (or otherwise) member of PK from within the
appropriate group given by GRP_COL. FK recreates the grouping from
GRP_COL, but in terms of PK. I want to do this because GRP_COL is
difficult to handle and I want to re-represent the grouping in terms of PK
(this allows me to link data into the grouping more easily).

Is there a simple way to do this?



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Query to select...

2005-05-24 Thread Dan Bolser

On Tue, 24 May 2005 [EMAIL PROTECTED] wrote:

>Hi,
>try for axample,
>
>select PK, ord(GRP_COL),COL from your_table;


When I said my data was 'like' the below I should have been more specific.
Here is some of my data (with reference to the simplified data (and
question details) below);

INPUT:

<PK-> <--GRP_COL>
+--+-+-++
| PDB  | ASS | LIST| COUNTS |
+--+-+-++
| 104l |   1 | 104lA   | 1  |
| 104l |   2 | 104lA   | 1  |
| 104m |   1 | 104m-   | 1  |
| 105m |   1 | 104m-   | 1  |
| 117e |   1 | 117eA   | 2  |
| 11ba |   1 | 11baA   | 2  |
| 11bg |   1 | 11baA   | 2  |
| 12e8 |   1 | 12e8H,12e8L | 2,2|
| 12e8 |   2 | 12e8H,12e8L | 2,2|
...


OUTPUT (wanted):

<FK> <PK->
+-+-+--+-+
| REP_PDB | REP_ASS | PDB  | ASS |
+-+-+--+-+
| 104l|   1 | 104l |   1 |
| 104l|   1 | 104l |   2 |
| 104m|   1 | 104m |   1 |
| 104m|   1 | 105m |   1 |
| 117e|   1 | 117e |   1 |
| 11ba|   1 | 11ba |   1 | 
| 11ba|   1 | 11bg |   1 | 
| 12e8|   1 | 12e8 |   1 | 
| 12e8|   1 | 12e8 |   2 | 
...


Cheers,



>
>
>Mathias
>
>Selon Dan Bolser <[EMAIL PROTECTED]>:
>
>>
>> Hello,
>>
>> I have data like this
>>
>> PK   GRP_COL
>> 1A
>> 2A
>> 3A
>> 4B
>> 5B
>> 6B
>> 7C
>> 8C
>> 9C
>>
>>
>> And I want to write a query to select data like this...
>>
>> PK   FK  GRP_COL
>> 11   A
>> 21   A
>> 31   A
>> 44   B
>> 54   B
>> 64   B
>> 77   C
>> 87   C
>> 97   C
>>
>>
>> Where FK is a random (or otherwise) member of PK from within the
>> appropriate group given by GRP_COL. FK recreates the grouping from
>> GRP_COL, but in terms of PK. I want to do this because GRP_COL is
>> difficult to handle and I want to re-represent the grouping in terms of PK
>> (this allows me to link data into the grouping more easily).
>>
>> Is there a simple way to do this?
>>
>>
>>
>> --
>> MySQL General Mailing List
>> For list archives: http://lists.mysql.com/mysql
>> To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
>>
>>
>
>
>
>


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Query to select...

2005-05-24 Thread Dan Bolser




>I agree, especially with the additional information the OP provided about 
>his REAL table structure. A separate groups table makes better sense. 
>
>Let this be an object lesson to others looking for assistance: If you want 
>timely and useful assistance, provide real and complete information 
>whenever possible. SHOW CREATE TABLE gives much better information than 
>DESCRIBE or SHOW COLUMNS. If you do chose to obfuscate your real table and 
>data layout then be prepared to translate whatever advice you receive. 

Eeep! I often try to simplify my problem to the bare bones before asking a
question, as it is often quite tricky to work out what you "really want to
do (tm)" and put it in its simplest form - also I often find that doing
this gives me the answer, and I can just delete my email before I ever
send it!


>Sorry it took so long to get to the bottom of this design issue.

Now to work out what I "really want to do" ;) - I will look over the
answers and see if I am any closer...

Thanks all for the help.


>Shawn Green
>Database Administrator
>Unimin Corporation - Spruce Pine
>
>
>
>


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Query to select...

2005-05-24 Thread Dan Bolser

On Tue, 24 May 2005 [EMAIL PROTECTED] wrote:

>Selon Dan Bolser <[EMAIL PROTECTED]>:
>
>>
>> Hello,
>>
>> I have data like this
>>
>> PK   GRP_COL
>> 1A
>> 2A
>> 3A
>> 4B
>> 5B
>> 6B
>> 7C
>> 8C
>> 9C
>>
>>
>> And I want to write a query to select data like this...
>>
>> PK   FK  GRP_COL
>> 11   A
>> 21   A
>> 31   A
>> 44   B
>> 54   B
>> 64   B
>> 77   C
>> 87   C
>> 97   C
>>
>>
>> Where FK is a random (or otherwise) member of PK from within the
>> appropriate group given by GRP_COL. FK recreates the grouping from
>> GRP_COL, but in terms of PK. I want to do this because GRP_COL is
>> difficult to handle and I want to re-represent the grouping in terms of PK
>> (this allows me to link data into the grouping more easily).
>>
>> Is there a simple way to do this?



Sorry about the column names above (something in my head). Here is my
favorite answer...


SET @i=0, @row='';

SELECT 
  *, -- Data table
  IF(@row=GRP_COL, @i, @i:[EMAIL PROTECTED]) AS FK,
  @row:=GRP_COL  AS DROP_ME_LATER
FROM 
  data 
ORDER BY 
  GRP_COL-- Essential for the logic used
;

http://dev.mysql.com/doc/mysql/en/variables.html (John Belamaric)

Having the FK column taken from the PK column was clearly not necessary
(thanks all again for pointers).

Somehow in the distant memory of my brain this is the answer I was looking
for (and finally found). I like this answer because I hate that half of
SQL which ALTERS tables and I have a neurotic fear of UPDATES accross
JOINS that infected my nightmares as a youth!

Strange I know, but its late and time for me to sleep /(xOx)/

Pleasant dreams!

Dan.


>>
>>
>>
>> --
>> MySQL General Mailing List
>> For list archives: http://lists.mysql.com/mysql
>> To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
>>
>>
>
>
>
>


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Resetting Auto-increment

2005-05-24 Thread Dan Bolser

On Tue, 24 May 2005 [EMAIL PROTECTED] wrote:

>Is there a better way to reset the auto_increment in a table, basically
>there are several million rows in the database and the field that is
>auto_increment is very large now and I don't want to exceed the limit of
>the field description so I want to renumber all the rows starting at 1
>again.
>
> 
>
>What I found online was this:
>
> 
>
>For those that are looking to "reset" the auto_increment, say on a list
>that has had a few deletions and you want to renumber everything, you can
>do the following.
>
>DROP the field you are auto_incrementing.
>ALTER the table to ADD the field again with the same attributes.
>
>You will notice that all existing rows are renumbered and the next
>auto_increment number will be equal to the row count plus 1.
>
> 
>
> 
>
>So is there an easier / better way to do this? Also would the way this is
>done be different if the table is empty?? I have an empty table that when
>I add a record, starts at some number based on how many have been entered
>and deleted.


In the case that the table is empty the information here

http://dev.mysql.com/doc/mysql/en/set-option.html


should let you do what you want.


I don't know the best way to 're-index' existing data. 


>
> 
>
>Chris Hood 
>
>Investigator Verizon Global Security Operations Center 
>
>Email:  
>[EMAIL PROTECTED] 
>
>Desk: 972.399.5900
>
>Verizon Proprietary
>
>
>
>NOTICE - This message and any attached files may contain information that
>is confidential and/or subject of legal privilege intended only for the
>use by the intended recipient.  If you are not the intended recipient or
>the person responsible for delivering the message to the intended
>recipient, be advised that you have received this message in error and
>that any dissemination, copying or use of this message or attachment is
>strictly forbidden, as is the disclosure of the information therein.  If
>you have received this message in error please notify the sender
>immediately and delete the message.
>
> 
>
>


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Cumulative Totals

2005-05-25 Thread Dan Bolser

On Wed, 25 May 2005, Russell Horn wrote:

>I have a pretty simple table with a list of payments, not much more
>than:
>
>paymentID | amount | paymentDate
>1| 123| 2005-01-10
>2| 77 | 2005-01-13
>3| 45 | 2005-02-16
>4| 13 | 2005-02-17
>
>
>I can get totals per month using a query like:
>
>SELECT SUM(amount) , DATE_FORMAT( `paymentDate` , '%Y-%m' )   FROM
>`payments`   GROUP BY DATE_FORMAT( payments . date , '%Y-%m' ) 
>
>That would give me:
>
>amount | paymentDate
>200| 2005-01
>58 | 2005-02
>
>Is there any way to get a running cumulative total directly from mysql?
>Something like:
>
>amount | paymentDate
>200| 2005-01
>258| 2005-02
>

http://dev.mysql.com/doc/mysql/en/variables.html


:D




>Thanks,
>
>Russell.
>
>
>


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

56 matches

Mail list logo