Re: Efficiency of MEDIUMTEXT columns?

ryc Wed, 18 Apr 2001 15:04:25 -0700
> I have some VARCHAR columns (name, e-mail address and subject) that are
> used in every query, though. Should I change them to CHAR(255) so that I
> can do this? (On a related note, does anyone know whether the RFC for
> e-mail messages limit the subject length/name length/e-mail address length
> to a certain number of characters? What is this number?)

I would suggest to create another seperate table for user info only and just
keep a userid field in the message table that corresponds with the user info
row that holds the name and email. This way if someone posts 100 messages,
you only store
their email address once. The subject field is a bit more tricky. You COULD
change it to Char(255) but that would waste a lot of space seeing that most
subjects are a lot shorter. You can move the subject into the table that
holds the varchar bodies. This wont hurt anything because that table already
has varaible length rows from the body column, adding the subject wont hurt.

> Also, is it a good idea to make a field called size and use that instead
> of calling length(message) (message is MEDIUMTEXT) when I don't need to
> retrieve message, or does MySQL have length(message) cached?

Yes definitly. Although it is tempting to calculate the length/count/other
stats on the fly using database functions, it is not scalable. For any stat
type field like length(in your case size) ask yourself how often it changes.
If it doesnt change often or at all, it is much better to calculate it once
and store 'cache' the result in the database. Do the same for number of
replies (ie dont calculate them on the fly). Otherwise you are going to find
yourself with 500k rows and the counting replies isnt so fast anymore.

> So it would appear that retrieving a message (the message that I used in
> this test is 300k, about the largest that is in the archive) from the
> database is faster than loading it from a disk file. However, when I have
> to split() the message into a list (one entry per line), it becomes
> slower.  Any idea how I can split it more efficiently? (My code has to
> process each line individually in order to hilight text from quoted
> replies; e.g. look at the different colors in
> http://qd.anifics.com/read.asp?num=32)

I can think of two solutions. Instead of splitting it into lines based on a
newline, you can do a global replace using \n as an anchor. Something like:
search for "\n>" and replace with "\n$colorcolor>". I am not sure if this
would be much faster than doing a split you can try it out. But again
thinking scalability you might want to do this conversion once, and store
the results in the database. So you dont have to worry about parsing it into
lines once you retrieve it from the db, you can just spit it out because it
already has the color info in it.

ryan


---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php
Re: Efficiency of MEDIUMTEXT columns?

Reply via email to