On 10/24/2013 04:40 PM, Boris Pavlovic wrote:
Hi,


1) If you have 30 million instance it means that you have 300 million
instance_system_metadata records
All these records will be downloaded every 6 seconds (in periodic tasks)
to compute_nodes => which means that OpenStack on scale out of box
doesn't work.

If you have 3 million instance the situation is the same (OpenStack
doesn't work). but migration will be done for 1 minute.

So it is maximum 1 minute downtime (in actually non real case).


This change is actually very important because VARCHAR works much much
faster then BLOB (Text) records.

This is actually not always true. It depends on:

* What RDBMS you are using
* What storage engine within MySQL, if using MySQL
* What the data access/modification patterns on the field are

The last bullet is very important. For fields that:

* Are not used in predicates or aggregates (i.e. not used in JOIN conditions, WHERE clauses, or GROUP BY/HAVING clauses)
* Are rarely read or updated
* Are typically longer than 200-300 bytes of data

It's typically more efficient (both from a fill factor perspective and from a log structure perspective) to leave the field as TEXT rather than using VARCHAR. The reason is twofold:

1) Using TEXT saves space in the main data pages of the storage engine since a pointer to external data file is stored in the data page instead of the data itself, meaning more records of that table can fit in a single fixed-byte-size data page, which means fewer disk and memory reads to find a record, which means faster seeks and scans.

2) When modifying the value of the TEXT field, there is no chance that a clustered index-organized table layout (like, say, InnoDB uses) would need to perform a rebalancing action due to the new size of the TEXT data causing the record to not fit on its containing data page -- something that would happen a lot more if VARCHAR was used.

If the data is:

1) Often used in predicates or aggregates
2) Often updated and the size of the updated field stays a similar range
3) ALWAYS within a small, defined range of bytes (say... 32-64 bytes)

Then it's often advantageous to use VARCHAR (or CHAR) over TEXT.

But it's wrong to say that it is ALWAYS faster to use VARCHAR vs. BLOB/TEXT.

Best,
-jay

So this is important change and shouldn't be -2.


Best regards,
Boris Pavlovic





On Fri, Oct 25, 2013 at 12:30 AM, Michael Still <mi...@stillhq.com
<mailto:mi...@stillhq.com>> wrote:

    Hi.

    Because I am a grumpy old man I have just -2'ed
    https://review.openstack.org/#/c/39685/ and I wanted to explain my
    rationale. Mostly I am hoping for a consensus to form -- if I am wrong
    then I'll happy remove my vote from this patch.

    This patch does the reasonably sensible thing of converting two
    columns from being text to varchar, which reduces their expense to the
    database. Given the data stored is already of limited length, it
    doesn't impact our functionality at all either.

    However, when I run it with medium sized (30 million instances)
    databases, the change does cause a 10 minute downtime. I don't
    personally think the change is worth such a large outage, but perhaps
    everyone else disagrees.

    Discuss.

    Thanks,
    Michael

    PS: I could see a more complicated approach where we did these changes
    "in flight" by adding columns, using a periodic task to copy data to
    the new columns, and then dropping the old. That's a lot more
    complicated to implement though.

    --
    Rackspace Australia




_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to