Re: Flexable Collating (feedback please)

Gabriel Genellina Wed, 18 Oct 2006 19:07:15 -0700

At Wednesday 18/10/2006 21:36, Ron Adam wrote:

>>          if self.flag & CAPS_FIRST:
>>              s = s.swapcase()
>
> This is just coincidental; it relies on (lowercase)<(uppercase) on the
> locale collating sequence, and I don't see why it should be always so.

The LC_COLLATE structure (in the python.exe C code I think) controlsthe order

of upper and lower case during collating.  I don't know if there is anyway to
examine it unfortunately.

LC_COLLATE is just a #define'd constant. I don't know how to examinethe collating definition, either.

If there was a way to change the LC_COLLATE structure, I wouldn'tneed to resortto tricks like s.swapcase(). But without that info, I don't know ofanother way.
Maybe changing the CAPS_FIRST to REVERSE_CAPS_ORDER would do?


At least it's a more accurate name.

There is an indirect way: test locale.strcoll("A","a") and see howthey get sorted. Then define options CAPS_FIRST, LOWER_FIRSTaccordingly. But maybe it's too much trouble...

> You should try to make this part a bit more generic. If you are
> concerned about locales, do not use "comma" explicitely. In other
> countries 10*100=1.000 - and 1,234 is a fraction between 1 and 2.

See the most recent version of this I posted.  It is a bit more generic.

       news://news.cox.net:119/[EMAIL PROTECTED]

Maybe a 'comma_is_decimal' option?

I'd prefer to use the 'decimal_point' and 'thousands_sep' from thelocale information. That would be more coherent with the locale usagealong your module.

Options are cheep so it's no problem to add them as long as theymake sense. ;-)
These options are what I refer to as mid-level options.  The programmer does
still need to know something about the data they arecollating. They may still
need to do some preprocessing even with this, but maybe not as much.

In a higher level collation routine, I think you would just need to specify a
named sort type, such as 'dictionary', 'directory', 'enventory' andit would setthe options and accordingly. The problem with that approach is thehigher leveldefinitions may be different depending on locale or even the fieldit is used in.

Sure. But your module is a good starting point for building a morehigh-level procedure.

>>      The NUMERICAL option orders leading and trailing digits as numerals.
>>
>>          >>> t = ['a5', 'a40', '4abc', '20abc', 'a10.2', '13.5b', 'b2']
>>          >>> collated(t, NUMERICAL)
>>          ['4abc', '13.5b', '20abc', 'a5', 'a10.2', 'a40', 'b2']
>
>  From the name "NUMERICAL" I would expect this sorting: b2, 4abc, a5,
> a10.2, 13.5b, 20abc, a40 (that is, sorting as numbers only).
> Maybe GROUP_NUMBERS... but I dont like that too much either...

How about 'VALUE_ORDERING' ?

The term I've seen before is called natural ordering, but that ismore general

and can include date, roman numerals, as well as other type.

Sometimes that's the hard part, finding a name which is concise,descriptive, and accurately reflects what the code does. A good nameshould make obvious what it is used for (being these option names, orclass names, or method names...) but in this case it may be difficultto find a good one. So users will have to read the documentation (agood thing, anyway!)



--
Gabriel Genellina

Softlab SRL


        
        
                
__________________________________________________
Preguntá. Respondé. Descubrí.
Todo lo que querías saber, y lo que ni imaginabas,
está en Yahoo! Respuestas (Beta).

¡Probalo ya!http://www.yahoo.com.ar/respuestas

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Flexable Collating (feedback please)

Reply via email to