the named code page ESMWIN is the correct to use at ansi spanish
applications. all other are invalids to work or only to read old buggy
indexes incorrect ordered.
ESMWIN is based on:
Windows locale: Spanish (Modern Sort)
Locale ID: 0xC0A
Location designator: Modern_Spanish
Code page: 1252
Code page name: windows-1252
you can read the next tables and see how incorrect is to use ISO-????.
CERTANLLY ESMWIN needs to change its code page name to windows-1252
ESWIN is a ansi copy of its oem version es850 compatible with old collation
used in CL53
but incorrectly ordered an unusable in Spain for this reason.
I don't know enough Linux, but the next is the windows information:
1) first of all this files are not code page, are collations.
2) theorically DBF are created to contain oem code page. But actually the
great part of windows applications save ansi code page at this files
without problem or conflict.
3) there are different code page OEM, ANSI, UTF8( UNICODE ), ... all of
them has its different collate.
Single-byte code pages are definitions of the characters mapped to each of the 256 bit patterns possible in a byte. Code pages
define bit patterns for uppercase and lowercase characters, digits, symbols, and special characters such as !, @, #, or %. Each
European language, such as German or Spanish, has its own single-byte code page. Although the bit patterns used to represent the
Latin alphabet characters A through Z are the same for all the code pages, the bit patterns used to represent accented
characters such as 'é' and 'á' vary from one code page to the next. If data is exchanged between computers running different
code pages, all character data must be converted from the code page of the sending computer to the code page of the receiving
computer. If the source data has extended characters that are not defined in the code page of the receiving computer, data is
lost. When a database serves clients from many different countries/regions, it is difficult to pick a code page for the database
that contains all the extended characters required by all the client computers. Also, there is a lot of processing time spent
doing the constant conversions from one code page to another.
Code page Description
1258 Vietnamese
1257 Baltic
1256 Arabic
1255 Hebrew
1254 Turkish
1253 Greek
1252 Latin1 (ANSI)
1251 Cyrillic
1250 Central European
950 Chinese (Traditional)
949 Korean
936 Chinese (Simplified)
932 Japanese
874 Thai
850 Multilingual (MS-DOS Latin1)
437 MS-DOS U.S. English
'CodePage identifier and name BrDisp BrSave MNDisp MNSave 1-Byte
ReadOnly
'37 IBM037 False False False False True
True
'437 IBM437 False False False False True
True
'500 IBM500 False False False False True
True
'708 ASMO-708 True True False False True
True
'720 DOS-720 True True False False True
True
'737 ibm737 False False False False True
True
'775 ibm775 False False False False True
True
'850 ibm850 False False False False True
True
'852 ibm852 True True False False True
True
'855 IBM855 False False False False True
True
'857 ibm857 False False False False True
True
'858 IBM00858 False False False False True
True
'860 IBM860 False False False False True
True
'861 ibm861 False False False False True
True
'862 DOS-862 True True False False True
True
'863 IBM863 False False False False True
True
'864 IBM864 False False False False True
True
'865 IBM865 False False False False True
True
'866 cp866 True True False False True
True
'869 ibm869 False False False False True
True
'870 IBM870 False False False False True
True
'874 windows-874 True True True True True
True
'875 cp875 False False False False True
True
'932 shift_jis True True True True False
True
'936 gb2312 True True True True False
True
'949 ks_c_5601-1987 True True True True False
True
'950 big5 True True True True False
True
'1026 IBM1026 False False False False True
True
'1047 IBM01047 False False False False True
True
'1140 IBM01140 False False False False True
True
'1141 IBM01141 False False False False True
True
'1142 IBM01142 False False False False True
True
'1143 IBM01143 False False False False True
True
'1144 IBM01144 False False False False True
True
'1145 IBM01145 False False False False True
True
'1146 IBM01146 False False False False True
True
'1147 IBM01147 False False False False True
True
'1148 IBM01148 False False False False True
True
'1149 IBM01149 False False False False True
True
'1200 utf-16 False True False False False
True
'1201 unicodeFFFE False False False False False
True
'1250 windows-1250 True True True True True
True
'1251 windows-1251 True True True True True
True
'1252 Windows-1252 True True True True True
True
'1253 windows-1253 True True True True True
True
'1254 windows-1254 True True True True True
True
'1255 windows-1255 True True True True True
True
'1256 windows-1256 True True True True True
True
'1257 windows-1257 True True True True True
True
'1258 windows-1258 True True True True True
True
'1361 Johab False False False False False
True
'10000 macintosh False False False False True
True
'10001 x-mac-japanese False False False False False
True
'10002 x-mac-chinesetrad False False False False False
True
'10003 x-mac-korean False False False False False
True
'10004 x-mac-arabic False False False False True
True
'10005 x-mac-hebrew False False False False True
True
'10006 x-mac-greek False False False False True
True
'10007 x-mac-cyrillic False False False False True
True
'10008 x-mac-chinesesimp False False False False False
True
'10010 x-mac-romanian False False False False True
True
'10017 x-mac-ukrainian False False False False True
True
'10021 x-mac-thai False False False False True
True
'10029 x-mac-ce False False False False True
True
'10079 x-mac-icelandic False False False False True
True
'10081 x-mac-turkish False False False False True
True
'10082 x-mac-croatian False False False False True
True
'20000 x-Chinese-CNS False False False False False
True
'20001 x-cp20001 False False False False False
True
'20002 x-Chinese-Eten False False False False False
True
'20003 x-cp20003 False False False False False
True
'20004 x-cp20004 False False False False False
True
'20005 x-cp20005 False False False False False
True
'20105 x-IA5 False False False False True
True
'20106 x-IA5-German False False False False True
True
'20107 x-IA5-Swedish False False False False True
True
'20108 x-IA5-Norwegian False False False False True
True
'20127 us-ascii False False True True True
True
'20261 x-cp20261 False False False False False
True
'20269 x-cp20269 False False False False True
True
'20273 IBM273 False False False False True
True
'20277 IBM277 False False False False True
True
'20278 IBM278 False False False False True
True
'20280 IBM280 False False False False True
True
'20284 IBM284 False False False False True
True
'20285 IBM285 False False False False True
True
'20290 IBM290 False False False False True
True
'20297 IBM297 False False False False True
True
'20420 IBM420 False False False False True
True
'20423 IBM423 False False False False True
True
'20424 IBM424 False False False False True
True
'20833 x-EBCDIC-KoreanExtended False False False False True
True
'20838 IBM-Thai False False False False True
True
'20866 koi8-r True True True True True
True
'20871 IBM871 False False False False True
True
'20880 IBM880 False False False False True
True
'20905 IBM905 False False False False True
True
'20924 IBM00924 False False False False True
True
'20932 EUC-JP False False False False False
True
'20936 x-cp20936 False False False False False
True
'20949 x-cp20949 False False False False False
True
'21025 cp1025 False False False False True
True
'21866 koi8-u True True True True True
True
'28591 iso-8859-1 True True True True True
True
'28592 iso-8859-2 True True True True True
True
'28593 iso-8859-3 False False True True True
True
'28594 iso-8859-4 True True True True True
True
'28595 iso-8859-5 True True True True True
True
'28596 iso-8859-6 True True True True True
True
'28597 iso-8859-7 True True True True True
True
'28598 iso-8859-8 True True False False True
True
'28599 iso-8859-9 True True True True True
True
'28603 iso-8859-13 False False False False True
True
'28605 iso-8859-15 False True True True True
True
'29001 x-Europa False False False False True
True
'38598 iso-8859-8-i True True True True True
True
'50220 iso-2022-jp False False True True False
True
'50221 csISO2022JP False True True True False
True
'50222 iso-2022-jp False False False False False
True
'50225 iso-2022-kr False False True False False
True
'50227 x-cp50227 False False False False False
True
'51932 euc-jp True True True True False
True
'51936 EUC-CN False False False False False
True
'51949 euc-kr False False True True False
True
'52936 hz-gb-2312 True True True True False
True
'54936 GB18030 True True True True False
True
'57002 x-iscii-de False False False False False
True
'57003 x-iscii-be False False False False False
True
'57004 x-iscii-ta False False False False False
True
'57005 x-iscii-te False False False False False
True
'57006 x-iscii-as False False False False False
True
'57007 x-iscii-or False False False False False
True
'57008 x-iscii-ka False False False False False
True
'57009 x-iscii-ma False False False False False
True
'57010 x-iscii-gu False False False False False
True
'57011 x-iscii-pa False False False False False
True
'65000 utf-7 False False True True False
True
'65001 utf-8 True True True True False
True
'65005 utf-32 False False False False False
True
'65006 utf-32BE False False False False False
True '
Windows locale LCID (locale ID) Collation designator Code page
Afrikaans 0xx436 Latin1_General 1252
Albanian 0x41C Albanian 1250
Arabic (Saudi Arabia) 0x401 Arabic 1256
Arabic (Iraq) 0x801 Arabic 1256
Arabic (Egypt) 0xC01 Arabic 1256
Arabic (Libya) 0x1001 Arabic 1256
Arabic (Algeria) 0x1401 Arabic 1256
Arabic (Morocco) 0x1801 Arabic 1256
Arabic (Tunisia) 0x1C01 Arabic 1256
Arabic (Oman) 0x2001 Arabic 1256
Arabic (Yemen) 0x2401 Arabic 1256
Arabic (Syria) 0x2801 Arabic 1256
Arabic (Jordan) 0x2C01 Arabic 1256
Arabic (Lebanon) 0x3001 Arabic 1256
Arabic (Kuwait) 0x3401 Arabic 1256
Arabic (United Arab Emirates) 0x3801 Arabic 1256
Arabic (Bahrain) 0x3C01 Arabic 1256
Arabic (Qatar) 0x4001 Arabic 1256
Basque 0x42D Latin1_General 1252
Byelorussian 0x423 Cyrillic_General 1251
Bulgarian 0x402 Cyrillic_General 1251
Catalan 0x403 Latin1_General 1252
Chinese (Taiwan) 0x30404 Chinese_Taiwan_Bopomofo 950
Chinese (Taiwan) 0x404 Chinese_Taiwan_Stroke 950
Chinese (People's Republic of China) 0x804 Chinese_PRC 936
Chinese (People's Republic of China) 0x20804 Chinese_PRC_Stroke 936
Chinese (Singapore) 0x1004 Chinese_PRC 936
Croatia 0x41a Croatian 1250
Czech 0x405 Czech 1250
Danish 0x406 Danish_Norwegian 1252
Dutch (Standard) 0x413 Latin1_General 1252
Dutch (Belgium) 0x813 Latin1_General 1252
English (United States) 0x409 Latin1_General 1252
English (Britain) 0x809 Latin1_General 1252
English (Canada) 0x1009 Latin1_General 1252
English (New Zealand) 0x1409 Latin1_General 1252
English (Australia) 0xC09 Latin1_General 1252
English (Ireland) 0x1809 Latin1_General 1252
English (South Africa) 0x1C09 Latin1_General 1252
English (Carribean) 0x2409 Latin1_General 1252
English (Jamaican) 0x2009 Latin1_General 1252
Estonian 0x425 Estonian 1257
Faeroese 0x0438 Latin1_General 1252
Farsi 0x429 Arabic 1256
Finnish 0x40B Finnish_Swedish 1252
French (Standard) 0x40C French 1252
French (Belgium) 0x80C French 1252
French (Switzerland) 0x100C French 1252
French (Canada) 0xC0C French 1252
French (Luxembourg) 0x140C French 1252
Georgian (Modern Sort) 0x10437 Georgian_Modern_Sort 1252
German (PhoneBook Sort) 0x10407 German_PhoneBook 1252
German (Standard) 0x407 Latin1_General 1252
German (Switzerland) 0x807 Latin1_General 1252
German (Austria) 0xC07 Latin1_General 1252
German (Luxembourg) 0x1007 Latin1_General 1252
German (Liechtenstein) 0x1407 Latin1_General 1252
Greek 0x408 Greek 1253
Hebrew 0x40D Hebrew 1255
Hindi 0x439 Hindi Unicode only
Hungarian 0x40E Hungarian 1250
Hungarian 0x104E Hungarian_Technical 1250
Icelandic 0x40F Icelandic 1252
Indonesian 0x421 Latin1_General 1252
Italian 0x410 Latin1_General 1252
Italian (Switzerland) 0x810 Latin1_General 1252
Japanese 0x411 Japanese 932
Japanese (Unicode) 0x10411 Japanese_Unicode 932
Korean (Extended Wansung) 0x412 Korean_Wansung 949
Korean 0x412 Korean_Wansung_Unicode 949
Latvian 0x426 Latvian 1257
Lithuanian 0x427 Lithuanian 1257
Lithuanian 0x827 Lithuanian_Classic 1257
Macedonian (Former Yugoslav Republic of Macedonia) 0x41C Cyrillic_General 1251
Norwegian (Bokmål) 0x414 Danish_Norwegian 1252
Norwegian (Nynorsk) 0x814 Danish_Norwegian 1252
Polish 0x415 Polish 1250
Portuguese (Portugal) 0x816 Latin1_General 1252
Portuguese (Brazil) 0x416 Latin1_General 1252
Romanian 0x418 Romanian 1250
Russian 0x419 Cyrillic_General 1251
Serbian (Latin) 0x81A Cyrillic_General 1251
Serbian (Cyrillic) 0xC1A Cyrillic_General 1251
Slovak 0x41B Slovak 1250
Slovenian 0x424 Slovenian 1250
Spanish (Mexico) 0x80A Traditional_Spanish 1252
Spanish (Traditional Sort) 0x40A Traditional_Spanish 1252
Spanish (Modern Sort) 0xC0A Modern_Spanish 1252
Spanish (Guatemala) 0x100A Modern_Spanish 1252
Spanish (Costa Rica) 0x140A Modern_Spanish 1252
Spanish (Panama) 0x180A Modern_Spanish 1252
Spanish (Dominican Republic) 0x1C0A Modern_Spanish 1252
Spanish (Venezuela) 0x200A Modern_Spanish 1252
Spanish (Colombia) 0x240A Modern_Spanish 1252
Spanish (Peru) 0x280A Modern_Spanish 1252
Spanish (Argentina) 0x2C0A Modern_Spanish 1252
Spanish (Ecuador) 0x300A Modern_Spanish 1252
Spanish (Chile) 0x340A Modern_Spanish 1252
Spanish (Uruguay) 0x380A Modern_Spanish 1252
Spanish (Paraguay) 0x3C0A Modern_Spanish 1252
Spanish (Bolivia) 0x400A Modern_Spanish 1252
Swedish 0x41D Finnish_Swedish 1252
Thai 0x41E Thai 874
Turkish 0x41F Turkish 1254
Ukrainian 0x422 Ukrainian 1251
Urdu 0x420 Arabic 1256
Vietnamese 0x42A Vietnamese 1258
Current eswin and eswinm (esmwin) is setup to use ISO-8859-1
CP internally. If this is right, they are in fact wrongly named
es_win_, and they should be named es_iso_. I assumed the
content is right, so I copied them to es_iso_, with exact same
content, this way the filename and internal CP are in sync.
Now, if the end result is not well, it can mean two things:
1) The CP marking in eswin is wrongly set, and should
be Windows-something. In this case eswin should be
corrected to use proper CP, and esiso should be converted
to use ISO-8859-1.
2) The CP marking in eswin is right, in this case the
content should be changed to be Windows-something CP,
and Windows CP strings, and new esiso files kept as correct
ones.
But having an ESWIN codepage internally using ISO-8859-1,
is completely misleading.
'SVWIN' BTW has a similar problem. Name 'win', internal CP ISO.
Brgds,
Viktor
ESMWIN codepage is a consensus among various programmers to make it
compatible with the codepage "Modern_spanish" that Microsoft uses in
its applications, and that is also compatible with all other languages
"Latin" without characters above 256 (utf8).
for example this collation is compatible with french, portuguese,
spanish, catalanish, galician and italian (COMPATIBLE)
This codepages are ansi collation codepages. And are not ISO, really
is a merge of codepages to to use only one in countries
where are speaked more than on language as spain.
If is possible, please revert the name changed. And remove the new one
copy.
What about backguard compatibilitie....
Best regards,
Miguel Angel Marchuet
Szakáts Viktor escribió:
2008-11-03 11:20 UTC+0200 Viktor Szakats (harbour.01 syenar hu)
* common.mak
* source/codepage/Makefile
- source/codepage/cpesmwin.c
+ source/codepage/cpeswinm.c
+ source/codepage/cpesiso.c
+ source/codepage/cpesisom.c
* Renamed cpesmwin -> cpeswinm (ESMWIN -> ESWINM) (INCOMPATIBLE)
+ Added Spanish ISO natsort modules. Besides their ID, they
are idendical with current ESWIN* natsorts, because the WIN
versions for some reason are using ISO-8859 CP instead of
Windows-*. This is IMO wrong, even if these CPs are similar or
identical for the Spanish language.
SORRY BUT IS NOT AN ISO CODEPAGE
--
Brgds,
Viktor
_______________________________________________
Harbour mailing list
Harbour@harbour-project.org
http://lists.harbour-project.org/mailman/listinfo/harbour
__________ Información de ESET NOD32 Antivirus, versión de la base de
firmas de virus 3576 (20081102) __________
ESET NOD32 Antivirus ha comprobado este mensaje.
http://www.eset.com
__________ Información de ESET NOD32 Antivirus, versión de la base de
firmas de virus 3578 (20081103) __________
ESET NOD32 Antivirus ha comprobado este mensaje.
http://www.eset.com
_______________________________________________
Harbour mailing list
Harbour@harbour-project.org
http://lists.harbour-project.org/mailman/listinfo/harbour