> On 19 May 2015, at 11:52, toki <toki.kant...@gmail.com> wrote:
> 
> 
> 
> On 19/05/2015 08:05, Dirk-Willem van Gulik wrote:
> 
>>> In testing out various grammar and spell checkers, I've come across a
>>> couple of instances, where different languages/dialects share the same
>>> ISO-639-# code.
>> 
>> Can you give an example
> 
> ISO 639-1 is xo
> ISO 639-2 is xho
> ISO 639-3 is xho
> Glotolog is xhos1239
> ISO 3166-1 ZA / ZAF / 710
> ISO 3166-2 ZA-EC
> 
> and
> 
> ISO 639-1 is xo
> ISO 639-2 is xho
> ISO 639-3 is xho
> Glotolog is mpon1252
> ISO 3166-1 ZA / ZAF / 710
> ISO 3166-2 ZA-NL
> (Please skip the debate about whether or not the enclaves are KwaZulu,
> the Eastern Cape, or Lesotho.)

Ok - good examples. So the 639’s all map maps to

        http://www.ethnologue.com/language/xho

2 map to the actual language in current use; 1 maps to the language families 
and group that xho and its dialects, like mpondo, belong to.

And:

        -1 (xh equivalent)
        -2 and -3: (xho)

to
                http://www-01.sil.org/iso639-3/documentation.asp?id=xho

so the -1, -2 and -3 are equivalent. And 1:1 on xhos1239 in glotolog ? And -5 
is a white herring - it maps to the language families and group of xho 
languages.

Now as far as I can see - mpon1252 is a dialect (Mpondo) within xhos1239.

It has no entry of its own in -3 or within -5; so its closed is xo/xho/xho in 
-1, -2, -3; and it for sure belngs in -5 xho.

Or in otherwords; SIL.org  (or the US library of congress for -5) has not 
assigned it (yet).

So in ISO 639-X the most accurate you can pinpoint it is xo and then xho.

And in glotolog; you have mpon1252 as its most precise denominator.

Now as it *happens* - this language is spoken in an area fully covered by a 
single country - so you can use a 3166 as a country (-1, ZA) or (-2, ZA-EC, 
ZA-NL) region specifier; and then refine it. As it happens that the region more 
or less maps to the language spoken there (and lets argue that in that region 
or country no other languages are spoken).

> For a slightly different example, I give you Koine Greek and Attic Greek
> .
> Linguist-List codes them as grc-koi & grc-att, respectively.
> ISO 639-2 code is GRC. ISO 639-3 is GRC. No ISO 639-1 code.
> 
> I wish all dialects/languages were as accommodating as:
> Gottolog lush1251
> ISO 639-1 none;
> ISO 639-2 none;
> ISO 639-3 LUT;
> ISO 639-3 SKA;
> ISO 639-3 SNO;
> ISO 639-3 SLH;
> (Note: AFAIK, there are no spell checkers or grammar checkers for those
> dialects, for any office suite.)

So also good examples - and I think the same applies

-       you get broad specifiers on -1, -2 level.
-       you may get granular specifiers in -3 and -5 for the rarer/older 
languages.
-       for dialects and more refined pinpointing you hit the limits of 639(-5) 
and have
        two options; petition SIL/Library of Congress to add one (above 
examples are all in scope); or rely on glottolog.

and

-       using regional coding; 3166; is not really helping you - as they do not 
define language.

Pragmatically that means using an exact -3 if you have it (i.e. the exact 
language match); relying on the nearest ‘above’ -5 language family identifier 
when there is no -3 match to be had; and ONLY in the -5 case add whatever you 
can, e.g. the glottolog identifier, to refine it.

And because -3 and -5 use similar identifiers for languages actually spoken 
(xho) and the language group (xho) to which mpo belongs; the identifier you 
expose should propably be something like


        iso-639-3:lang                  lang = alpha-3 language identifier
or
        iso-639-5:langgroup[:other]
                                                langgroup = alpha-3 language 
families and groups identifier
                                                other = optional identifier; 
taken from glottlog when available.

or something along those lines. And discourage -1 and 3166 use; though permit 
it in :other if there is no glottolog entry

Dw.


Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to