Hi,
First we should agree final version and resolve some problems.
1. how to keep translation in memory.
IMHO it should be single serialized item and hash item is very good
choice because it can be also easy extended to keep additional
attributes. All translation strings will be indexed by string items
and all other attributes like information about CP encoding, language
name will be indexes by numbers. F.e.:
#define HB_I18N_CPID 1
#define HB_I18N_DOMAIN 2
[...]
hTrans[ HB_I18N_CPID ] := "PLISO"
I was thinking about:
{"LANGUAGE"=>"PL_PL", "CPID"=>"PLISO", "TABLE"=>{...}}
but using numeric keys are also possible.
2. how to keep translation in final binary files.
hb_itemSerialize() seems to be natural form
I think hb_itemSerialize() is good for the table storage, but language,
cpid could be stored in header. Header containing at least signature and
file version would be a good idea. It can help identify file for various
tools without decoding serialized data, ex., /etc/magic
3. we should add API to easy create/update trnalsation tables.
It does not have to be optimized for speed and can be written
as .prg code so programmers can easy create usr interface for
translations in their programs. We can also create such program
as new hbtools, f.e. hbi18n so it can be used externally to
application.
To not hardcode internal representation in such programs we will
need functions to initialize new translation set, add new translation
item and extract final set as item or string.
OK. I think API for create/update translation is the easiest part.
4. We should add support for loading directly .pot files and create
translation table from them.
Sure.
5. For easy use it's necessary to have in core code function which will
work like printf(). Otherwise it will be hard to create strings for
translations which are not context dependent.
Such function should be in base escape characters like %s or %d
compatible with C so we can use dedicated tools to operate on .pot
files. Anyhow for better flexibility we should add support for stringify
and format any item size. F.e. %s should also work for numeric, date and
logical items. In general it should be possible to create formatting
similar to the one given by transform and picture clasue.
We can add support for passing picture clase directly in formatted
string or as parameter. We can also add some additional extensions.
The function name is less important. We can call it hb_strFormat().
It's time to implement it.
Yes. We need it. The problem is, if we had to be C compatible, or we
need to to implement our own Harbour specific type specifiers.
A simple implementation could be:
%d = LTRIM(STR(nValue)) or LTRIM(STR(ROUND(nValue, 0))) ???
%s = cValue or TRIM(cValue) ???
but we will always want some new extensions. I still can not make
agreement with myself what specifiers should be. Do you have some ideas
about it?
From i18n point of view a very important thing is %1$d extension. It is
not related to speficier type problem, so I we need to implement it for
sure.
6. We should decide if we want to add support for plural form translations.
Now it's not supported by us. In many cases we can live without it but
sometimes it's useful. It will be important for final translation
representation though seems that if we will use hashes then it can be
added later. We can simply add translation for plural forms as additional
hash item attributes.
In Lithuanian language plural forms does not always solves the problem,
because we need more that 2 word ending. AFAIK, the same problem is for
Polish translations. I guess we can live without plurals.
8. The C interface should allow to use different low level implementation
so if someone will want to use real gettext API then it can register
his own wrappers.
Do you mean just an overloading of i18n C function by other module, or
some more complex possibility to have alternative gettext?
9. We should add support for automatic CP translations in output strings.
Otherwise we will end with many different lang modules for different
encoding like in msg*.c files. There is a question if we also want to
add translation for input strings but I do not think it's very important.
We can leave it open for future decissions.
According to http://www.gnu.org/software/gettext/manual/gettext.html
11.2.4: Note that the msgid argument to gettext is not subject to
character set conversion. Also, when gettext does not find a translation
for msgid, it returns msgid unchanged – independently of the current
output character set. It is therefore recommended that all msgids be
US-ASCII strings.
So, I think we do not need to convert input strings.
7. We should give more precise meaning for domain names in our implementation.
I think that using it as language ID is quite good idea.
10. We should decide about global settings which will control the translation
module:
- default domain/language
- default path with translation file
If we want to make them thread local then they should be controlled by
_SET_* structure. In such case they will be inherited by child threads.
According to this document domains are used to as a synonym for package
or library. The default domain is "messages". My knowledge about
original gettext and is only theoretical. All I know is
http://www.gnu.org/software/gettext/manual/gettext.html, but I've never
used it real life. So, it is hard for me to say, if we can mix domain
and language. It is different things in getttext.
BTW, in my application I had more need for context than domain. Because
sometimes the same word needs to have different translations, ex.,
"Exit" has a different translation depending on meaning, if it is "An
exit" or "To exit".
I'm currently using my hackish context implementation, but it would be
nice that Harbours i18n would support contexts.
I can implement the base C code when I'll hear your opinion about above
points.
It will be good if you or someone else can work on hb_strFormat() and
hbi18n tool for creating translation.
I'll try to do as much as possible. Implementation of API to create/edit
translation table is not a problem. I only doubt my possibility to
implement the final tool, because I was always using my own GUI library,
and I've never tried to do browse() or @ 1, 1, SAY "Hello".
Best regards,
Mindaugas
_______________________________________________
Harbour mailing list
Harbour@harbour-project.org
http://lists.harbour-project.org/mailman/listinfo/harbour