Hi Yegor, as far as I understood your issue, you have just to use data_coding=0x08 on ESME side and encode msg body to UCS2.
Alex > Am 21.05.2015 um 12:06 schrieb Yegor Ivaschenko > <yegor_ivasche...@exigenebit.com>: > > Dear, Aurel. > Thank you for your quick reply. > I don't really sure want I should do next. Answer to you directly or maybe > only to users@kannel or both. > Teach me the procedure. > > Anyway, I did a lot of testing of my ESME behaviour. > It has web interface, message input box and status bar which indicates how > many characters entered and upper limit for current message. (i.e. < 5 / 140 > >) > As I know, > > UTF-16 (16-bit Unicode Transformation Format) is a character encoding capable > of encoding all 1,112,064 possible characters in Unicode. The encoding is > variable-length, as code points are encoded with one or two 16-bit code units. > UTF-16 developed from an earlier fixed-width 16-bit encoding known as UCS-2 > (for 2-byte Universal Character Set) once it became clear that a fixed-width > 2-byte encoding could not encode enough characters to be truly universal. > (http://en.wikipedia.org/wiki/UTF-16). > > So UCS-2 is always fixed 16-bit encoding. > Also UCS-2 is always BE. This is what wiki says: > > UCS-2 encoding is defined to be big-endian only. In practice most software > defaults to little-endian,[citation needed > <http://en.wikipedia.org/wiki/Wikipedia:Citation_needed>] and handles a > leading BOM to define the byte order just as in UTF-16. Although the similar > designations UCS-2BE and UCS-2LE imitate the UTF-16 labels, they do not > represent official encoding schemes. (http://en.wikipedia.org/wiki/UTF-16) > > Well, when I enter in my ESME something like this: "asdfÆäæë " (which are > actually characters from ISO8859-1 chars table) it shows in status bar > (9/140). When I send it to opensmppbox, with wireshark analysis I can see, > that it uses 1 byte per character encoding. > So It can't be UCS-2 or UTF-16. It's definitely latin-1 (or some other of > ISO-8859 family). > > When I add some non-latin-1 character (like russian "ф" or japanese "テ" (te)) > "asdfÆäæë фテ"status bar of ESME's message input box changes upper limit to > 70. In wireshark dump I can see, that message send to opensmppbox has leading > "FF FE" character which leads to LE encoding. Each character encoded with 2 > bytes. > So again, it's not UCS-2 (which, according to standard have to be Big Endian > w/o BOM). It's either UTF-16 LE or non official UCS-2. > > So my ESME actually uses two encoding schemes depending on message itself. If > needed I can provide dumps and logs. > > So one more time, SMSC need only pure UCS-2 (or UTF-16 BE w/o BOM). My ESME > sends messages using ISO-8859-(1) and UTF-16 LE. > My question is : Is there any way to cast all messages going from ESME to > UTF-16 BE encoding by means of Kannel or some additional software in pair > with Kannel? > > > By the way, can you tell me were is that ticket considering this problem. > Maybe I can help by providing further information like wireshark dump, kannel > logs and configs etc. > > > Best regards, > Ivashchenko Yegor > 差出人: Aurel Branzeanu <branzeanu.au...@gmail.com> > 送信日時: 2015年5月21日 0:54 > 宛先: Yegor Ivaschenko > CC: users@kannel.org > 件名: Re: ESME - Opensmppbox - Kannel - SMSC Encoding problem > > Hello, Yegor! > > On Wed, May 20, 2015 at 4:30 PM, Yegor Ivaschenko > <yegor_ivasche...@exigenebit.com <mailto:yegor_ivasche...@exigenebit.com>> > wrote: > My ESME, based on characters used in message use 2 encoding type. > > I think it does not use 2 encodings, but just UCS-2, whose latin part matches > ISO8859-1 > > There is, however, a very nasty bug when receiving messages to opensmppbox in > UCS-2 (data_coding = 8) and using two alphabets, say, english and russian - I > will submit a bug-report right now. > > -- > Sincerely yours, > > Aurel Branzeanu, > > mailto: branzeanu.au...@gmail.com <mailto:branzeanu.au...@gmail.com> > Skype: tvorogov > GSM Orange: +373 6 940-7700 > GSM Moldcell: +373 7 940-7700 >