As Marco van de Voort requested me to reuse the large functionality of charset 
(see bugtracker comment) I have enlarged my test-application. Here are the 
results :

...

ISO-8859-1 >> UTF-8 using LConvEncoding ¦ Input string has 256 characters.
---------------------------------------------------------------------------
Evaluating LConvEncoding.ConvertEncoding(string,iso88591,utf8):string           
100000 times, Time: 0,312 [s] : Result is correct.
Evaluating LConvEncoding.ISO_8859_1ToUTF8(string):string                        
     100000 times, Time: 0,249 [s] : Result is correct.

ISO-8859-1 >> UTF-8 using Charset ¦ Input string has 256 characters.
---------------------------------------------------------------------
Charset does not support conversions to UTF8, using utf8-unit for that
Evaluating utf8.UnicodeToUTF8(Charset.getunicode(string,iso88591)):string   
100000 times, Time: 2,480 [s] : Result is correct.

ISO-8859-1 >> UTF-8 using Codepages ¦ Input string has 256 characters.
-----------------------------------------------------------------------
Evaluating DirectConversion(string,chEncISO-8859-1,chEncUTF-8):string         
100000 times, Time: 0,187 [s] : Result is correct.
Evaluating ConvertToUTF8(string,chEncISO-8859-1):string                         
      100000 times, Time: 0,234 [s] : Result is correct.

...

ISO-8859-1 >> UTF-16 using Charset ¦ Input string has 256 characters.
----------------------------------------------------------------------
Charset does not support conversions to UTF16, using utf16-unit for that
Evaluating utf8.UnicodeToUTF16(Charset.getunicode(string,iso88591)):widestring  
                     100000 times, Time: 7,847 [s] : Result is correct.

ISO-8859-1 >> UTF-16 using Codepages ¦ Input string has 256 characters.
------------------------------------------------------------------------
Evaluating DirectConversion(string,chEncISO-8859-1,chEncUTF-16):widestring      
                     100000 times, Time: 0,203 [s] : Result is correct.

ISO-8859-2 >> UTF-16 using Charset ¦ Input string has 256 characters.
----------------------------------------------------------------------
Charset does not support conversions to UTF16, using utf16-unit for that
Evaluating utf8.UnicodeToUTF16(Charset.getunicode(string,iso88592)):widestring  
                     100000 times, Time: 7,831 [s] : Result is correct.

ISO-8859-2 >> UTF-16 using Codepages ¦ Input string has 256 characters.
------------------------------------------------------------------------
Evaluating DirectConversion(string,chEncISO-8859-2,chEncUTF-16):widestring      
                     100000 times, Time: 0,219 [s] : Result is correct.

....

ISO-8859-1 >> ISO-8859-2 using LConvEncoding ¦ Input string has 256 characters.
--------------------------------------------------------------------------------
Evaluating LConvEncoding.ConvertEncoding(string,iso88591,iso88592):string       
                     100000 times, Time: 0,873 [s]

ISO-8859-1 >> ISO-8859-2 using Charset ¦ Input string has 256 characters.
----------------------------------------------------------------------------
Evaluating 
Charset.getascii(Charset.getunicode(string,iso88591),iso88592):string           
          100000 times, Time: 9,079 [s]

ISO-8859-1 >> ISO-8859-2 using Codepages ¦ Input string has 256 characters.
----------------------------------------------------------------------------
Evaluating DirectConversion(string,chEncISO-8859-1,chEncISO-8859-2):string      
                     100000 times, Time: 0,218 [s]

....

SHIFT_JIS >> UTF-8 using LConvEncoding ¦ Input string has 14843 characters.
----------------------------------------------------------------------------
Evaluating LConvEncoding.ConvertEncoding(string,cp932,utf8):string              
                     1000 times, Time: 24,321 [s]
  Length(Result)=22078 Length(Reference)=22173 : 79 characters are different.
Evaluating LConvEncoding.CP932ToUTF8(string):string                             
                     1000 times, Time: 24,414 [s]
  Length(Result)=22078 Length(Reference)=22173 : 79 characters are different.

SHIFT_JIS >> UTF-8 using Charset ¦ Input string has 14843 characters.
----------------------------------------------------------------------
Charset does not support conversions to UTF8, using utf8-unit for that
Evaluating utf8.UnicodeToUTF8(Charset.getunicode(string,cp932)):string          
                     1000 times, Time: 1,560 [s]
  Length(Result)=39233 Length(Reference)=22173 : 21798 characters are different.

SHIFT_JIS >> UTF-8 using Codepages ¦ Input string has 14843 characters.
------------------------------------------------------------------------
Evaluating DirectConversion(string,chEncCP932,chEncUTF-8):string                
                     1000 times, Time: 0,234 [s] : Result is correct.
Evaluating ConvertToUTF8(string,chEncCP932):string                              
                     1000 times, Time: 0,218 [s] : Result is correct.
Evaluating CP932ToUTF8(string):string                                           
                     1000 times, Time: 0,218 [s] : Result is correct.


Hmmm, the conversion SHIFT_JIS >> UTF-8 using the Charset-unit ended up with a 
complet mess. The reason is, that the large functionality of charset has no 
mean to convert Doublebyte charsets to Unicode. :(

The complete Testresults in the attachment...

I will publish the Testprogram on the bugtracker.

Greetings


______________________________________________________
powered by GLOBER.LU
Luxembourg Internet Service Provider
Hosting. Domain Registration, Webshops, Webdesign, FreeMail ...

Our professional Web Hosting plans include all the features you are looking for 
at the best possible price.
www.globe.lu
OS Name Microsoft Windows 7 Home Premium
Version 6.1.7600 Build 7600
System Manufacturer     Gigabyte Technology Co., Ltd.
System Model            GA-870A-UD3
System Type             x64-based PC
Processor               AMD Phenom(tm) II X6 1090T Processor, 3200 Mhz, 6 
Core(s), 6 Logical Processor(s)
BIOS Version/Date       Award Software International, Inc. F1, 15.04.2010
Total Physical Memory   8,00 GB
Total Virtual Memory    14,0 GB

Testing character conversion.
System is little endian
Using half translation tables.
Using UTF8-translation tables.

ISO-8859-1 >> UTF-8 using LConvEncoding ¦ Input string has 256 characters.
---------------------------------------------------------------------------
Evaluating LConvEncoding.ConvertEncoding(string,iso88591,utf8):string           
                     100000 times, Time: 0,312 [s] : Result is correct.
Evaluating LConvEncoding.ISO_8859_1ToUTF8(string):string                        
                     100000 times, Time: 0,249 [s] : Result is correct.

ISO-8859-1 >> UTF-8 using Charset ¦ Input string has 256 characters.
---------------------------------------------------------------------
Charset does not support conversions to UTF8, using utf8-unit for that
Evaluating utf8.UnicodeToUTF8(Charset.getunicode(string,iso88591)):string       
                     100000 times, Time: 2,480 [s] : Result is correct.

ISO-8859-1 >> UTF-8 using Codepages ¦ Input string has 256 characters.
-----------------------------------------------------------------------
Evaluating DirectConversion(string,chEncISO-8859-1,chEncUTF-8):string           
                     100000 times, Time: 0,187 [s] : Result is correct.
Evaluating ConvertToUTF8(string,chEncISO-8859-1):string                         
                     100000 times, Time: 0,234 [s] : Result is correct.
Evaluating ISO_8859_1ToUTF8(string):string                                      
                     100000 times, Time: 0,234 [s] : Result is correct.
Evaluating DirectConversion(string,chEncISO-8859-1,chEncUTF-8):widestring       
                     100000 times, Time: 0,250 [s] : Result is correct.
Evaluating DirectConversion(string,chEncISO-8859-1,chEncUTF-16):widestring      
                     100000 times, Time: 0,202 [s] : Result is correct.

ISO-8859-1 >> UTF-16 using Charset ¦ Input string has 256 characters.
----------------------------------------------------------------------
Charset does not support conversions to UTF16, using utf16-unit for that
Evaluating utf8.UnicodeToUTF16(Charset.getunicode(string,iso88591)):widestring  
                     100000 times, Time: 7,847 [s] : Result is correct.

ISO-8859-1 >> UTF-16 using Codepages ¦ Input string has 256 characters.
------------------------------------------------------------------------
Evaluating DirectConversion(string,chEncISO-8859-1,chEncUTF-16):widestring      
                     100000 times, Time: 0,203 [s] : Result is correct.

ISO-8859-2 >> UTF-16 using Charset ¦ Input string has 256 characters.
----------------------------------------------------------------------
Charset does not support conversions to UTF16, using utf16-unit for that
Evaluating utf8.UnicodeToUTF16(Charset.getunicode(string,iso88592)):widestring  
                     100000 times, Time: 7,831 [s] : Result is correct.

ISO-8859-2 >> UTF-16 using Codepages ¦ Input string has 256 characters.
------------------------------------------------------------------------
Evaluating DirectConversion(string,chEncISO-8859-2,chEncUTF-16):widestring      
                     100000 times, Time: 0,219 [s] : Result is correct.

ISO-8859-2 >> UTF-8 using LConvEncoding ¦ Input string has 256 characters.
---------------------------------------------------------------------------
Evaluating LConvEncoding.ConvertEncoding(string,iso88592,utf8):string           
                     100000 times, Time: 0,297 [s] : Result is correct.
Evaluating LConvEncoding.ISO_8859_2ToUTF8(string):string                        
                     100000 times, Time: 0,250 [s] : Result is correct.

ISO-8859-2 >> UTF-8 using Codepages ¦ Input string has 256 characters.
-----------------------------------------------------------------------
Evaluating DirectConversion(string,chEncISO-8859-2,chEncUTF-8):string           
                     100000 times, Time: 0,203 [s] : Result is correct.
Evaluating ConvertToUTF8(string,chEncISO-8859-2):string                         
                     100000 times, Time: 0,234 [s] : Result is correct.
Evaluating ISO_8859_2ToUTF8(string):string                                      
                     100000 times, Time: 0,234 [s] : Result is correct.
Evaluating DirectConversion(string,chEncISO-8859-2,chEncUTF-8):widestring       
                     100000 times, Time: 0,281 [s] : Result is correct.
Evaluating DirectConversion(string,chEncISO-8859-2,chEncUTF-16):widestring      
                     100000 times, Time: 0,203 [s] : Result is correct.

ISO-8859-1 >> ISO-8859-2 using LConvEncoding ¦ Input string has 256 characters.
--------------------------------------------------------------------------------
Evaluating LConvEncoding.ConvertEncoding(string,iso88591,iso88592):string       
                     100000 times, Time: 0,873 [s]

ISO-8859-1 >> ISO-8859-2 using Codepages ¦ Input string has 256 characters.
----------------------------------------------------------------------------
Evaluating 
Charset.getascii(Charset.getunicode(string,iso88591),iso88592):string           
          100000 times, Time: 9,079 [s]

ISO-8859-1 >> ISO-8859-2 using Codepages ¦ Input string has 256 characters.
----------------------------------------------------------------------------
Evaluating DirectConversion(string,chEncISO-8859-1,chEncISO-8859-2):string      
                     100000 times, Time: 0,218 [s]

UTF-16 >> UTF-16BE using Codepages ¦ Input string has 256 characters.
----------------------------------------------------------------------
Evaluating DirectConversion(widestring,chEncUTF-16,chEncUTF-16BE):widestring    
                     100000 times, Time: 0,234 [s] : Result is correct.

UTF-16 >> UTF-8 using Codepages ¦ Input string has 256 characters.
-------------------------------------------------------------------
Evaluating DirectConversion(widestring,chEncUTF-16,chEncUTF-8):string           
                     100000 times, Time: 0,203 [s] : Result is correct.


SHIFT_JIS >> UTF-8 using LConvEncoding ¦ Input string has 14843 characters.
----------------------------------------------------------------------------
Evaluating LConvEncoding.ConvertEncoding(string,cp932,utf8):string              
                     1000 times, Time: 24,321 [s]
  Length(Result)=22078 Length(Reference)=22173 : 79 characters are different.
Evaluating LConvEncoding.CP932ToUTF8(string):string                             
                     1000 times, Time: 24,414 [s]
  Length(Result)=22078 Length(Reference)=22173 : 79 characters are different.

SHIFT_JIS >> UTF-8 using Charset ¦ Input string has 14843 characters.
----------------------------------------------------------------------
Charset does not support conversions to UTF8, using utf8-unit for that
Evaluating utf8.UnicodeToUTF8(Charset.getunicode(string,cp932)):string          
                     1000 times, Time: 1,560 [s]
  Length(Result)=39233 Length(Reference)=22173 : 21798 characters are different.

SHIFT_JIS >> UTF-8 using Codepages ¦ Input string has 14843 characters.
------------------------------------------------------------------------
Evaluating DirectConversion(string,chEncCP932,chEncUTF-8):string                
                     1000 times, Time: 0,234 [s] : Result is correct.
Evaluating ConvertToUTF8(string,chEncCP932):string                              
                     1000 times, Time: 0,218 [s] : Result is correct.
Evaluating CP932ToUTF8(string):string                                           
                     1000 times, Time: 0,218 [s] : Result is correct.

UTF-8 >> SHIFT_JIS using LConvEncoding ¦ Input string has 22173 characters.
----------------------------------------------------------------------------
Evaluating LConvEncoding.ConvertEncoding(string,utf8,cp932):string              
                     1000 times, Time: 24,367 [s] : 370 characters are 
different.
Evaluating LConvEncoding.UTF8ToCp932(string):string                             
                     1000 times, Time: 24,399 [s] : 370 characters are 
different.

UTF-8 >> SHIFT_JIS using Codepages ¦ Input string has 22173 characters.
------------------------------------------------------------------------
Evaluating DirectConversion(string,chEncUTF-8,chEncCP932):string                
                     1000 times, Time: 0,328 [s] : Result is correct.
Evaluating UTF8ToCp932(string):string                                           
                     1000 times, Time: 0,344 [s] : Result is correct.
--
_______________________________________________
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus

Reply via email to