[fpc-pascal] Unicode filenames
How does the RTL support using unicode filenames (e.g. file names that cannot be represented by the ansi char set)? For example the FileExists function takes a string which is encoded in the system char set. If the system char set is UTF8, like most linuxes and Mac OS X, then there is no problem. But on windows using a western european charset, I cannot check for existence of a file with cyrilic characters, even though I can enter them in the windows explorer and create such files. Vincent ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] small bug with pass-by-reference on array elem
On 28 Jun 2008, at 23:55, David Emerson wrote: {$mode objfpc} procedure assign_it (out x : longint); begin x := 7; end; var a : array [0..1] of longint; b : longint; begin assign_it (b); // no warning assign_it (a[1]); // errant warning issued end. This bug has already been fixed and the fix will be in the next release. Jonas PS: please do not sent new messages to the list by replying to older ones, because it puts your new message in the same thread as the older messages in mail clients which are configured to group threads (because mail clients add a header referring to the message ID of the message you are replying to for exactly this purpose). ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] Unicode filenames
On 29 Jun 2008, at 09:27, Vincent Snijders wrote: How does the RTL support using unicode filenames (e.g. file names that cannot be represented by the ansi char set)? It doesn't. Jonas ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] Unicode filenames
On Sunday 29 June 2008 09.27:24 Vincent Snijders wrote: > How does the RTL support using unicode filenames (e.g. file names that > cannot be represented by the ansi char set)? > > For example the FileExists function takes a string which is encoded in > the system char set. If the system char set is UTF8, like most linuxes > and Mac OS X, then there is no problem. But on windows using a western > european charset, I cannot check for existence of a file with cyrilic > characters, even though I can enter them in the windows explorer and > create such files. > In MSEgui I use my own set of widestring based file utilities to overcome the problem. They are located in lib/common/msefileutils.pas and the systemspecific msesysintf.pas. The windows version currently converts the MSEgui widestring filenames to the system encoding before doing system calls, I plan to call the *W versions of the system routines instead if available (post version 1.8). Martin ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] Unicode filenames
On 29 Jun 2008, at 09:43, Jonas Maebe wrote: On 29 Jun 2008, at 09:27, Vincent Snijders wrote: How does the RTL support using unicode filenames (e.g. file names that cannot be represented by the ansi char set)? It doesn't. See also http://bugs.freepascal.org/view.php?id=7863 Jonas ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] Unicode filenames
> How does the RTL support using unicode filenames (e.g. file names that > cannot be represented by the ansi char set)? As said on IRC, afaik decisions about this have been continuously postponed. There are some problems: - If the border condition that w9x must remain supported persists, this becomes a magnitude more work. A way to deal with this has to be found (two win32 FPC releeases, one advocated as D2..D2006 compat + w9x, one as NT + Unicode/Tiburon?) - Also I have some doubts that using two different encodings is a good thing for a portable compiler, so a decision has to be made about that too (e.g. always support UTF-8 and allow overloading with utf16 if the platform defaults to it) - Do we have a non com widestring on Windows? - Tiburon compability. It doesn't make sense to roll our own slightly incompatible schema. At least we should have a look at it, before we decide if we support it (if it can folded in the multiplatform vision and it is sane from a native perspective) What are the exact plans of Lazarus in this? Is there some wiki page with how Lazarus plans to tackle this with all multi-platform concerns? Btw I saw that Cygwin is giving up 9x support in the next release. ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] Unicode filenames
Marco van de Voort schreef: What are the exact plans of Lazarus in this? Is there some wiki page with how Lazarus plans to tackle this with all multi-platform concerns? All strings the LCL are UTF8, see http://wiki.lazarus.freepascal.org/LCL_Unicode_Support For windows this means all strings in the LCL are converted to ansi, if the OS is win9X, and to widestring, if the OS is NT or higher. See in particular: http://wiki.lazarus.freepascal.org/LCL_Unicode_Support#Dealing_with_directory_and_filenames IMHO there should be a better solution than to convert file and directory to/from ansi. I think, this is the archiles heel of the Lazarus unicode support, that it depends on a RTL that has insufficient unicode support. Vincent ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] Unicode filenames
> Marco van de Voort schreef: > > What are the exact plans of Lazarus in this? Is there some wiki page with > > how Lazarus plans to tackle this with all multi-platform concerns? > > All strings the LCL are UTF8, see > http://wiki.lazarus.freepascal.org/LCL_Unicode_Support > > For windows this means all strings in the LCL are converted to ansi, if > the OS is win9X, and to widestring, if the OS is NT or higher. > > See in particular: > http://wiki.lazarus.freepascal.org/LCL_Unicode_Support#Dealing_with_directory_and_filenames > IMHO there should be a better solution than to convert file and > directory to/from ansi. I think, this is the archiles heel of the > Lazarus unicode support, that it depends on a RTL that has insufficient > unicode support. Yes. And with the above question I meant more what you want long term, and the reasons that we can't see (widgetset related, db components related), not which hacks you employ now to workaround that :-) If we want to make an informed decision, we have to put all requirements on the table, e.g. - Base principle for me: requiring too much handcoding is not desirable. In an application (not RTL/FCL) I don't want to have to insert manual conversions for each string operation and/or passing. Some of this must be automated, it is the delphi way IMHO. - Which encoding(s) to support (utf-8 and/or utf-16 mostly) - The and/or in the question above, one primal encoding or two? If you have just one, you have to convert on some OS to access API+widgetset, and header translated for that OS must be redone with gluecode doing the transforms If you have two, each general purpose string routine must be doubly implemented. (or face conversion chaos) - keep one windows release per windows target (win32,win64) or have two (ascii+w9x compat, unicode+NTonly) ? This because if we significantly step unicode API use up, keeping runtime compatability with w9x will require a lot of hackish code. Split them up, and you just have a few unicode vs ansi includefiles, and way less glue code to make mistakes in. - Do we (longterm) let UTF8 piggy back on ansistring, or do we have a distinct type for it, so that the compiler knows that type is utf-8 (and consequently that ansistring isn't) ? I've a feeling that this might be required, even if we decide on utf8 as universal encoding to avoid having to add too many checks and hand transformations. ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] Unicode filenames
2008/6/29 Vincent Snijders <[EMAIL PROTECTED]>: > How does the RTL support using unicode filenames (e.g. file names that > cannot be represented by the ansi char set)? > > For example the FileExists function takes a string which is encoded in the > system char set. If the system char set is UTF8, like most linuxes and Mac > OS X, then there is no problem. But on windows using a western european > charset, I cannot check for existence of a file with cyrilic characters, > even though I can enter them in the windows explorer and create such files. In fpGUI we use UTF-8 for everything. We have wrapper file access functions which replaces the RTL ones. The unit is called gfx_utils.pas eg: function fpgFileExists(const FileName: TfpgString): Boolean; begin Result := FileExists(fpgToOSEncoding(FileName)); end; fpgToOSEncoding() is then implemented in platform dependent include files (like FPC also does with many functions). Linux & *BSD X11: --- // yes we assume UTF-8. Only very old Linux versions don't use UTF-8, but that is very rare now. function fpgToOSEncoding(aString: TfpgString): string; begin Result := aString; end; Windows GDI: --- function fpgToOSEncoding(aString: TfpgString): string; begin Result := Utf8ToAnsi(aString); end; Regards, - Graeme - ___ fpGUI - a cross-platform Free Pascal GUI toolkit http://opensoft.homeip.net/fpgui/ ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] Unicode filenames
Graeme Geldenhuys schreef: 2008/6/29 Vincent Snijders <[EMAIL PROTECTED]>: How does the RTL support using unicode filenames (e.g. file names that cannot be represented by the ansi char set)? For example the FileExists function takes a string which is encoded in the system char set. If the system char set is UTF8, like most linuxes and Mac OS X, then there is no problem. But on windows using a western european charset, I cannot check for existence of a file with cyrilic characters, even though I can enter them in the windows explorer and create such files. In fpGUI we use UTF-8 for everything. We have wrapper file access functions which replaces the RTL ones. The unit is called gfx_utils.pas eg: Windows GDI: --- function fpgToOSEncoding(aString: TfpgString): string; begin Result := Utf8ToAnsi(aString); end; I see you are crippled in the same way as the LCL, because you only can handle ansi filenames correctly. Vincent ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] Unicode filenames
On Sunday 29 June 2008 13.10:33 Marco van de Voort wrote: > > - Which encoding(s) to support (utf-8 and/or utf-16 mostly) In order to complement Graemes mail, MSEgui uses widestrings for everything. Martin ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] Unicode filenames
2008/6/29 Vincent Snijders <[EMAIL PROTECTED]>: > > I see you are crippled in the same way as the LCL, because you only can > handle ansi filenames correctly. fpGUI was tested under Windows with Russian locale and filenames. Not tested by me, but my a co-developer (Vladimir). He reported that the file dialogs and other file related functions worked correctly. He actually implemented the locale file handing. Regards, - Graeme - ___ fpGUI - a cross-platform Free Pascal GUI toolkit http://opensoft.homeip.net/fpgui/ ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] Unicode filenames
On Sun, Jun 29, 2008 at 6:32 AM, Marco van de Voort <[EMAIL PROTECTED]> wrote: > - If the border condition that w9x must remain supported persists, this > becomes a magnitude more work. A way to deal with this has to be found > (two win32 FPC releeases, one advocated as D2..D2006 compat + w9x, one as > NT + Unicode/Tiburon?) procedure AnyFileRoutineInWin32(AFileName: widestring); begin if UnicodeEnabledOS then SomeWin32APIW() else AnsiToWideString(SomeWin32ApiA()) end; Not very hard to keep 9x support. -- Felipe Monteiro de Carvalho ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] Unicode filenames
On Sun, Jun 29, 2008 at 3:19 PM, Graeme Geldenhuys <[EMAIL PROTECTED]> wrote: > fpGUI was tested under Windows with Russian locale and filenames. Not > tested by me, but my a co-developer (Vladimir). He reported that the > file dialogs and other file related functions worked correctly. He > actually implemented the locale file handing. What if you have a russian directory in a Windows with western latin locale? What if you have a chinese directory in a Linux with non-utf-8 locale? We also support a russian directory in a russian windows the problem only appears if the locale cannot represent the characters in the file. -- Felipe Monteiro de Carvalho ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] Unicode filenames
> procedure AnyFileRoutineInWin32(AFileName: widestring); > begin > if UnicodeEnabledOS then SomeWin32APIW() > else AnsiToWideString(SomeWin32ApiA()) > end; If you want even more details you can initialize UnicodeEnabledOS by reading the operating system version and the operating system type NT/9x very easely. -- Felipe Monteiro de Carvalho ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] Unicode filenames
Graeme Geldenhuys schreef: 2008/6/29 Vincent Snijders <[EMAIL PROTECTED]>: I see you are crippled in the same way as the LCL, because you only can handle ansi filenames correctly. fpGUI was tested under Windows with Russian locale and filenames. Not tested by me, but my a co-developer (Vladimir). He reported that the file dialogs and other file related functions worked correctly. He actually implemented the locale file handing. Yes, it works ok, if all the characters used are part of the system locale, so for Russian locale, it contains the ascii characters and the cyrilic characters. Try using a path, that contains characters from more than codepage, for example using cyrillic characters in a file name and french accented chars in the directory. That is when you need real unicode support. Vincent ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] Unicode filenames
Graeme Geldenhuys schreef: 2008/6/29 Vincent Snijders <[EMAIL PROTECTED]>: I see you are crippled in the same way as the LCL, because you only can handle ansi filenames correctly. fpGUI was tested under Windows with Russian locale and filenames. Not tested by me, but my a co-developer (Vladimir). He reported that the file dialogs and other file related functions worked correctly. He actually implemented the locale file handing. If you create a file with a russian name on your hard disk, can you use fpgFileExists to check for its existence? Vincent ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] Unicode filenames
2008/6/29 Vincent Snijders <[EMAIL PROTECTED]>: > > If you create a file with a russian name on your hard disk, can you use > fpgFileExists to check for its existence? My Windows doesn't contain a Russian locale, but I tried it with French, German etc names and it works. I asked Vladimir to create a zip file containing English and Russian directory and file names. I'll unzip that and give it another try as soon as he emails the file. I'll let you know what happens. :) Regards, - Graeme - ___ fpGUI - a cross-platform Free Pascal GUI toolkit http://opensoft.homeip.net/fpgui/ ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] Unicode filenames
2008/6/29 Vincent Snijders <[EMAIL PROTECTED]>: > > If you create a file with a russian name on your hard disk, can you use > fpgFileExists to check for its existence? I did a test and sent a screenshot of the results. I don't know what's the limit of attachments in this mailing list. So let me know if the attachment didn't go through. The file was 22kb in size. Regards, - Graeme - ___ fpGUI - a cross-platform Free Pascal GUI toolkit http://opensoft.homeip.net/fpgui/ ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] Unicode filenames
On Sun, Jun 29, 2008 at 8:06 PM, Graeme Geldenhuys <[EMAIL PROTECTED]> wrote: > I did a test and sent a screenshot of the results. I don't know what's > the limit of attachments in this mailing list. So let me know if the > attachment didn't go through. The file was 22kb in size. Nothing arived here. Can't you just say if it worked or not? -- Felipe Monteiro de Carvalho ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] Unicode filenames
On 29 Jun 08, at 20:48, Felipe Monteiro de Carvalho wrote: > On Sun, Jun 29, 2008 at 8:06 PM, Graeme Geldenhuys > <[EMAIL PROTECTED]> wrote: > > I did a test and sent a screenshot of the results. I don't know what's > > the limit of attachments in this mailing list. So let me know if the > > attachment didn't go through. The file was 22kb in size. > > Nothing arived here. Can't you just say if it worked or not? I suspect that this may be difficult to assess for Graeme if he doesn't know Russian. ;-) As a side note related to Graeme's e-mail, French and German should be obviously supported properly if using the "western" charsets, so no wonder that this worked. Finally, please, note that only the latest Info-zip beta (!) versions of zip and unzip (specifically, zip 3.0 and unzip 6.0 betas) support Unicode paths, so zip archives may not be the best candidates for this test. Other zip and unzip implementations may or may not already support Unicode paths even in released versions, but Info-zip implementation is certainly the most common one on many platforms (all Unix/Linux for sure). I'm not sure about situation with other archive formats (I believe that Unicode is supposed to be supported with RAR, although I have no idea whether that applies to e.g. the Linux version of unrar too). Tomas ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] Unicode filenames
Graeme Geldenhuys schreef: 2008/6/29 Vincent Snijders <[EMAIL PROTECTED]>: If you create a file with a russian name on your hard disk, can you use fpgFileExists to check for its existence? My Windows doesn't contain a Russian locale, but I tried it with French, German etc names and it works. I asked Vladimir to create a zip file containing English and Russian directory and file names. I'll unzip that and give it another try as soon as he emails the file. I'll let you know what happens. :) Even if it doesn't contain Russian locale, you would be able to create such files in the windows explorer to create such file, for example by copy / pasting the file name while renaming it. Then let your fpGui program check for its existence. Vincent ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal