Re: Windows Unicode and GCC

2006-05-01 Thread Nicolas De Rico
Hello, As a quick reminder, the problem that I encountered arised when trying to compile source files that are NOT encoded with the same encoding as the system header files. My current Linux machine uses UTF-8, but I am trying to compile files that were created using Windows "unicode". To ma

Re: Windows Unicode and GCC

2006-04-27 Thread Zack Weinberg
On Thu, Apr 27, 2006 at 05:16:10PM -0700, Joe Buck wrote: > On Thu, Apr 27, 2006 at 07:58:29PM -0400, Zack Weinberg wrote: > [ Unicode, UTF-{8,16}, BOMs, etc ] > > It would also be good to take advantage of the fact that 95+% of C > > source files start with "/*", "//", "#i", or "#d" to distinguish

Re: Windows Unicode and GCC

2006-04-27 Thread Joe Buck
[ Unicode, UTF-{8,16}, BOMs, etc ] On Thu, Apr 27, 2006 at 07:58:29PM -0400, Zack Weinberg wrote: > complicated) "Local Variables:" marker near the end of the file; > other editors have similar, but of course incompatible, conventions > (I know Vim has one but I don't know what it looks like). It

Re: Windows Unicode and GCC

2006-04-27 Thread Zack Weinberg
> I think that CPP should try to determine the encoding for each file > and not use a single encoding for every file. It should look for > a unicode header when it opens a file (original c source or any > include), and if it doesn't find one, use the default: -finput-charset, > LC_CTYPE, UTF-8, un

Re: Windows Unicode and GCC

2006-04-25 Thread Eric Christopher
Presumably, cpp wants everything from libiconv in UTF-8 with no BOM. Yes. -eric

Re: Windows Unicode and GCC

2006-04-25 Thread Nicolas De Rico
Hello and thank you for the reply. I created 3 files (very simple hello world program): hi.c: UTF-8 without BOM hi-8.c: UTF-8 with BOM hi-16.c: UTF-16 with BOM I ran iconv twice for each file. Once with the -f option which explicitly indicates the encoding, and once without the -f option to s

Re: Windows Unicode and GCC

2006-04-25 Thread Eric Christopher
It seems that BOM is a Unicode UTF facility that MS thought was a great thing to implement, and I certainly agree with that assessment. BOM tells even more than its name implies. A program can detect if a file is encoded in UTF-8, 16LE, 16BE, 32LE and 32BE in a very easy way. I think

Re: Windows Unicode and GCC

2006-04-25 Thread Nicolas De Rico
Hi, Yes, I was talking about the byte order mark (BOM): http://www.unicode.org/faq/utf_bom.html It seems that BOM is a Unicode UTF facility that MS thought was a great thing to implement, and I certainly agree with that assessment. BOM tells even more than its name implies. A program can de

Re: Windows Unicode and GCC

2006-04-25 Thread Ranjit Mathew
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Mike Hearn wrote: > On Mon, 24 Apr 2006 15:27:07 -0400, Nicolas De Rico wrote: >> I would like to compile files created on Windows and encoded in >> "Unicode" (UTF-8 or UTF-16). Microsoft puts a little header at the >> beginning of files to indicate

Re: Windows Unicode and GCC

2006-04-25 Thread Mike Hearn
On Mon, 24 Apr 2006 15:27:07 -0400, Nicolas De Rico wrote: > I would like to compile files created on Windows and encoded in > "Unicode" (UTF-8 or UTF-16). Microsoft puts a little header at the > beginning of files to indicate that they are UTF-16, UTF-8, etc. I > believe that this header is s

Windows Unicode and GCC

2006-04-24 Thread Nicolas De Rico
Hello, I would like to compile files created on Windows and encoded in "Unicode" (UTF-8 or UTF-16). Microsoft puts a little header at the beginning of files to indicate that they are UTF-16, UTF-8, etc. I believe that this header is standard unicode btw, not an extension! When I try to com