Any news on this item? Does what I'm saying make sense? I understand most people who use clojure are probably English-speaking and couldn't care less about internationalization, but this has to be addressed if clojure is to get any semblance of semi-mainstream adoption. In fact, one of the reasons I chose clojure myself is because internationalization is a solved problem in Java (and hence I though in clojure as well). If the perception is that the problem is "limited" to Windows, well, that's 90% of the deployed PCs out there.
Since the fix seems so trivial and requires changes in only about 5 lines of code, I'm not sure what prevents this from being fixed. At least, is there a clojure bug tracking site where I could add this issue? Thanks, Max On Mar 7, 2:21 pm, max3000 <maxime.lar...@gmail.com> wrote: > The default character set on WinXP (which I use) is windows-1252 > (cp1252). Check outhttp://www.rgagnon.com/javadetails/java-0505.html. > > If I were to change my source file encodings to UTF-8 that would > probably get me some mileage. Of course, I would have to use an editor > that supports it and not all editors would (on windows). However, it > wouldn't change anything in the REPL. Presumably, stdin in Java is > tied to the platform's default encoding and there is probably no way > to change that. My understanding is that clojure assumes reading a > file and reading stdin is the same thing encoding-wise. That's a > faulty assumption. > > Typically, I believe clojure should read and write to/from the default > character set unless specifically told otherwise. UTF-8 is not the > default on all platforms. > > Thanks, > > Max > > On Mar 7, 10:03 am, Toralf Wittner <toralf.witt...@gmail.com> wrote: > > > On Sat, 2009-03-07 at 05:43 -0800, max3000 wrote: > > > Ok, so I ended up doing this in my code: > > > > String resource = "/exmentis/rules_main.clj"; > > > InputStream is = getClass().getResourceAsStream(resource); > > > String script = ... read in is as a String (like slurp) ... > > > StringReader r = new StringReader(script); > > > clojure.lang.Compiler.load(r, null, resource); > > > > Note I use clojure.lang.Compiler directly because RT has no methods to > > > do what I want. > > > > The above works fine, and requires no modifications to the clojure > > > source code. > > > Hi Max, > > > Please tell us a bit about your environment (locale settings, OS). It > > looks to me like your settings are different from UTF-8 and the reason > > why the above procedure works is because Java will use the default > > character set when decoding your source file. Within Java (or Clojure) > > you can get the default character set with: > > > (java.nio.charset.Charset/defaultCharset) > > > which in my case produces #<UTF_8 UTF-8>. If you are using a different > > character set (e.g. ISO-8859-1), some characters can not be mapped > > directly between this and UTF-8. While I am not aware of any explicit > > requirements regarding Clojure source file encodings, it seems that de > > facto UTF-8 is assumed. Try encoding your sources as UTF-8 and things > > should work as expected. > > > Cheers, > > Toralf --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~----------~----~----~----~------~----~------~--~---