Re: file formats: MacRoman and UTF-8...

2011-03-28 Thread Patrick Diviacco
ilto:patrick.divia...@gmail.com] > > Sent: Monday, March 28, 2011 9:17 AM > > To: java-user@lucene.apache.org > > Subject: Re: file formats: MacRoman and UTF-8... > > > > hi, I'm using my own code: > > > > > > > > Writer writer = null; >

RE: file formats: MacRoman and UTF-8...

2011-03-28 Thread Uwe Schindler
H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Patrick Diviacco [mailto:patrick.divia...@gmail.com] > Sent: Monday, March 28, 2011 9:17 AM > To: java-user@lucene.apache.org > Subject: Re: file formats: MacRoman and

Re: file formats: MacRoman and UTF-8...

2011-03-28 Thread Patrick Diviacco
hi, I'm using my own code: Writer writer = null; try { //File fileOutput = new File("output.trectext"); File fileOutput = new File(args[1]); writer = new BufferedWriter(new FileWriter(fileOutput)); writer.write(contents.toString()); } catch (FileNotFoundException e) { e.printStackTrace(); } cat

RE: file formats: MacRoman and UTF-8...

2011-03-28 Thread Uwe Schindler
Hi, You have to give the Charset when creating the Writer. If you give no charset, Java uses the platform default. This question has nothing to do with Lucene, it is better suited at an XML or JAVA general forum. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de

Re: file formats: MacRoman and UTF-8...

2011-03-28 Thread Paul Libbrecht
java -Dfile.encoding=utf-8 should do the trick. Or... which java app are you using? paul Le 28 mars 2011 à 09:03, Patrick Diviacco a écrit : > When I run my Lucene app and a parse a xml file I get the following error > due to some fonts such as "é" written in the text file. > > If I save the