Hi, FileReader is a broken class, this is well-known. For that reason it is part of the forbidden-apis lis, which is also used by Lucene to prevent issues like your in our source code. To correctly specify the characterset for reading a file, you have to use an FileInputStream and wrap it with an InputStreamReader. On the InputStreamReader you can give the charset.
See https://github.com/policeman-tools/forbidden-apis Uwe ----- Uwe Schindler Achterdiek 19, D-28357 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -----Original Message----- > From: Kudrettin Güleryüz [mailto:kudret...@gmail.com] > Sent: Tuesday, May 23, 2017 9:13 PM > To: java-user@lucene.apache.org > Subject: Re: utf-8 issues depending on host > > I create the object as new FileReader(file) > Where file is read from File.listFiles() as below: > cwd.listFiles(getSourceCodeFilter()) > File file : files > > FileReader doesn't seem to have a constructor that lets me specify an > encoding, and in fact I feel like I should not be setting it to UTF-8 by > default, anyways. > > Let me revise my question, how can I make sure all hosts running this > indexer code behave as expected? It certainly runs as expected on one > machine while not on others. One that runs as expected is Debian 8.3 others > are Debian 7.4. > > Thank you > > On Tue, May 23, 2017 at 10:45 AM Adrien Grand <jpou...@gmail.com> > wrote: > > > The issue is likely due to how you create the FileReader that you pass to > > TextField. Maybe you don't give it the right encoding? > > > > Le mar. 23 mai 2017 à 16:38, Kudrettin Güleryüz <kudret...@gmail.com> a > > écrit : > > > > > Hi, > > > > > > Depending on the host running indexer, UTF-8 characters are not stored > > (not > > > correctly, anyways) in Lucene index. > > > > > > Interestingly, locale output is identical on all hosts but the output is > > > different. > > > > > > Apparently using FileReader could be the culprit. I am currently using > > > TextField(String name, Reader reader) > > > > > > How can I improve this? What is the suggested way for handling this using > > > 5.2.1? TextField(String name, String value, Store store)? > > > > > > Thank you > > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org