Yegor, I'm glad to help you to find this issue. This is exactly the problem. By now I'm trying to fix it in my own code by replacing these characters. Though the beta4 is not out yet I'm still using it in my project because I need to write an excel with >100.000 lines. Do you know when the beta4 is gonna be out?
Cheers, José Guilherme Macedo Vieira 2011/8/3 Yegor Kozlov <yegor.koz...@dinom.ru> > The culprit is the non-break space (charcode=\u00a0). I was able to > reproduce the trouble with the following code: > > Workbook wb = new SXSSFWorkbook(); > Sheet sh = wb.createSheet(); > Row row = sh.createRow(0); > row.createCell(0).setCellValue("ALEXANDRE\u00a0MARINHO DE SOUZA"); > FileOutputStream out = new FileOutputStream("/temp/test.xlsx"); > wb.write(out); > out.close(); > > The fix is coming soon and will be included in 3.8-beta4. > > Cheers, > Yegor > > On Wed, Aug 3, 2011 at 1:59 PM, Guilherme Vieira <jguilherm...@gmail.com> > wrote: > > Dear Yegor, > > > > Your tip didn't work. So I guessed that there was a non-printable > character > > instead of white spaces. That said I tried to encode it with > > URLEncoder.encode("the name goes here","ASCII"); and guess what? The > encoded > > name is as below: > > > > ALEXANDRE%3BF+MARINHO+DE+SOUZA > > > > It interesting because I can't remove it with replace all because we have > > non-printable characters. So, I'm trying to find a regular expression > that > > matches to these expressions (%3BF and + ,respectively). It would be nice > if > > I could find a regular expression that matches to any special > non-printable > > characters. So, how do I proceed? > > > > And thanks in advance for your answer as well for your GREAT work in > Apache > > POI with the Big Grid Demo approach. It is just wonderful. Can't wait for > > the final release (3.8-beta4). > > > > Best regards, > > José Guilherme Macedo Vieira > > > > > > 2011/8/3 Yegor Kozlov <yegor.koz...@dinom.ru> > > > >> Tweak your report generator and try the following tricks before > >> passing strings to SXSSFCell: > >> > >> (a) string.replaceAll("\\s+", ""); // replace multiple white spaces > >> with a single space > >> (b) string.replace(' ', '_'); // replace white spaces with underscore > >> > >> Does any of (a) and (b) help? > >> > >> My hunch is that the problem is in something else, not in double white > >> spaces. At least, I can't reproduce the problem with the following > >> code snippet: > >> > >> Workbook wb = new SXSSFWorkbook(); > >> Sheet sh = wb.createSheet(); > >> for(int i = 0; i < 10000; i++) { > >> Row row = sh.createRow(i); > >> row.createCell(0).setCellValue("ALEXANDRE__MARINHO DE > SOUZA"); > >> row.createCell(1).setCellValue("ALEXANDRE MARINHO DE SOUZA"); > >> row.createCell(2).setCellValue("ALEXANDRE MARINHO DE > SOUZA"); > >> row.createCell(3).setCellValue("ALEXANDRE MARINHO DE > SOUZA"); > >> } > >> > >> FileOutputStream out = new FileOutputStream("/temp/test.xlsx"); > >> wb.write(out); > >> out.close(); > >> > >> The generated file is readable and all spaces are there. > >> > >> Yegor > >> > >> On Tue, Aug 2, 2011 at 11:49 PM, Guilherme Vieira > >> <jguilherm...@gmail.com> wrote: > >> > So, I've searched column by column in the problematic line in order to > >> > identify the problem. The problem is quite weird. It's a string column > in > >> > the database. This column stores people names. > >> > > >> > In my problem the name is: ALEXANDRE__MARINHO DE SOUZA > >> > > >> > Of course, without the underline character. Instead it is a whitespace > >> > character. So, when with double whitespace character the file is > >> corrupted. > >> > And when I manually remove the one whitespace in the IDE, the file is > >> also > >> > corrupted. But when I change the whole name manually in the IDE, > setting > >> the > >> > value to ALEXANDRE_MARINHO DE SOUZA, it works. It's strange. I don't > know > >> > why SXSSF is not accepting two whitespaces. > >> > > >> > Anyone have a clue? > >> > > >> > > >> > > >> > 2011/8/2 jguilhermemv <jguilherm...@gmail.com> > >> > > >> >> I tried without merged region and it didn't work. So, I noticed that > >> there > >> >> is a line in the file which present the error. It's the line (2451) > and > >> >> until the line 2450 everything works great. But for some reason when > it > >> >> reach the line 2450 it just doesn't work. I checked if the was any > null > >> >> values, but there wasn't. The writing routine is right, otherwise it > >> >> wouldn't write until the line 2450. > >> >> > >> >> What can I do now? > >> >> > >> >> Best regards. > >> >> José Guilherme Macedo Vieira > >> >> > >> >> > >> >> 2011/8/2 Nick Burch-11 [via Apache POI] < > >> >> ml-node+4658878-753894702-237...@n5.nabble.com> > >> >> > >> >> > On Tue, 2 Aug 2011, jguilhermemv wrote: > >> >> > > Regarding the file, it makes use of some CellStyles and Merged > >> Regions. > >> >> > > >> >> > Try without them, and see if that fixes it. You need to narrow your > >> >> > problem down before you can figure out what to correct. Try to > >> identify > >> >> > the simplest file that fails, and the most complex one that works, > the > >> >> gap > >> >> > there is your issue > >> >> > > >> >> > Nick > >> >> > > >> >> > > --------------------------------------------------------------------- > >> >> > To unsubscribe, e-mail: [hidden email]< > >> >> http://user/SendEmail.jtp?type=node&node=4658878&i=0> > >> >> > For additional commands, e-mail: [hidden email]< > >> >> http://user/SendEmail.jtp?type=node&node=4658878&i=1> > >> >> > > >> >> > > >> >> > > >> >> > ------------------------------ > >> >> > If you reply to this email, your message will be added to the > >> discussion > >> >> > below: > >> >> > > >> >> > > >> >> > >> > http://apache-poi.1045710.n5.nabble.com/Apache-POI-3-8-SXSSFWorkbook-Unreadable-Content-tp4658852p4658878.html > >> >> > To unsubscribe from Apache POI 3.8 (SXSSFWorkbook) - Unreadable > >> Content, > >> >> click > >> >> > here< > >> >> > >> > http://apache-poi.1045710.n5.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4658852&code=amd1aWxoZXJtZW12QGdtYWlsLmNvbXw0NjU4ODUyfDg3MzU2ODc4NA== > >> >> >. > >> >> > > >> >> > > >> >> > >> >> > >> >> -- > >> >> View this message in context: > >> >> > >> > http://apache-poi.1045710.n5.nabble.com/Apache-POI-3-8-SXSSFWorkbook-Unreadable-Content-tp4658852p4659737.html > >> >> Sent from the POI - Dev mailing list archive at Nabble.com. > >> >> > >> > > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org > >> For additional commands, e-mail: dev-h...@poi.apache.org > >> > >> > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org > For additional commands, e-mail: dev-h...@poi.apache.org > >