On Fri, Feb 25, 2011 at 9:09 AM, Bernd Fehling <bernd.fehl...@uni-bielefeld.de> wrote: > Hi Yonik, > > good point, yes we are using Jetty. > Do you know if Tomcat has this limitation?
Tomcat's defaults are worse - you need to configure it to use UTF-8 by default for URLs. Once you do, it passes all those tests (last I checked). Those tests are really about UTF-8 working in GET/POST query arguments. Solr may still be able to handle indexing and returning full UTF-8, but you wouldn't be able to query for it w/o using surrogates if you're using Jetty. It would be good to test though - does anyone know how to add a char above the BMP to utf8-example.xml? -Yonik http://lucidimagination.com > Regards, > Bernd > > Am 25.02.2011 14:54, schrieb Yonik Seeley: >> On Fri, Feb 25, 2011 at 8:48 AM, Bernd Fehling >> <bernd.fehl...@uni-bielefeld.de> wrote: >>> So Solr trunk should already handle Unicode above BMP for field type string? >>> Strange... >> >> One issue is that jetty doesn't support UTF-8 beyond the BMP: >> >> /opt/code/lusolr/solr/example/exampledocs$ ./test_utf8.sh >> Solr server is up. >> HTTP GET is accepting UTF-8 >> HTTP POST is accepting UTF-8 >> HTTP POST defaults to UTF-8 >> ERROR: HTTP GET is not accepting UTF-8 beyond the basic multilingual plane >> ERROR: HTTP POST is not accepting UTF-8 beyond the basic multilingual plane >> ERROR: HTTP POST + URL params is not accepting UTF-8 beyond the basic >> multilingual plane >> >> -Yonik >> http://lucidimagination.com > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org