It starts to be really amusing to see how some characters are changing when they travel a few times between Finland, France and Poland :)
-Jukka- > -----Alkuperäinen viesti----- > Lähettäjä: Even Rouault [mailto:even.roua...@mines-paris.org] > Lähetetty: 3. tammikuuta 2012 16:44 > Vastaanottaja: Rahkonen Jukka > Kopio: 'gdal-dev@lists.osgeo.org' > Aihe: Re: [gdal-dev] Re: WFS and -where with non-ASCII characters > > Selon Rahkonen Jukka <jukka.rahko...@mmmtike.fi>: > > > > > Mateusz Åoskot wrote: > > > > > Jukka Rahkonen wrote: > > > > I took the successful query sent by Ari from the TinyOWS > > > log and copied it > > > > literally into Windows and this way it works: > > > > > > > > -where name='Hämeenkylä' > > > > > > Windows Command Prompt can work with UTF-8 characters if > you change > > > codepage to UTF-8: > > > > > > 0) Open new prompt (cmd.exe) > > > 1) Change font to Lucida Concole > > > 3) chcp 65001 > > > > > > And OGR can consume filter without problems: > > > > > > -where "name=\"Hämeenkylä\"" > > > > > > Note, the \"\" is needed to not to confuse OGR SQL compilers, > > > otherwise value Hämeenkylä > > > will be parsed as OGR SQL type SNT_COLUMN instead of > SNT_CONSTANT for > > > field value. > > > > > > However, I think the problem may be with TinyOWS. It throws error; > > > > > > <ows:ExceptionText>QUERY_STRING contains forbidden > > > characters</ows:ExceptionText> > > > > > > which is generated by TinyOWS: > > > > > > http://www.tinyows.org/trac/browser/trunk/src/struct/cgi_reque > > > st.c?rev=525#L208 > > > > > > where TinyOWS simply tests characters passed in request > against fixed > > > range: A-Za-zà -ÿ > > > Comparing extended ASII codes, the value 'ä' is outside of > > > this range anyway. > > > > > > I get no WFS exception no OGR error when querying with > some (not all) > > > Polish diacritics: > > > > > > ogrinfo WFS:http://hip.latuviitta.org/cgi-bin/tinyows > > > lv:pks_tilastoalue_piste -where "name=\"ąęśćł\"" > > > > > > Certainly, it gives empty resultset. > > > > > > I think it would be a good idea to try against different > WFS server. > > > > I followed your example but changing the font and chcp 65001 did not > > actually change anything as fas as I can see. OGR may consume > > -where "name=\"Hämeenkylä\"" OK but as you said but > TinyOWS denies it. > > However, -where name='Hämeenkylä' gives correct result. But > > it gave correct result even before changing the font and codepage. > > > > TinyOWS log shows your -where "name=\"ąęśćł\"" like > "aescl" but I am not > > sure if the characters have changed or if my console just shows them > > as ascii characters. > > > > Mapserver behaves also as it did before. My codepage is now > 65001 and > > -where "name=\"Hämeenkylä\"" gives http 500 error while > > -where name='Hämeenkylä' gives correct result. > > Yes, your observation confirms my little testing. Mateusz' > trick with chcp > indeed fixes the display of UTF-8 characters in the console, > but when I enter an > accentuated character, the command line utilities consume it > as Latin1. > Note: I'm on Windows xp. > > I've verified it with a trivial code compiled with MSVC : > > int main(int argc, char* argv[]) > { > printf("%d\n", strlen(argv[1])); > return 0; > } > > If I try "test éven", it prints 4, whereas it should print 5 > if it was really > UTF-8. > > > > > -Jukka Rahkonen- > > > > > > > > > > >
_______________________________________________ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev