On 5/19/05, Randy Kobes <[EMAIL PROTECTED]> wrote: > On Wed, 18 May 2005, Jay Savage wrote: > > > On 5/18/05, angie ahl <[EMAIL PROTECTED]> wrote: > > > I can confirm that it's happening before the data's gone > > > to the database or anything. I'm getting the params from > > > CGI.pm and then decoding via decode("utf8", $v) The page > > > the params came from is set as utf-8 in the http header > > > and> content type and firefox is believing the page is > > > utf-8.> > It looks as though the browser isn't sending > > > the data as UTF-8 unless> it contains text that has to > > > be. As soon as I add a € or some other character that's > > > utf-8 it comes through fine. Checking the params before > > > it's decoded showed the £ as I expected to see it after > > > if had been decoded leading me to think the form hasn't > > > been passed as utf-8 . Any clues..... anyone? > > > That sounds about right. Most (english) browsers default > > to Latin-1even when they say they don't. Make sure you > > have "enctype" set inthe opening form tag. If it still > > doesn't work, you'll need to figureout (or as the client) > > what the encoding is, and translate itmanipulating the > > layers and/or encodings. But the bottom line is: if you're > > not putting utf-8 in at some point,you won't get utf-8 > > out. > > For > http://perl.wtsbroadcast.com/about/Angies_second_test_page.html > if (in Firefox on Win32) I set > View -> Character Encoding -> Western (Windows-1252) > I get the £ displayed. > > -- > best regards, > randy kobes >
Just for the record it was the browser passing the form params as Latin unless there was a character that couldn't be represented in Latin. Then it would do as it was told and pass it as utf-8 in the end I had to use Encode::Guess to see if it was utf-8 if so decode as that otherwise decode as iso-8859-1. To make it a tiny bit more stable, and after a lot of trial and error I ended up doing this. 1.Concat all the values that were passed in the form into one string. 2.Run Encode::Guess on that in order to give it enough data to have a fair crack at it. If $decoder is set use it to decode for values, otherwise use iso-8859-1. Not very pretty I grant you, but the only thing that does actually work seeing as the browser wont pass values as utf-8 all the time. Or maybe it's the OS that's entering the text as iso-8859-1. HTH someone someday.