Re: [9fans] utf-8 text files from httpd

2009-10-19 Thread lucio
> 2009/10/19 erik quanstrom : >> why try that hard?  just call it utf-8.  i can't think of >> any browsers that would have a problem with that today. > > the instance of the problem that i had was when > adding an attachment to a upas mail. > file -m is useful when the attachment might be > binary

Re: [9fans] utf-8 text files from httpd

2009-10-19 Thread erik quanstrom
On Mon Oct 19 10:36:51 EDT 2009, rogpe...@gmail.com wrote: > 2009/10/19 erik quanstrom : > > why try that hard?  just call it utf-8.  i can't think of > > any browsers that would have a problem with that today. > > the instance of the problem that i had was when > adding an attachment to a upas ma

Re: [9fans] utf-8 text files from httpd

2009-10-19 Thread roger peppe
2009/10/19 erik quanstrom : > why try that hard?  just call it utf-8.  i can't think of > any browsers that would have a problem with that today. the instance of the problem that i had was when adding an attachment to a upas mail. file -m is useful when the attachment might be binary.

Re: [9fans] utf-8 text files from httpd

2009-10-19 Thread erik quanstrom
On Mon Oct 19 09:51:33 EDT 2009, rogpe...@gmail.com wrote: > there's another problem with file -m that > i've been bitten by before: it ignores any > stuff after the first 6000 bytes. > > so if you've got a mostly-ascii file with some > utf-8 characters 8K in, then it won't be picked up. > > i th

Re: [9fans] utf-8 text files from httpd

2009-10-19 Thread roger peppe
there's another problem with file -m that i've been bitten by before: it ignores any stuff after the first 6000 bytes. so if you've got a mostly-ascii file with some utf-8 characters 8K in, then it won't be picked up. i think file -m should read the whole file, but that's just IMHO.

Re: [9fans] utf-8 text files from httpd

2009-10-19 Thread erik quanstrom
> Is the output of file(1) appropriate for this purpose? > Shouldn't your sample file also be sent as UTF-8? it should be. for example since ; echo ☺ | file stdin: short UTF text # sic one would expect that echo ☺ | file -m would yield text/plain; charset=utf-8. > file(1) speak

Re: [9fans] utf-8 text files from httpd

2009-10-19 Thread Kenji Arisawa
I think it is difficult to make web server to work correctly in case we have variety of charset text files on the server. Although we can manually select charset in browser menu, the selection is useless in case the page is written in Javascript that fills some portion of a page reading a tex

Re: [9fans] utf-8 text files from httpd

2009-10-19 Thread Akshat Kumar
new/sendfd.c:243 c old/sendfd.c:243 < --- > /* new/sendfd.c:246 c old/sendfd.c:246 < --- > */ (context: text/plain -> text/plain; charset=utf-8) Now my text files can be read in the proper encoding by default, and are not interpreted by browsers (as well as certain applications) to be whack ASCII

Re: [9fans] utf-8 text files from httpd

2009-10-19 Thread Eris Discordia
The decision whether to open in place or save to disk based on MIME type is up to the browser. For example, I set my browsers to ask to save to disk application/pdf documents (rather than opening them with Adobe Acrobat's problem plugin). A MIME type of text/plain (without any specification of

Re: [9fans] utf-8 text files from httpd

2009-10-18 Thread erik quanstrom
> Thus, hard coding "charset=utf-8" in http header will bring other > problem > because that coding disables a line in html header such as: > that should not be a problem on a plan 9 system; plan 9's character set is utf-8. - erik

Re: [9fans] utf-8 text files from httpd

2009-10-18 Thread Kenji Arisawa
we should note also http://www.w3.org/TR/html4/charset.html#h-5.2.2. the document says: To sum up, conforming user agents must observe the following priorities when determining a document's character encoding (from highest priority to lowest): 1. An HTTP "charset" parameter i

Re: [9fans] utf-8 text files from httpd

2009-10-18 Thread Kenji Arisawa
according to rfc2616, default charset in sending text file is ascii: The "charset" parameter is used with some media types to define the character set (section 3.4) of the data. When no explicit charset parameter is provided by the sender, media subtypes of the "text" type are defined

Re: [9fans] utf-8 text files from httpd

2009-10-18 Thread andrey mirtchovski
your mimetypes are probably maim-typed (heh). see /sys/lib/mimetype for a fix, or put this in your page's section: On Sun, Oct 18, 2009 at 6:34 PM, Akshat Kumar wrote: > I'm trying to put up a plain text file containing UTF-8 > characters from httpd, but when viewing it from any > br

Re: [9fans] utf-8 text files from httpd

2009-10-18 Thread erik quanstrom
On Sun Oct 18 20:37:23 EDT 2009, aku...@mail.nanosouffle.net wrote: > I'm trying to put up a plain text file containing UTF-8 > characters from httpd, but when viewing it from any > browser, it comes off as an ASCII file that needs to > be downloaded (so, those characters are garbled). > Is this du

[9fans] utf-8 text files from httpd

2009-10-18 Thread Akshat Kumar
I'm trying to put up a plain text file containing UTF-8 characters from httpd, but when viewing it from any browser, it comes off as an ASCII file that needs to be downloaded (so, those characters are garbled). Is this due to some behaviour of httpd? ak