Re: Unicode in cgi-script with apache2

Dominique Ramaekers Sat, 16 Aug 2014 15:51:53 -0700

Hi Peter,

Your code seems interesting.

I've tried using sys.stdout (in a slightly different form) but it gavethe same error.

I also read about people who fixed the error by changing the serverslocale to en_US.UTF-8. The people who posted these fixes also said thatyou can only use en_US.UTF-8 (and not ex. nl_BE.UTF8)... Anyway, Itdidn't work for me. And I find this a dirty fix because, I don't want touse US locale...

Please excuse me not to try out your specific solutions. I've alreadystarted to implement WSGI over CGI. See my previous message...


grz

Op 16-08-14 om 13:17 schreef Peter Otten:

Dominique Ramaekers wrote:

I've got a little script:

#!/usr/bin/env python3
print("Content-Type: text/html")
print("Cache-Control: no-cache, must-revalidate")    # HTTP/1.1
print("Expires: Sat, 26 Jul 1997 05:00:00 GMT") # Date in the past
print("")
f = open("/var/www/cgi-data/index.html", "r")
for line in f:
      print(line,end='')

If I run the script in the terminal, it nicely prints the webpage
'index.html'.

If access the script through a webbrowser, apache gives an error:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position
1791: ordinal not in range(128)

I've done a hole afternoon of reading on fora and blogs, I don't have a
solution.

Can anyone help me?

If the input and output encoding are the same you can avoid the byte-to-text
(and subsequent text-to-byte conversion) and serve the binary contents of
the index.html file directly:

#!/usr/bin/env python3
import sys

print("Content-Type: text/html")
print("Cache-Control: no-cache, must-revalidate")    # HTTP/1.1
print("Expires: Sat, 26 Jul 1997 05:00:00 GMT") # Date in the past
print("")
sys.stdout.flush()
with open("/var/www/cgi-data/index.html", "rb") as f:
     for line in f:
         sys.stdout.buffer.write(line)

The flush() is necessary to write pending data before accessing the lowlevel
stdout.buffer. Instead of the loop you can use any of these:

sys.stdout.buffer.write(f.read()) # not for huge files, but should be OK for
                                   # typical html file sizes
sys.stdout.buffer.writelines(f)
shutil.copyfileobj(f, sys.stdout.buffer) # show off your knowledge
                                          # of the stdlib ;)


Alternatively you could choose an encoding via the locale:

#!/usr/bin/env python3
import locale
locale.setlocale(locale.LC_ALL, "en_US.UTF-8")

print("Content-Type: text/html")
print("Cache-Control: no-cache, must-revalidate")    # HTTP/1.1
print("Expires: Sat, 26 Jul 1997 05:00:00 GMT") # Date in the past
print("")
with open("/var/www/cgi-data/index.html") as f:
     for line in f:
         print(line, end='')

Python should then use UTF-8 as the default for i/o and the resulting
scripts looks more familiar.


--
https://mail.python.org/mailman/listinfo/python-list

Re: Unicode in cgi-script with apache2

Reply via email to