On Wed, 05 Oct 2011 21:39:17 -0700, Greg wrote:
> Here is the final code for those who are struggling with similar
> problems:
>
> ## open and decode file
> # In this case, the encoding comes from the charset argument in a meta
> tag
> # e.g.
> fileObj = open(filePath,"r").read()
> fileContent =
In xDog Walker
writes:
> What is this io of which you speak?
It was introduced in Python 2.6.
--
John Gordon A is for Amy, who fell down the stairs
gor...@panix.com B is for Basil, assaulted by bears
-- Edward Gorey, "The Gashl
On Thursday 2011 October 06 10:41, jmfauth wrote:
> or (Python2/Python3)
>
> >>> import io
> >>> with io.open('abc.txt', 'r', encoding='iso-8859-2') as f:
>
> ... r = f.read()
> ...
>
> >>> repr(r)
>
> u'a\nb\nc\n'
>
> >>> with io.open('def.txt', 'w', encoding='utf-8-sig') as f:
>
> ... t
On 6 oct, 06:39, Greg wrote:
> Brilliant! It worked. Thanks!
>
> Here is the final code for those who are struggling with similar
> problems:
>
> ## open and decode file
> # In this case, the encoding comes from the charset argument in a meta
> tag
> # e.g.
> fileObj = open(filePath,"r").read()
>
On Thu, Oct 6, 2011 at 8:29 PM, Ulrich Eckhardt
wrote:
> Just wondering, why do you split the latter two parts? I would have used
> codecs.open() to open the file and define the encoding in a single step. Is
> there a downside to this approach?
>
Those two steps still happen, even if you achieve
Am 06.10.2011 05:40, schrieb Steven D'Aprano:
(4) Do all your processing in Unicode, not bytes.
(5) Encode the text into bytes using UTF-8 encoding.
(6) Write the bytes to a file.
Just wondering, why do you split the latter two parts? I would have used
codecs.open() to open the file and defi
On Thu, Oct 6, 2011 at 3:39 PM, Greg wrote:
> Brilliant! It worked. Thanks!
>
> Here is the final code for those who are struggling with similar
> problems:
>
> ## open and decode file
> # In this case, the encoding comes from the charset argument in a meta
> tag
> # e.g.
> fileContent = fileObj.
Brilliant! It worked. Thanks!
Here is the final code for those who are struggling with similar
problems:
## open and decode file
# In this case, the encoding comes from the charset argument in a meta
tag
# e.g.
fileObj = open(filePath,"r").read()
fileContent = fileObj.decode("iso-8859-2")
fileSo
On Wed, 05 Oct 2011 16:35:59 -0700, Greg wrote:
> Hi, I am having some encoding problems when I first parse stuff from a
> non-english website using BeautifulSoup and then write the results to a
> txt file.
If you haven't already read this, you should do so:
http://www.joelonsoftware.com/article
Hi, I am having some encoding problems when I first parse stuff from a
non-english website using BeautifulSoup and then write the results to
a txt file.
I have the text both as a normal (text) and as a unicode string
(utext):
print repr(text)
'Branie zak\xc2\xb3adnik\xc3\xb3w'
print repr(utext)
u
10 matches
Mail list logo