On 4/9/2013 07:38, Ferrous Cranus wrote: > Στις 4/9/2013 2:26 μμ, ο/η Dave Angel έγραψε:
>> >>>> >>>> So first in the interpreter, I ran >>>> >>>> >>>> >>>>>>>> f = open("junk.txt", "w") >>>> >>>>>>>> f.write(b'\xb6\xe3\xed\xf9\xf3\xf4\xef\xfc\xed\xef\xec\xe1 >>>>>>>> \xf3\xf5\xf3\xf4\xde\xec\xe1\xf4\xef\xf2\n') >>>> >>>>>>>> f.close() >>>> >>>> >>>> <snip> >> So since the tets.py file was a sidetrack, I just ran those three lines >> in the interpreter. >> > I'm still consused about this. > > say we save those 3 lines inside junk.txt and we save it by default as utf-8 > > when we 'file junk.txt' > > what will file respond with? junk2.txt: ASCII text > > filename's charset? > > or > > will it llook at the bystering within to decide what encoding it uses? > 'file' isn't magic. And again, it doesn't look at the filename, it looks at the content. What heuristics it uses, I don't know, but it has hundreds of them. ( I wish you hadn't confused the issue by using the same name junk.txt for an entirely different purpose) When it looks at a file like this one, it looks only at the bytes within it. In this case, the instance of 'file' on my machine decides it's an ASCII file. if I add an silly shebang line #!/usr/tmp/pyttthon it says junk2.txt: a /usr/tmp/pyttthon script, ASCII text executable It doesn't know it's python, it just trusts the shebang line. And it identifies it as ASCII, not utf-8, since there are no non-ascii characters in it. It certainly does not try to interpret the b'xxxx' byte string by Python syntax rules. -- DaveA -- https://mail.python.org/mailman/listinfo/python-list