while c = f.read(1)
I have a Python snippet: f = open("blah.txt", "r") while True: c = f.read(1) if c == '': break # EOF # ... work on c Is some way to make this code more compact and simple? It's a bit spaghetti. This is what I would ideally like: f = open("blah.txt", "r") while c = f.read(1): # ... work on c But I get a syntax error. while c = f.read(1): ^ SyntaxError: invalid syntax And read() doesn't work that way anyway because it returns '' on EOF and '' != False. If I try: f = open("blah.txt", "r") while (c = f.read(1)) != '': # ... work on c I get a syntax error also. :( Is this related to Python's expression vs. statement syntactic separation? How can I be write this code more nicely? Thanks -- http://mail.python.org/mailman/listinfo/python-list
Re: while c = f.read(1)
Okay, the 1st option seems more complicated and the 2nd option, while simpler to my eye makes me worry about file descriptors, resource management and memory running out. My files are large, hence 1 character at a time, not f.read(). This is code from another employee and I'm just in the stages of going through it and doing a basic clean-up before I get on to a proper efficiency assessment, hence I don't want to change the way it works, just make it as short and lucid as I can. Your suggestion using the generator expression again seems more complex than the original. They're good suggestions though. Thank you! I'm just now catching up with everybody's comments. Some of them seem quite entriguing. -- http://mail.python.org/mailman/listinfo/python-list
Re: while c = f.read(1)
The 2nd option has real potential for me. Although the total amount of code is greater, it factors out some complexity away from the actual job, so that code is not obscured by unnecessary compexity. IMHO that's great practice. I like it! Thank you! The assurance that the code was clear was good too - that is also what I need to hear (decisively dismissing the quest for something shorter and punchier will make me happy as much as succeeding in it). I'll read all the other suggestions first though. -- http://mail.python.org/mailman/listinfo/python-list
Re: while c = f.read(1)
Donn Cave wrote: > Actually I'd make it a little less compact -- put the "break" > on its own line -- but in any case this is fine. It's a natural > and ordinary way to express this in Python. > > ... > | But I get a syntax error. > | > | while c = f.read(1): > |^ > | SyntaxError: invalid syntax > | > | And read() doesn't work that way anyway because it returns '' on EOF > | and '' != False. If I try: > > This is the part I really wanted to respond to. Python managed > without a False for years (and of course without a True), and if > the introduction of this superfluous boolean type really has led > to much of this kind of confusion, then it was a bad idea for sure. Sorry that was patently untrue - as far as the while loop is concerned '' and False both cause the loop to exit so for all intents and purposes, in that context, '' may as well equal False. I think True and False constants are a good idea. And I like the way it worked. The reason I got confused though was because I cannot do this: while c = f.read(1): # ... > The condition that we're looking at here, and this is often the > way to look at conditional expressions in Python, is basically > something vs. nothing. In this and most IO reads, the return > value will be something, until at end of file it's nothing. > Any type of nothing -- '', {}, [], 0, None - will test "false", > and everything else is "true". Of course True is true too, and > False is false, but as far as I know they're never really needed. I'm all in favour of logic that looks like: if '' or 0 or {} or None: # never gets done else: # always gets done My confusion stems from the syntax error not from Python's boolean logic. Sorry, my statement at the end regarding '' != False was a bit of a red herring. :\ I'm also in the habit of doing things like this: def f(dict=None): dict = dict or {} # ... > You are no doubt wondering when I'm going to get to the part where > you can exploit this to save you those 3 lines of code. Sorry, > it won't help with that. Ah but if it helps me to understand why Python forces me to do it then although it won't save 3 lines it will stop me from complaining and posting naff newbie questions. ;) > | Is this related to Python's expression vs. statement syntactic > | separation? How can I be write this code more nicely? > > Yes, exactly. Don't worry, it's nice as can be. If this is > the worst problem in your code, you're far better off than most > of us. Okay well that is reassuring but a little disappointing. Are there any plans (PEPs?) in the distant future to unify statements and expressions in the Python syntax so I can generally do things like this: x = if aboolean: else: and while c = f.read(1): # ... I couldn't find any general PEPs along these lines, only specific ones (e.g. 308 re. an if-then-else expression). It does seem quirky given Python's indentation syntax. :( However I notice I can already do some things I wouldn't expect given a strict statement/expression separation, such as: x = y = z = 0 I suppose I have to learn these as special cases. (?) -- http://mail.python.org/mailman/listinfo/python-list
Re: while c = f.read(1)
I've always accepted the None vs. 0 as a cavaet of the added convenience but I think it's ultimately worth it. Sorry, I didn't want to start a "nothing values evaluate to false" argument. I'll go read python-dev archives a bit and see if there's anything useful for me to know. Thanks -- http://mail.python.org/mailman/listinfo/python-list
Re: while c = f.read(1)
John Machin wrote: > > Is some way to make this code more compact and simple? It's a bit > > spaghetti. > > Not at all, IMHO. This is a simple forward-branching exit from a loop in > explicable circumstances (EOF). It is a common-enough idiom that doesn't > detract from readability & understandability. Spaghetti is like a GOTO > that jumps backwards into the middle of a loop for no discernable reason. While that is true, I had high hopes that I could do this: while c = f.read(1): # ... And relative to that, it is more complex. And although I am nit-picking to try to simplify this code, I wanted to understand why Python works in this way (even if that's just "historical reasons"), and check to see if there was not some shorter more modern Pythonic alternative. I did actually like Robert Kern's suggestion which used an iterator and a function to factor out the complexity of setting it up. I think that is actually better code than the original. It matches my philosophy in programming of pushing complexity *out* of the code which does the actual job, even if it means writing a few support functions/classes/whatever. I know they can be re-used and refined and I know that it is the code that does the actual job that is most likely to be rewritten in future revisions of the code with shifts in requirements. > You have a bit of a misunderstanding here that needs correcting: > > In "if " and "while ", is NOT restricted to being in > (True, False). See section 5.10 of the Python Reference Manual: I'm sorry! I realise that now and I'm sorry to have caused the traffic I did. Thank you for pointing it out to me though - it's pretty fundamental Python! *Greg thumbtacks a note to his forehead* > How about > for c in f.read(): > ? > Note that this reads the whole file into memory (changing \r\n to \n on > Windows) ... performance-wise for large files you've spent some memory > but clawed back the rather large CPU time spent doing f.read(1) once per > character. The "more nicely" factor improves outasight, IMHO. I would if only I had any kind of guarrantee on the file size but I don't - this code is for reading a header out of a binary file which uses delimiters and escape characters to mark out its fields. I didn't design the format, but after cleaning up the code that deals with it, I may *re*design it. ;) > Mild curiosity: what are you doing processing one character at a time > that can't be done with a built-in function, a standard module, or a > 3rd-party module? Our company is designing a new file type. *sigh*. Confidentiality prevents me from saying any more, too. If that bugs you because it's not open source, sorry I need a job. Don't worry though, I'm developing an open source remote GUI for the code management system we're using called Aegis (http://aegis.sf.net). It's not sufficiently implemented yet to warrant posting publically (I'd describe its current state as "framework") but if anybody reading this is interested then give me a yell and it'll have a public project page or something overnight. ;) -- http://mail.python.org/mailman/listinfo/python-list
Re: while c = f.read(1)
That is both clever and useful! I never would have thought of doing that. This seems to me like a general way to "workaround" the Python statement/expression separation woes I've been having, in cases where I really really want it. Now, where can I copy this out to so I will be able to find it when the occasion arises? Hmm... *jot jot jot* Thanks muchly! :-) -- http://mail.python.org/mailman/listinfo/python-list
Re: while c = f.read(1)
> On the other hand, if you've already planned another pass over the code, that might be the time to look into this. Exactly. And when I do that pass I will definitely try buffering the data 10 or 100 meg at a time before entring the 1 char-at-a-time loop, or using mmap to similar ends. -- http://mail.python.org/mailman/listinfo/python-list
Re: while c = f.read(1)
John Machin wrote: > Sigh indeed. If you need to read it a character at a time to parse it, > the design is f***ed. There is always the potential to do 2k buffered reads and once in memory pick the contents apart character-wise. I assume something similar would happen for tokenising XML and HTML which would presumably often 'read until "<"'. -- http://mail.python.org/mailman/listinfo/python-list
Re: while c = f.read(1)
Robert Kern wrote: > > Robert> Please quote the message you are replying to. We have no > > Robert> idea what "the 2nd option" is. > > > > I think he means the second option you presented > > > > If you must read one character at a time, > > > > def reader(fileobj, blocksize=1): > > """Return an iterator that reads blocks of a given size from a > > file object until EOF. > > ...snip > > > > With a decent threaded news/mail reader, the thread provides > > sufficient context, no? > > Not taking into account the python-list gateway or GMane. I see his > message threaded directly under his original one. > > And dammit, I'm vain enough that if people are complimenting my code, I > want to be sure about it. ;-) Sorry Robert, I'm using Google Groups until I figure out the news settings for our ISP at work (which is most unhelpful). I'm not used to using it and the default 'Reply' option doesn't quote. :\ Not a good excuse, I know. Let's see... to summarise the responses I got, I liked yours the best, Robert. It was: def reader(fileobj, blocksize=1): """Return an iterator that reads blocks of a given size from a file object until EOF. """ # Note that iter() can take a function to call repeatedly until it # receives a given sentinel value, here ''. return iter(lambda: fileobj.read(blocksize), '') f = open('blah.txt', 'r') try: for c in reader(f): # ... finally: f.close() I like it because I can make 'reader' a stock library function I can potentially re-use and it removes complexity from the area where I want to place the domain-specific logic (where I call reader()), which I have a personal preference for. Potentially the added comlexity of buffering larger chunks at a time for efficiency could also be put into the reader() function to keep the rest of the code super clean and neat. -- http://mail.python.org/mailman/listinfo/python-list
Re: Django Vs Rails
Diez B. Roggisch wrote: > I tried to find out if subway and > rails can do the same - that is, generate the sql. For subway the lack > of documentation prevented that, and I didn't find it in rails , too. In Rails you can do that with the command: $ rake db_structure_dump However I think it's not the prescribed way of using it because it tends to involve losing all your data every time you make a schema change. I think they recommend doing this once at the start of development if you don't have a DB schema yet. Once you're up and running, if you rely on ALTER TABLE type commands and manually update your code to match the new schema, although it's more work it leaves your data intact. I think that's the idea anyway. > And there is at least one shortcoming to the first approach, when using > the most popular RDBMS, MySQL: The lack of foreign key constraints makes > me wonder how to automatically infer 1:n or m:n relationships. From a > rails tutorial, I see that one has to declare these too: > > http://www.onlamp.com/pub/a/onlamp/2005/01/20/rails.html?page=5 > > > But maybe someone who has expirience with subway or rails can elaborate > on this? >From my experience with Rails, the OR mapping isn't hugely automated. That is, some manual work is required if you change the schema, to update the code that operates upon it. However although this is another step it is pretty trivial due to the metaprogramming style methods. Besides, if you're altering the schema you tend to have to update some code anyway as you're most likely altering the functionality of the system a little. And as I said above, a loose/manual OR mapping has its benefits; I don't feel anxious about losing my data/schema when using Rails because I know it leaves it alone. -- http://mail.python.org/mailman/listinfo/python-list
Appending paths relative to the current file to sys.path
Out of interest, are there any standard Python modules that do this: def appendRelativeIncludePath(*relpath): dir = os.path.abspath(os.path.join(os.path.dirname(__file__), *relpath)) if not dir in sys.path: sys.path.append(dir) I ask because I often find myself doing this: # myproject/lib/mymodule/test/test.py if __name__ == '__main__': import os.path import sys dir = os.path.abspath(os.path.join(os.path.dirname(__file__), '..', '..')) if not dir in sys.path: sys.path.append(dir) import mymodule # it's always 2 directories up from here And it seems like a lot to type so often. I can't factor out this functionality unless I create a little module for doing this and install it in a standard include path. You see I dream of doing this: # myproject/lib/mymodule/test/test.py if __name__ == '__main__': import astandardmodule astandardmodule.appendRelativeIncludePath('..', '..') import mymodule Which, as you can see, is much shorter. ;) -- Greg McIntyre -- http://mail.python.org/mailman/listinfo/python-list
Re: the python way?
I like this solution. For some reason it made me want to write a Ruby version of it. Possibly something about the title and the fact that Ruby enthusiasts are always going on about "the Ruby way". VOWELS = 'aeiouy'.split('') def reinterpolate(word) letters = word.split('').sort_by{rand} vowels = letters.find_all{|l| VOWELS.include?(l) } consonants = letters - vowels return vowels.zip(consonants).map{|v, c| "#{c}#{v}" }.join('') end puts reinterpolate("encyclopedias") -- http://mail.python.org/mailman/listinfo/python-list