On Thu, Jun 5, 2014 at 6:27 AM, <ps16thypresenceisfullnessof...@gmail.com> wrote: > I'm completely new to SQL, and recently started using SQLite in one of my > Python programs. I've gotten what I wanted to work, but I'm not sure if I'm > doing it in the best/most efficient way. I have attached some sample code and > would appreciate any (polite) comments about how the SQL (or Python) in it > could be improved. The code is written in Python 2, but I think it should > work in Python 3 if the 4 print statements are changed to function calls. Am > I correct that the function 'set_description2' should work the same way as > 'set_description'? >
Happy to help out! But before I look into the code itself, two small points. Firstly, you're using Google Groups, which miswraps text and double-spaces all quoted text. This is considered to be extremely annoying on this list/newsgroup, and while it's not technically your fault, it will make people less inclined to read and respond to your posts. I advise signing up for the mailing list instead: https://mail.python.org/mailman/listinfo/python-list Alternatively, you can read the comp.lang.python newsgroup in any good newsreader, or you can use Gmane. The second point that you may want to consider: If you think the code is that close to Python 3 compatible, and if your Python 2 is either version 2.6 or 2.7, just put this line at the top of your script: from __future__ import print_function Then you'll be using print() as a function, and your code may well work unchanged on both versions. But do still tell us what version you're running this under; "Python 2" is a start, but it'd be useful to know which minor version, and which operating system. I don't think it makes any difference here, but it's easy enough to just say "Tested on Python 2.6 on Debian Linux" or "Tested on Python 2.7.7 on Windows 7" or somesuch. On to the code! *plays Lord Phillip theme from 'The Beauty Stone'* > def get_description(conn, name): > cur = conn.cursor() > cur.execute("SELECT description FROM ProgrammingLanguages WHERE Name=?", > (name,)) > row = cur.fetchone() > if row: > return row[0] > return None Since cur.fetchone() returns None if there's no row, you can simplify this down: return row and row[0] Not everyone likes that sort of style, so take your pick. > def set_description(conn, name, description): > cur = conn.cursor() > cur.execute("SELECT 1 FROM ProgrammingLanguages WHERE Name=?", (name,)) > row = cur.fetchone() > if description: > with conn: > if not row: > conn.execute("INSERT INTO ProgrammingLanguages VALUES(?,?)", > (name, description)) > else: > conn.execute("UPDATE ProgrammingLanguages SET Description=? " > \ > "WHERE Name=?", (description, name)) > elif row: > with conn: > conn.execute("DELETE FROM ProgrammingLanguages WHERE Name=?", > (name,)) > conn.commit() There's a general principle in Python APIs that a "mode switch" parameter isn't a good thing. Even more strongly, I would be very surprised if attempting to set a blank description deleted the row instead of setting the description to blank. My recommendation: Split this into two functions, set_description and delete_language. In delete_language, just unconditionally delete, don't bother checking for the row's presence first. The insert-or-update concept is a bit of a tricky one. With a simple database, you probably don't need to worry about the race condition of seeing there's no row, then going to insert, and finding that you can't; but still consider an EAFP model instead - attempt the update, and if nothing got updated, do the insert (or, conversely, attempt the insert, and if you get back an error, update instead). That still has a chance of a race (rows can still be inserted or deleted in the middle), but if you do the more likely case first (do you expect to mostly be updating or inserting?), you'll catch most of your work with a single query, which is that bit safer (not to mention faster). However, trying to use an SQL table as if it were a Python dict is fundamentally hard. No matter how you work it, it won't be perfect. > def set_description2(conn, name, description): > with conn: > if description: > conn.execute("INSERT OR REPLACE INTO ProgrammingLanguages " \ > "VALUES(?,?)", (name, description)) > else: > conn.execute("DELETE FROM ProgrammingLanguages WHERE Name=?", > (name,)) > conn.commit() In theory, yes, this will probably achieve the same as your set_description. But since, as I said above, it's fundamentally hard to do the "INSERT OR REPLACE" operation, there's no guarantee it'll be any better than doing it manually; and this is something that not all database backends support, so if you ever change to a more high-end database (eg PostgreSQL), you'll have to recode this anyway. This function, even more than the other, highlights the value of splitting it into two. There's almost no code common to the two branches - just the 'with' block (and I'm not sure what that accomplishes here) and the commit call. May as well split them out and make it clear when you're deleting something. > conn.execute("CREATE TABLE IF NOT EXISTS ProgrammingLanguages(name TEXT " \ > "PRIMARY KEY, description TEXT)") I'd recommend using a triple-quoted string, unless you have a good reason not to. Also, breaking it in the middle of one declaration is usually a bad idea - makes it harder to read. Here's how I'd lay that out: conn.execute("""CREATE TABLE IF NOT EXISTS ProgrammingLanguages ( name TEXT PRIMARY KEY, description TEXT)""") If you have more columns to add, you could break it at any comma; sometimes, you can put logically-related columns onto one line, like this: conn.execute("""create table if not exists ProgrammingLanguages ( Name text primary key, Description text not null, Year_Introduced smallint not null, Year_Withdrawn smallint not null, -- you should be able to put SQL comments here, too Uses_Braces boolean not null, Supports_Unicode boolean not null, Provides_Bignums boolean not null, )""") (I tend to put SQL keywords in lowercase, and I generally put some uppercase letters in field names like this. Neither change has any significant effect - or shouldn't, with a properly SQL compliant database - so do whatever you find more readable.) Note, by the way, that I've peppered 'not null' everywhere. Apart from the primary key (which is implicitly 'not null'), all fields are nullable unless you say otherwise. Generally you don't need that, and it's one less possibility to have to worry about. (You may want to use 'with default' so you can insert into the table without having to specify every field.) > set_description(conn, "Assembly", > "Making Easy Things Very Hard & Hard Things Impossible") Hey, that's not fair! Assembly language makes some hard things really easy, like segfaulting your process. Credit where it's due! :) > set_description(conn, "Assembly", None) > print "Assembly: %s" % get_description(conn, "Assembly") # Should be None This is where I would be using an explicit delete operation. The only advantage of the merged API is that you could get "a string or None" from somewhere and pass it straight to set_description(), and have it figure out whether to update or delete; but in the very rare situation where you'd do that, you can have the explicit "if desc is None" check outside the function just as easily as inside. (And I would advise using "if desc is None" rather than just "if desc", because a blank description shouldn't normally be a problem - or if it is, that should be commented somewhere.) By and large, you have some reasonably decent code there. The points I've made are all fairly minor, and several of them are matters of stylistic taste more than anything else. SQL is sufficiently simple (for simple jobs, at least) that it's hard to do anything really horrendously wrong; you clearly know about parameterized queries, which would be the one biggest unobviosity to learn, and other than that... if it works, it's probably right. Once you start getting into more complicated queries (multi-table joins, true transactional processing, stuff like that), there are more ways to go wrong, but even then, it's pretty easy to figure stuff out. One final point, before I hit send. When you decide that SQLite isn't enough for your purposes (maybe when you need multiple applications to be able to hammer the database - SQLite isn't designed for high throughput), skip MySQL and go straight to PostgreSQL. Even though MySQL might look tempting at first (it has "REPLACE INTO" which does your INSERT/UPDATE job - PostgreSQL doesn't, the SQL standard doesn't), it's full of annoying traps, and its defaults leave a lot to be desired. Go with PostgreSQL; the defaults may be unideal as regards performance, but at least they're conservative as regards reliability, and that's what you need. All the best! ChrisA -- https://mail.python.org/mailman/listinfo/python-list