On 07/01/2019 02:15, ram wrote: > I load these 7000+ records into a Pandas DataFrame and update a subset of > these records. > > What's the optimal way to update this subset of records to SQL table?
First 7000 records is not a lot so performance shouldn't be a big issue even if you do update them all. But since its nice not to waste cycles there are (at least) two approaches: 1) tag your records in Pandas when they get updates then loop over the pandas frame updating the tagged records. 2) Use SQL to only update those records which have changes (since its only 3 fields the SQL should not be too difficult. I don't know SQL Server syntax but most major databases allow you to access the existing value from within in a SQL statement. A quick google suggests that it may be using the value attribute in SQL Server. Something like UPDATE.... WHERE field1.value <> Value1 OR field2.value <> Value2 OR field3.value <> value3 Do a bit of digging in your SQL reference you should find it. But it will be non portable since its not part of the SQL standard but a common vendor extension. Of the two approaches Python is easiest to debug but less scalable. For production use doing it on the server will be more efficient. For your volumes either should be fine. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor