Hi Sarfaraz, If you need to perserve existing properties, you should write your own transaction function, and do all the work in there. get_or_insert is simply a convenience function for this:
entity = cls.get(key_name) if not entity: entity = cls(key_name, **kwargs) entity.put() return entity -Nick Johnson On Fri, Apr 8, 2011 at 9:43 AM, Sarfaraz <[email protected]>wrote: > Thank you so much for the help Nick ! > > I still got a problem, I am not over writing all the properties there are > some properties which are not changed by the CSV. > If I follow the sample you gave me: > > entities = [] > for row in rows: > q = Quotes(key_name=cells[0]) > # Fill in q, or pass the values as arguments to the constructor above > entities.append(q) > db.put(entities) > > The Other Properties are becoming None. > > If I use > q = Quotes.get_by_key_name (key_name=cells[0]) > > Instead of > > q = Quotes(key_name=cells[0]) > > The other properties are not affected only those i wish to update from csv > are updated. this works for me but when i run the cron every 1 minute its > still utilizing 2% of cpu / hour. > > Can I improve on this ? > > Regards > > > > > > > > > Warm Regards > Sarfaraz Farooqui > -- > Strong and bitter words indicate a weak cause > > > > > On Thu, Apr 7, 2011 at 7:33 AM, Nick Johnson (Google) < > [email protected]> wrote: > >> Hi Sarfaraz, >> >> Which sort of deadline exceeded error are you getting? If it's from the >> URLFetch call, you can increase the deadline; on offline requests (task >> queue and cron job) that can be up to 10 minutes. >> >> You can generally ignore the 'too much CPU' warning; there are no longer >> per-handler or per-page quotas, and offline requests may use as much CPU as >> they need (providing you have the quota for it!). You can improve the >> efficiency of your code, though: Instead of calling .put() in each iteration >> of the loop, batch the results up and store them with a single put call: >> >> entities = [] >> for row in rows: >> q = Quotes(key_name=cells[0]) >> # Fill in q, or pass the values as arguments to the constructor above >> entities.append(q) >> db.put(entities) >> >> The other modification, above, is to simply call the constructor instead >> of get_or_insert. get_or_insert executes a transaction to look for an >> existing entity, and insert one if it doesn't exist. You're simply >> overwriting all the values of the entity, though, so there's no need to >> fetch the old one or do it transactionally - you can simply create a new >> entity, overwriting any old data. >> >> -Nick Johnson >> >> >> On Thu, Apr 7, 2011 at 7:00 AM, Sarfaraz <[email protected]>wrote: >> >>> Hi, >>> I am running a cron every 5 minutes to get csv data (stock prices) from >>> a url using urllib2. After splitting the data by Newline and Comma I am >>> storing the data in datastore >>> >>> I am getting many deadlineexceed errors and warning messages that the >>> cron "uri uses a high amount of cpu and may soon exceed its quota" >>> >>> I am attaching a few snapshots from my dashboard, and also pasting my >>> code below for a review: is this Normal or I am doing somethign wrong. >>> >>> >>> *Model* >>> >>> >>> class Quotes(db.Model): #Symbol is stored as the key_name >>> PriceDate = db.StringProperty() # I am using string here because i >>> just show this value no calculations or sorting on this ignore it. >>> Open = db.FloatProperty() >>> High = db.FloatProperty() >>> Low = db.FloatProperty() >>> Last = db.FloatProperty() >>> Change = db.FloatProperty() >>> PerChange = db.FloatProperty() >>> PrevClose = db.FloatProperty() >>> Vol = db.IntegerProperty() >>> Val = db.IntegerProperty() >>> UpdatedOn = db.DateTimeProperty(auto_now=True) >>> >>> ** >>> *Code for URL fetch, parsing and storing into datastore:* >>> >>> I get the csv data first which consists of 147 Rows and 9 Columns (comma >>> seperated values) rows may increase in future >>> First I split by New Line Charachter to get all Rows which are approx 147 >>> Then I loop through each row and split it by comma to get column values >>> (approx 9 columns) >>> while looping for the rows I fetch an entity by using " get_or_insert ( >>> key_name ) >>> then I update all the properties and call the put(). >>> >>> Please check the detailed code below ( I have removed error handling and >>> logging stuff for brevity) >>> >>> These are a few lines from the csv data I recieve from the url. >>> *********************************************************************** >>> 1010,4/6/2011 3:32:01 PM,26.00,26.20,26.00,26.20,0.10,207359,5423720 >>> 1020,4/6/2011 3:32:01 PM,19.25,19.60,19.10,19.60,0.35,739595,14399067 >>> 1030,4/6/2011 3:32:01 PM,20.20,20.30,20.15,20.30,0.10,31833,643936 >>> 1040,4/6/2011 3:32:01 PM,30.40,30.80,30.40,30.80,0.00,10621,325830 >>> 1050,4/6/2011 3:32:01 PM,49.30,50.50,49.00,50.00,1.20,126361,6326252 >>> >>> *********************************************************************** >>> >>> HERE IS THE CODE >>> >>> url = "http://www.example.com/somefile.ashx" >>> >>> result = urllib2.urlopen(url) >>> result = result.read() >>> rows = result.split("\n") >>> for row in rows: >>> cells = row.split(",") >>> q = Quotes.get_or_insert( (str(cells[0])).strip() ) ## cells[0] >>> contains my key_name >>> q.PriceDate = cells[1] >>> q.Open = float(cells[2]) #opening price >>> q.High = float(cells[3]) #days high price >>> q.Low = float(cells[4]) #days low price >>> q.Last = float(cells[5]) >>> q.Change = float(cells[6]) >>> q.PrevClose = float(q.Last - q.Change) >>> q.PerChange = round( (q.Change * 100 ) / q.PrevClose , 2) >>> q.Vol = int(cells[7]) >>> q.Val = int(cells[8]) >>> q.put() >>> ** >>> ** >>> *This is using too much cpu and exceeding deadline because I am fetching >>> 1 entity at a time and using put() for each entity thats 147 times ? AND >>> also because I am doing some calculations such as q.PrevClose = >>> (q.Last-q.Change) instead of calculating it in a diffrent variable and then >>> assigning it to property. are these the reason ? please help me in getting >>> this right* >>> ** >>> *Attached: Dashbaord Screnshots* >>> ** >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "Google App Engine" group. >>> To post to this group, send email to [email protected]. >>> To unsubscribe from this group, send email to >>> [email protected]. >>> For more options, visit this group at >>> http://groups.google.com/group/google-appengine?hl=en. >>> >> >> >> >> -- >> Nick Johnson, Developer Programs Engineer, App Engine >> >> >> > -- Nick Johnson, Developer Programs Engineer, App Engine -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
