On 03/07/18 20:39, Daryl Heppner wrote: > The idea of a class is something I'd unsuccessfully tried but I'm up > for the challenge! I'm very comfortable in Transact SQL but OOP is > new to me.
OK, What I meant by using a class was to define one (or more?) that encompassed all the data in one Deal. Then have the init method take a row of XML and populate all the fields. Something like this (Note this is not pretending to be correct, its just your own code reformed slightly): class DealTerm: def __init__(self,xml_dt): self.id = xml_dt.get("ID") self.start = pr.parse(dts.find("CommencementDate").text) self.term = int(xml_dt.find("LeaseTerm").text) self.area = xml_dt.find("RentableArea").text self.baseRents = [BaseRent(br) for br in xml_dt.findall("BaseRents/BaseRent")] class BaseRent: def __init__(self, xml_br): self.id = xml_br.get("ID") self.begmo = int(xml_br.find("BeginIn").text) if xml_br.find("Duration").text is not None: duration = int(xml_br.find("Duration").text) else: duration = 0 self.amt = xml_br.find("Rent").text self.per = xml_br.find("Period").text self.perst = dstart + rd.relativedelta(months=begmo-1) self.perend = perst + rd.relativedelta(months=duration-1) self.months = [dt.strftime("%Y-%m-%d") for dt in rrule(MONTHLY, dtstart=perst, until=perend)] class Deal: def __init__(self, dl): # use XT to parse the xml string self.id = dl.get("DealID") self.terms = [DealTerm(dts) for dts in dl.findall("DealTerms/DealTerm")] for billdt in BillDate(perst, perend): billdt def __str__(self): # returns string for print() etc ''' only uses first term value for convenience''' return self.id + terms[0].id + terms[0].baseRents[0].id + ... # The billdt line above makes no sense since it will do nothing. # Although in the >>> interpreter it would display the value # in a script it will do nothing. # I'm also not sure if the loop should be part of the # BaseRent or the DealTerm or the Deal. I suspect it # should be in BaseRent but I've left it where it was for now! # Now the main code just needs the outer loops: tree = ET.parse(r'\\DealData.xml') root = tree.getroot() deals = [] for deal in root.findall("Deals"): for dl in deal.findall("Deal"): deals.append( Deal(dl) ) if dl.id == "706880": print(dl) #uses __str__() to create string Hopefully you can see how, by putting the xml parsing into the classes that represent the actual real-world data objects that you are dealing with you simplify the logic of the higher level code and make it clearer what is being read and where? Notice also that in Deal I'm using a list to hold the DealTerms. And in DealTerm I use a list to hold the BaseRents. That way we don't just throw away the early values as your code currently does. You can extend this to cover the other data blocks in your XML if necessary, eg. Commission, Tenant, User etc. > Here's my updated code with a sample XML. The full XML > file is over 600K rows. > > I'd appreciate any feedback. > > Daryl > > import xml.etree.ElementTree as ET > import pyodbc > import dateutil.relativedelta as rd > import dateutil.parser as pr > from dateutil.rrule import rrule, MONTHLY > > tree = ET.parse(r'\\DealData.xml') > root = tree.getroot() > > class BillDate: > def __init__(self, begin, end): > self.current = begin > self.end = end > > def __iter__(self): > return self > > def __next__(self): > if self.current < self.end: > billmo = self.current > self.current += rd.relativedelta(months=1) > return billmo > else: > raise StopIteration Since this acts like a virtual collection you should probably call it BillDates - plural. Then the for loop reads more naturally as for date in BillDates(start, end): # do something with each date > > > for deal in root.findall("Deals"): > for dl in deal.findall("Deal"): > dealid = dl.get("DealID") > for dts in dl.findall("DealTerms/DealTerm"): > dtid = dts.get("ID") > dstart = pr.parse(dts.find("CommencementDate").text) > dterm = int(dts.find("LeaseTerm").text) > darea = dts.find("RentableArea").text > for brrent in dts.findall("BaseRents/BaseRent"): > brid = brrent.get("ID") > begmo = int(brrent.find("BeginIn").text) > if brrent.find("Duration").text is not None: > duration = int(brrent.find("Duration").text) > else: > duration = 0 > brentamt = brrent.find("Rent").text > brper = brrent.find("Period").text > perst = dstart + rd.relativedelta(months=begmo-1) > perend = perst + rd.relativedelta(months=duration-1) > months = [dt.strftime("%Y-%m-%d") for dt in rrule(MONTHLY, > dtstart=perst, until=perend)] > for billdt in BillDate(perst, perend): > billdt > > if dealid == "706880": > print(dealid, dtid, brid, begmo, dstart, dterm, darea, > brentamt, brper, duration, perst, \ > perend, billdt) > > <?xml version="1.0" encoding="UTF-8"?> > <DealExport> > <jobid>c03528f241fdc8e4</jobid> > <Deals> > <Deal DealID="706880"> > <Properties> > <Property ID="69320"> > <Name>12345_Place - Bldg 1</Name> > <CreatedAt>2015-10-28T04:09:07Z</CreatedAt> > <UpdatedAt>2018-04-24T21:56:25Z</UpdatedAt> > <Symbols> > <Type>SourceID</Type> > <Value>12345~12345</Value> > </Symbols> > <Address> > <City>Calgary</City> > <State>AB</State> > <Street>1234 5 Ave S.W.</Street> > <PostalCode>T3P 0N7</PostalCode> > <Country>Canada</Country> > </Address> > <Spaces> > <Space ID="848294"> > <Suite>800</Suite> > <SpaceAvailable>11225</SpaceAvailable> > <FloorName>8</FloorName> > <FloorPosition>7</FloorPosition> > <FloorRentableArea/> > > <CreatedAt>2016-04-20T13:47:13Z</CreatedAt> > <UpdatedAt>2017-10-02T09:17:20Z</UpdatedAt> > <Symbols> > <Type>SourceID</Type> > <Value>800</Value> > </Symbols> > </Space> > </Spaces> > </Property> > </Properties> > <DealType>new</DealType> > <LastModified>2017-12-20T22:36:21Z</LastModified> > <CreatedDate>2017-06-06T15:09:35Z</CreatedDate> > <Stage>dead_deal</Stage> > <Probability>0</Probability> > <CompetitiveSet/> > > <MoveInDate>2017-12-01</MoveInDate> > <RequirementType>Office</RequirementType> > <Tenant> > <Name>Tenant Name</Name> > <Industry>financial services</Industry> > <Affiliation/> > > <TenantID>1013325</TenantID> > <TenantWebsite/> > > <TenantCurrentLocation/> > > <TenantRSFFrom>6000</TenantRSFFrom> > <TenantRSFTo>7000</TenantRSFTo> > </Tenant> > <Brokers></Brokers> > <TenantContacts></TenantContacts> > <DealStages> > <DealStage ID="1533265"> > <Stage>proposal</Stage> > <CreatedDate>2017-06-06T15:09:35Z</CreatedDate> > <LastModified>2017-06-06T15:09:22Z</LastModified> > <StartDate>2017-06-06T15:09:22Z</StartDate> > <EndDate>2017-12-20</EndDate> > <DurationInDays>197</DurationInDays> > <DeadDealReasons></DeadDealReasons> > <User> > <UserId>17699</UserId> > <FirstName>Jessica</FirstName> > <LastName>Rabbit</LastName> > <Email>jessica.rab...@themovie.com</Email> > <Phone></Phone> > <PhoneExtension></PhoneExtension> > <Title></Title> > <BrokerLicenseNumber></BrokerLicenseNumber> > </User> > </DealStage> > <DealStage ID="2174388"> > <Stage>dead_deal</Stage> > <CreatedDate>2017-12-20T22:36:01Z</CreatedDate> > <LastModified>2017-12-20T22:36:01Z</LastModified> > <StartDate>2017-12-20T22:36:01Z</StartDate> > <EndDate>2017-12-20</EndDate> > <DurationInDays>0</DurationInDays> > <DeadDealReasons></DeadDealReasons> > <User> > <UserId>24934</UserId> > <FirstName>Leigh</FirstName> > <LastName>Vaughan</LastName> > <Email>leigh.vaug...@emails.com</Email> > <Phone></Phone> > <PhoneExtension></PhoneExtension> > <Title></Title> > <BrokerLicenseNumber></BrokerLicenseNumber> > </User> > </DealStage> > <DealStage ID="2174390"> > <Stage>dead_deal</Stage> > <CreatedDate>2017-12-20T22:36:21Z</CreatedDate> > <LastModified>2017-12-20T22:36:21Z</LastModified> > <StartDate>2017-12-20T22:36:21Z</StartDate> > <DurationInDays>169</DurationInDays> > <DeadDealReasons> > <DeadDealReason>price</DeadDealReason> > </DeadDealReasons> > <User> > <UserId>24934</UserId> > <FirstName>Leigh</FirstName> > <LastName>Vaughan</LastName> > <Email>leigh.vaug...@emails.com</Email> > <Phone></Phone> > <PhoneExtension></PhoneExtension> > <Title></Title> > <BrokerLicenseNumber></BrokerLicenseNumber> > </User> > </DealStage> > </DealStages> > <DealComments></DealComments> > <DealTerms> > <DealTerm ID="4278580"> > <ProposalType>landlord</ProposalType> > <DiscountRate>0.08</DiscountRate> > <RentableArea>6200</RentableArea> > <LeaseType>triple net</LeaseType> > <LeaseTerm>60</LeaseTerm> > <CommencementDate>2018-01-01</CommencementDate> > <TenantImprovements></TenantImprovements> > <BuildingImprovements></BuildingImprovements> > <FreeRents></FreeRents> > <CreatedDate>2017-06-06T15:16:23Z</CreatedDate> > <SecurityDeposit/> > > <NER>20714.1144670763</NER> > <NEROverride/> > > <DateEntered>2017-06-06</DateEntered> > <OpEx ID=""> > <BaseOpEx/> > > <YearType/> > > <LeaseType/> > > <Description/> > </OpEx> > <RealEstateTaxes ID="33598712"> > <BaseRealEstateTaxes>7.45</BaseRealEstateTaxes> > <YearType/> > > <LeaseType/> > > <Description/> > </RealEstateTaxes> > <NPVperSqFt>103367.15936951</NPVperSqFt> > <TotalNPV>640876388.09096</TotalNPV> > <MiscDescription/> > > <NetCashFlow>642137720.0</NetCashFlow> > <TotalIncome>502200.0</TotalIncome> > <Concessions>64480.0</Concessions> > <Payback>1</Payback> > <IRR/> > > <Latest>true</Latest> > <BaseRents> > <BaseRent ID="45937180"> > <BeginIn>1</BeginIn> > <Rent>15.0</Rent> > <Period>rsf/year</Period> > <Duration>36</Duration> > </BaseRent> > <BaseRent ID="45937181"> > <BeginIn>37</BeginIn> > <Rent>18.0</Rent> > <Period>rsf/year</Period> > <Duration>24</Duration> > </BaseRent> > </BaseRents> > <RentEscalations></RentEscalations> > <OtherCredits></OtherCredits> > <Commissions> > <Commission ID="260593"> > <RepType>landlord</RepType> > <Firm>Company1</Firm> > <Amount>2.9</Amount> > <Unit>rsf</Unit> > </Commission> > <Commission ID="260594"> > <RepType>landlord</RepType> > <Firm>Company2</Firm> > <Amount>7.5</Amount> > <Unit>rsf</Unit> > </Commission> > </Commissions> > <Rights></Rights> > <ProposalID>186607</ProposalID> > <ProposalEnteredDate>2017-06-06</ProposalEnteredDate> > <RecoveryIncomes> > <RecoveryIncome ID="33598711"> > <RecoveryType>ReimbursableExpenseCam</RecoveryType> > <ExpenseAmount>12.87</ExpenseAmount> > <RecoveryMethod>net</RecoveryMethod> > <RecoveryAmount>12.87</RecoveryAmount> > <Description/> > </RecoveryIncome> > <RecoveryIncome ID="33598712"> > <RecoveryType>RealEstateTax</RecoveryType> > <ExpenseAmount>7.45</ExpenseAmount> > <RecoveryMethod>net</RecoveryMethod> > <RecoveryAmount>7.45</RecoveryAmount> > <Description/> > </RecoveryIncome> > </RecoveryIncomes> > </DealTerm> > </DealTerms> > <Budgets></Budgets> > <Appraisals></Appraisals> > <Tours></Tours> > </Deal> > </Deals> > </DealExport> -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor