Easy way to get a list of tuples.
Hi I have been toying with json and I particular area where I cannot get the desired result a list of tuples as my return. The json from the API is way to long but I don't think it will matter. .. hitting url data = r.json() for item in data["RaceDay"]['Meetings'][0]['Races']: raceDetails = item['RacingFormGuide']['Event']['Race'] print(raceDetails) This returns {'Number': 1, 'NumberDisplay': '01', 'Distance': 1000, 'DistanceDisplay': '1000 METRES', 'Name': 'CLASS 3 HANDICAP', 'NameForm': 'HIGHWAY-C3'} {'Number': 2, 'NumberDisplay': '02', 'Distance': 1600, 'DistanceDisplay': '1600 METRES', 'Name': 'BM 90 HANDICAP', 'NameForm': 'BM90'} {'Number': 3, 'NumberDisplay': '03', 'Distance': 1100, 'DistanceDisplay': '1100 METRES', 'Name': 'HERITAGE STAKES', 'NameForm': 'HERITAGE'} {'Number': 4, 'NumberDisplay': '04', 'Distance': 1400, 'DistanceDisplay': '1400 METRES', 'Name': 'BILL RITCHIE HANDICAP', 'NameForm': 'RITCHIE'} {'Number': 5, 'NumberDisplay': '05', 'Distance': 1400, 'DistanceDisplay': '1400 METRES', 'Name': 'TEA ROSE STAKES', 'NameForm': 'TEA ROSE'} {'Number': 6, 'NumberDisplay': '06', 'Distance': 1600, 'DistanceDisplay': '1600 METRES', 'Name': 'GEORGE MAIN STAKES', 'NameForm': 'GEO MAIN'} {'Number': 7, 'NumberDisplay': '07', 'Distance': 1100, 'DistanceDisplay': '1100 METRES', 'Name': 'THE SHORTS', 'NameForm': 'THE SHORTS'} {'Number': 8, 'NumberDisplay': '08', 'Distance': 2000, 'DistanceDisplay': '2000 METRES', 'Name': 'KINGTON TOWN STAKES', 'NameForm': 'KING TOWN'} {'Number': 9, 'NumberDisplay': '09', 'Distance': 1200, 'DistanceDisplay': '1200 METRES', 'Name': 'BM 84 HANDICAP', 'NameForm': 'BM84'} My goal is to select a few elements and create a list of 3 element tuples like this [('CLASS 3 HANDICAP', 1, 1000), ('BM 90 HANDICAP', 2, 1600), ('HERITAGE STAKES', 3, 1100), ('BILL RITCHIE HANDICAP', 4, 1400), ('TEA ROSE STAKES', 5, 1400), ('GEORGE MAIN STAKES', 6, 1600), ('THE SHORTS', 7, 1100), ('KINGTON TOWN STAKES', 8, 2000), ('BM 84 HANDICAP', 9, 1200)] I get close creating a list of elements but each attempt I try to create the list of tuples fails. This is my closest code data = r.json() raceData = [] for item in data["RaceDay"]['Meetings'][0]['Races']: raceDetails = item['RacingFormGuide']['Event']['Race'] raceData += (raceDetails['Name'],raceDetails['Number'],raceDetails['Distance']) print(raceDetails) which returns ['CLASS 3 HANDICAP', 1, 1000, 'BM 90 HANDICAP', 2, 1600, 'HERITAGE STAKES', 3, 1100, 'BILL RITCHIE HANDICAP', 4, 1400, 'TEA ROSE STAKES', 5, 1400, 'GEORGE MAIN STAKES', 6, 1600, 'THE SHORTS', 7, 1100, 'KINGTON TOWN STAKES', 8, 2000, 'BM 84 HANDICAP', 9, 1200] How do I get the tuples? Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Easy way to get a list of tuples.
On Thursday, 21 September 2017 20:31:28 UTC+10, Thomas Jollans wrote: > On 2017-09-21 12:18, Sayth Renshaw wrote: > > This is my closest code > > > > data = r.json() > > > > raceData = [] > > > > for item in data["RaceDay"]['Meetings'][0]['Races']: > > raceDetails = item['RacingFormGuide']['Event']['Race'] > > raceData += > > (raceDetails['Name'],raceDetails['Number'],raceDetails['Distance']) > > > > print(raceDetails) > > > > You're close! > > The operator += extends a list with the items of another sequence (or > iterable). What you're looking for is the method .append(), which adds a > single element. > > Observe: > > Python 3.6.0 |Continuum Analytics, Inc.| (default, Dec 23 2016, 12:22:00) > [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux > Type "help", "copyright", "credits" or "license" for more information. > py> a_list = [] > py> a_list += 1,2,3 > py> a_list > [1, 2, 3] > py> a_list.append(4) > py> a_list > [1, 2, 3, 4] > py> a_list += 4 > Traceback (most recent call last): > File "", line 1, in > TypeError: 'int' object is not iterable > py> a_list.append((5,6,7)) > py> a_list > [1, 2, 3, 4, (5, 6, 7)] > py> > > > -- > Thomas Jollans Thanks Thomas yes you are right with append. I have tried it but just can't get it yet as append takes only 1 argument and I wish to give it 3. I am really having trouble creating the groups of 3, since I am getting one consistent stream. Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Easy way to get a list of tuples.
> > > > Thanks Thomas yes you are right with append. I have tried it but just > > can't get it yet as append takes only 1 argument and I wish to give it 3. > > > You have not showed us what you tried, but you are probably missing a pair > of brackets. > > C:\Users\User>python > Python 3.6.0 (v3.6.0:41df79263a11, Dec 23 2016, 08:06:12) [MSC v.1900 64 bit > (AMD64)] on win32 > Type "help", "copyright", "credits" or "license" for more information. > >>> x = [] > >>> x.append(('a', 'b', 'c')) > >>> x.append(('p', 'q', 'r')) > >>> x > [('a', 'b', 'c'), ('p', 'q', 'r')] > >>> > > Does this help? > > Frank Millman Oh yes I just had one set of brackets with my append. Thanks Frank -- https://mail.python.org/mailman/listinfo/python-list
None is None but not working
Hi I have got a successful script setup to rotate through dates and download json data from the url. As the api returns 200 whether successful I want to check if the file returned is not successful. when a file doesn't exist the api returns {'RaceDay': None, 'ErrorInfo': {'SystemId': 200, 'ErrorNo': 55013, 'DisplayMessage': 'File Not Found.', 'ContactSupport': False, 'SupportErrorReference': '200-55013'}, 'Success': False} When I call data = r.json() it says its type is None if it is not successful so I thought it easier to check that. However checking for None does not work the flow in my if else falls straight to else. for dates in fullUrl: r = requests.get(dates) data = r.json() if data is None: print("Nothing here") else: print(data["RaceDay"]) and I get output of None None {'MeetingDate': '2017-01- ... and so on. How can I actually get this to check? If i use type(data) I also get None. Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: None is None but not working
Thank you it was data["RaceDay"] that was needed. ata = r.json() if data["RaceDay"] is None: print("Nothing here") else: print(data["RaceDay"]) Nothing here Nothing here Nothing here {'MeetingDate': '2017-01-11T00:00:00', . Thanks Sayth -- https://mail.python.org/mailman/listinfo/python-list
Suggestions on storing, caching, querying json
HI Looking for suggestions around json libraries. with Python. I am looking for suggestions around a long term solution to store and query json documents across many files. I will be accessing an api and downloading approx 20 json files from an api a week. Having downloaded this year I have over 200 files already. So it will grow at a reasonable rate. What I have initially done is store them into a mongo db. Now I am wondering if this is useful or prudent since other than querying the json I wont have much use of other mongo features. When querying the json files though queries will utilise multiple json files at once, not just retrieving a single record. The usage is for data analysis. Is there a good json storage option, with caching and optimal querying etc. Regarding querying I did find a library for json searching called ObjectPath written in Python http://objectpath.org/reference.html Looking to leverage your experience. Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Suggestions on storing, caching, querying json
On Thursday, 5 October 2017 15:13:43 UTC+11, Sayth Renshaw wrote: > HI > > Looking for suggestions around json libraries. with Python. I am looking for > suggestions around a long term solution to store and query json documents > across many files. > > I will be accessing an api and downloading approx 20 json files from an api a > week. Having downloaded this year I have over 200 files already. So it will > grow at a reasonable rate. > > What I have initially done is store them into a mongo db. Now I am wondering > if this is useful or prudent since other than querying the json I wont have > much use of other mongo features. > > When querying the json files though queries will utilise multiple json files > at once, not just retrieving a single record. The usage is for data analysis. > > Is there a good json storage option, with caching and optimal querying etc. > > Regarding querying I did find a library for json searching called ObjectPath > written in Python http://objectpath.org/reference.html > > Looking to leverage your experience. > > Cheers > > Sayth There is a new extension for redis ReJson and redis-py for using redis and python as a json store. http://rejson.io/ and https://github.com/andymccurdy/redis-py. Not sure if this has much more upside than mongo other than having a more fmailiar query language like JsonPath http://rejson.io/path/ Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list
pathlib PurePosixPath
Hi How do I create a valid file name and directory with pathlib? When I create it using PurePosixPath I end up with an OSError due to an obvously invlaid path being created. import pathlib for dates in fullUrl: # print(dates) time.sleep(0.3) r = requests.get(dates) data = r.json() if data["RaceDay"] is not None: file_name = data["RaceDay"]["Meetings"][0]["VenueName"] + data["RaceDay"]["MeetingDate"] + '.json' result_path = pathlib.PurePosixPath(r'C:\Users\Sayth\Projects\results', file_name) with open(result_path, 'a') as f: f.write(data) ##Output C:\Users\Sayth\Anaconda3\envs\json\python.exe C:/Users/Sayth/PycharmProjects/ubet_api_mongo/json_download.py Traceback (most recent call last): File "C:/Users/Sayth/PycharmProjects/ubet_api_mongo/json_download.py", line 40, in with open(result_path, 'a') as f: OSError: [Errno 22] Invalid argument: 'C:\\Users\\Sayth\\Projects\\results/Warwick Farm2017-09-06T00:00:00.json' Process finished with exit code 1 Not sure exactly which way to fix it. Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: pathlib PurePosixPath
> > Hi > > > > How do I create a valid file name and directory with pathlib? > > > > When I create it using PurePosixPath I end up with an OSError due to an > > obvously invlaid path being created. > > You're on Windows. The rules for POSIX paths don't apply to your file > system, and... > > > OSError: [Errno 22] Invalid argument: > > 'C:\\Users\\Sayth\\Projects\\results/Warwick Farm2017-09-06T00:00:00.json' > > ... the colon is invalid on Windows file systems. You'll have to > replace those with something else. > > ChrisA Thanks. Updated the script. But shouldn't it create the file if it doesn't exist? Which none of them will. for dates in fullUrl: # print(dates) time.sleep(0.3) r = requests.get(dates) data = r.json() if data["RaceDay"] is not None: a = data["RaceDay"]["MeetingDate"] b = a[:7] file_name = data["RaceDay"]["Meetings"][0]["VenueName"] + '_' + b + '.json' result_path = pathlib.PurePath(r'C:\Users\Sayth\Projects\results', file_name) with open(result_path, 'a') as f: f.write(data) ##Output File "C:/Users/Sayth/PycharmProjects/ubet_api_mongo/json_download.py", line 42, in with open(result_path, 'a') as f: FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\Sayth\\Projects\\results\\Warwick Farm_2017-09.json' Process finished with exit code 1 Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list
Looping on a list in json
Hi I want to get a result from a largish json api. One section of the json structure returns lists of data. I am wanting to get each resulting list returned. This is my code. import json from pprint import pprint with open(r'/home/sayth/Projects/results/Canterbury_2017-01-20.json', 'rb') as f, open('socks3.json','w') as outfile: to_read = json.load(f) print(to_read.keys()) # pprint(to_read) meet = to_read["RaceDay"]["Meetings"] meeting_id = to_read["RaceDay"]["Meetings"][0] pprint(meeting_id.keys()) # result = meeting_id["Races"][1]["RacingFormGuide"]["Event"]["Runners"] result = meeting_id["Races"] #failing for item in result: pprint(["RacingFormGuide"]["Event"]["Runners"]) The key to the issue is that result = meeting_id["Races"][0]["RacingFormGuide"]["Event"]["Runners"] result = meeting_id["Races"][1]["RacingFormGuide"]["Event"]["Runners"] result = meeting_id["Races"][2]["RacingFormGuide"]["Event"]["Runners"] the numbers though in the above results could go from 0 to 10. What is the best way to and return the data? would just save meeting_id["Races"] to my result however there are a lot of other junk dictionaries and lists I am filtering. Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Looping on a list in json
On Sunday, 5 November 2017 09:53:37 UTC+11, Cameron Simpson wrote: > >I want to get a result from a largish json api. One section of the json > >structure returns lists of data. I am wanting to get each resulting list > >returned. > > > >This is my code. > >import json > >from pprint import pprint > > > >with open(r'/home/sayth/Projects/results/Canterbury_2017-01-20.json', 'rb') > >as f, open('socks3.json','w') as outfile: > >to_read = json.load(f) > [...] > >meeting_id = to_read["RaceDay"]["Meetings"][0] > >result = meeting_id["Races"] > >#failing > >for item in result: > >pprint(["RacingFormGuide"]["Event"]["Runners"]) > > I'd just keep the interesting runners, along with their race numbers, in a > dict. The enumerate function is handy here. Something like (untested): > > runner_lists = {} > for n, item in enumerate(result): > if this one is interested/not-filtered: > runner_lists[n] = result["RacingFormGuide"]["Event"]["Runners"] > > and just return runner_lists. That way you know what the race numbers were > for > each list of runners. > > >What is the best way to and return the data? > > The basic idea is to make a small data structure of your own (just the > dictionary runner_lists in the example above) and fill it in with the > infomation you care about in a convenient and useful shape. Then just return > the data structure. > > The actual data structure will depend on what you need to do with this later. > > Cheers, Thank you. That does seem a good approach. I was intending to merge other dictionary data from other dicts within the json structure and that's where the trouble starts i guess trying to get too much from json. Thanks Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Read Firefox sqlite files with Python
On Sunday, 5 November 2017 04:32:26 UTC+11, Steve D'Aprano wrote: > I'm trying to dump a Firefox IndexDB sqlite file to text using Python 3.5. > > > import sqlite3 > con = sqlite3.connect('foo.sqlite') > with open('dump.sql', 'w') as f: > for line in con.iterdump(): > f.write(line + '\n') > > > The error I get is: > > Traceback (most recent call last): > File "", line 2, in > File "/usr/local/lib/python3.5/sqlite3/dump.py", line 30, in _iterdump > schema_res = cu.execute(q) > sqlite3.DatabaseError: file is encrypted or is not a database > > > If I open the file in a hex editor, it starts with: > > SQLite format 3 > > and although I can see a few human readable words, the bulk of the file looks > like noise. > > > > > -- > Steve > “Cheer up,” they said, “things could be worse.” So I cheered up, and sure > enough, things got worse. https://stackoverflow.com/a/18601429 Version mismatch between sqlite CLI and python sqlite API? I created again my db from the script instead of the CLI. Now insert and select work from the script, but not from the CLI. $sqlite -version returns 2.8.17, while the python version is 2.7.3. Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Looping on a list in json
> I'd just keep the interesting runners, along with their race numbers, in a > dict. The enumerate function is handy here. Something like (untested): > > runner_lists = {} > for n, item in enumerate(result): > if this one is interested/not-filtered: > runner_lists[n] = result["RacingFormGuide"]["Event"]["Runners"] > > and just return runner_lists. That way you know what the race numbers were > for > each list of runners. > > > Cheers, > Cameron Simpson The main issue is that enumerate doesn't enumerate on the lists when trying to filter. result = meeting_id["Races"] so yes enumerating this works, showing just printing n and item. runner_lists = {} for n, item in enumerate(result): # if this one is interested / not -filtered: print(n, item) 0 {'FeatureRaceBonusActive': 'Disabled', 'FixedPriceSummary': {'FixedPrices': [{'SportId': 8, 'LeagueId': 102, 'MeetingId': 1218, 'MainEventId': 650350, 'SubEventId': 3601361, 'Status': 'F', 'StatusDescription': 'FINALISED', 'BetTypeName': 'Win', 'EnablePlaceBetting': True}]}, 'RacingFormGuide': {'Copyright': . and so on it goes through the 7 items in this file. but including runner_lists = {} for n, item in enumerate(result): # if this one is interested / not -filtered: print(n, item) runner_lists[n] = result["RacingFormGuide"]["Event"]["Runners"] ## Produces Traceback (most recent call last): dict_keys(['RaceDay', 'ErrorInfo', 'Success']) File "/home/sayth/PycharmProjects/ubet_api_mongo/parse_json.py", line 31, in runner_lists[n] = result["RacingFormGuide"]["Event"]["Runners"] TypeError: list indices must be integers or slices, not str Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Looping on a list in json
Sorry figured it. Needed to use n to iterate when creating. runner_lists = {} for n, item in enumerate(result): # if this one is interested / not -filtered: print(n, item) runner_lists[n] = result[n]["RacingFormGuide"]["Event"]["Runners"] Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Looping on a list in json
no doubt tho after playing with this is that enumerate value ends up in the output which is a dictionary. The enumerate has no key which makes it invalid json if dumped. Not massive issue but getting the effect of enumerate without polluting output would be the winner. >runner_lists = {} >for n, item in enumerate(result): ># if this one is interested / not -filtered: >print(n, item) >runner_lists[n] = result[n]["RacingFormGuide"]["Event"]["Runners"] Sayth -- https://mail.python.org/mailman/listinfo/python-list
generator function - Called and accepts XML attributes- Python 3.5
Hi I want to create a generator function that supplies my calling function a file, how though do I get the generator to accept attributes in the argument when called? This works not as a generator for filename in sorted(file_list): with open(dir_path + filename) as fd: doc = xmltodict.parse(fd.read()) for item in doc['meeting']['race']: for detail in item['nomination']: print(item['@id'] + "\t" + detail['@id'] + "\t" + detail['@number'] + "\t" + detail['@horse']) And what I want to do is simplify for detail in item['nomination']: print(item['@id'] + "\t" + detail['@id'] + "\t" + detail['@number'] + "\t" + detail['@horse']) As I will have several implementations to do and would like to be able to neatly handoff to sqlalchemy. I have it as something like this but am not quite creating it correctly. def return_files(file_list): """ Take a list of files and return file when called Calling function to supply attributes """ for filename in sorted(file_list, *attribs): with open(dir_path + filename) as fd: doc = xmltodict.parse(fd.read()) for item in doc([attribs]): yield item Thanks Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: generator function - Called and accepts XML attributes- Python 3.5
> The way I'm reading your code, it's not the generator that's the > difference here. Consider these lines: > > > for item in doc['meeting']['race']: > > > > def return_files(file_list): > > for filename in sorted(file_list, *attribs): > > for item in doc([attribs]): > > Firstly, did you mean for *attribs to be part of the signature of > return_files? As it is, it's being given to sorted(), which won't work > (the other args to sorted are keyword-only). > > Assuming that to be the case, what you're trying to do is subscript an > object with a variable set of attributes: > > return_files(files, "meeting", "race') > > That can best be done with a little loop: > > def return_files(file_list, *attribs): > ... > for attr in attribs: > doc = doc[attr] > for item in doc: > > If that's not what you want, can you further clarify the question? > > ChrisA Thanks ChrisA, what I am wanting to do is to create a generator for my file input to only pull the file when the last has processed. Then have my separate functions to process the files as required as each will have different attribute combinations and processing. My thinking is that having been reading SQLAlchemy I will be creating the models to represent the tables and the relationships. Therefore if I had 3 tables for name and people attributes, another for location(meeting) details and a central table for event details, then if the generator pulls a file I can parse it with each function. Each function would update the model and then call back to the generator for another file. My concern is that as some process in each table will be longish it is better to keep that separate and just call them from a main function. Maybe the generator should just stop at parsing the file at the root XML level so that each calling function can then hook up from its node. Is that clear or a massive brain dump? Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: generator function - Called and accepts XML attributes- Python 3.5
This seems to work as a starter. def return_files(file_list): """ Take a list of files and return file when called Calling function to supply attributes """ for filename in sorted(file_list): with open(dir_path + filename) as fd: doc = xmltodict.parse(fd.read()) for item in doc['meeting']['race']: yield item my_generator = return_files(file_list) def gets_id(): for value in my_generator: for asset in value['nomination']: print(asset['@id']) gets_id() Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list
strings and ints consistency - isinstance
Hi Trying to clarify why ints and strings arent treated the same. You can get a valuerror from trying to cast a non-int to an int as in int(3.0) however you cannot do a non string with str(a). Which means that you likely should use try and except to test if a user enters a non-int with valuerror. However as you can't str () and get a valuerror you use conditional logic with strings. Therefore to try and keep with pythons only one obvious way of doing things should i prefer conditional logic for all using isinstance? That way regardless of input type my code flows the same and more explicitly states the intended type. Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: strings and ints consistency - isinstance
To answer all the good replies. I adapted a simple vector example from the Springer book practical primer on science with python. Having solved the actual problem I thought checking with the user they had the correct entries or would like to ammend them would be a good addition. This leads to the question Y or N response which isn't hard but if someone accidentally types 4, then you get where I got stuck can't test an int for ValueError if you expect a string. This was my code import sys v0 = float(input("What velocity would you like? ")) g = float(input("What gravity would you like? ")) t = float(input("What time decimal would you like? ")) print(""" We have the following inputs. v0 is %d g is %d t is %d Is this correct? [Y/n] """ % (v0, g, t)) while True: try: answer = input("\t >> ").isalpha() print(v0 * t - 0.5 * g * t ** 2) except ValueError as err: print("Not a valid entry", err.args) sys.exit() finally: print("would you like another?") break ___ When I look at this SO question it splits the votes half choose try except the other conditional logic, neither are wrong but which is the more obvious python way. https://stackoverflow.com/questions/2020598/in-python-how-should-i-test-if-a-variable-is-none-true-or-false ___ I actually thought this would have resolved my issues and still returned error if ints entered however it still passes through. answer = input("\t >> ") if isinstance(answer, str) is True: print(v0 * t - 0.5 * g * t ** 2) elif int(answer) is True: raise ValueError("Ints aren't valid input") sys.exit() else: print("Ok please ammend your entries") Thanks Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: strings and ints consistency - isinstance
This ends being the code I can use to get it to work, seems clear and pythonic, open to opinion on that :-) answer = input("\t >> ") if isinstance(int(answer), int) is True: raise ValueError("Ints aren't valid input") sys.exit() elif isinstance(answer, str) is True: print(v0 * t - 0.5 * g * t ** 2) else: print("Ok please ammend your entries") Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Data Types
> > > >> Are there any other data types that will give you type(A) or type(B) = > >> besides True and False? > > > > No types but any variable or expression containing True or False will be > > a bool type (or class bool): > > "Containing" True or False? Certainly not: > > py> type( [1, 2, True] ) > > py> type( False or 99 ) > > > Based on your examples, I think you mean any expression *evaluating to* True > or False will be a bool. Um, yeah, of course it will. Because it evaluates > to one of the only two bool values, True or False. > > > A = 10<20 > > print (type(A)) => > > That's because the value of A is True. > > > print (10<20)=> True > > print (type(10<20)) => > > 10<20 shouldn't be thought of as some alternative value which is a bool, any > more than we should think of 1+1 as being a different value to 2. > > What about 0 or 1 they are true and false like no other numbers? what category do they fall in with regards to booleans? In [6]: 0 == False Out[6]: True In [7]: 1 == True Out[7]: True In [8]: 2 == True Out[8]: False In [9]: 3 == True Out[9]: False In [10]: 3 == False Out[10]: False Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: strings and ints consistency - isinstance
> > This ends being the code I can use to get it to work, seems clear and > > pythonic, open to opinion on that :-) > > Neither clear, nor Pythonic. Sadly imo clearer than the many hack attempts on SO. > > > answer = input("\t >> ") > > Since input() returns a string in Python 3, this will always be a string. > > > > if isinstance(int(answer), int) is True: > > int(answer) will either succeed, or it will fail. If it fails, it will raise > ValueError, and your code will fail with an exception. > > If it succeeds, then it will return an int. Testing whether int(answer) > returns an int is a waste of time -- it *always* returns an int, if it > returns at all. So if it returns, it will return an int, isinstance() will > always return True, and True is True. So your code will then > > > raise ValueError("Ints aren't valid input") > > Which means your code will ALWAYS raise ValueError: > > if answer is a numeric string, like "123", then int() will succeed, the if > block will run, and ValueError is raised; > > but if answer is NOT a numeric string, like "abc", then int() will raise > ValueError. > > So we can replace your entire block of code with a single line: > > raise ValueError > > since that is the only result possible. The rest of your code is dead > code -- it cannot be executed. > > But if it could... > > > > sys.exit() > > It seems a bit harsh to exit the application just because the user types the > wrong value. Shouldn't you try again, let them type another string? > > > > elif isinstance(answer, str) is True: > > print(v0 * t - 0.5 * g * t ** 2) > > Since input() returns a string, answer is always a string, and isinstance() > will always return True. So True is True will always evaluate to True, and > the print statement with the mysterious formula will always print. > > > > else: > > print("Ok please ammend your entries") > True it failed, just actually happy to get it to fail or pass successfully on int input. Just felt it was a clearer and more consistent approach to verifying input, then most of the varied and rather inconsistent approaches I have seen in trying to get this to work. Half opt for try except the other half if else and then implement them largely differently. Every many and varied approach str2bool(), isalpha() using list with isinstance(var, [ int, str, bool]) etc. Anyway back to the old drawing board. Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list
how to append to list in list comprehension
I have a list of lists of numbers like this excerpt. ['0', '0', '0', '0'] ['0', '0', '0', '0'] ['0', '0', '0', '0'] ['0', '0', '0', '0'] ['0', '0', '0', '0'] ['0', '0', '0', '0'] ['7', '2', '1', '0', '142647', '00'] ['7', '2', '0', '1', '87080', '00'] ['6', '1', '1', '1', '51700', '00'] ['4', '1', '1', '0', '36396', '00'] I want to go threw and for each index error at [4] append a 0. I have called the lists fups. p = re.compile('\d+') fups = p.findall(nomattr['firstup']) [x[4] for x in fups if IndexError fups.append(0)] print(fups) Unsure why I cannot use append in this instance, how can I modify to acheive desired output? Desired Output ['0', '0', '0', '0', '0'] ['0', '0', '0', '0', '0'] ['0', '0', '0', '0', '0'] ['0', '0', '0', '0', '0'] ['0', '0', '0', '0', '0'] ['0', '0', '0', '0', '0'] ['7', '2', '1', '0', '142647', '00'] ['7', '2', '0', '1', '87080', '00'] ['6', '1', '1', '1', '51700', '00'] ['4', '1', '1', '0', '36396', '00'] Thanks Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: how to append to list in list comprehension
> > I want to go threw and for each index error at [4] append a 0. > > You want to append 0 if the list does not have at least 5 items? > > > p = re.compile('\d+') > > fups = p.findall(nomattr['firstup']) > > [x[4] for x in fups if IndexError fups.append(0)] > > print(fups) > > > Unsure why I cannot use append in this instance > > Because that's incorrect syntax. > > > how can I modify to acheive desired output? > > for f in fups: > if len(f) < 5: > f.append(0) > > Or, if you really want to use a list comprehension: > > [f.append(0) for f in fups if len(f) < 5] > > However there's no reason to use a list comprehension here. The whole > point of list comprehensions is to create a *new* list, which you don't > appear to need; you just need to modify the existing fups list. > > -- > John Gordon A is for Amy, who fell down the stairs B is for Basil, assaulted by bears > -- Edward Gorey, "The Gashlycrumb Tinies" You are right John in that I don't want a new list I just wish to modify in-place to acheive the desired output. I had no direct desire to use list comprehension just seemed an option. Ultimately once it works I will abstract it into a function for other lists that will have a similar issue. def listClean(fups) holder = [(f + ['0'] if len(f) < 5 else f) for f in fups ] return holder[0], holder[1], holder[2], holder[3], holder[4] and then call it in my csv.writer that I have, which currently errors quite correctly that it cannot write index[4] as some of my lists fail it. I do like [(f + ['0'] if len(f) < 5 else f) for f in fups ] Rustom, if there are better non list comprehension options I would like to know as generally I find then confusing. Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: how to append to list in list comprehension
On Saturday, 1 October 2016 14:17:06 UTC+10, Rustom Mody wrote: > On Saturday, October 1, 2016 at 9:08:09 AM UTC+5:30, Sayth Renshaw wrote: > > I do like [(f + ['0'] if len(f) < 5 else f) for f in fups ] Rustom, if > > there are better non list comprehension options I would like to know as > > generally I find then confusing. > > Two points here — best taken independently: > 1. List comprehensions are confusing > 2. When to want/not want them > > > For 1 I suggest you (privately) rewrite them with '|' for 'for' and '∈' for > 'in' > Once you do that they will start looking much more like the origin that > inspires > them — set builder notation: > https://en.wikipedia.org/wiki/Set-builder_notation > > From there I suggest you play with replacing '[]' with '{}' ie actually try > out set comprehensions and then others like dict-comprehensions — very nifty > and oft-neglected. And the mother of all — generator comprehensions. > > Of course to check it out in python you will need to invert the translation: > '|' for 'for' and '∈' for 'in' > the point of which is to use python as a kind of math assembly language > *into* which you *code* but not in which you *think* > > For 2 its important that you always keep in front of you whether you want to > approach a problem declaratively (the buzzword FP!) or imperatively. > Python is rather unique in the extent to which it allows both > This also makes it uniquely difficult because its all too easy to garble the > two styles as John's .append inside a LC illustrates. > > And the way to ungarble your head is by asking yourself the meta-question: > Should I be asking "How to solve this (sub)problem?" or more simply > "What is the (sub)problem I wish to solve?" > > How questions naturally lead to imperative answers; whats to declarative > > You may be helped with [plug!] my writings on FP: > http://blog.languager.org/search/label/FP > > Particularly the tables in: > http://blog.languager.org/2016/01/primacy.html Thank You Rustom -- https://mail.python.org/mailman/listinfo/python-list
Re: how to append to list in list comprehension
On Saturday, 1 October 2016 14:17:06 UTC+10, Rustom Mody wrote: > On Saturday, October 1, 2016 at 9:08:09 AM UTC+5:30, Sayth Renshaw wrote: > > I do like [(f + ['0'] if len(f) < 5 else f) for f in fups ] Rustom, if > > there are better non list comprehension options I would like to know as > > generally I find then confusing. > > Two points here — best taken independently: > 1. List comprehensions are confusing > 2. When to want/not want them > > > For 1 I suggest you (privately) rewrite them with '|' for 'for' and '∈' for > 'in' > Once you do that they will start looking much more like the origin that > inspires > them — set builder notation: > https://en.wikipedia.org/wiki/Set-builder_notation > > From there I suggest you play with replacing '[]' with '{}' ie actually try > out set comprehensions and then others like dict-comprehensions — very nifty > and oft-neglected. And the mother of all — generator comprehensions. > > Of course to check it out in python you will need to invert the translation: > '|' for 'for' and '∈' for 'in' > the point of which is to use python as a kind of math assembly language > *into* which you *code* but not in which you *think* > > For 2 its important that you always keep in front of you whether you want to > approach a problem declaratively (the buzzword FP!) or imperatively. > Python is rather unique in the extent to which it allows both > This also makes it uniquely difficult because its all too easy to garble the > two styles as John's .append inside a LC illustrates. > > And the way to ungarble your head is by asking yourself the meta-question: > Should I be asking "How to solve this (sub)problem?" or more simply > "What is the (sub)problem I wish to solve?" > > How questions naturally lead to imperative answers; whats to declarative > > You may be helped with [plug!] my writings on FP: > http://blog.languager.org/search/label/FP > > Particularly the tables in: > http://blog.languager.org/2016/01/primacy.html Your insight has helped. May lack elegance but I have got it working. from lxml import etree import csv import re def clean(attr): p = re.compile('\d+') myList = p.findall(attr) if len(myList) < 5: myList.append('0') return myList[0], myList[1], myList[2], myList[3], myList[4] with open("20161001RAND0.xml", 'rb') as f, open( "output/310916RABD.csv", 'w', newline='') as csvf: tree = etree.parse(f) root = tree.getroot() race_writer = csv.writer(csvf, delimiter=',') for meet in root.iter("meeting"): for race in root.iter("race"): for nom in root.iter("nomination"): meetattr = meet.attrib raceattr = race.attrib nomattr = nom.attrib if nomattr['number'] != '0': print(clean(nomattr['firstup'])) Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list
generator no iter - how do I call it from another function
Evening My file list handler I have created a generator. Before I created it as a generator I was able to use iter on my lxml root objects, now I cannot iter. ± |master U:2 ?:1 ✗| → python3 race.py data/ -e *.xml Traceback (most recent call last): File "race.py", line 83, in dataAttr(rootObs) File "race.py", line 61, in dataAttr for meet in roots.iter("meeting"): AttributeError: 'generator' object has no attribute 'iter' How do I now pull the next iterations through from the other function? from lxml import etree import csv import re import argparse import os parser = argparse.ArgumentParser() parser.add_argument("path", type=str, nargs="+") parser.add_argument( '-e', '--extension', default='', help='File extension to filter by.') # >python race.py XML_examples/ -e .xml args = parser.parse_args() name_pattern = "*" + args.extension my_dir = args.path[0] for dir_path, subdir_list, file_list in os.walk(my_dir): for name_pattern in file_list: full_path = os.path.join(dir_path, name_pattern) def return_files(file_list): """ Take a list of files and return file when called. Calling function to supply attributes """ for filename in sorted(file_list): with open(dir_path + filename) as fd: tree = etree.parse(fd) root = tree.getroot() yield root def clean(attr): """ Split list into lists of 5 elements. if list is less than 5 in length then a 0 is appended and the list returned """ p = re.compile('\d+') myList = p.findall(attr) if len(myList) < 5: myList.append('0') return myList[0], myList[1], myList[2], myList[3], myList[4] def dataAttr(roots): """Get the root object and iter items.""" with open("output/first2.csv", 'w', newline='') as csvf: race_writer = csv.writer(csvf, delimiter=',') for meet in roots.iter("meeting"): print(meet) for race in roots.iter("race"): for nom in roots.iter("nomination"): meetattr = meet.attrib raceattr = race.attrib nomattr = nom.attrib if nomattr['number'] != '0': firsts = clean(nomattr['firstup']) race_writer.writerow( [meetattr['id'], meetattr['date'], meetattr['venue'], raceattr['id'], raceattr['number'], raceattr['distance'], nomattr['id'], nomattr['barrier'], nomattr['weight'], nomattr['rating'], nomattr['description'], nomattr['dob'], nomattr['age'], nomattr['decimalmargin'], nomattr['saddlecloth'], nomattr['sex'], firsts[4]]) rootObs = return_files(file_list) dataAttr(rootObs) -- https://mail.python.org/mailman/listinfo/python-list
Re: rocket simulation game with just using tkinter
On Saturday, 1 October 2016 08:59:28 UTC+10, Irmen de Jong wrote: > Hi, > > I've made a very simple rocket simulation game, inspired by the recent > success of SpaceX > where they managed to land the Falcon-9 rocket back on a platform. > > I was curious if you can make a simple graphics animation game with just > using Tkinter, > instead of using other game libraries such as PyGame. > As it turns out, that works pretty well and it was quite easy to write. > Granted, there's > not much going on on the screen, but still the game runs very smoothly and I > think it is > fun for a little while where you try to learn to control the rocket and > attempt to > successfully land it on the other launchpad! > > The physics simulation is tied to the game's frame rate boohoo, but the > upside is that > you can change the framerate to control the game's difficulty. It's easy at > <=20, fun at > 30 and impossible at 60 :) It's running on 30 by default. > > > You can get the code here if you want to give it a try: > https://github.com/irmen/rocketsimulator > > So you just need python 2/3 with tkinter to play this! > > > Have fun > Irmen Well done. An interesting listen that might be up your alley, how-i-built-an-entire-game-and-toolchain-100-in-python on talkpython https://talkpython.fm/episodes/show/78/how-i-built-an-entire-game-and-toolchain-100-in-python Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: generator no iter - how do I call it from another function
My main issue is that usually its just x in ,,, for a generator. But if I change the code for meet in roots.iter("meeting"): to for meet in roots("meeting"): Well its invalid but I need to be able to reference the node, how do I achieve this? Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: generator no iter - how do I call it from another function
> Steve > “Cheer up,” they said, “things could be worse.” So I cheered up, and sure > enough, things got worse. Loving life. I first started with a simple with open on a file, which allowed me to use code that "iter"'s. for example: for meet in root.iter("meeting"): for race in root.iter("race"): for nom in root.iter("nomination"): meetattr = meet.attrib I then got correct output so wanted to scale up to passing a directory of files. So skipping the code that deals with getting it off the command line I implemented a generator to yield the file root object as needed. This is that code def return_files(file_list): """ Take a list of files and return file when called. Calling function to supply attributes """ for filename in sorted(file_list): with open(dir_path + filename) as fd: tree = etree.parse(fd) root = tree.getroot() yield root My question is though now that I have implemented it this way my I pull in the root via a function first few lines are def dataAttr(roots): """Get the root object and iter items.""" with open("output/first2.csv", 'w', newline='') as csvf: race_writer = csv.writer(csvf, delimiter=',') for meet in roots.iter("meeting"): which I call as rootObs = return_files(file_list) dataAttr(rootObs) So if I use a generator to pass in the root lxml object to a function how do I iter since python provides an error that iters don't exist on python objects? This is said error ± |master U:1 ?:1 ✗| → python3 race.py data/ -e *.xml Traceback (most recent call last): File "race.py", line 77, in dataAttr(rootObs) File "race.py", line 55, in dataAttr for meet in roots.iter("meeting"): AttributeError: 'generator' object has no attribute 'iter' Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: RASTER analysis(slope)
On Sunday, 2 October 2016 04:52:13 UTC+11, Xristos Xristoou wrote: > hello team > > i want to calculate slope and aspect from some RASTER > IMAGE(.grid,tiff,geotiff) > who is the better method to can i do this ? > with numpy and scipy or with some package? > > > thnx you team I don't know much about the topic however pyDem seems like the appropriate library to use. https://pypi.python.org/pypi/pyDEM/0.1.1 Here is a standford article on it https://pangea.stanford.edu/~samuelj/musings/dems-in-python-pt-3-slope-and-hillshades-.html and a pdf from scipy conference. https://conference.scipy.org/proceedings/scipy2015/pdfs/mattheus_ueckermann.pdf Sayth -- https://mail.python.org/mailman/listinfo/python-list
inplace text filter - without writing file
Hi I have a fileobject which was fine however now I want to delete a line from the file object before yielding. def return_files(file_list): for filename in sorted(file_list): with open(dir_path + filename) as fd: for fileItem in fd: yield fileItem Ned gave an answer over here http://stackoverflow.com/a/6985814/461887 for i, line in enumerate(input_file): if i == 0 or not line.startswith('#'): output.write(line) which I would change because it is the first line and I want to rid
Re: inplace text filter - without writing file
Thank you -- https://mail.python.org/mailman/listinfo/python-list
Re: inplace text filter - without writing file
On Sunday, 2 October 2016 12:14:43 UTC+11, MRAB wrote: > On 2016-10-02 01:21, Sayth Renshaw wrote: > > Hi > > > > I have a fileobject which was fine however now I want to delete a line from > > the file object before yielding. > > > > def return_files(file_list): > > for filename in sorted(file_list): > > When joining paths together, it's better to use 'os.path.join'. > > > with open(dir_path + filename) as fd: > > for fileItem in fd: > > yield fileItem > > > > Ned gave an answer over here http://stackoverflow.com/a/6985814/461887 > > > > for i, line in enumerate(input_file): > > if i == 0 or not line.startswith('#'): > > output.write(line) > > > > which I would change because it is the first line and I want to rid
Re: inplace text filter - without writing file
On Sunday, 2 October 2016 16:19:14 UTC+11, Sayth Renshaw wrote: > On Sunday, 2 October 2016 12:14:43 UTC+11, MRAB wrote: > > On 2016-10-02 01:21, Sayth Renshaw wrote: > > > Hi > > > > > > I have a fileobject which was fine however now I want to delete a line > > > from the file object before yielding. > > > > > > def return_files(file_list): > > > for filename in sorted(file_list): > > > > When joining paths together, it's better to use 'os.path.join'. > > > > > with open(dir_path + filename) as fd: > > > for fileItem in fd: > > > yield fileItem > > > > > > Ned gave an answer over here http://stackoverflow.com/a/6985814/461887 > > > > > > for i, line in enumerate(input_file): > > > if i == 0 or not line.startswith('#'): > > > output.write(line) > > > > > > which I would change because it is the first line and I want to rid
Create a map for data to flow through
Is there a standard library feature that allows you to define a declarative map or statement that defines the data and its objects to be parsed and output format? Just wondering as for loops are good but when i end up 3-4 for loops deep and want multiple matches at each level i am finding it harder to manage. I am reading here https://docs.python.org/3/howto/functional.html Is there a better way than loops on loops on loops etc? Thinking that for loop is quick at the start but there probably is a more direct way which while slower may be clearer over the long run. Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list
scipy tutorial question
I would ask on scipy mailing list as it may provide a better response. https://www.scipy.org/scipylib/mailing-lists.html Sayth -- https://mail.python.org/mailman/listinfo/python-list
lxml ignore none in getchildren
I am following John Shipmans example from http://infohost.nmt.edu/~shipman/soft/pylxml/web/Element-getchildren.html >>> xml = ''' ... ''' >>> pen = etree.fromstring(xml) >>> penContents = pen.getchildren() >>> for content in penContents: ... print "%-10s %3s" % (content.tag, content.get("n", "0")) ... horse2 cow 17 cowboy 2 >>> If I make one minor modification to the xml and change an n to an m as in my example below the getchildren will return none for none matches, how can I ignore nones? In [2]: from lxml import etree In [3]: xml = ''' ''' In [4]: pen =etree.fromstring(xml) In [5]: pencontents = pen.getchildren() In [6]: for content in pencontents: ...: print(content.get('n')) 2 None 2 Because In [17]: for content in pencontents: : if content is not None: : print(content.get('n')) Sayth -- https://mail.python.org/mailman/listinfo/python-list
Generator comprehension - list None
I was solving a problem to create a generator comprehension with 'Got ' and a number for each in range 10. This I did however I also get a list of None. I don't understand where none comes from. Can you please clarify? a = (print("Got {0}".format(num[0])) for num in enumerate(range(10))) # for item in a: # print(item) b = list(a) print(b) Output Got 0 Got 1 Got 2 Got 3 Got 4 Got 5 Got 6 Got 7 Got 8 Got 9 [None, None, None, None, None, None, None, None, None, None] => None Thanks Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Generator comprehension - list None
Thank you quite easy, was trying to work around it in the generator, the print needs to be outside the generator to avoid the collection of "None". Wasn't really liking comprehensions though python 3 dict comprehensions are a really nice utility. Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: need help for an assignment plz noob here
> > > Hello guys. so my assignment consists in creating a key generator so i can > > use it in later steps of my work. In my first step i have to write a > > function called key_generator that receives an argument, letters, that > > consists in a tuple of 25 caracters. The function returns a tuple of 5 > > tuples of caracters, each tuple with 5 elements. > > Sounds confusing right? well I have the final product but i dont know how > > to get there. I was told by my professor that I dont really have to use any > > complicated knowledge of python so I have to keep it simple. Can u help me > > plz? just give me a hint how to do it and i'll do everything plz. > > > > example: > > Python Shell > > > letters = (‘A’, ‘B’, ‘C’, ‘D’, ‘E’, ‘F’, ‘G’, \ > > ... ‘H’, ‘I’, ‘J’, ‘ ’, ‘L’, ‘M’, ‘N’, \ > > ... ‘O’, ‘P’, ‘Q’, ‘R’, ‘S’, ‘T’, ‘U’, \ > > ... ‘V’, ‘X’, ‘Z’, ‘.’) > key_generator(letters) > > ... ((‘A’, ‘B’, ‘C’, ‘D’, ‘E’), > > ... (‘F’, ‘G’, ‘H’, ‘I’, ‘J’), > > ... (‘ ’, ‘L’, ‘M’, ‘N’, ‘O’), > > ... (‘P’, ‘Q’, ‘R’, ‘S’, ‘T’), > > ... (‘U’, ‘V’, ‘X’, ‘Z’, ‘.’)) > > Start by playing with letters in the shell and seeing what you can do > with it. The word you're looking for is 'slices'. Figure out > interactively what you need to do, then write it up. I would a assume a key generator to be random so this may help. http://stackoverflow.com/a/306417 Also about tuples http://openbookproject.net/thinkcs/python/english3e/tuples.html * from the link Of course, even if we can’t modify the elements of a tuple, we can always make the julia variable reference a new tuple holding different information. To construct the new tuple, it is convenient that we can slice parts of the old tuple and join up the bits to make the new tuple. So if julia has a new recent film, we could change her variable to reference a new tuple that used some information from the old one: >>> julia = julia[:3] + ("Eat Pray Love", 2010) + julia[5:] >>> julia ("Julia", "Roberts", 1967, "Eat Pray Love", 2010, "Actress", "Atlanta, Georgia") Sayth -- https://mail.python.org/mailman/listinfo/python-list
Inplace shuffle function returns none
If shuffle is an "in place" function and returns none how do i obtain the values from it. from random import shuffle a = [1,2,3,4,5] b = shuffle(a) print(b[:3]) For example here i just want to slice the first 3 numbers which should be shuffled. However you can't slice a noneType object that b becomes. So how do i get shuffle to give me my numbers? Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Inplace shuffle function returns none
So why can't i assign the result slice to a variable b? It just keeps getting none. Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Inplace shuffle function returns none
Ok i think i do understand it. I searched the python document for in-place functions but couldn't find a specific reference. Is there a particular part in docs or blog that covers it? Or is it fundamental to all so not explicitly treated in one particular page? Thanks Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Inplace shuffle function returns none
Hi Chris I read this last night and thought i may have woken with a frightfully witty response. I didnt however. Thanks :-) -- https://mail.python.org/mailman/listinfo/python-list
Re: HTML templating tools
On Thursday, 20 October 2016 20:51:56 UTC+11, Tony van der Hoff wrote: > For a long time, under Python 2.7, I have been using htmltmpl to > generate htmjl pages programmatically from Python. > > However, htmltmpl is not available for python3, and doesn't look as if > it ever will be. Can anyone recommend a suitable replacement (preferably > compatible with htmltmpl)? > > Cheers, Tony > > -- > Ariège, France | I had not heard of that googling it seems it came from Perl to Python. There is the smartypants https://bitbucket.org/livibetter/smartypants.py version for python, however I will show you one I like mainly because I don't like jinja and it seems more stylish and clear to me. not on many wiki's yet but should be is a python port of jade which I think is probably the best template language. https://github.com/syrusakbary/pyjade Sayth -- https://mail.python.org/mailman/listinfo/python-list
windows utf8 & lxml
Hi I have been trying to get a script to work on windows that works on mint. The key blocker has been utf8 errors, most of which I have solved. Now however the last error I am trying to overcome, the solution appears to be to use the .decode('windows-1252') to correct an ascii error. I am using lxml to read my content and decode is not supported are there any known ways to read with lxml and fix unicode faults? The key part of my script is for content in roots: utf8_parser = etree.XMLParser(encoding='utf-8') fix_ascii = utf8_parser.decode('windows-1252') mytree = etree.fromstring( content.read().encode('utf-8'), parser=fix_ascii) Without the added .decode my code looks like for content in roots: utf8_parser = etree.XMLParser(encoding='utf-8') mytree = etree.fromstring( content.read().encode('utf-8'), parser=utf8_parser) However doing it in such a fashion returns this error: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte Which I found this SO for http://stackoverflow.com/a/29217546/461887 but cannot seem to implement with lxml. Ideas? Sayth -- https://mail.python.org/mailman/listinfo/python-list
windows utf8 & lxml
Possibly i will have to use a different method from lxml like this. http://stackoverflow.com/a/29057244/461887 Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: windows utf8 & lxml
On Tuesday, 20 December 2016 22:54:03 UTC+11, Sayth Renshaw wrote: > Hi > > I have been trying to get a script to work on windows that works on mint. The > key blocker has been utf8 errors, most of which I have solved. > > Now however the last error I am trying to overcome, the solution appears to > be to use the .decode('windows-1252') to correct an ascii error. > > I am using lxml to read my content and decode is not supported are there any > known ways to read with lxml and fix unicode faults? > > The key part of my script is > > for content in roots: > utf8_parser = etree.XMLParser(encoding='utf-8') > fix_ascii = utf8_parser.decode('windows-1252') > mytree = etree.fromstring( > content.read().encode('utf-8'), parser=fix_ascii) > > Without the added .decode my code looks like > > for content in roots: > utf8_parser = etree.XMLParser(encoding='utf-8') > mytree = etree.fromstring( > content.read().encode('utf-8'), parser=utf8_parser) > > However doing it in such a fashion returns this error: > > UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: > invalid start byte > Which I found this SO for http://stackoverflow.com/a/29217546/461887 but > cannot seem to implement with lxml. > > Ideas? > > Sayth Why is windows so hard. Sort of running out of ideas, tried methods in the docs SO etc. Currently for xml_data in roots: parser_xml = etree.XMLParser() mytree = etree.parse(xml_data, parser_xml) Returns C:\Users\Sayth\Anaconda3\envs\race\python.exe C:/Users/Sayth/PycharmProjects/bs4race/race.py data/ -e *.xml Traceback (most recent call last): File "C:/Users/Sayth/PycharmProjects/bs4race/race.py", line 100, in data_attr(rootObs) File "C:/Users/Sayth/PycharmProjects/bs4race/race.py", line 55, in data_attr mytree = etree.parse(xml_data, parser_xml) File "src/lxml/lxml.etree.pyx", line 3427, in lxml.etree.parse (src\lxml\lxml.etree.c:81110) File "src/lxml/parser.pxi", line 1832, in lxml.etree._parseDocument (src\lxml\lxml.etree.c:118109) File "src/lxml/parser.pxi", line 1852, in lxml.etree._parseFilelikeDocument (src\lxml\lxml.etree.c:118392) File "src/lxml/parser.pxi", line 1747, in lxml.etree._parseDocFromFilelike (src\lxml\lxml.etree.c:117180) File "src/lxml/parser.pxi", line 1162, in lxml.etree._BaseParser._parseDocFromFilelike (src\lxml\lxml.etree.c:111907) File "src/lxml/parser.pxi", line 595, in lxml.etree._ParserContext._handleParseResultDoc (src\lxml\lxml.etree.c:105102) File "src/lxml/parser.pxi", line 702, in lxml.etree._handleParseResult (src\lxml\lxml.etree.c:106769) File "src/lxml/lxml.etree.pyx", line 324, in lxml.etree._ExceptionContext._raise_if_stored (src\lxml\lxml.etree.c:12074) File "src/lxml/parser.pxi", line 373, in lxml.etree._FileReaderContext.copyToBuffer (src\lxml\lxml.etree.c:102431) io.UnsupportedOperation: read Process finished with exit code 1 Thoughts? Sayth -- https://mail.python.org/mailman/listinfo/python-list
for loop iter next if file bad
Hi I am looping a list of files and want to skip any empty files. I get an error that str is not an iterator which I sought of understand but can't see a workaround for. How do I make this an iterator so I can use next on the file if my test returns true. Currently my code is. for dir_path, subdir_list, file_list in os.walk(my_dir): for name_pattern in file_list: full_path = os.path.join(dir_path, name_pattern) def return_files(file_list): """ Take a list of files and return file when called. Calling function to supply attributes """ for file in file_list: with open(os.path.join(dir_path, file), 'rb') as fd: if os.stat(fd.name).st_size == 0: next(file) else: yield fd Exact error is: C:\Users\Sayth\Anaconda3\envs\race\python.exe C:/Users/Sayth/PycharmProjects/bs4race/race.py data/ -e *.xml Traceback (most recent call last): File "C:/Users/Sayth/PycharmProjects/bs4race/race.py", line 98, in data_attr(rootObs) File "C:/Users/Sayth/PycharmProjects/bs4race/race.py", line 51, in data_attr for xml_data in roots: File "C:/Users/Sayth/PycharmProjects/bs4race/race.py", line 32, in return_files next(file) TypeError: 'str' object is not an iterator Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: for loop iter next if file bad
Ah yes. Thanks ChrisA http://www.tutorialspoint.com/python/python_loop_control.htm The continue Statement: The continue statement in Python returns the control to the beginning of the while loop. The continue statement rejects all the remaining statements in the current iteration of the loop and moves the control back to the top of the loop. The continue statement can be used in both while and for loops. Sayth -- https://mail.python.org/mailman/listinfo/python-list
Screwing Up looping in Generator
Hi This is simple, but its getting me confused. I have a csv writer that opens a file and loops each line of the file for each file and then closes, writing one file. I want to alter the behaviour to be a written file for each input file. I saw a roundrobin example however it failed for me as you cannot get len(generator) to use a while loop on. it exhausts should I use the same for again after the with open? rootobs in this code is my generator and I am already looping it however def data_attr(roots): """Get the root object and iter items.""" for file in rootobs: base = os.path.basename(file.name) write_to = os.path.join("output", os.path.splitext(base)[0] + ".csv") with open(write_to, 'w', newline='') as csvf: race_writer = csv.writer(csvf, delimiter=',') race_writer.writerow( ["meet_id", "meet_venue", "meet_date", "meet_rail", ... # other categories here ... "jockeysurname", "jockeyfirstname"]) for xml_data in roots: ... # parsing code for noms in race_child: if noms.tag == 'nomination': race_writer.writerow( [meet_id, meet_venue, meet_date, ... #parsing info removed noms.get("jockeyfirstname")]) Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Screwing Up looping in Generator
So can I call the generator twice and receive the same file twice in 2 for loops? Once to get the files name and the second to process? for file in rootobs: base = os.path.basename(file.name) write_to = os.path.join("output", os.path.splitext(base)[0] + ".csv") with open(write_to, 'w', newline='') as csvf: for file in rootobs: # create and write csv Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Screwing Up looping in Generator
On Wednesday, 4 January 2017 12:36:10 UTC+11, Sayth Renshaw wrote: > So can I call the generator twice and receive the same file twice in 2 for > loops? > > Once to get the files name and the second to process? > > for file in rootobs: > base = os.path.basename(file.name) > write_to = os.path.join("output", os.path.splitext(base)[0] + ".csv") > with open(write_to, 'w', newline='') as csvf: > for file in rootobs: > # create and write csv > > Cheers > > Sayth I just need it to write after each file however the with open(#file) as csvf: Keeps it all open until every file processed in an output file with the name of the first file in the generator. Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Screwing Up looping in Generator
Untested as i wrote this in notepad at work but, if i first use the generator to create a set of filenames and then iterate it then call the generator anew to process file may work? Good idea or better available? def get_list_of_names(generator_arg): name_set = set() for name in generator_arg: base = os.path.basename(name.name) filename = os.path.splitext(base)[0] name_set.add(filename) return name_set for file_name in name_set: directory = "output" with open(os.path.join(directory, filename, 'w', newline='') as csvf: for file in rootobs: # create and write csv Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Screwing Up looping in Generator
For completeness I was close this is the working code. def get_list_of_names(generator_arg): name_set = set() for name in generator_arg: base = os.path.basename(name.name) filename = os.path.splitext(base)[0] name_set.add(filename) return name_set def data_attr(roots): """Get the root object and iter items.""" for name in names: directory = "output" write_name = name + ".csv" with open(os.path.join(directory, write_name), 'w', newline='') as csvf: race_writer = csv.writer(csvf, delimiter=',' ) thanks for your time and assistance. It's much appreciated Sayth -- https://mail.python.org/mailman/listinfo/python-list
Is there a good process or library for validating changes to XML format?
Afternoon Is there a good library or way I could use to check that the author of the XML doc I am using doesn't make small changes to structure over releases? Not fully over this with XML but thought that XSD may be what I need, if I search "python XSD" I get a main result for PyXB and generateDS (https://pythonhosted.org/generateDS/). Both seem to be libraries for generating bindings to structures for parsing so maybe I am searching the wrong thing. What is the right thing to search? Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Is there a good process or library for validating changes to XML format?
It definitely has more features than i knew http://xmlsoft.org/xmllint.html Essentially thigh it appears to be aimed at checking validity and compliance of xml. I why to check the structure of 1 xml file against the previous known structure to ensure there are no changes. Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Screwing Up looping in Generator
On Wednesday, 4 January 2017 12:36:10 UTC+11, Sayth Renshaw wrote: > So can I call the generator twice and receive the same file twice in 2 for loops? > > Once to get the files name and the second to process? > > for file in rootobs: > base = os.path.basename(file.name) > write_to = os.path.join("output", os.path.splitext(base)[0] + ".csv") > with open(write_to, 'w', newline='') as csvf: > for file in rootobs: > # create and write csv > > Cheers > > Sayth I just need it to write after each file however the with open(#file) as csvf: Keeps it all open until every file processed in an output file with the name of the first file in the generator. Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Screwing Up looping in Generator
So can I call the generator twice and receive the same file twice in 2 for loops? Once to get the files name and the second to process? for file in rootobs: base = os.path.basename(file.name) write_to = os.path.join("output", os.path.splitext(base)[0] + ".csv") with open(write_to, 'w', newline='') as csvf: for file in rootobs: # create and write csv Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Screwing Up looping in Generator
Untested as i wrote this in notepad at work but, if i first use the generator to create a set of filenames and then iterate it then call the generator anew to process file may work? Good idea or better available? def get_list_of_names(generator_arg): name_set = set() for name in generator_arg: base = os.path.basename(name.name) filename = os.path.splitext(base)[0] name_set.add(filename) return name_set for file_name in name_set: directory = "output" with open(os.path.join(directory, filename, 'w', newline='') as csvf: for file in rootobs: # create and write csv Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Screwing Up looping in Generator
For completeness I was close this is the working code. def get_list_of_names(generator_arg): name_set = set() for name in generator_arg: base = os.path.basename(name.name) filename = os.path.splitext(base)[0] name_set.add(filename) return name_set def data_attr(roots): """Get the root object and iter items.""" for name in names: directory = "output" write_name = name + ".csv" with open(os.path.join(directory, write_name), 'w', newline='') as csvf: race_writer = csv.writer(csvf, delimiter=',' ) thanks for your time and assistance. It's much appreciated Sayth -- https://mail.python.org/mailman/listinfo/python-list
Is there a good process or library for validating changes to XML format
Afternoon Is there a good library or way I could use to check that the author of the XML doc I am using doesn't make small changes to structure over releases? Not fully over this with XML but thought that XSD may be what I need, if I search "python XSD" I get a main result for PyXB and generateDS (https://pythonhosted.org/generateDS/). Both seem to be libraries for generating bindings to structures for parsing so maybe I am searching the wrong thing. What is the right thing to search? Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Data Integrity Parsing json
On Thursday, 26 April 2018 07:57:28 UTC+10, Paul Rubin wrote: > Sayth Renshaw writes: > > What I am trying to figure out is how I give myself surety that the > > data I parse out is correct or will fail in an expected way. > > JSON is messier than people think. Here's an article with some > explanation and test code: > > http://seriot.ch/parsing_json.php Thanks Paul there is a lot of good information in that article. Sayth -- https://mail.python.org/mailman/listinfo/python-list
Format list of list sub elements keeping structure.
I have data which is a list of lists of all the full paths in a json document. How can I change the format to be usable when selecting elements? data = [['glossary'], ['glossary', 'title'], ['glossary', 'GlossDiv'], ['glossary', 'GlossDiv', 'title'], ['glossary', 'GlossDiv', 'GlossList'], ['glossary', 'GlossDiv', 'GlossList', 'GlossEntry'], ['glossary', 'GlossDiv', 'GlossList', 'GlossEntry', 'ID'], ['glossary', 'GlossDiv', 'GlossList', 'GlossEntry', 'SortAs'], ['glossary', 'GlossDiv', 'GlossList', 'GlossEntry', 'GlossTerm'], ['glossary', 'GlossDiv', 'GlossList', 'GlossEntry', 'Acronym'], ['glossary', 'GlossDiv', 'GlossList', 'GlossEntry', 'Abbrev'], ['glossary', 'GlossDiv', 'GlossList', 'GlossEntry', 'GlossDef'], ['glossary', 'GlossDiv', 'GlossList', 'GlossEntry', 'GlossDef', 'para'], ['glossary', 'GlossDiv', 'GlossList', 'GlossEntry', 'GlossDef', 'GlossSeeAlso'], ['glossary', 'GlossDiv', 'GlossList', 'GlossEntry', 'GlossDef', 'GlossSeeAlso', 0], ['glossary', 'GlossDiv', 'GlossList', 'GlossEntry', 'GlossDef', 'GlossSeeAlso', 1], ['glossary', 'GlossDiv', 'GlossList', 'GlossEntry', 'GlossSee']] I am trying to change it to be. [['glossary'], ['glossary']['title'], ['glossary']['GlossDiv'], ] Currently when I am formatting I am flattening the structure(accidentally). for item in data: for elem in item: out = ("[{0}]").format(elem) print(out) Which gives [glossary] [title] [GlossDiv] [title] [GlossList] [GlossEntry] [ID] [SortAs] [GlossTerm] [Acronym] [Abbrev] [GlossDef] [para] [GlossSeeAlso] [0] [1] [GlossSee] Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Format list of list sub elements keeping structure.
> > > > for item in data: > > for elem in item: > > out = ("[{0}]").format(elem) > > print(out) > > Hint: print implicitly adds a newline to the output string. So collect all > the values of each sublist and print a line-at-time to output, or use the > end= argument of Py3's print, or find another solution. Also remember that > indention is significant in Python. Thanks Getting closer. answer = [] for item in data: for elem in item: out = ("[{0}]").format(elem) answer.append(out) print(answer) Think I need to bring it in a list not an element of a list and process it. Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Format list of list sub elements keeping structure.
I am very close to the end result. I now have it as Output [ ['[glossary]'], ['[glossary]', '[title]'], ['[glossary]', '[GlossDiv]'], ['[glossary]', '[GlossDiv]', '[title]'], ['[glossary]', '[GlossDiv]', '[GlossList]'], ['[glossary]', '[GlossDiv]', '[GlossList]', '[GlossEntry]'], .] I used. elements = [['[{0}]'.format(element) for element in elements]for elements in data] Is there a good way to strip the ', ' and make the list a list of 1 element lists? Thoughts on achieving this? So [ ['[glossary]'], ['[glossary][title]'], ] Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Format list of list sub elements keeping structure.
On Tuesday, 24 July 2018 14:25:48 UTC+10, Rick Johnson wrote: > Sayth Renshaw wrote: > > > elements = [['[{0}]'.format(element) for element in elements]for elements > > in data] > > I would suggest you avoid list comprehensions until you master long-form > loops. My general issue is that I want to pick up all the elements in each sublist and operate on them to concatenate them together. However, using for loops I get each list then each element of each list. When in my head I want each list then all elements of each list. Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Format list of list sub elements keeping structure.
On Tuesday, 24 July 2018 14:25:48 UTC+10, Rick Johnson wrote: > Sayth Renshaw wrote: > > > elements = [['[{0}]'.format(element) for element in elements]for elements > > in data] > > I would suggest you avoid list comprehensions until you master long-form > loops. I actually have the answer except for a glitch where on list element is an int. My code for item in data: out = '[{0}]'.format("][".join(item)) print(out) which prints out [glossary] [glossary][title] [glossary][GlossDiv] [glossary][GlossDiv][title] [glossary][GlossDiv][GlossList] However, in my source I have two lines like this ['glossary', 'GlossDiv', 'GlossList', 'GlossEntry', 'GlossDef', 'GlossSeeAlso', 0], ['glossary', 'GlossDiv', 'GlossList', 'GlossEntry', 'GlossDef', 'GlossSeeAlso', 1], when it hits these lines I get TypeError: sequence item 6: expected str instance, int found Do I need to do an explicit check for these 2 cases or is there a simpler way? Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Format list of list sub elements keeping structure.
> myjson = ... > path = "['foo']['bar'][42]" > print(eval("myjson" + path)) > > ? > > Wouldn't it be better to keep 'data' as is and use a helper function like > > def get_value(myjson, path): > for key_or_index in path: > myjson = myjson[key_or_index] > return myjson > > path = ['foo', 'bar', 42] > print(get_value(myjson, path)) > > ? Currently I do leave the data I extract the keys out as a full path. If I use pprint as suggested I get close. ['glossary'], ['glossary', 'title'], ['glossary', 'GlossDiv'], ['glossary', 'GlossDiv', 'title'], ['glossary', 'GlossDiv', 'GlossList'], ['glossary', 'GlossDiv', 'GlossList', 'GlossEntry'], ['glossary', 'GlossDiv', 'GlossList', 'GlossEntry', 'ID'], ['glossary', 'GlossDiv', 'GlossList', 'GlossEntry', 'SortAs'], ['glossary', 'GlossDiv', 'GlossList', 'GlossEntry', 'GlossTerm'], ['glossary', 'GlossDiv', 'GlossList', 'GlossEntry', 'Acronym'], ...] But to select elements from the json I need the format json['elem1']['elem2] . I want to be able to get any json in future and parse it into my function and return a list of all json key elements. Then using this cool answer on SO https://stackoverflow.com/a/14692747/461887 from functools import reduce # forward compatibility for Python 3 import operator def getFromDict(dataDict, mapList): return reduce(operator.getitem, mapList, dataDict) def setInDict(dataDict, mapList, value): getFromDict(dataDict, mapList[:-1])[mapList[-1]] = value Then get the values from the keys >>> getFromDict(dataDict, ["a", "r"]) 1 That would mean I could using my function if I get it write be able to feed it any json, get all the full paths nicely printed and then feed it back to the SO formula and get the values. It would essentially self process itself and let me get a summary of all keys and their data. Thanks Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Format list of list sub elements keeping structure.
> > Then using this cool answer on SO [...] > > Oh. I thought you wanted to learn how to solve problems. I had no idea you > were auditioning for the James Dean part. My bad. Awesome response burn lol. I am trying to solve problems. Getting tired of dealing with JSON and having to figure out the structure each time. Just want to automate that part so I can move through the munging part and spend more time on higher value tasks. Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Format list of list sub elements keeping structure.
> > Well, your code was close. All you needed was a little tweak > to make it work like you requested. So keep working at it, > and if you have a specific question, feel free to ask on the > list. > > Here's a tip. Try to simplify the problem. Instead of > looping over a list of lists, and then attempting to do a > format in the middle of an iteration, a format that you > really don't know how to do in a vacuum (no pressure, > right???), pull one of the sublists out and try to format it > _first_. IOWs: isolate the problem. > > And, when you can format _one_ list the way you want -- > spoiler alert! -- you can format an infinite number of lists > the way you want. Loops are cool like that. Well, most of > the time. > > The key to solving most complex problems is to (1) break > them down into small parts, (2) solve each small part, and > (3) assemble the whole puzzle. This is a skill you must > master. And it's really not difficult. It just requires a > different way of thinking about tasks. Thank you Rick, good advice. I really am enjoying coding at the moment, got myself and life in a good headspace. Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list
Use a function arg in soup
Hi. I want to use a function argument as an argument to a bs4 search for attributes. I had this working not as a function # noms = soup.findAll('nomination') # nom_attrs = [] # for attr in soup.nomination.attrs: # nom_attrs.append(attr) But as I wanted to keep finding other elements I thought I would turn it into a generator function. def attrKey(element): for attr in soup.element.attrs: yield attr results in a Nonetype as soup.element.attrs is passed and the attribute isn't substituted. # Attempt 2 def attrKey(element): for attr in "soup.{}.attrs".format(element): yield attr # Attempt 3 def attrKey(element): search_attr = "soup.{}.attrs".format(element) for attr in search_attr: yield attr so I would call it like attrKey('nomination') Any ideas on how the function argument can be used as the search attribute? Thanks Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Use a function arg in soup
Thanks Peter ### (2) attrs is a dict, so iterating over it will lose the values. Are you sure you want that? ### Yes initially I want just the keys as a list so I can choose to filter them out to only the ones I want. # getAttr Thanks very much will get my function up and working. Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list
Advice on Python build tools
Hi Looking at the wiki list of build tools https://wiki.python.org/moin/ConfigurationAndBuildTools Has anyone much experience in build tools as i have no preference or experience to lean on. Off descriptions only i would choose invoke. My requirements, simply i want to learn and build a simple static website generator. Many i am not liking design of or are overkill so its a good opportunity to learn, logya is a good starting point for what i think a good python static generator should be. Second i want to use Jade templates (js) as i think they are more pythonic than jinja and mako so being able to have mixed js and python support would be needed. Thoughts? Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: OT: Anyone here use the ConEmu console app?
Win 10 will have full bash provided by project between Ubuntu and MS so that's pretty cool Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Advice on Python build tools
On Tuesday, 12 April 2016 19:48:43 UTC+10, Sayth Renshaw wrote: > Hi > > Looking at the wiki list of build tools > https://wiki.python.org/moin/ConfigurationAndBuildTools > > Has anyone much experience in build tools as i have no preference or > experience to lean on. > > Off descriptions only i would choose invoke. > > My requirements, simply i want to learn and build a simple static website > generator. Many i am not liking design of or are overkill so its a good > opportunity to learn, logya is a good starting point for what i think a good > python static generator should be. > > Second i want to use Jade templates (js) as i think they are more pythonic > than jinja and mako so being able to have mixed js and python support would > be needed. > > Thoughts? > > Sayth Just to add if would affect your opinion I will be using Python 3 only so no py2 dependency and this is the project logya I was referring to https://github.com/yaph/logya Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Advice on Python build tools
Thanks for the tips. Doit does look interesting. Regarding template plugins with Nikola the plugins would be only for python template alternatives such as mako. Mainly i find the whitespace and readability of Jade/pug far more pythonic than all tge brackets {% %} yes its a minor thing but so much clearer. Anyway checked out mako which has some improvement might see if there is another with support and create a nikola plugin and then give it a try. Cheers Sayth On Thu, 14 Apr 2016 1:19 am Chris Warrick wrote: > On 12 April 2016 at 11:48, Sayth Renshaw wrote: > > Hi > > > > Looking at the wiki list of build tools > > https://wiki.python.org/moin/ConfigurationAndBuildTools > > > > Has anyone much experience in build tools as i have no preference or > experience to lean on. > > > > Off descriptions only i would choose invoke. > > > > My requirements, simply i want to learn and build a simple static > website generator. Many i am not liking design of or are overkill so its a > good opportunity to learn, logya is a good starting point for what i think > a good python static generator should be. > > > > Second i want to use Jade templates (js) as i think they are more > pythonic than jinja and mako so being able to have mixed js and python > support would be needed. > > > > Thoughts? > > > > Sayth > > -- > > https://mail.python.org/mailman/listinfo/python-list > > Here’s a great static site generator (disclaimer, I’m a core dev over > there): > > https://getnikola.com/ > > We use doit, which is on that list. With doit, we get an existing > build system, and incremental rebuilds — for free. I recommend you try > Nikola, and if you don’t like it and still want to build something > yourself, doit is going to be a great way to do it. That said, > incremental builds often involve trial-and-error and subtle bugs when > you start working on it. And if you don’t like doit, you can always > write your own build micro-system. Because if you want to write > something simple and minimal, an existing large build system will just > make things harder. > > As for Jade templates, you can’t do that reasonably. You would need to > produce some hack to spawn a JavaScript subprocess, and it would limit > what you can use in templates. Instead, look for a template system > that is written in Python and that has similar syntax. > > (also, I wouldn’t consider such weird-thing-into-real-HTML template > engines pythonic) > > -- > Chris Warrick <https://chriswarrick.com/> > PGP: 5EAAEA16 > -- https://mail.python.org/mailman/listinfo/python-list
Create a forecast estimate updated with actuals weekly
Hi Wondering if someone has this knowledge, and please forgive my maths expressions. If I want to estimate need results to achieve a goal at the end of a term updated weekly with real results how would I structure this? So as an example to illustrate my thought process(which could be wrong) These are Bills results for the first to weeks. Bills Goal = 60% after 5 weeks. wk1 wk2 wk3 wk4 wk5 Bill54.5% 57.1% So say these are the results that get it to current. wk1 wk2 get opp get opp 6 11 4 7 So 6/11 etc I am thinking to achieve this I then need to estimate an average opportunity rate rounded up. (11 + 7)/2 = 9 so if we had a 5 week term wk3 wk4 wk5 get avgopp get avgopp get avgopp X 9 X 9 X 9 So doing it manually I would discover Bill needs 6 out of 9 each week, which results in: sumget 28 0.6 result sumopp 45 But how do I structure this so that the new results when known for week 3 update and adjust the following estimates? Thanks Sayth -- https://mail.python.org/mailman/listinfo/python-list
What iterable method should I use for Lists of Lists
Hi I have an XML and using pyquery to obtain the elements within it and then write it to csv. What is the best most reliable way to take dictionaries of each element, and print them(csv write later) based on each position so get item 0 of each list and then it 1 and so on. Any other code I post is open to criticism. Because there are many attributes I will want to collect my thought is to create a list of lists, again seems a bit clunky so could be wrong. from pyquery import PyQuery as pq d = pq(filename='20160319RHIL0_edit.xml') res = d('nomination') # myAt = pq.each(res.attr('bbid')) # print(repr(res)) # myAt = [res.eq(i).attr('horse') for i in range(len(res))] # print(myAt) nomID = [res.eq(i).attr('horse') for i in range(len(res))] horseName = [res.eq(i).attr('horse') for i in range(len(res))] group = [nomID, horseName] for items in group: print(items) This is my source. http://"; /> Of $15 and trophies of $1000. First $9 and trophies of $1000 to owner, second $3, third $15000, fourth $7500, fifth $3000, sixth $1500, seventh $1500, eighth $1500 No class restriction, Set Weights plus Penalties, For Three-Years-Old and Upwards, Fillies and Mares, (Group 3) No Allowances for apprentices. Field Limit: 14 + 4 EM If I do this nomID = [res.eq(i).attr('horse') for i in range(len(res))] horseName = [res.eq(i).attr('horse') for i in range(len(res))] print(nomID, horseName) comes out correctly In [7]: 171115 Vergara Since I will be taking another 10 attributes out of nominmation category an efficient way that ensures data integrity would be valued. Thanks Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: What iterable method should I use for Lists of Lists
On Monday, 18 April 2016 12:05:39 UTC+10, Sayth Renshaw wrote: > Hi > > I have an XML and using pyquery to obtain the elements within it and then > write it to csv. > > What is the best most reliable way to take dictionaries of each element, and > print them(csv write later) based on each position so get item 0 of each list > and then it 1 and so on. > > Any other code I post is open to criticism. Because there are many attributes > I will want to collect my thought is to create a list of lists, again seems a > bit clunky so could be wrong. > > from pyquery import PyQuery as pq > > > d = pq(filename='20160319RHIL0_edit.xml') > res = d('nomination') > # myAt = pq.each(res.attr('bbid')) > # print(repr(res)) > # myAt = [res.eq(i).attr('horse') for i in range(len(res))] > # print(myAt) > > nomID = [res.eq(i).attr('horse') for i in range(len(res))] > horseName = [res.eq(i).attr('horse') for i in range(len(res))] > group = [nomID, horseName] > > for items in group: > print(items) > > > This is my source. > > > date="2016-03-19T00:00:00" gearchanges="-1" stewardsreport="-1" gearlist="-1" > racebook="0" postracestewards="0" meetingtype="TAB" rail="Timing - Electronic > : Rail - +2m" weather="Fine " trackcondition="Good" > nomsdeadline="2016-03-14T11:00:00" weightsdeadline="2016-03-15T16:00:00" > acceptdeadline="2016-03-16T09:00:00" jockeydeadline="2016-03-16T12:00:00"> >website="http://"; /> >stage="Results" distance="1900" minweight="0" raisedweight="0" class="~ > " age="3U" grade="0" weightcondition="SWP " trophy="1000" > owner="1000" trainer="0" jockey="0" strapper="0" totalprize="15" > first="9" second="3" third="15000" fourth="7500" fifth="3000" > time="2016-03-19T12:40:00" bonustype=" " nomsfee="0" acceptfee="0" > trackcondition="Good " timingmethod="Electronic" fastesttime="1-56.83 > " sectionaltime="600/35.3 " formavailable="0" racebookprize="Of $15 and > trophies of $1000. First $9 and trophies of $1000 to owner, second > $3, third $15000, fourth $7500, fifth $3000, sixth $1500, seventh $1500, > eighth $1500"> > Of $15 and trophies of $1000. First $9 and > trophies of $1000 to owner, second $3, third $15000, fourth $7500, fifth > $3000, sixth $1500, seventh $1500, eighth $1500 > No class restriction, Set Weights plus Penalties, For > Three-Years-Old and Upwards, Fillies and Mares, (Group 3) > No Allowances for apprentices. Field Limit: 14 + 4 > EM > idnumber="" regnumber="" blinkers="1" trainernumber="38701" > trainersurname="Cummings" trainerfirstname="Anthony" trainertrack="Randwick" > rsbtrainername="Anthony Cummings" jockeynumber="86876" > jockeysurname="McDonald" jockeyfirstname="James" barrier="7" weight="55" > rating="93" description="B M 5 Snippetson x Graces Spirit (Flying Spur)" > colours="Yellow, Red Epaulettes And Cap" owners="Anthony Cummings > Thoroughbreds Pty Ltd Syndicate (Mgrs: A & B Cummings) & P C Racing > Investments Syndicate (Mgr: P J Carroll) " dob="2010-10-07T00:00:00" age="6" > sex="M" career="30-7-4-2 $295445.00" thistrack="6-1-1-0 $90500.00" > thisdistance="0-0-0-0" goodtrack="17-3-2-2 $101440.00" heavytrack="5-0-1-0 > $20200.00" slowtrack="" deadtrack="" fasttrack="0-0-0-0" firstup="7-2-1-2 > $108340.00" secondup="7-1-1-0 $43200.00" mindistancewin="0" > maxdistancewin="0" finished="1" weightvariation="0" variedweight="55" > decimalmargin="0.00" penalty="0" pricestarting="$12" sectional200="0" sectional400="0" sectional600="0" sectional800="0" sectional1200="0" bonusindicator="" /> > idnumber="" regnumber="" blinkers="0" trainernumber="736" > trainersurname="Martin" trainerfirstname="Tim" trainer
Re: What iterable method should I use for Lists of Lists
On Monday, 18 April 2016 12:12:59 UTC+10, Sayth Renshaw wrote: > On Monday, 18 April 2016 12:05:39 UTC+10, Sayth Renshaw wrote: > > Hi > > > > I have an XML and using pyquery to obtain the elements within it and then > > write it to csv. > > > > What is the best most reliable way to take dictionaries of each element, > > and print them(csv write later) based on each position so get item 0 of > > each list and then it 1 and so on. > > > > Any other code I post is open to criticism. Because there are many > > attributes I will want to collect my thought is to create a list of lists, > > again seems a bit clunky so could be wrong. > > > > from pyquery import PyQuery as pq > > > > > > d = pq(filename='20160319RHIL0_edit.xml') > > res = d('nomination') > > # myAt = pq.each(res.attr('bbid')) > > # print(repr(res)) > > # myAt = [res.eq(i).attr('horse') for i in range(len(res))] > > # print(myAt) > > > > nomID = [res.eq(i).attr('horse') for i in range(len(res))] > > horseName = [res.eq(i).attr('horse') for i in range(len(res))] > > group = [nomID, horseName] > > > > for items in group: > > print(items) > > > > > > This is my source. > > > > > > > date="2016-03-19T00:00:00" gearchanges="-1" stewardsreport="-1" > > gearlist="-1" racebook="0" postracestewards="0" meetingtype="TAB" > > rail="Timing - Electronic : Rail - +2m" weather="Fine " > > trackcondition="Good" nomsdeadline="2016-03-14T11:00:00" > > weightsdeadline="2016-03-15T16:00:00" acceptdeadline="2016-03-16T09:00:00" > > jockeydeadline="2016-03-16T12:00:00"> > >> website="http://"; /> > >> stage="Results" distance="1900" minweight="0" raisedweight="0" class="~ > > " age="3U" grade="0" weightcondition="SWP " trophy="1000" > > owner="1000" trainer="0" jockey="0" strapper="0" totalprize="15" > > first="9" second="3" third="15000" fourth="7500" fifth="3000" > > time="2016-03-19T12:40:00" bonustype=" " nomsfee="0" acceptfee="0" > > trackcondition="Good " timingmethod="Electronic" fastesttime="1-56.83 > > " sectionaltime="600/35.3 " formavailable="0" racebookprize="Of $15 > > and trophies of $1000. First $9 and trophies of $1000 to owner, second > > $3, third $15000, fourth $7500, fifth $3000, sixth $1500, seventh > > $1500, eighth $1500"> > > Of $15 and trophies of $1000. First $9 and > > trophies of $1000 to owner, second $3, third $15000, fourth $7500, > > fifth $3000, sixth $1500, seventh $1500, eighth $1500 > > No class restriction, Set Weights plus Penalties, > > For Three-Years-Old and Upwards, Fillies and Mares, (Group 3) > > No Allowances for apprentices. Field Limit: 14 + 4 > > EM > > > idnumber="" regnumber="" blinkers="1" trainernumber="38701" > > trainersurname="Cummings" trainerfirstname="Anthony" > > trainertrack="Randwick" rsbtrainername="Anthony Cummings" > > jockeynumber="86876" jockeysurname="McDonald" jockeyfirstname="James" > > barrier="7" weight="55" rating="93" description="B M 5 Snippetson x Graces > > Spirit (Flying Spur)" colours="Yellow, Red Epaulettes And Cap" > > owners="Anthony Cummings Thoroughbreds Pty Ltd Syndicate (Mgrs: A & B > > Cummings) & P C Racing Investments Syndicate (Mgr: P J Carroll) " > > dob="2010-10-07T00:00:00" age="6" sex="M" career="30-7-4-2 $295445.00" > > thistrack="6-1-1-0 $90500.00" thisdistance="0-0-0-0" goodtrack="17-3-2-2 > > $101440.00" heavytrack="5-0-1-0 $20200.00" slowtrack="" deadtrack="" > > fasttrack="0-0-0-0" firstup="7-2-1-2 $108340.00" secondup="7-1-1-0 > > $43200.00" mindistancewin="0" maxdistancewin="0" finished="1" > > weightvariation="0" variedweight=&
Re: What iterable method should I use for Lists of Lists
On Monday, 18 April 2016 13:13:21 UTC+10, Sayth Renshaw wrote: > On Monday, 18 April 2016 12:12:59 UTC+10, Sayth Renshaw wrote: > > On Monday, 18 April 2016 12:05:39 UTC+10, Sayth Renshaw wrote: > > > Hi > > > > > > I have an XML and using pyquery to obtain the elements within it and then > > > write it to csv. > > > > > > What is the best most reliable way to take dictionaries of each element, > > > and print them(csv write later) based on each position so get item 0 of > > > each list and then it 1 and so on. > > > > > > Any other code I post is open to criticism. Because there are many > > > attributes I will want to collect my thought is to create a list of > > > lists, again seems a bit clunky so could be wrong. > > > > > > from pyquery import PyQuery as pq > > > > > > > > > d = pq(filename='20160319RHIL0_edit.xml') > > > res = d('nomination') > > > # myAt = pq.each(res.attr('bbid')) > > > # print(repr(res)) > > > # myAt = [res.eq(i).attr('horse') for i in range(len(res))] > > > # print(myAt) > > > > > > nomID = [res.eq(i).attr('horse') for i in range(len(res))] > > > horseName = [res.eq(i).attr('horse') for i in range(len(res))] > > > group = [nomID, horseName] > > > > > > for items in group: > > > print(items) > > > > > > > > > This is my source. > > > > > > > > > > > date="2016-03-19T00:00:00" gearchanges="-1" stewardsreport="-1" > > > gearlist="-1" racebook="0" postracestewards="0" meetingtype="TAB" > > > rail="Timing - Electronic : Rail - +2m" weather="Fine " > > > trackcondition="Good" nomsdeadline="2016-03-14T11:00:00" > > > weightsdeadline="2016-03-15T16:00:00" > > > acceptdeadline="2016-03-16T09:00:00" jockeydeadline="2016-03-16T12:00:00"> > > >> > associationclass="1" website="http://"; /> > > >> > stage="Results" distance="1900" minweight="0" raisedweight="0" class="~ > > > " age="3U" grade="0" weightcondition="SWP " > > > trophy="1000" owner="1000" trainer="0" jockey="0" strapper="0" > > > totalprize="15" first="9" second="3" third="15000" > > > fourth="7500" fifth="3000" time="2016-03-19T12:40:00" bonustype=" > > > " nomsfee="0" acceptfee="0" trackcondition="Good " > > > timingmethod="Electronic" fastesttime="1-56.83 " > > > sectionaltime="600/35.3 " formavailable="0" racebookprize="Of $15 > > > and trophies of $1000. First $9 and trophies of $1000 to owner, > > > second $3, third $15000, fourth $7500, fifth $3000, sixth $1500, > > > seventh $1500, eighth $1500"> > > > Of $15 and trophies of $1000. First $9 > > > and trophies of $1000 to owner, second $3, third $15000, fourth > > > $7500, fifth $3000, sixth $1500, seventh $1500, eighth $1500 > > > No class restriction, Set Weights plus Penalties, > > > For Three-Years-Old and Upwards, Fillies and Mares, (Group 3) > > > No Allowances for apprentices. Field Limit: 14 + > > > 4 EM > > > > > idnumber="" regnumber="" blinkers="1" trainernumber="38701" > > > trainersurname="Cummings" trainerfirstname="Anthony" > > > trainertrack="Randwick" rsbtrainername="Anthony Cummings" > > > jockeynumber="86876" jockeysurname="McDonald" jockeyfirstname="James" > > > barrier="7" weight="55" rating="93" description="B M 5 Snippetson x > > > Graces Spirit (Flying Spur)" colours="Yellow, Red Epaulettes And Cap" > > > owners="Anthony Cummings Thoroughbreds Pty Ltd Syndicate (Mgrs: A & B > > > Cummings) & P C Racing Investments Syndicate (Mgr: P J Carroll) " > > > dob="2010-10-07T00:00:00" age="6" sex="M" career="30-7-4-2 $295445.00" >
Re: scipy install error,need help its important
On Monday, 18 April 2016 13:53:30 UTC+10, Xristos Xristoou wrote: > guys i have big proplem i want to install scipy > but all time show me error > i have python 2.7 and windows 10 > i try to use pip install scipy and i take that error > > raise NotFoundError('no lapack/blas resources found') > numpy.distutils.system_info.NotFoundError: no lapack/blas resources found > > > Command "C:\Python27\python.exe -u -c "import setuptools, > tokenize;__file__='c:\\users\\name\\appdata\\local\\temp\\pip-build-a3fjaf\\scipy\\setup.py';exec(compile(getattr(tokenize, > 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" > install --record > c:\users\name\appdata\local\temp\pip-pgtkuz-record\install-record.txt > --single-version-externally-managed --compile" failed with error code 1 in > c:\users\name\appdata\local\temp\pip-build-a3fjaf\scipy\ Either install and use anaconda https://www.continuum.io/downloads or use these builds to install. http://www.lfd.uci.edu/~gohlke/pythonlibs/ Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: scipy install error,need help its important
On Monday, 18 April 2016 13:53:30 UTC+10, Xristos Xristoou wrote: > guys i have big proplem i want to install scipy > but all time show me error > i have python 2.7 and windows 10 > i try to use pip install scipy and i take that error > > raise NotFoundError('no lapack/blas resources found') > numpy.distutils.system_info.NotFoundError: no lapack/blas resources found > > > Command "C:\Python27\python.exe -u -c "import setuptools, > tokenize;__file__='c:\\users\\name\\appdata\\local\\temp\\pip-build-a3fjaf\\scipy\\setup.py';exec(compile(getattr(tokenize, > 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" > install --record > c:\users\name\appdata\local\temp\pip-pgtkuz-record\install-record.txt > --single-version-externally-managed --compile" failed with error code 1 in > c:\users\name\appdata\local\temp\pip-build-a3fjaf\scipy\ Oh and I would choose the anaconda route. Then you can use conda to easy manage libraries that could otherwise be dificult on windows. Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: What iterable method should I use for Lists of Lists
> > You're getting very chatty with yourself, which is fine... but do you > need to quote your entire previous message every time? We really don't > need another copy of your XML file in every post. > > Thanks! > > ChrisA Oops sorry, everytime I posted I then thought of another resource and kept reading. I have a working messy solution hopefully I can resolve it to something nicer. Sayth -- https://mail.python.org/mailman/listinfo/python-list
How to track files processed
Hi If you are parsing files in a directory what is the best way to record which files were actioned? So that if i re-parse the directory i only parse the new files in the directory? Thanks Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: How to track files processed
Thank you Martin and Peter To clarify Peter at the moment only writing to csv but am wanting to set up an item pipeline to SQL db next. I will have a go at your examples Martin and see how i go. Thank you both for taking time to help. Sayth -- https://mail.python.org/mailman/listinfo/python-list
Why are my files in in my list - os module used with sys argv
Hi Why would it be that my files are not being found in this script? from pyquery import PyQuery as pq import pandas as pd import os import sys if len(sys.argv) == 2: print("no params") sys.exit(1) dir = sys.argv[1] mask = sys.argv[2] files = os.listdir(dir) fileResult = filter(lambda x: x.endswith(mask), files) # d = pq(filename='20160319RHIL0_edit.xml') data = [] for file in fileResult: print(file) for items in fileResult: d = pq(filename=items) res = d('nomination') attrs = ('id', 'horse') data = [[res.eq(i).attr(x) for x in attrs] for i in range(len(res))] # from nominations # res = d('nomination') # nomID = [res.eq(i).attr('id') for i in range(len(res))] # horseName = [res.eq(i).attr('horse') for i in range(len(res))] # attrs = ('id', 'horse') frames = pd.DataFrame(data) print(frames) I am running this from the bash prompt as (pyquery)sayth@sayth-E6410:~/Projects/pyquery$ python jqxml.py samples *.xml my directory structure (pyquery)sayth@sayth-E6410:~/Projects/pyquery$ ls -a . .. environment.yml .git .gitignore #jqxml.py# jqxml.py samples and samples contains (pyquery)sayth@sayth-E6410:~/Projects/pyquery/samples$ ls -a . 20160319RHIL0_edit.xml 20160409RAND0.xml .. 20160402RAND0.xml 20160416RAND0.xml yet I get no files out of the print statement. Ideas? Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Why are my files in in my list - os module used with sys argv
On Tuesday, 19 April 2016 18:17:02 UTC+10, Peter Otten wrote: > Steven D'Aprano wrote: > > > On Tue, 19 Apr 2016 09:44 am, Sayth Renshaw wrote: > > > >> Hi > >> > >> Why would it be that my files are not being found in this script? > > > > You are calling the script with: > > > > python jqxml.py samples *.xml > > > > This does not do what you think it does: under Linux shells, the glob > > *.xml will be expanded by the shell. Fortunately, in your case, you have > > no files in the current directory matching the glob *.xml, so it is not > > expanded and the arguments your script receives are: > > > > > > "python jqxml.py" # not used > > > > "samples" # dir > > > > "*.xml" # mask > > > > > > You then call: > > > > fileResult = filter(lambda x: x.endswith(mask), files) > > > > which looks for file names which end with a literal string (asterisk, dot, > > x, m, l) in that order. You have no files that match that string. > > > > At the shell prompt, enter this: > > > > touch samples/junk\*.xml > > > > and run the script again, and you should see that it now matches one file. > > > > Instead, what you should do is: > > > > > > (1) Use the glob module: > > > > https://docs.python.org/2/library/glob.html > > https://docs.python.org/3/library/glob.html > > > > https://pymotw.com/2/glob/ > > https://pymotw.com/3/glob/ > > > > > > (2) When calling the script, avoid the shell expanding wildcards by > > escaping them or quoting them: > > > > python jqxml.py samples "*.xml" > > (3) *Use* the expansion mechanism provided by the shell instead of fighting > it: > > $ python jqxml.py samples/*.xml > > This requires that you change your script > > from pyquery import PyQuery as pq > import pandas as pd > import sys > > fileResult = sys.argv[1:] > > if not fileResult: > print("no files specified") > sys.exit(1) > > for file in fileResult: > print(file) > > for items in fileResult: > try: > d = pq(filename=items) > except FileNotFoundError as e: > print(e) > continue > res = d('nomination') > # you could move the attrs definition before the loop > attrs = ('id', 'horse') > # probably a bug: you are overwriting data on every iteration > data = [[res.eq(i).attr(x) for x in attrs] for i in range(len(res))] > > I think this is the most natural approach if you are willing to accept the > quirk that the script tries to process the file 'samples/*.xml' if the > samples directory doesn't contain any files with the .xml suffix. Common > shell tools work that way: > > $ ls samples/*.xml > samples/1.xml samples/2.xml samples/3.xml > $ ls samples/*.XML > ls: cannot access samples/*.XML: No such file or directory > > Unrelated: instead of working with sys.argv directly you could use argparse > which is part of the standard library. The code to get at least one file is > > import argparse > > parser = argparse.ArgumentParser() > parser.add_argument("files", nargs="+") > args = parser.parse_args() > > print(args.files) > > Note that this doesn't fix the shell expansion oddity. Hi Thanks for the insight, after doing a little reading I found this post which uses both argparse and glob and attempts to cover the windows and bash expansion of wildcards, http://breathmintsforpenguins.blogspot.com.au/2013/09/python-crossplatform-handling-of.html import argparse from glob import glob def main(file_names): print file_names if __name__ == "__main__": parser = argparse.ArgumentParser() parser.add_argument("file_names", nargs='*') #nargs='*' tells it to combine all positional arguments into a single list args = parser.parse_args() file_names = list() #go through all of the arguments and replace ones with wildcards with the expansion #if a string does not contain a wildcard, glob will return it as is. for arg in args.file_names: file_names += glob(arg) main(file_names) And way beyond my needs for such a tiny script but I think tis is the flask developers python cli creation package Click http://click.pocoo.org/5/why/#why-not-argparse based of optparse. > # probably a bug: you are overwriting data on every iteration > data = [[res.eq(i).attr(x) for x in attrs] for i in range(len(res))] Thanks for picking this up will have to append to it on each iteration for each attribute. Thank You Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Why are my files in in my list - os module used with sys argv
On Tuesday, 19 April 2016 23:21:42 UTC+10, Sayth Renshaw wrote: > On Tuesday, 19 April 2016 18:17:02 UTC+10, Peter Otten wrote: > > Steven D'Aprano wrote: > > > > > On Tue, 19 Apr 2016 09:44 am, Sayth Renshaw wrote: > > > > > >> Hi > > >> > > >> Why would it be that my files are not being found in this script? > > > > > > You are calling the script with: > > > > > > python jqxml.py samples *.xml > > > > > > This does not do what you think it does: under Linux shells, the glob > > > *.xml will be expanded by the shell. Fortunately, in your case, you have > > > no files in the current directory matching the glob *.xml, so it is not > > > expanded and the arguments your script receives are: > > > > > > > > > "python jqxml.py" # not used > > > > > > "samples" # dir > > > > > > "*.xml" # mask > > > > > > > > > You then call: > > > > > > fileResult = filter(lambda x: x.endswith(mask), files) > > > > > > which looks for file names which end with a literal string (asterisk, dot, > > > x, m, l) in that order. You have no files that match that string. > > > > > > At the shell prompt, enter this: > > > > > > touch samples/junk\*.xml > > > > > > and run the script again, and you should see that it now matches one file. > > > > > > Instead, what you should do is: > > > > > > > > > (1) Use the glob module: > > > > > > https://docs.python.org/2/library/glob.html > > > https://docs.python.org/3/library/glob.html > > > > > > https://pymotw.com/2/glob/ > > > https://pymotw.com/3/glob/ > > > > > > > > > (2) When calling the script, avoid the shell expanding wildcards by > > > escaping them or quoting them: > > > > > > python jqxml.py samples "*.xml" > > > > (3) *Use* the expansion mechanism provided by the shell instead of fighting > > it: > > > > $ python jqxml.py samples/*.xml > > > > This requires that you change your script > > > > from pyquery import PyQuery as pq > > import pandas as pd > > import sys > > > > fileResult = sys.argv[1:] > > > > if not fileResult: > > print("no files specified") > > sys.exit(1) > > > > for file in fileResult: > > print(file) > > > > for items in fileResult: > > try: > > d = pq(filename=items) > > except FileNotFoundError as e: > > print(e) > > continue > > res = d('nomination') > > # you could move the attrs definition before the loop > > attrs = ('id', 'horse') > > # probably a bug: you are overwriting data on every iteration > > data = [[res.eq(i).attr(x) for x in attrs] for i in range(len(res))] > > > > I think this is the most natural approach if you are willing to accept the > > quirk that the script tries to process the file 'samples/*.xml' if the > > samples directory doesn't contain any files with the .xml suffix. Common > > shell tools work that way: > > > > $ ls samples/*.xml > > samples/1.xml samples/2.xml samples/3.xml > > $ ls samples/*.XML > > ls: cannot access samples/*.XML: No such file or directory > > > > Unrelated: instead of working with sys.argv directly you could use argparse > > which is part of the standard library. The code to get at least one file is > > > > import argparse > > > > parser = argparse.ArgumentParser() > > parser.add_argument("files", nargs="+") > > args = parser.parse_args() > > > > print(args.files) > > > > Note that this doesn't fix the shell expansion oddity. > > Hi > > Thanks for the insight, after doing a little reading I found this post which > uses both argparse and glob and attempts to cover the windows and bash > expansion of wildcards, > http://breathmintsforpenguins.blogspot.com.au/2013/09/python-crossplatform-handling-of.html > > import argparse > from glob import glob > > def main(file_names): > print file_names > > if __name__ == "__main__": > parser = argparse.ArgumentParser() > parser.add_argument("file_names", nargs='*') > #nargs='*' tells it
Re: Why are my files in in my list - os module used with sys argv
On Tuesday, 19 April 2016 23:46:01 UTC+10, Peter Otten wrote: > Sayth Renshaw wrote: > > > Thanks for the insight, after doing a little reading I found this post > > which uses both argparse and glob and attempts to cover the windows and > > bash expansion of wildcards, > > http://breathmintsforpenguins.blogspot.com.au/2013/09/python-crossplatform-handling-of.html > > I hope you read the comment section of that page carefully. > On Linux your script's behaviour will be surprising. Yes I have gone your way now and am parsing the files, where my data is going will have to wait till after I sleep. Thanks for the advice. from pyquery import PyQuery as pq import pandas as pd import argparse # from glob import glob parser = argparse.ArgumentParser(description=None) def GetArgs(parser): """Parser function using argparse""" # parser.add_argument('directory', help='directory use', # action='store', nargs='*') parser.add_argument("files", nargs="+") return parser.parse_args() fileList = GetArgs(parser) print(fileList.files) # d = pq(filename='20160319RHIL0_edit.xml') data = [] attrs = ('id', 'horse') for items in fileList.files: d = pq(filename=items) res = d('nomination') dataSets = [[res.eq(i).attr(x) for x in attrs] for i in range(len(res))] resultList = data.append(dataSets) frames = pd.DataFrame(resultList) print(frames) -- (pyquery)sayth@sayth-E6410:~/Projects/pyquery$ python jqxml.py samples/*.xml ['samples/20160319RHIL0_edit.xml', 'samples/20160402RAND0.xml', 'samples/20160409RAND0.xml', 'samples/20160416RAND0.xml'] Empty DataFrame Columns: [] Index: [] (pyquery)sayth@sayth-E6410:~/Projects/pyquery$ Thanks Sayth -- https://mail.python.org/mailman/listinfo/python-list
Controlling the passing of data
Hi This file contains my biggest roadblock with programming and that's the abstract nature of needing to pass data from one thing to the next. In my file here I needed to traverse and modify the XML file I don't want to restore it or put it in a new variable or other format I just want to alter it and let it flow onto the list comprehensions as they were. Once I can get on top of this mentally I will be able to do so much better, I think I am trying to manage it in my head as if it was water and plumbing. In particular here I am taking the id from race and putting it into the children of each race called nomination. I have put a comment above the new code which is causing the difficulty. from pyquery import PyQuery as pq import pandas as pd import argparse import numpy as np # from glob import glob parser = argparse.ArgumentParser(description=None) def GetArgs(parser): """Parser function using argparse""" # parser.add_argument('directory', help='directory use', # action='store', nargs='*') parser.add_argument("files", nargs="+") return parser.parse_args() fileList = GetArgs(parser) # print(fileList.files) data = [] horseattrs = ('race_id', 'id', 'horse', 'number', 'finished', 'age', 'sex', 'blinkers', 'trainernumber', 'career', 'thistrack', 'firstup', 'secondup', 'variedweight', 'weight', 'pricestarting') meetattrs = ('id', 'venue', 'date', 'rail', 'weather', 'trackcondition') raceattrs = ('id', 'number', 'shortname', 'stage', 'distance', 'grade', 'age', 'weightcondition', 'fastesttime', 'sectionaltime') clubattrs = ('code') frames = pd.DataFrame([]) noms = [] for items in fileList.files: d = pq(filename=items) meet = d('meeting') club = d('club') race = d('race') res = d('nomination') # d('p').filter(lambda i: i == 1) # Here i need to traverse and modify but I don't want to restore the # structure just pass it on. So I can use it in the following list # comprehensions as I had before. for race_el in d('race'): race = pq(race_el) race_id = race.attr('id') for nom_el in race.items('nomination'): res.append((pq(nom_el).attr('raceid', race_id))) resdata = [[res.eq(i).attr(x) for x in horseattrs] for i in range(len(res))] # print(dataSets) meetdata = [[meet.eq(i).attr(x) for x in meetattrs] for i in range(len(meet))] racedata = [[race.eq(i).attr(x) for x in raceattrs] for i in range(len(race))] clubdata = [[club.eq(i).attr(x) for x in clubattrs] for i in range(len(club))] raceid = [row[0] for row in racedata] # L = [x + [0] for x in L] # print(resdata) # resdata = [raceid[i] for i in raceid x + i for x in resdata] # for number of classes equalling nomination in the each category of # race inset raceid into resdata # # print(resdata) # clubdf = pd.DataFrame(clubdata) # meetdf = pd.DataFrame(meetdata) # racedf = pd.DataFrame(racedata) # resdf = pd.DataFrame(resdata) # frames = frames.append(clubdf) # frames = frames.append(meetdf) # # frames = frames.append(racedf) # frames = frames.append(resdf) # print(frames) # frames.to_csv('~/testingFrame5.csv', encoding='utf-8') Thanks Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Controlling the passing of data
> > Your actual problem is drowned in too much source code. Can you restate it > in English, optionally with a few small snippets of Python? > > It is not even clear what the code you provide should accomplish once it's > running as desired. > > To give at least one code-related advice: You have a few repetitions of the > following structure > > > meetattrs = ('id', 'venue', 'date', 'rail', 'weather', 'trackcondition') > > > meet = d('meeting') > > > meetdata = [[meet.eq(i).attr(x) > > for x in meetattrs] for i in range(len(meet))] > > You should move the pieces into a function that works for meetings, clubs, > races, and so on. Finally (If I am repeating myself so be it): the occurence > of range(len(something)) in your code is a strong indication that you are > not using Python the way Guido intended it. Iterate over the `something` > directly whenever possible. Hi Peter > meetattrs = ('id', 'venue', 'date', 'rail', 'weather', 'trackcondition') is created to define a list of attr in the XML rather than referencing each attr individually I create a list and pass it into > meetdata = [[meet.eq(i).attr(x) > > for x in meetattrs] for i in range(len(meet))] This list comprehension reads the XML attr by attr using meet = d('meeting') as the call to pyquery to locate the class in the XML and identify the attr's. I do apologise for the lack of output, I asked a question about parsing that I always seem to get wrong over think and then find the solution simpler than I thought. The output is 4 tables of the class and selected attributes eg meetattrs = ('id', 'venue', 'date', 'rail', 'weather', 'trackcondition') from the meeting class of the XML. In order to give flexibility and keep the relational nature they have defined in the table I found when exporting the nominations section via pandas to csv that i had no way to determine which id belonged to each race that is there was no race_id in the nominations to relate back to the race, and also no meeting id in the raceid to relate it back to the meeting. So I wanted to traverse all the classes Meeting, Race and Nomination and insert the id of the class into its direct children only and since there were many races a meeting and many nomnations a race I need to ensure that it is the direct children only. It was otherwise working as parsed output in code supplied using to push to pandas and use its csv write capability. So I inserted for race_el in d('race'): race = pq(race_el) race_id = race.attr('id') for nom_el in race.items('nomination'): res.append((pq(nom_el).attr('raceid', race_id))) which traverses and inserts the race_id into the child nominations. However, my boggles is how to pass this to the list comprehension that was working without changing the data from XML or creating another intermediate step and variable. Just to parse it as it was but with the extra included race_id. Thanks Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Controlling the passing of data
On Friday, 29 April 2016 01:19:28 UTC+10, Dan Strohl wrote: > If I am reading this correctly... you have something like (you will have to > excuse my lack of knowledge about what kinds of information these actually > are): > > > 1234 > first > > > 5678 > second > > > > And you want something like: > nominations = [(1,1234), (2,5678)] > meetings = [(1,'first'),(2,'second')] > > if that is correct, my suggestion is to do something like (this is psudeo > code, I didn't look up the exact calls to use): > > nomination_list = [] > meeting_list = [] > > for race_element in xml_file('race'): > id = race_element.get_attr('id') > for nomination_element in race_element('nomination'): > nomination = nomination_element.get_text() >nomination_list.append((id, nomination)) > > for meeting_element in race_element('meeting'): > meeting = meeting_element.get_text() >meeting_list.append((id, meeting)) > > > > Yes in essence that is what I am trying to acheive however the XML I have has many attributes like this. for example this is one nomination. Therefore I thought that if I tried to do it like the code you posted it would soon become unweildy. > for race_element in xml_file('race'): > id = race_element.get_attr('id') > for nomination_element in race_element('nomination'): > nomination = nomination_element.get_text() >nomination_list.append((id, nomination)) So I created a list of the attributes of each class meeting race nomination and then parsed that list through the list comprehension. On putting out the code though I realised that whilst each class worked I had no way to relate the race to the meeting, the nomination to the race so if I then loaded the csv or created sql to push it to a db it would loose its relation. So when I say meetattrs = ('id', 'venue', 'date', 'rail', 'weather', 'trackcondition') In my thinking this is a table. Meeting id venue date rail weather trackcondition There is no foreign key relation to race, so in this question I am saying shouldn't I put the meeting_id as a foreign key into the race attributes before parsing race and then I can have a 'id' in meeting related to the new 'race_id' in race. The id of race would then be put in nomnation before parsing and I would do the same? Hoping this is clearer, probably a little close to the problem to express it clearly so I apologise for that. Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Pivot table of Pandas
On Friday, 29 April 2016 09:56:13 UTC+10, David Shi wrote: > Hello, Matt, > Please see the web link.Pandas Pivot Table Explained > > | | > | | | | | | > | Pandas Pivot Table ExplainedExplanation of pandas pivot_table function. | > | | > | View on pbpython.com | Preview by Yahoo | > | | > | | > > > Debra and Fred have their own groups. > How to split the pivot table into separate tables? > What types of objects are these pivot tables? > How to access the "manager" column? > the pivot table is interesting to users, but it is very different from > databases which we normally know. > Looking forward to hearing from you. > Regards. > David Unsure of your exact requirements but this doc on reshaping seems to cover the requirements posted. You would not get 2 tables that would be counter-intuitive to a pivottable. You would just define a group in rows and the columns as the data or vice versa. So Debra data of some sort team_member1 44 team_member2 56 Fred team_memeber162 team_memeber233 http://pandas.pydata.org/pandas-docs/stable/reshaping.html Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Controlling the passing of data
> because a set avoids duplicates. If you say "I want to document my > achievements for posterity" I would recommend that you print to a file > rather than append to a list and the original code could be changed to > > with open("somefile") as f: > for achievement in my_achievements: > print(achievement.description, file=f) > > > Back to my coding hint: Don't repeat yourself. If you move the pieces > > >> > meetattrs = ('id', 'venue', 'date', 'rail', 'weather', > >> > 'trackcondition') > >> > >> > meet = d('meeting') > >> > >> > meetdata = [[meet.eq(i).attr(x) > >> > for x in meetattrs] for i in range(len(meet))] > > into a function > > def extract_attrs(nodes, attrs): > return [[nodes.eq(i).attr(name) for name in attrs] > for i in range(len(nodes))] > > You can reuse it for clubs, races, etc.: > > meetdata = extract_attrs(d("meeting"), meetattrs) > racedata = extract_attrs(d("race"), raceattrs) > > If you put the parts into a dict you can generalize even further: > > tables = { >"meeting": ([], meetattrs), >"race": ([], raceattrs), > } > for name, (data, attrs) in tables.items(): > data.extend(extract_attrs(d(name), attrs)) > I find that really cool. Reads well to, hadn't considered approaching it that way at all. > So you want to go from a tree structure to a set of tables that preserves > the structure by adding foreign keys. You could try a slightly different > approach, something like > > for meeting in meetings: > meeting_table.append(...meeting attrs...) > meeting_id = ... > for race in meeting: > race_table.append(meeting_id, ...meeting attrs...) > race_id = ... > for nomination in race: > nomination_table.append(race_id, ...nomination attrs...) > > I don't know how to spell this in PyQuery -- with lxml you could do > something like > > meeting_table = [] > race_table = [] > nomination_table = [] > tree = lxml.etree.parse(filename) > for meeting in tree.xpath("/meeting"): > meeting_table.append([meeting.attrib[name] for name in meetattrs]) > meeting_id = meeting.attrib["id"] > for race in meeting.xpath("./race"): > race_table.append( > [meeting_id] + [race.attrib[name] for name in raceattrs]) > race_id = race.attrib["id"] > for nomination in race.xpath("./nomination"): > nomination_table.append( > [race_id] > + [nomination.attrib[name] for name in horseattrs]) > > Not as clean and not as general as I would hope -- basically I'm neglecting > my recommendation from above -- but if it works for you I might take a > second look later. I need to play around with this just to understand it more, really like it. Might try and implement your advice from before and put it in a function. Sayth -- https://mail.python.org/mailman/listinfo/python-list
Code Opinion - Enumerate
Looking at various Python implementations of Conway's game of life. I came across one on rosetta using defaultdict. http://rosettacode.org/wiki/Conway%27s_Game_of_Life#Python Just looking for your opinion on style would you write it like this continually calling range or would you use enumerate instead, or neither (something far better) ? import random from collections import defaultdict printdead, printlive = '-#' maxgenerations = 3 cellcount = 3,3 celltable = defaultdict(int, { (1, 2): 1, (1, 3): 1, (0, 3): 1, } ) # Only need to populate with the keys leading to life ## ## Start States ## # blinker u = universe = defaultdict(int) u[(1,0)], u[(1,1)], u[(1,2)] = 1,1,1 for i in range(maxgenerations): print "\nGeneration %3i:" % ( i, ) for row in range(cellcount[1]): print " ", ''.join(str(universe[(row,col)]) for col in range(cellcount[0])).replace( '0', printdead).replace('1', printlive) nextgeneration = defaultdict(int) for row in range(cellcount[1]): for col in range(cellcount[0]): nextgeneration[(row,col)] = celltable[ ( universe[(row,col)], -universe[(row,col)] + sum(universe[(r,c)] for r in range(row-1,row+2) for c in range(col-1, col+2) ) ) ] universe = nextgeneration Just finished watching ned batchelders talk and wondering how far I should take his advice. http://nedbatchelder.com/text/iter.html Thanks Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: Code Opinion - Enumerate
Also not using enumerate but no ugly for i range implementation this one from code review uses a generator on live cells only. http://codereview.stackexchange.com/a/108121/104381 def neighbors(cell): x, y = cell yield x - 1, y - 1 yield x, y - 1 yield x + 1, y - 1 yield x - 1, y yield x + 1, y yield x - 1, y + 1 yield x, y + 1 yield x + 1, y + 1 def apply_iteration(board): new_board = set([]) candidates = board.union(set(n for cell in board for n in neighbors(cell))) for cell in candidates: count = sum((n in board) for n in neighbors(cell)) if count == 3 or (count == 2 and cell in board): new_board.add(cell) return new_board if __name__ == "__main__": board = {(0,1), (1,2), (2,0), (2,1), (2,2)} number_of_iterations = 10 for _ in xrange(number_of_iterations): board = apply_iteration(board) print board Sayth -- https://mail.python.org/mailman/listinfo/python-list