On Sunday, August 30, 2015 at 1:16:12 PM UTC-4, MRAB wrote: > On 2015-08-30 17:31, kbtyo wrote: > > On Saturday, August 29, 2015 at 10:50:18 PM UTC-4, MRAB wrote: > >> On 2015-08-30 03:05, kbtyo wrote: > >> > I am using Jupyter Notebook and Python 3.4. I have a data structure in > >> > the format, (type list): > >> > > >> > [{'AccountNumber': N, > >> > 'Amount': '0', > >> > 'Answer': '12:00:00 PM', > >> > 'ID': None, > >> > 'Type': 'WriteLetters', > >> > 'Amount': '10', > >> > {'AccountNumber': Y, > >> > 'Amount': '0', > >> > 'Answer': ' 12:00:00 PM', > >> > 'ID': None, > >> > 'Type': 'Transfer', > >> > 'Amount': '2'}] > >> > > >> > The end goal is to write this out to CSV. > >> > > >> > For the above example the output would look like: > >> > > >> > AccountNumber, Amount, Answer, ID, Type, Amount > >> > N,0,12:00:00 PM,None,WriteLetters,10 > >> > Y,2,12:00:00 PM,None,Transfer,2 > >> > > >> > Below is the function that I am using to write out this data structure. > >> > Please excuse any indentation formatting issues. The data structure is > >> > returned through the function "construct_results(get_just_xml_data)". > >> > > >> > The data that is returned is in the format as above. > >> > "construct_headers(get_just_xml_data)" returns a list of headers. > >> > Writing out the row for "headers_list" works. > >> > > >> > The list comprehension "data" is to maintain the integrity of the column > >> > headers and the values for each new instance of the data structure > >> > (where the keys in the dictionary are the headers and values - row > >> > instances). The keys in this specific data structure are meant to check > >> > if there is a value instance, and if there is not - place an ''. > >> > > >> > def write_to_csv(results, headers): > >> > > >> > headers = construct_headers(get_just_xml_data) > >> > results = construct_results(get_just_xml_data) > >> > headers_list = list(headers) > >> > > >> > with open('real_csv_output.csv', 'wt') as f: > >> > writer = csv.writer(f) > >> > writer.writerow(headers_list) > >> > for row in results: > >> > data = [row.get(index, '') for index in results] > >> > writer.writerow(data) > >> > > >> > > >> > > >> > However, when I run this, I receive this error: > >> > > >> > --------------------------------------------------------------------------- > >> > TypeError Traceback (most recent call > >> > last) > >> > <ipython-input-747-7746797fc9a5> in <module>() > >> > ----> 1 write_to_csv(results, headers) > >> > > >> > <ipython-input-746-c822437eeaf0> in write_to_csv(results, headers) > >> > 9 writer.writerow(headers_list) > >> > 10 for item in results: > >> > ---> 11 data = [item.get(index, '') for index in results] > >> > 12 writer.writerow(data) > >> > > >> > <ipython-input-746-c822437eeaf0> in <listcomp>(.0) > >> > 9 writer.writerow(headers_list) > >> > 10 for item in results: > >> > ---> 11 data = [item.get(index, '') for index in results] > >> > 12 writer.writerow(data) > >> > > >> > TypeError: unhashable type: 'dict' > >> > > >> > > >> > I have done some research, namely, the following: > >> > > >> > https://mail.python.org/pipermail//tutor/2011-November/086761.html > >> > > >> > http://stackoverflow.com/questions/27435798/unhashable-type-dict-type-error > >> > > >> > http://stackoverflow.com/questions/1957396/why-dict-objects-are-unhashable-in-python > >> > > >> > However, I am still perplexed by this error. Any feedback is welcomed. > >> > Thank you. > >> > > >> You're taking the index values from 'results' instead of 'headers'. > > > > Would you be able to elaborate on this? I partially understand what you > > mean. However, each dictionary (of results) has the same keys to map to > > (aka, headers when written out to CSV). I am wondering if you would be able > > to explain how the index is being used in this case? > > > In the list comprehension on line 11, you have "item.get(index, '')". > > What is 'index'? > > You have "for index in results" in the list comprehension, and 'results' > is a list of dicts, therefore 'index' is a _dict_. > > That means that you're trying to look up an entry in the 'item' dict > using a _dict_ as the key. > > Oh, and incidentally, line 12 should be indented to the same level as > line 11.
Yes, as mentioned in my OP, please forgive formatting issues with indentation: I feel that I need to provide some context to avoid any confusion over my motivations for choosing to do something. My original task was to parse an XML data structure stored in a CSV file with other data types and then add the elements back as headers and the text as row values. I went back to drawing board and creating a "results" list of dictionaries where the keys have values as lists using this. def convert_list_to_dict(get_just_xml_data): d = {} for item in get_just_xml_data(get_all_data): for k, v in item.items(): try: d[k].append(v) except KeyError: d[k] = [v] return d This creates a dictionary for each XML tag - for example: { 'Number1': ['0'], 'Number2': ['0'], 'Number3': ['0'], 'Number4': ['0'], 'Number5': ['0'], 'RepgenName': [None], 'RTpes': ['Execution', 'Letters'], 'RTID': ['3', '5']} I then used this to create a "headers" set (to prevent duplicates to be added) and the list of dictionaries that I mentioned in my OP. I achieve this via: #just headers def construct_headers(convert_list_to_dict): header = set() with open('real.csv', 'rU') as infile: reader = csv.DictReader(infile) for row in reader: xml_data = convert_list_to_dict(get_just_xml_data) #get_just_xml_data(get_all_data) row.update(xml_data) header.update(row.keys()) return header #get all of the results def construct_results(convert_list_to_dict): header = set() results = [] with open('real.csv', 'rU') as infile: reader = csv.DictReader(infile) for row in reader: xml_data = convert_list_to_dict(get_just_xml_data) #get_just_xml_data(get_all_data) # print(row) row.update(xml_data) # print(row) results.append(row) # print(results) header.update(row.keys()) # print(type(results)) return results I guess I am using the headers list originally written out. My initial thought is to just write out the values corresponding with each transaction. For example, citing this data structure: { 'Number1': ['0'], 'Number2': ['0'], 'Number3': ['0'], 'Number4': ['0'], 'Number5': ['0'], 'RPN': [None], 'RTypes': ['Execution', 'Letters'], 'RTID': ['3', '5']} I would get a CSV Number1, Number2, Number3, Number4, Number5, RPN, RTypes,RTID 0, 0, 0, 0, 0, None, Execution, 3 None, None, None,None,None, Letters, 5 I am wondering how I would achieve this when all of the headers set is not sorted (should I do so before writing this out?). Also, since I have millions of transactions I want to make sure that the values for each of the headers is sequentially placed. Any guidance would be very helpful. Thanks. -- https://mail.python.org/mailman/listinfo/python-list