Hi all, I started Python just a little while ago and I am stuck on something that is really simple, but I just can't figure out.
Essentially I need to take a text document with some chemical information in Czech and organize it into another text file. The information is always EINECS number, CAS, chemical name, and formula in tables. I need to organize them into lines with | in between. So it goes from: 200-763-1 71-73-8 nátrium-tiopentál C11H18N2O2S.Na to: 200-763-1|71-73-8|nátrium-tiopentál|C11H18N2O2S.Na but if I have a chemical like: kyselina močová I get: 200-720-7|69-93-2|kyselina|močová |C5H4N4O3|200-763-1|71-73-8|nátrium-tiopentál and then it is all off. How can I get Python to realize that a chemical name may have a space in it? Thank you, Patrick So far I have: #take tables in one text file and organize them into lines in another import codecs path = "c:\\text_samples\\chem_1_utf8.txt" path2 = "c:\\text_samples\\chem_2.txt" input = codecs.open(path, 'r','utf8') output = codecs.open(path2, 'w', 'utf8') #read and enter into a list chem_file = [] chem_file.append(input.read()) #split words and store them in a list for word in chem_file: words = word.split() #starting values in list e=0 #EINECS c=1 #CAS ch=2 #chemical name f=3 #formula n=0 loop=1 x=len(words) #counts how many words there are in the file print '-'*100 while loop==1: if n<x and f<=x: print words[e], '|', words[c], '|', words[ch], '|', words[f], '\n' output.write(words[e]) output.write('|') output.write(words[c]) output.write('|') output.write(words[ch]) output.write('|') output.write(words[f]) output.write('\r\n') #increase variables by 4 to get next set e = e + 4 c = c + 4 ch = ch + 4 f = f + 4 # increase by 1 to repeat n=n+1 else: loop=0 input.close() output.close() -- http://mail.python.org/mailman/listinfo/python-list