beginner in python
Hi everybody , I am a beginner in python, I have to fetch the redundant entries from a file, code: import re L = [] fh = open('ARCHITECTURE_MAIN.txt','r') for line in fh.readlines(): data =line.strip() # splitted = data.split('#') L.append(data) fh.close() M=L for x in L: x = x.split('#') for y in M: y = y.split('#') x_data = x[0],x[1],x[2],x[3] #print x_data y_data = y[0],y[1],y[2],y[3] #print y_dat if x_data[0] == y_data[0]: print x_data i get the result as a tupule, the text file which has datas separated by hash entry#isoform#start#stop# i have to check upto this 00250_1#ARCH_104#61#89#Literature#9224948#00250 00250_1#ARCH_104#97#126#Literature#9224948#00250 00250_1#ARCH_104#139#186#Literature#9224948#00250 00251_1#ARCH_463#7#59#SMART##00251 00251_1#ARCH_463#91#121#SMART##00251 00251_1#ARCH_463#251#414#SMART##00251 00251_1#ARCH_463#540#624#SMART##00251 00252_1#ARCH_474#1#21#Literature#8136357#00252 00252_1#ARCH_393#481#501#Literature#8136357#00252 00252_1#ARCH_463#523#553#SMART##00252 00253_1#ARCH_82#37#362#SMART##00253 00253_1#ARCH_54#365#522#SMART##00253 00253_1#ARCH_104#589#617#SMART##00253 00253_1#ARCH_104#619#647#SMART##00253 00253_1#ARCH_104#684#712#SMART##00253 00254_1#ARCH_82#27#352#SMART##00254 00254_1#ARCH_54#355#510#SMART##00254 00254_1#ARCH_104#576#604#SMART##00254 00254_1#ARCH_104#606#634#SMART##00254 00254_1#ARCH_104#671#699#SMART##00254 00255_1#ARCH_82#56#425#SMART##00255 00255_1#ARCH_54#428#582#SMART##00255 00255_1#ARCH_104#696#724#SMART##00255 can you suggest me ,what are the improvement i have to make in the above code regards shafreen -- http://mail.python.org/mailman/listinfo/python-list
beginner in python
hi everybody, I am beginner in python I have to calculate the euclidean distance between the atoms from a pdb file i have written the the code and its shows me some error , the code: import re import string import math ab =[] x_value = [] y_value = [] z_value = [] fh = open("1K5N.pdb",'r') for atom in fh.readlines(): a = atom.strip() pattern= re.compile('^ATOM.*') atom_file= pattern.search(a) if atom_file: atom_data = atom_file.group() x_coordinate = atom_data[31:38] y_coordinate = atom_data[39:46] z_coordinate = atom_data[47:54] x_value.append(x_coordinate) y_value.append(y_coordinate) z_value.append(z_coordinate) for x in range(len(x_value)): x_co = float(x_value[x])-float(x_value[x+1]) y_co = float(y_value[x])-float(y_value[x+1]) z_co = float(z_value[x])-float(z_value[x+1]) data = math.sqrt(x_co)*(x_co)+(y_co)*(y_co)+(z_co)*(z_co) print data ~ and the error ,message File "pdb_fetching.py", line 22, in ? x_co = float(x_value[x])-float(x_value[x+1]) IndexError: list index out of range can you suggest me the mistake i have made regards shafreen -- http://mail.python.org/mailman/listinfo/python-list
beginner in python
Hi everybody , I am a beginner in python, I have to fetch the redundant entries from a file, code: import re L = [] fh = open('ARCHITECTURE_MAIN.txt','r') for line in fh.readlines(): data =line.strip() # splitted = data.split('#') L.append(data) fh.close() M=L for x in L: x = x.split('#') for y in M: y = y.split('#') x_data = x[0],x[1],x[2],x[3] #print x_data y_data = y[0],y[1],y[2],y[3] #print y_dat if x_data[0] == y_data[0]: print x_data i get the result as a tupule, the text file which has datas separated by hash entry#isoform#start#stop# i have to check upto this 00250_1#ARCH_104#61#89#Literature#9224948#00250 00250_1#ARCH_104#97#126#Literature#9224948#00250 00250_1#ARCH_104#139#186#Literature#9224948#00250 00251_1#ARCH_463#7#59#SMART##00251 00251_1#ARCH_463#91#121#SMART##00251 00251_1#ARCH_463#251#414#SMART##00251 00251_1#ARCH_463#540#624#SMART##00251 00252_1#ARCH_474#1#21#Literature#8136357#00252 00252_1#ARCH_393#481#501#Literature#8136357#00252 00252_1#ARCH_463#523#553#SMART##00252 00253_1#ARCH_82#37#362#SMART##00253 00253_1#ARCH_54#365#522#SMART##00253 00253_1#ARCH_104#589#617#SMART##00253 00253_1#ARCH_104#619#647#SMART##00253 00253_1#ARCH_104#684#712#SMART##00253 00254_1#ARCH_82#27#352#SMART##00254 00254_1#ARCH_54#355#510#SMART##00254 00254_1#ARCH_104#576#604#SMART##00254 00254_1#ARCH_104#606#634#SMART##00254 00254_1#ARCH_104#671#699#SMART##00254 00255_1#ARCH_82#56#425#SMART##00255 00255_1#ARCH_54#428#582#SMART##00255 00255_1#ARCH_104#696#724#SMART##00255 can you suggest me ,what are the improvement i have to make in the above code regards shafreen -- http://mail.python.org/mailman/listinfo/python-list
clarification
hi every body, i have compared two files: code: fh = open('HPRD_MAIN_20.txt','r') for line in fh.readlines(): data = line.strip().split('#') fh1 = open('NOMENCLATURE_MAIN_20.txt','r') for line1 in fh1.readlines(): data1 = line1.strip().split('#') if data1[0] == data[0]: result = data[0] +'#'+data[3]+'|'+ data[4]+'|'+data[9]+'|'+ data1[3] print result the result was as given below: 00017#ACTG1|actin, gamma 1|Actin gamma 1|ACTG 00017#ACTG1|actin, gamma 1|Actin gamma 1|Actin gamma 00017#ACTG1|actin, gamma 1|Actin gamma 1|Cytoskeletal gamma actin but i need the result to be like this : 00017#ACTG1|actin, gamma 1|Actin gamma 1|ACTG,Actin gamma,Cytoskeletal gamma, actin with out redundancy and the name in the same line separated by commas.. please suggest what should i do for this to get the result like this. regards shafreen -- http://mail.python.org/mailman/listinfo/python-list
clarification
hi everybody, i have a file with data separated by tab mydata: fhl1fkh2 dfp1chk1 mal3alp14 mal3moe1 mal3spi1 mal3bub1 mal3bub3 mal3mph1 mal3mad3 hob1nak1 hob1wsp1 hob1rad3 cdr2cdc13 cdr2cdc2 shows these two are separated by tab represented as columns i have to check the common data between these two coloumn1 an coloumn2 my code: data = [] data1 = [] result = [] fh = open('sheet1','r') for line in fh.readlines(): splitted = line.strip().split('\t') data.append(splitted[0]) data1.append(splitted[1]) for k in data: if k in data1: result.append(k) print result fh.close() can you tell me problem with my script and what should is do for this regards shafreen -- http://mail.python.org/mailman/listinfo/python-list
swapping
hi everbody, i have a file with data: fhl1fkh2 dfp1chk1 mal3alp14 mal3moe1 mal3spi1 mal3bub1 mal3bub3 mal3mph1 mal3mad3 hob1nak1 i have written code to check the redudant pairs my code: data = [] data1 = [] fh = open('sheet1','r') for line in fh: if line not in data: data.append(line) else: print line fh.close() fh1 = open('sheet2','r') for line1 in fh1: if line1 not in data1: data1.append(line1) else: print line1 fh1.close() result: klp5bub1 apn1apn2 but i have do the same for the revere ,to check the result like this for eg: apn2apn1 what is the concept to do this regards shafreen -- http://mail.python.org/mailman/listinfo/python-list
hi everybody
hi everybody, i have written to fetch the url, and accesstje nm and np entries my code: import re import urllib2 import time Gene_id=raw_input("Please enter the gene_id:") fh = urllib2.urlopen(' http://www.ncbi.nlm.nih.gov/sites/entrez?db=gene&cmd=search&term='+Gene_id) for line in fh.readlines(): pattern = re.compile('(NM_\d+.\d{0,5}).*(NP_\d+.\d{0,5})') m = pattern.search(line) if m: nm_entry = m.group(1) np_entry = m.group(2) length = len(np_entry) #data = raw_input("There are %s entry, They are:" %(length)) fh1 = urllib2.urlopen(' http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?val='+nm_entry) for line1 in fh1.readlines(): p1 = re.compile('source\s*(\d{1}.*\d+)') m1 = p1.search(line1) if m1: seq = m1.group(1) seq_len = seq.split('..') print nm_entry, 'Length of NM_seq:', seq_len[1],np_entry fh1.close() fh.close() time.sleep(2) in my result : Please enter the gene_id: (after this i want to get the text and data) eg., there are 11 entries and the nm and np entry the final print statement. i have include the highlighted text in code where it is repeaded since inside the looping please check about the following code and post your comments and where is include the text to get the result properly #data = raw_input("There are %s entry, They are:" %(length)) regards shafreen -- http://mail.python.org/mailman/listinfo/python-list
easy but difficult
hi everybody, I have a file separated by hash: as shown below, file: A#1 B#2 A#2 A#3 B#3 I need the result like this: A 1#2#3 B 2#3 how will generate the result like this from the above file can somebody tell me what i have to do.. My code: fh =open('abc_file','r') for line in fh.readlines(): data = line.strip().split('#') for data[0] in line print line I tried but i donot know how to create 1#2#3 in a single line regards shafreen -- http://mail.python.org/mailman/listinfo/python-list
iterating over the other and finding the greatest
hi everybody, I have file with four columns:the content: column1column2 col3 col4 1175123443A_16_P03652190 12771336387A_16_P41582022 1723178298A_16_P03652191 18801932270A_16_P41582024 1000120210001261539A_16_P41582025 100018001000185916A_16_P41582026 100018751000192343A_16_P21378376 1000196610002011361A_16_P03652193 for the column3 : have carry out a process a>b then i should print the line and if b>c then i should print the line and c>d then i should print... like this i have to continue.say for eg: 43<387 so the first row is omitted, 387 is greater then 98 so i can print the line second row... my code: fh = open('364010_spacing','r') for lines in fh.readlines(): data = lines.strip().split('\t') start =data[0].strip() end = data[1].strip() values = data[2].strip() id = data[3].strip() if a > b :#hanged up here print lines but i am not able to do the above can you people guide me in a right way.. I should proceed further... regards shafreen -- http://mail.python.org/mailman/listinfo/python-list
caluclating median distance
hi everybody, I have a file 1175123443A_16_P03652190 12771336387A_16_P41582022 1723178298A_16_P03652191 18801932270A_16_P41582024 1000120210001261539A_16_P41582025 100018001000185916A_16_P41582026 100018751000192343A_16_P21378376 1000196610002011361A_16_P03652193 100023721000242249A_16_P21378377 1000247110002527118A_16_P03652194 1000264510002704187A_16_P41582029 1000289110002941130A_16_P21378379 1000307110003121415A_16_P03652195 1000353610003595-38A_16_P03652196 how to calculate the median spacing of an data in the file basically focusing on the third column of an file. say for example the median spacing is to be 635. how do we programtically calculate the median spacing and sort the file according to the third column's median spacing. hope median we calculate using (N+1)/2 can somebody help me in this point of view regards shafreen -- http://mail.python.org/mailman/listinfo/python-list
sorting data
hi all, I have problem to sort the data.. the file includes data as follow. file: chrX:123343123123343182A_16_P41787782 chrX:123343417123343476A_16_P03762840 chrX:123343460123343519A_16_P41787783 chrX:1233433612334395A_16_P03655927 chrX:123343756123343815A_16_P03762841 chrX:123343807123343866A_16_P41787784 chrX:123343966123344024A_16_P21578670 chrX:123344059123344118A_16_P21578671 chrX:1233443812334497A_16_P21384637 chrX:123344776123344828A_16_P21578672 chrX:123344811123344870A_16_P03762842 chrX:123345165123345224A_16_P41787789 chrX:123345360123345419A_16_P41787790 chrX:123345380123345439A_16_P03762843 chrX:123345481123345540A_16_P41787792 chrX:123345873123345928A_16_P41787793 chrX:123345891123345950A_16_P03762844 how do is sort the file based on the column 1 and 2 with values.. using sort option works for only one column and not for the other how do is sort both 1 and 2nd column so that the third column does not change. my script:#sorting the file start_lis = [] end_lis = [] fh = open('chromosome_location_346010.bed','r') for line in fh.readlines(): data = line.strip().split('\t') start = data[1].strip() end = data[2].strip() probe_id = data[3].strip() start_lis.append(start) end_lis.append(end) start_lis.sort() end_lis.sort() for k in start_lis: for i in end_lis print k , i , probe_id(this doesnot worK) result = start#end#probe_id --->this doesnot work... print result What is the error and how do is sort a file based on the two column to get the fourth column also with that. regards shafreen -- http://mail.python.org/mailman/listinfo/python-list
two file into a single file
hi everybody, I have a two file, file 1: 17097 17186 1723 17895 17906 18295 18311 1880 19160 19629 file 2: 17097 17186 1723 17895 17906 18295 18311 1880 19160 19629 how do i make into a single file..like this file 1 file 2 17097 17097 17186 17097 17186 1880 172317895 17895 17895 17906 17895 18295 8311 18311 188 -- http://mail.python.org/mailman/listinfo/python-list
two files into an alternate list
hi everybody , i have a file : file 1: 1 2 3 4 5 6 file2: a b c d e f how do i make the two files into list like this = [1,a,2,b,3,c,4,d,5,e,6,f] regards shafreen -- http://mail.python.org/mailman/listinfo/python-list
dictionary and list
hi everbody, I have a file, a b c d e 2722316 2722360A_16_P03641972150-44 2722510 2722554A_16_P2136023916-44 2722570 2722614A_16_P0364197344-44 2722658 2722702A_16_P415636692187-44 2724889 2724948A_16_P03641974738-59 2725686 2725745A_16_P03641975422-59 2726167 2726219A_16_P0364197688-52 2726307 2726366A_16_P415636772167-59 2728533 2728589A_16_P213602495819-56 2734408 2734467A_16_P21360257-14-59 2734453 2734509A_16_P03641977376-56 2734885 2734929A_16_P213602591987-44 i need to do with dictionary like this : c[d,e,d+1] = A_16_P03641972[150,-44,16] my script:d = {} fh = open('final_lenght_probe_span','r') for line in fh.readlines(): data = line.strip().split('\t') probe_id = data[2].strip() span = data[3].strip() length = data[4].strip() d[probe_id ] = [] d[probe_id] = [span,length,span[0+1]] for key in d.keys(): print key ,d[key] I donot end with this result how do i do -- http://mail.python.org/mailman/listinfo/python-list
appending into a list
hi everybody, I have a file : A B C D E 2717353 2717412A_16_P03641964214-59 2717626 2717685A_16_P4156365525-59 2717710 2717754A_16_P036419651250-44 2719004 2719063A_16_P03641966-36-59 2719027 2719086A_16_P21360229289-59 2719375 2719428A_16_P0364196760-53 2719488 2719542A_16_P21360231418-54 2719960 2720014A_16_P03641968727-54 2720741 2720786A_16_P03641969494-45 2721280 2721339A_16_P03641970-28-59 2721311 2721370A_16_P21360234150-59 2721520 2721569A_16_P21360235199-49 2721768 2721821A_16_P03641971139-53 2721960 2722004A_16_P21360237312-44 I need to append the column D and E into a list: in such a way that the list should have [D,E,D,E,D,E] How do i do it. regards shafreen -- http://mail.python.org/mailman/listinfo/python-list
redundancy_check
hi everbody, I have a file, a b c 1454VALTGLTVAEYFR8.9954e-07 1454VALTGLTVAEYFR0.00404626 1498STLTDSLVSK0.00404626 1505TIAMDGTEGLVR1.50931e-05 1528GAEISAILEER0.00055542 1528GAEISAILEER0.00055542 1538YPIEHGIITNWDDMEK0.0180397 1540YPIEHGIITNWDDMEK3.69329e-05 1552AQIVGGFPIDISEAPYQISLR0.015136 The file has redundancy in lines , I have to print the line without redundancy on consideration to the column c of the two lines which are redundant and those that are having column c lesser value than the other. how do i do it. regards shafreen -- http://mail.python.org/mailman/listinfo/python-list
dictionary
hi everybody, I have a file 2709852 2709911A_16_P21360207405-59 2710316 2710367A_14_P136880-42-51 2710325 2710384A_16_P21360209876-59 2711260 2711319A_16_P21360210-22-59 2711297 2711356A_16_P03641959254-59 2711610 2711659A_16_P03641960982-49 2712641 2712696A_16_P036419621011-55 2713707 2713765A_16_P4156364843-58 2713808 2713861A_16_P03641963-16-53 2713845 2713893A_16_P415636493460-48 2717353 2717412A_16_P03641964214-59 2717626 2717685A_16_P4156365525-59 2717710 2717754A_16_P036419651250-44 2719004 2719063A_16_P03641966-36-59 I need the result like this A_16_P21360207 [405 , -59 , -42] A_14_P136880 [42 , -51, 876] A_16_P21360209[876 , -59, -22] That is the list has an overlapping with the next lines . My script: d = {} fh = open('final_lenght_probe_span','r') while fh: a, b, c, span,len = fh.readline().strip().split('\t') a1,b1,c1,span1,len1 = fh.readline().strip().split('\t') probe_id = c d[probe_id] = [] d[probe_id] = [span,len,span1] for key in d.keys(): print key, d[key] but i am not able to achive at my result what should i do for it. -- http://mail.python.org/mailman/listinfo/python-list
to sum a list in a dictonary
hi everybody, I need to sum a list in dictionary... my script, d = {} probes = list(enumerate((i.split('\t')[2],i.split('\t')[3], i.split('\t')[4])for i in open('final_lenght_probe_span'))) for idx, (probe_id, span, length) in probes: try : l = [span,length.strip(),probes[idx+1][1][1]] d[probe_id] = sum[l] except IndexError : none = 0 l = [span,length,None] d[probe_id] = sum[l] for key in d.keys(): print key, d[key] I used an built-in fuction sum to add the list, but is results in error... now how do i do it my file used is: 2709852 2709911 A_16_P21360207 405 -59 2710316 2710367 A_14_P136880-42 -51 2710325 2710384 A_16_P21360209 876 -59 2711260 2711319 A_16_P21360210 -22 -59 2711297 2711356 A_16_P03641959 254 -59 2711610 2711659 A_16_P03641960 982 -49 2712641 2712696 A_16_P03641962 1011-55 2713707 2713765 A_16_P41563648 43 -58 2713808 2713861 A_16_P03641963 -16 -53 2713845 2713893 A_16_P41563649 3460-48 2717353 2717412 A_16_P03641964 214 -59 2717626 2717685 A_16_P41563655 25 -59 2717710 2717754 A_16_P03641965 1250-44 -- http://mail.python.org/mailman/listinfo/python-list
Re: to sum a list in a dictonary
hi everybody ,i have tried with the improving my code like this but i face the problem since i am not able to concatenate the str, lis.. but if i donot use none i wont get the respective list i require... is there any solution to this dd = {} dd2 ={} probes = list(enumerate((i.split('\t')[2],i.split('\t')[3], i.split('\t')[4])for i in open('final_lenght_probe_span'))) for idx, (probe_id, span, length) in probes: try : dd[probe_id] = [span,length.strip(),probes[idx+1][1][1]] except IndexError : None = 0( is this a right way) dd[probe_id] = [span,length,None] dd2 = dict(zip(dd.keys(), [[sum(item)] for item in dd.values()])) print dd2 1)the error shown is when i assign NONE =0 None = 0 SyntaxError: assignment to None 2)when i donot assign NONE TypeError: unsupported operand type(s) for +: 'int' and 'str' On 11/1/07, Beema shafreen <[EMAIL PROTECTED]> wrote: > > hi everybody, >I need to sum a list in dictionary... > my script, >d = {} > probes = list(enumerate((i.split('\t')[2],i.split('\t')[3], > i.split('\t')[4])for > i in open('final_lenght_probe_span'))) > for idx, (probe_id, span, length) in probes: > try : > l = [span,length.strip(),probes[idx+1][1][1]] > d[probe_id] = sum[l] > > except IndexError : > none = 0 > l = [span,length,None] > d[probe_id] = sum[l] > for key in d.keys(): > print key, d[key] > > > I used an built-in fuction sum to add the list, but is results in error... > > now how do i do it > > my file used is: > 2709852 2709911 A_16_P21360207 405 -59 > 2710316 2710367 A_14_P136880-42 -51 > 2710325 2710384 A_16_P21360209 876 -59 > 2711260 2711319 A_16_P21360210 -22 -59 > 2711297 2711356 A_16_P03641959 254 -59 > 2711610 2711659 A_16_P03641960 982 -49 > 2712641 2712696 A_16_P03641962 1011-55 > 2713707 2713765 A_16_P41563648 43 -58 > 2713808 2713861 A_16_P03641963 -16 -53 > 2713845 2713893 A_16_P41563649 3460-48 > 2717353 2717412 A_16_P03641964 214 -59 > 2717626 2717685 A_16_P41563655 25 -59 > 2717710 2717754 A_16_P03641965 1250-44 > > > -- http://mail.python.org/mailman/listinfo/python-list
looping
hi everbody, i have a file , A_16_P418510561730 A_16_P03796992165 A_16_P21640222360 A_16_P21640223240 A_16_P03796993168 A_16_P418510591094 A_16_P216402251035 A_16_P03796994154 A_16_P216402261422 A_16_P216402271262 A_16_P41851063107 A_16_P0379699578 A_16_P03796996273 A_16_P21640230687 A_16_P03796997417 A_16_P21640233320 A_16_P03796998205 column 2 of my file has values 1730, 165,360 if i need to put a condition checking that if the first values is greater than the next i have to print the column coresponding to that. say for example 1730 is greater than 165 so i print column1 value =A_16_P41851056 , but 165 is lesser that both the 1730 and 360 so i omit the valuefrom column1, and i have to continue the loop till the end of file checking which values is greater the both tom and bottom how do i do it. -- http://mail.python.org/mailman/listinfo/python-list
looping
hi everybody, I have a file: A_16_P21360207 304 A_14_P136880783 A_16_P21360209795 A_16_P21360210173 A_16_P036419591177 A_16_P036419601944 A_16_P03641962999 A_16_P036419633391 A_16_P415636493626 A_16_P03641964180 A_16_P415636551216 A_16_P036419651170 A_16_P03641966194 A_16_P21360229290 my script: import math values_lis = [] fh = open('test5','r') for line in fh.readlines(): data = line.strip().split('\t') probe = data[0].strip() values= data[1].strip() value = values[0].strip() value =float(values) old = 0 if value >=old: old = value print '%s\t%s'%(probe,old) I am aiming to remove the lowest values in the second column by iterating one over the othersay for example 304 , is lesser than the second row value 783 so i omit the first column first value.. in the same case in the third row 795 is greater the 783 so i omit the 783the second row... this to be carried out for attaining my destination... i have written the sript but it doesnot not remove the lesser values...can you please chek where i go wrong... -- http://mail.python.org/mailman/listinfo/python-list
error ...1 value to unpack
hi everybody, i have a file: A_16_P21360207#304 A_14_P136880#783 A_16_P21360209#795 A_16_P21360210#173 A_16_P03641959#1177 A_16_P03641960#1944 A_16_P03641962#999 A_16_P41563648#-31 A_16_P03641963#3391 A_16_P41563649#3626 A_16_P03641964#180 A_16_P41563655#1216 A_16_P03641965#1170 A_16_P03641966#194 A_16_P21360229#290 A_16_P03641967#425 A_16_P21360231#1091 A_16_P03641968#1167 A_16_P03641969#421 A_16_P03641970#63 A_16_P21360234#290 A_16_P21360235#289 A_16_P03641971#398 A_16_P21360237#418 A_16_P03641972#122 A_16_P21360239#16 A_16_P03641973#2187 A_16_P41563669#2881 A_16_P03641974#1101fh = open('complete_span','r') data = fh.readline().split('#') old_probe = data[0].strip() old_value = data[1].strip() #print old_probe, old_value count = 1 while fh: current_probe, current_value = fh.readline().strip().split('#')[0:2] probe =current_probe.strip() value = current_value.strip() if old_value > value: res_value='%s\t%s'%(old_value, old_probe) print res_value if count == 244000: break old_probe,old_value =probe, value fh.close() and i face this error:Traceback (most recent call last): File "count_values.py", line 8, in current_probe, current_value = fh.readline().strip().split('#')[0:2] ValueError: need more than 1 value to unpack why do i get this what is the solution for this regards shafreen A_16_P03641975#451 My script: -- http://mail.python.org/mailman/listinfo/python-list
count increment...
hi, evrybody I have file A_16_P21360207304 A_14_P136880783 A_16_P21360209795 A_16_P21360210173 A_16_P036419591177 A_16_P036419601944 A_16_P03641962999 A_16_P41563648-31 A_16_P036419633391 A_16_P415636493626 A_16_P03641964180 A_16_P415636551216 A_16_P036419651170 A_16_P03641966194 A_16_P21360229290 A_16_P03641967425 A_16_P213602311091 A_16_P036419681167 A_16_P03641969421 A_16_P0364197063 A_16_P21360234290 A_16_P21360235289 A_16_P03641971398 A_16_P21360237418 A_16_P03641972122 A_16_P2136023916 A_16_P036419732187 A_16_P415636692881 A_16_P036419741101 A_16_P03641975451 A_16_P036419762203 A_16_P415636777927 A_16_P213602495749 A_16_P21360257303 A_16_P036419772307 A_16_P213602592102 A_16_P03641980270 my script: #!/usr/bin/env python fh = open('complete_span','r') line = fh.readline().split('#') old_probe = line[0].strip() old_value = line[1].strip() print old_probe, old_value count = 1 line = "" while line: line = fh.readline().strip() if line : current_probe, current_value = line.split('#')[0:2] probe =current_probe.strip() value = current_value.strip() if int(old_value) > int(value): res_value='%s\t%s'%(old_value, old_probe) print res_value if count >= 244000: break old_probe,old_value =probe, value fh.close() I need to increment the line until the line count is 244000. i havescript but... it doesnot work... can anybody... chekc and let me know what was the problem... -- http://mail.python.org/mailman/listinfo/python-list
hopping in a list
hi everybody, I have a created list, my code: res_value = [] fh = open('test','r') for line in fh.readlines(): data = line.strip().split('\t') current_span = data[3].strip() probe = data[2].strip() length = data[4].strip() res_value.append(current_span) res_value.append(probe) res_value.append(length) fh.close() my result: L = ['35', 'A_16_P21404055', '-59', '355', 'A_16_P21404056', '-57', '167', 'A_16_P03667375', '-59', '5006', 'A_16_P21404058', '-55', '-38', 'A_16_P21404059', '-57', '261', 'A_16_P03667376', '-59', '200', 'A_16_P21404061', '-58', '-43', 'A_16_P03667377', '-59', '308', 'A_16_P21404062', '-58', '67', 'A_16_P03667378', '-54', '226'] if i need to check L[0] < L[3] and L[3]< L[5] and L[5]< L[7]... and so on... if its less then i have to add L[0]+L[2]+L[3] and print the values. how do i do it. -- http://mail.python.org/mailman/listinfo/python-list
error :list out of range
hi everybody, I have written a code to check which is the lowest value in a list my list: ['94', 'A_16_P03647505', '-59', '42', 'A_16_P41573860', '-44', '513', 'A_16_P41573861', '-44', '66', 'A_16_P41573862', '-44', '327', 'A_16_P03647506', '-46', '77', 'A_16_P41573864', '-59', '52', 'A_16_P03647507', '-59', '307', 'A_16_P41573865', '-59', '111', 'A_16_P03647508', '-59', '167', 'A_16_P41573867', '-48', '223', 'A_16_P03647509', '-45', '124', 'A_16_P41573869', '-54', '206', 'A_16_P03647510', '-59', '52', 'A_16_P41573870', '-52', '549', 'A_16_P03647511', '-59', '2976'] mycode: res_value = [] fh = open('test','r') for line in fh.readlines(): data = line.strip().split('\t') current_span = data[3].strip() probe = data[2].strip() length = data[4].strip() res_value.append(current_span) res_value.append(probe) res_value.append(length) #omplete_dataset.append(res_value) fh.close() for k in range(0,len(res_value),3): check = res_value[k:k+4] if check[0] < check[4]: print check error: File "app.py", line 16, in if check[0] < check[4]: IndexError: list index out of range i get an error like this how do i sort the error out to get result -- http://mail.python.org/mailman/listinfo/python-list
comparing dictionaries to find the identical keys
hi everybody , i need to compare two dictionary's key. I have written a script gene_symbol = {} probe_id = {} result = {} def getGene(fname): fh = open(fname , 'r') for line in fh: yield line fh.close() for line in getGene("symbol_hu133"): data1= line.strip().split('#') probe_give = data1[0].strip() gene_give = data1[1].strip() gene_symbol[probe_give] = gene_give #print gene_symbol.keys() for line in getGene("gds1428.csv"): data = line.strip().split(',') probe_get = data[0].strip() probe_id[probe_get] = data if gene_symbol.keys() == probe_id.keys(): print gene_symbol.keys(), probe_id.values() can anybody show me the error I make here ,while comparing the keys of two dictionaries so that i print the values of the dictionaries whoes Keys are Identical -- http://mail.python.org/mailman/listinfo/python-list
joining rows
hi every body, I have two columns in a file separted by tabs If the column1 is common in the row1 and row2 then it should be column 2 should be displayed in the single line. eg: col 1 col2 A1 A2 A3 B1 C 2 D 3 D 4 The result should be A1|2|3 B1 C2 D3|4 What should I do to get my results -- http://mail.python.org/mailman/listinfo/python-list
CSV
Hi all, I have written a script to parse a CSV file: import csv def get_lines(fname): fhandle = csv.reader(open(fname,"rb")) for line in fhandle: while fhandle.next()[0] == "prot_hit_num": continue for row in fhandle: print row result = get_lines("file.csv") print result I need to print the data from "prot_hit_num" and before the line "peptide sequence". I am able to print the whole from "prot_hit_num" to the end of the file but I need to break before line "peptide sequence". How should i do this. -- http://mail.python.org/mailman/listinfo/python-list
module pickle
Hi I am beginner in python. and I am not able to understand the Pickle concept in python can. some body explain me about the use of this module, few examples. which will help me a lot. regards shafreen -- http://mail.python.org/mailman/listinfo/python-list
printing dictionary and tuple
Hi everbody i am trying to print the dictionary values and tuple in a same line as below print "\t".join(dict[a].values())+'\t'+"\t".join(b) Error I get is the TypeError, since i have misisng values in the dictionary. if i use exception i will miss those how should i print the data without missing the lines excluding the error separated by tab. -- Beema Shafreen -- http://mail.python.org/mailman/listinfo/python-list
setting path for python interpretor
Hi all, Can any body suggest me how to the set path for making python2.4 as the main interpretor instead of python 2.5. regards -- Beema Shafreen -- http://mail.python.org/mailman/listinfo/python-list
replace numbers in a string
hi All, i have few lines in file "ttccatttctggacatgacgtctgt6901ggtttaagctttgtgaaagaatgtgctttgattcg" i need to replace the number and get only the alphabet in such a case what should i do. Can any body suggest me -- Beema Shafreen -- http://mail.python.org/mailman/listinfo/python-list
basic comparing files
hi all, I have a very basic doubt I am comparing two files A and B which has three columns a1, b1 of A and a2, b2 say for example if need to compare a1 with a2 and if there are common i have display a1, b1, b2 or else i have to display a1 , b1 or a1, b2 is the set function going to be the best option or is there any other way -- Beema Shafreen -- http://mail.python.org/mailman/listinfo/python-list
set function
Hi all, I need to find the intersection of 10 different files with ids defined as a_items, b_items and so on common_items = a_items&b_items&c_items&\ d_items&e_items&f_items\ &g_items&h_items&i_items&j_items i have included above line in the script is this an right way will my program accept it or what are the other option to compare 10 different such items -- Beema Shafreen -- http://mail.python.org/mailman/listinfo/python-list
comparison of files using set function
I have files with two column, column 1 is with id and column 2 is with data(sequence) My goal is to create a table in such a way, the column one of the table should have all the id from the files and next column will be have the respective seq of the file1 with correlation to the id and the third column will be sequence information of the next file with respective to the id original files look like this 45ytut 46erete 37 dfasf 45 dassdsd and so on for all the 10 files that is it has two column as mentioned above. The output should look like this: Idfile1 file2 file3 file4 file5 43ytuhytuh ytuhytuhytuh 46 erteee rty ryyy ertyu 47 yutiorrreeerr The goal is if the pick all the common id in the files and with their respective information in the adjacent rows. the various conditons ca also prevails 1) common id present in all the files, which have same information 2)common id present in all the files, which donot have same information 3) common id may not be present in all the files But the goal is exactly find the common id in all the files and add their corresponding information in the file to the table as per the view my script : def file1_search(*files1): for file1 in files1: gi1_lis = [] fh = open(file1,'r') for line in fh.readlines(): data1 = line.strip().split('\t') gi1 = data1[0].strip() seq1 = data1[1].strip() gi1_lis.append(gi1) return gi1_lis def file2_search(**files2): for file2 in files2: for file in files2[file2]: gi2_lis = [] fh1 = open(file,'r') for line1 in fh1.readlines(): data2 = line1.strip().split('\t') gi2 = data2[0].strip() seq2 = data2[1].strip() gi2_lis.append(gi2) return gi2_lis def set_compare(data1,data2,*files1,**files2): A = set(data1) B = set(data2) I = A&B # common between thesetwo sets D = A-B #57 is the len of D C = B-A #176 is the len of c #print len(C) # print len(D) for file1 in files1: for gi in D: fh = open(file1,'r') for line in fh.readlines(): data1 = line.strip().split('\t') gi1 = data1[0].strip() seq1 = data1[1].strip() if gi == gi1: #print line.strip() pass for file2 in files2: for file in files2[file2]: for gi in C: fh1 = open(file,'r') for line1 in fh1.readlines(): data2 = line1.strip().split('\t') gi2 = data2[0].strip() seq2 = data2[1].strip() if gi == gi2: # print line1.strip() pass if __name__ == "__main__": files1 = ["Fr20.txt",\ "Fr22.txt",\ "Fr24.txt",\ "Fr60.txt",\ "Fr62.txt"] files2 = {"data":["Fr64.txt",\ "Fr66.txt",\ "Fr68.txt",\ "Fr70.txt",\ "Fr72.txt"]} data1 = file1_search(*files1) """113 is the total number of gi""" data2 = file2_search(**files2) #for j in data2: # print j """232 is the total number of gi found""" result = set_compare(data1,data2,*files1,**files2) It doesnot work fine... some body please suggest me the way i can proceed . Thanks a lot -- Beema Shafreen -- http://mail.python.org/mailman/listinfo/python-list
Typeerror
Hi all, I getting the following error when i run my scirpt , can somebody help me regarding this to solve the type error problem Traceback (most recent call last): File "get_one_prt_pep.py", line 59, in ? if len(data[res])<=1: TypeError: string indices must be integers -- Beema Shafreen -- http://mail.python.org/mailman/listinfo/python-list
Re: Typeerror
Thanks a lot i had solved the problem On Wed, May 21, 2008 at 2:32 PM, <[EMAIL PROTECTED]> wrote: > On May 21, 9:58 am, Freaky Chris <[EMAIL PROTECTED]> wrote: > > This is a simple error, you are passing the variable res as an interger > to > > use for a slice when what ever you are storing in res isn't an integer. > > > > Chris > > > > > > > > Beema shafreen wrote: > > > > > Hi all, > > > I getting the following error when i run my scirpt , > > > can somebody help me regarding this to solve the type error problem > > > > > Traceback (most recent call last): > > > File "get_one_prt_pep.py", line 59, in ? > > > if len(data[res])<=1: > > > TypeError: string indices must be integers > > > > > -- > > > Beema Shafreen > > > > > -- > > >http://mail.python.org/mailman/listinfo/python-list > > > > -- > > View this message in context: > http://www.nabble.com/Typeerror-tp17358659p17358932.html > > Sent from the Python - python-list mailing list archive at Nabble.com. > > If it is an integer, but stored as a string you can use: > > if len(data[int(res)])<=1 > > its not pretty but it should sort out the type errors > -- > http://mail.python.org/mailman/listinfo/python-list > -- Beema Shafreen -- http://mail.python.org/mailman/listinfo/python-list
e-value
Hi all, I have file which includes the e_value i want to fetch the lines if the line with the e_value which is less than 0.01 so I have script but it doesn't work. can you please tell me is this the right way. The script does not end up with any error. It work if is give the condition evalue > 0.01 my script: >>> for line in fh: ... gi, seq, e_value = line.strip().split('\t') ... if e_value < 0.01: ... print e_value ... >>> sample data file: gi|7290649|IWHHTFYNELR4.6e-02 gi|108883867|TITLEVEPSDTIENVK7.8e-02 gi|157018218|LFEGGFDTLNK2.2e-03 gi|34420406|YMVGPIEEVVEK7.5e-04 gi|118791575|ATIKDEITHTGQFYEANDYR9.4e-03 gi|78706974|LLSGVTIAQGGVLPNIQAVLLPK5.2e-02 gi|157015257|VDDDVAVTDEK1.0e-02 gi|28571691|QAGEVTYADAHK2.2e-02 gi|89954247|VETGVLKPGTVVVFAPVNLTTEVK4.4e-03 gi|78101790|LFEGGFDTLNK2.2e-03 gi|157021047|LLSGVTIAQGGVLPNIQAVLLPK7.3e-05 gi|157138410|LLSGVTIAQGGVLPNIQAVLLPK5.2e-02 gi|27820013|LTDEEVDEMIR2.6e-03 gi|56417572|TITLEVEPSDTIENVK7.8e-02 gi|157020596|HPGSFEIVHVK5.8e-02 can anybody help me reagrding this. -- Beema Shafreen -- http://mail.python.org/mailman/listinfo/python-list
sorting a file
Hi all, I have a file with three columns i need to sort the file with respect to the third column. How do I do it uisng python. I used Linux command to do this. Sort but i not able to do it ? can any body ssuggest me -- Beema Shafreen -- http://mail.python.org/mailman/listinfo/python-list
Re: sorting a file
Thanks lot for your valuable suggestions On Sun, Jun 15, 2008 at 4:04 AM, Dennis Lee Bieber <[EMAIL PROTECTED]> wrote: > On Sat, 14 Jun 2008 12:45:47 +0530, "Beema shafreen" > <[EMAIL PROTECTED]> declaimed the following in > gmane.comp.python.general: > >Strange: I don't recall seeing this on comp.lang.py, just the first > responder; and a search on message ID only found it on gmane... > > > Hi all, > > > > I have a file with three columns i need to sort the file with respect to > > the third column. How do I do it uisng python. I used Linux command to do > > this. Sort but i not able to do it ? > > can any body ssuggest me > > Question 1: Will the file fit completely within the memory of a running > Python program? > > Question 2: How are the columns defined? Fixed width, known in advance; > tab separated; comma separated. > > If #1 is true, I'd read the file into a list of tuples/sublists (if line > is fixed width columns, read line, manually split on column widths; if > TSV or CSV use the proper options with the CSV module to read the file). > Define a sort key function to extract the key column and use the > built-in list sort method > >data.sort(key=lambda x : x[2]) #warning, I'm not skilled at lambda > > Actually, if text sort order (not numeric value order) is okay, and the > lines are fixed width columns, no need to manually split the columns > into tuples; just read all lines into a list and define a key function > that picks out the columns needed > >data.sort(key=lambda x : x[colstart:colend]) > > > If #1 if FALSE (too big for memory) you will need to create a sort-merge > procedure in which you read n-lines of the file; sort them, write to > temporary file; alternating among 2+ temporary files keeping the same > n-lines (except for the last packet). Then merge the 2+ temporaries over > the n-lines in the batch to a new temporary file; after the first n > lines have been merged (giving n*2+ lines in the batch) switch to > another temporary file for the next batch When all original batches > are merged, repeat the merge using batches of size n*2+... Repeat until > only one temporary file is left (ie, only one long merge batch is > written). > >Or figure out how to call whatever system sort command is available > with whatever parameters are needed -- after all, why reinvent the wheel > if you can reach outside the snake and grab that is already in the snake > pit ("outside the snake" => os.system(...); "snake pit" => the OS > environment). Even WinXP has a command line sort command; as long as you > don't need a multikey sort it can handle the simple text record sorting > with limitations on memory size to use. > > -- >WulfraedDennis Lee Bieber KD6MOG >[EMAIL PROTECTED] [EMAIL PROTECTED] >HTTP://wlfraed.home.netcom.com/ >(Bestiaria Support Staff: [EMAIL PROTECTED]) >HTTP://www.bestiaria.com/ > -- > http://mail.python.org/mailman/listinfo/python-list > -- Beema Shafreen -- http://mail.python.org/mailman/listinfo/python-list
Regular expression
Hi all, How do I write a regular expression for this kind of sequences >gi|158028609|gb|ABW08583.1| CG8385-PF, isoform F [Drosophila melanogaster] MGNVFANLFKGLFGKKEMRILMVGLDAAGKTTILYKLKLGEIVTTIPTIGFNVETVE thanks -- Beema Shafreen -- http://mail.python.org/mailman/listinfo/python-list
print "%s"
Hi ALL, In my script i have to print a series of string , so print "%s\t%s\t%s\t%s\t%s\t%s\t" %("a","v","t","R","s","f") I need to know instead of typing so many %s can i write %6s in python, as we do in C progm. What are the other options . -- Beema Shafreen -- http://mail.python.org/mailman/listinfo/python-list
Re: print "%s"
Thanks a lot for your kind suggestions On Mon, Aug 18, 2008 at 7:07 PM, gundlach <[EMAIL PROTECTED]> wrote: > The string.join() approach is better for your purpose, but FYI you can > multiply a string to repeat it: > > In [2]: "%s\t" * 6 > Out[2]: '%s\t%s\t%s\t%s\t%s\t%s\t' > > - Michael > > On Aug 18, 3:27 am, Bruno Desthuilliers [EMAIL PROTECTED]> wrote: > > Cameron Simpson a écrit : > > > > > > > > > On 18Aug2008 11:58, Beema Shafreen <[EMAIL PROTECTED]> wrote: > > > | In my script i have to print a series of string , so > > > | > > > | print "%s\t%s\t%s\t%s\t%s\t%s\t" %("a","v","t","R","s","f") > > > | > > > | I need to know instead of typing so many %s can i write %6s in > python, as > > > | we do in C progm. > > > > > I hate to tell you this, but "%6s" in C does NOT print 6 strings. It > > > prints 1 string, right justified, in no less that 6 characters. > > > C is just like Python in this example. > > > > > | What are the other options . > > > > > Write a small loop to iterate over the strings. Print a tab before each > > > string except the first. > > > > Or use the str.join method: > > > > print "\t".join(list("avtRsf")) > > -- > http://mail.python.org/mailman/listinfo/python-list > -- Beema Shafreen -- http://mail.python.org/mailman/listinfo/python-list
biopython
Hi all, I am using Biopython to fetch pumed Id's ,The module i use is (from Bio import Entrez) But i am getting this error >>> from Bio import Entrez Traceback (most recent call last): File "", line 1, in ? ImportError: cannot import name Entrez what should i do know can anybody suggest me an alternative for this -- Beema Shafreen -- http://mail.python.org/mailman/listinfo/python-list
function return
hi all, I have a script using functions , I have a problem in returning the result. My script returns only one line , i donot know where the looping is giving problem, Can any one suggest, why this is happening and let me know how to return all the lines def get_ptm(): fh = open('file.txt','r') data_lis = [] for line in fh.readlines(): data = line.strip().split('\t') id = data[0].strip() gene_symbol = data[1].strip() ptms = data[8].strip() result = "%s\t%s\t%s" %(id,gene_symbol,ptms) return result fh.close() -- Beema Shafreen -- http://mail.python.org/mailman/listinfo/python-list
Re: function return
thanks for your valuable comments. I could solve the problem wiht your comments On Thu, Sep 11, 2008 at 7:07 PM, Fredrik Lundh <[EMAIL PROTECTED]>wrote: > make that: > > note that you put the "return" statement inside the loop, so returning >> only one line is the expected behaviour. >> > > to fix this, you can append the result strings to the data_lis list inside > the loop: > > result = "%s\t%s\t%s" %(id,gene_symbol,ptms) >> data_lis.append(result) >> >> and then return the list when done: >> >> > fh.close() >> return data_lis >> >> > -- > http://mail.python.org/mailman/listinfo/python-list > -- Beema Shafreen -- http://mail.python.org/mailman/listinfo/python-list