On Aug 16, 2007, at 2:42 AM, Beema shafreen wrote:
hi every body,
i have compared two files:
code:
fh = open('HPRD_MAIN_20.txt','r')
for line in fh.readlines():
data = line.strip().split('#')
fh1 = open('NOMENCLATURE_MAIN_20.txt','r')
for line1 in fh1.readlines():
data1 = line1.strip().split('#')
if data1[0] == data[0]:
result = data[0] +'#'+data[3]+'|'+ data[4]
+'|'+data[9]+'|'+ data1[3]
print result
the result was as given below:
00017#ACTG1|actin, gamma 1|Actin gamma 1|ACTG
00017#ACTG1|actin, gamma 1|Actin gamma 1|Actin gamma
00017#ACTG1|actin, gamma 1|Actin gamma 1|Cytoskeletal gamma actin
but i need the result to be like this :
00017#ACTG1|actin, gamma 1|Actin gamma 1|ACTG,Actin
gamma,Cytoskeletal gamma, actin
with out redundancy and the name in the same line separated by
commas..
please suggest what should i do for this to get the result like this.
# untested
fh = open('HPRD_MAIN_20.txt','r')
for line in fh.readlines():
data = line.strip().split('#')
hprd = '%s#%s|%s|%s|' % (data[0], data[3], data[4], data[9])
nomenclature = []
fh1 = open('NOMENCLATURE_MAIN_20.txt','r')
for line1 in fh1.readlines():
data1 = line1.strip().split('#')
if data1[0] == data[0]:
nomenclature.append(data1[3])
print '%s%s' % (hprd, ','.join(nomenclature))
hth,
Michael
---
"I would rather use Java than Perl. And I'd rather be eaten by a
crocodile than use Java." — Trouser
--
http://mail.python.org/mailman/listinfo/python-list