On Aug 16, 2007, at 2:42 AM, Beema shafreen wrote:

hi every body,
i have compared two files:
code:

fh = open('HPRD_MAIN_20.txt','r')
for line in fh.readlines():
        data = line.strip().split('#')
        fh1 = open('NOMENCLATURE_MAIN_20.txt','r')
        for line1 in fh1.readlines():
                data1 = line1.strip().split('#')
                if  data1[0] == data[0]:
result = data[0] +'#'+data[3]+'|'+ data[4] +'|'+data[9]+'|'+ data1[3]
                        print result
the result was as given below:


00017#ACTG1|actin, gamma 1|Actin gamma 1|ACTG
00017#ACTG1|actin, gamma 1|Actin gamma 1|Actin gamma
00017#ACTG1|actin, gamma 1|Actin gamma 1|Cytoskeletal gamma actin


but i need the result to be like this :


00017#ACTG1|actin, gamma 1|Actin gamma 1|ACTG,Actin gamma,Cytoskeletal gamma, actin


with out redundancy and the name in the same line separated by commas..
please suggest what should i do for this to get the result like this.

# untested
fh = open('HPRD_MAIN_20.txt','r')
for line in fh.readlines():
        data = line.strip().split('#')
        hprd = '%s#%s|%s|%s|' % (data[0], data[3], data[4], data[9])
        nomenclature = []
        fh1 = open('NOMENCLATURE_MAIN_20.txt','r')
        for line1 in fh1.readlines():
                data1 = line1.strip().split('#')
                if  data1[0] == data[0]:
                        nomenclature.append(data1[3])
        print '%s%s' % (hprd, ','.join(nomenclature))

hth,
Michael

---
"I would rather use Java than Perl. And I'd rather be eaten by a crocodile than use Java." — Trouser


-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to