First of all thanks for helping me out.
I have to admit I dont understand some of your suggestiosn, sorry.
I dont know what is the "3D" thing... Is there another way to make it
work something more simple for a newbie like me? Thanks
What I want to do is:
First check all the files from a folder and analyze only the one with the .Seq
extension.
What I want to do is to get the reverse complement of the DNA sequence. If
their is a problem
with some characters in the DNA Sequence I want the function to tell it to me.
Here are the comp and iupac:
iupac ="GgAaTtCcRrYyMmKkSsWwHhBbVvDdNn"
comp={"A":"T", "T":"A", "G":"C", "C":"G", "R":"Y", "Y":"R", "M":"K",
"K":"M", "S":"W", "W":"S", "B":"V", "V":"B", "D":"H", "H":"D", "r":"y",
"y":"r", "m":"k", "k":"m", "s":"w", "w":"s", "b":"v", "v":"b", "d":"h",
"h":"d", "a":"t", "t":"a", "g":"c", "c":"g", "N":"N","n":"n"}
So if a $ or Z appears in the DNA sequence, I want to know it.
My code so far:
# -*- coding: iso-8859-1 -*-
import sys
import os
from progadn import *
ab1seq = raw_input("Entrez le répertoire où sont les fichiers à analyser: ") or
None
if ab1seq == None :
print "Erreur: Pas de répertoire! \n" \
"\nAu revoir \n"
sys.exit()
listrep = os.listdir(ab1seq)
#print listrep
extseq=[]
for f in listrep:
if f[-4:]==".Seq":
extseq.append(f)
#print extseq
for x in extseq:
f=open(x, "r")
seq=f.read()
f.close()
#s=seq
def checkDNA(seq):
"""Retourne une liste des caractères non conformes à l'IUPAC."""
junk=[]
for c in range (len(seq)):
if seq[c] not in iupac:
junk.append([seq[c],c])
#print junk
print "ATTN: Il y a le caractère %s en position %s " % (seq[c],c)
if junk == []:
indinv=range(len(seq))
indinv.reverse()
resultat=""
for i in indinv:
resultat +=comp[seq[i]]
return resultat
seq=checkDNA(seq)
-
Path:
news3!feeder.news-service.com!news.glorb.com!postnews.google.com!o13g2000cwo.googlegroups.com!not-for-mail
From: gry@ll.mit.edu
Newsgroups: comp.lang.python
Subject: Re: problem with the logic of read files
Date: 12 Apr 2005 10:47:17 -0700
Organization: http://groups.google.com
Lines: 104
Message-ID: <[EMAIL PROTECTED]>
References: <[EMAIL PROTECTED]>
NNTP-Posting-Host: 129.55.200.20
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
X-Trace: posting.google.com 1113328069 32347 127.0.0.1 (12 Apr 2005 17:47:49
GMT)
X-Complaints-To: [EMAIL PROTECTED]
NNTP-Posting-Date: Tue, 12 Apr 2005 17:47:49 + (UTC)
In-Reply-To: <[EMAIL PROTECTED]>
User-Agent: G2/0.2
Complaints-To: [EMAIL PROTECTED]
Injection-Info: o13g2000cwo.googlegroups.com; posting-host=129.55.200.20;
posting-account=tzIXbQwAAACT3z3X4eITVLtksgiDRxhx
Xref: news-x2.support.nl comp.lang.python:438583
<[EMAIL PROTECTED]> wrote:
> I am new to python and I am not in computer science. In fact I am a
biologist and I ma trying to learn python. So if someone can help me, I
will appreciate it.
> Thanks
>
>
> #!/cbi/prg/python/current/bin/python
> # -*- coding: iso-8859-1 -*-
> import sys
> import os
> from progadn import *
>
> ab1seq =3D raw_input("Entrez le r=E9pertoire o=F9 sont les fichiers =E0
analyser: ") or None
> if ab1seq =3D=3D None :
> print "Erreur: Pas de r=E9pertoire! \n"
> "\nAu revoir \n"
> sys.exit()
>
> listrep =3D os.listdir(ab1seq)
> #print listrep
>
> extseq=3D[]
>
> for f in listrep:
## Minor -- this is better said as: if f.endswith(".Seq"):
> if f[-4:]=3D=3D".Seq":
> extseq.append(f)
> # print extseq
>
> for x in extseq:
> f =3D open(x, "r")
## seq=3D... discards previous data and refers only to that just
read.
## It would be simplest to process each file as it is read:
@@ seq=3Df.read()
@@ checkDNA(seq)
> seq=3Df.read()
> f.close()
> s=3Dseq
>
> def checkDNA(seq):
> """Retourne une liste des caract=E8res non conformes =E0
l'IUPAC."""
>
> junk=3D[]
> for c in range (len(seq)):
> if seq[c] not in iupac:
> junk.append([seq[c],c])
> #print junk
> print "ATTN: Il y a le caract=E8re %s en position %s " %
(seq[c],c)
> if junk =3D=3D []:
> indinv=3Drange(len(seq))
> indinv.reverse()
> resultat=3D""
> for i in indinv:
> resultat +=3Dcomp[seq[i]]
> return resultat
>
> seq=3DcheckDNA(seq)
> print seq
# The program segment you posted did not define "comp" or "iupac",
# so it's a little hard to guess how it's supposed to work. It
would
# be helpful if you gave a concise description of what you want the
# program to do, as well as brief sample of input data.
# I hope this helps! -- Geor