Hi Paul,  thank for your interest. I am talking about  multiple protein sequence alignments generated by the program clustalW (see http://www.ebi.ac.uk/clustalw/help.html for additional information).  Since the sequences to be aligned can be very long, in the output clustalW split the sequences  in fragments, and every fragment starts with the name of the  aligned sequence (please check the attahced file clustal.aln to see an example). This is what I previouly called block.   If the program parseclustal.pl (also attached) was working properly the output should look like the file "parseclustal.aln" (see attachement), but instead the result is that seen in the attached badparseclustal.aln.  I run the program with the following command "parseclustal.pl clustal.aln > output"
Please, let me know if you find a solution and thanks for your help.


Regards,

Pedro Reche
 

-- 
***************************************************************************
PEDRO a. RECHE gallardo, pHD            TL: 617 632 3824                                                 
Scientist, Mol.Immnunol.Foundation,     FX: 617 632 3351
Dana-Farber Cancer Institute,           EM: [EMAIL PROTECTED]
Harvard Medical School,                 URL: http://www.reche.org
44 Binney Street, D610C,                                
Boston, MA 02115                                
***************************************************************************
 
YPK1            SQLSWKRLLMKGYIPPYKPAVS-----NSMDTSNFDEEFTR---EKPIDSVVDEYLSESV------QKQF
YPK2            KDISWKKLLLKGYIPPYKPIVK-----SEIDTANFDQEFTK---EKPIDSVVDEYLSASI------QKQF
KPCA_HUMAN      RRIDWEKLENREIQPPFKPKVC------GKGAENFDKFFTR---GQPVLTPPDQLVIANID-----QSDF
KPCZ_HUMAN      RSIDWDLLEKKQALPPFQPQIT-----DDYGLDNFDTQFTS---EPVQLTPDDEDAIKRID-----QSEF
KAPA            KEVVWEKLLSRNIETPYEPPIQ----QGQGDTSQFDKYPE----EDINYGVQGEDPYADL------FRDF
KAPC            NEVIWEKLLARYIETPYEPPIQ----QGQGDTSQFDRYPE----EEFNYGIQGEDPYMDL------MKEF
KAPB            SEVVWERLLAKDIETPYEPPIT----SGIGDTSLFDQYPE----EQLDYGIQGDDPYAEY------FQDF
KS6_HUMAN       RHINWEELLARKVEPPFKPLLQ-----SEEDVSQFDSKFTR---QTPVDSP-DDSTLSESA-----NQVF
KPC1            RNINFDDILNLRVKPPYIPEIK-----SPEDTSYFEQEFTS---APPTLTPLPSVLTTSQ------QEEF
KRAC_BOVIN      ASIVWQDVYEKKLSPPFKPQVT-----SETDTRYFDEEFTA---QMITITPPDQDDSMEGVDS-ERRPHF
SCH9            ADIDWEALKQKKIPPPFKPHLV-----SETDTSNFDPEFTT---ASTSYMNKHQPMMTATPLSPAMQAKF
KGP1_DROME      LGFDWDGLASQLLIPPFVRPIA-----HPTDVRYFDRFPC------DLNEPPDELSGWDA--------DF
ARK2_RAT        KGIDWQYVYLRKYPPPLIPPRGEVNAADAFDIGSFDEEDTKG--IKLLDCDQDLYKNFPLMISERWQQEV
DBFB            AEINFETLRTS--SPPFIPQLD-----DETDAGYFDDFTNEEDMAKYADVFKRQNKLSAMVDDSAVDSKL
DBF2            ADINFSTLRSM--IPPFTPQLD-----SETDAGYFDDFTSEADMAKYADVFKRQDKLTAMVDDSAVSSKL
YPK1            SQLSWKRLLMKGYIPPYKPAVS-----NSMDTSNFDEEFTR---EKPIDSVVDEYLSESV------QKQF
YPK2            KDISWKKLLLKGYIPPYKPIVK-----SEIDTANFDQEFTK---EKPIDSVVDEYLSASI------QKQF
KPCA_HUMAN      RRIDWEKLENREIQPPFKPKVC------GKGAENFDKFFTR---GQPVLTPPDQLVIANID-----QSDF
KPCZ_HUMAN      RSIDWDLLEKKQALPPFQPQIT-----DDYGLDNFDTQFTS---EPVQLTPDDEDAIKRID-----QSEF
KAPA            KEVVWEKLLSRNIETPYEPPIQ----QGQGDTSQFDKYPE----EDINYGVQGEDPYADL------FRDF
KAPC            NEVIWEKLLARYIETPYEPPIQ----QGQGDTSQFDRYPE----EEFNYGIQGEDPYMDL------MKEF
KAPB            SEVVWERLLAKDIETPYEPPIT----SGIGDTSLFDQYPE----EQLDYGIQGDDPYAEY------FQDF
KS6_HUMAN       RHINWEELLARKVEPPFKPLLQ-----SEEDVSQFDSKFTR---QTPVDSP-DDSTLSESA-----NQVF
KPC1            RNINFDDILNLRVKPPYIPEIK-----SPEDTSYFEQEFTS---APPTLTPLPSVLTTSQ------QEEF
KRAC_BOVIN      ASIVWQDVYEKKLSPPFKPQVT-----SETDTRYFDEEFTA---QMITITPPDQDDSMEGVDS-ERRPHF
SCH9            ADIDWEALKQKKIPPPFKPHLV-----SETDTSNFDPEFTT---ASTSYMNKHQPMMTATPLSPAMQAKF
KGP1_DROME      LGFDWDGLASQLLIPPFVRPIA-----HPTDVRYFDRFPC------DLNEPPDELSGWDA--------DF
ARK2_RAT        KGIDWQYVYLRKYPPPLIPPRGEVNAADAFDIGSFDEEDTKG--IKLLDCDQDLYKNFPLMISERWQQEV
DBFB            AEINFETLRTS--SPPFIPQLD-----DETDAGYFDDFTNEEDMAKYADVFKRQNKLSAMVDDSAVDSKL
DBF2            ADINFSTLRSM--IPPFTPQLD-----SETDAGYFDDFTSEADMAKYADVFKRQDKLTAMVDDSAVSSKL
CLUSTAL W(1.60) multiple sequence alignment



YPK1            SQLSWKRLLMKGYIPPYKPAVS-----NSMDTSNFDEEFTR---EKPIDSVVDEYLSESV
YPK2            KDISWKKLLLKGYIPPYKPIVK-----SEIDTANFDQEFTK---EKPIDSVVDEYLSASI
KPCA_HUMAN      RRIDWEKLENREIQPPFKPKVC------GKGAENFDKFFTR---GQPVLTPPDQLVIANI
KPCZ_HUMAN      RSIDWDLLEKKQALPPFQPQIT-----DDYGLDNFDTQFTS---EPVQLTPDDEDAIKRI
KAPA            KEVVWEKLLSRNIETPYEPPIQ----QGQGDTSQFDKYPE----EDINYGVQGEDPYADL
KAPC            NEVIWEKLLARYIETPYEPPIQ----QGQGDTSQFDRYPE----EEFNYGIQGEDPYMDL
KAPB            SEVVWERLLAKDIETPYEPPIT----SGIGDTSLFDQYPE----EQLDYGIQGDDPYAEY
KS6_HUMAN       RHINWEELLARKVEPPFKPLLQ-----SEEDVSQFDSKFTR---QTPVDSP-DDSTLSES
KPC1            RNINFDDILNLRVKPPYIPEIK-----SPEDTSYFEQEFTS---APPTLTPLPSVLTTSQ
KRAC_BOVIN      ASIVWQDVYEKKLSPPFKPQVT-----SETDTRYFDEEFTA---QMITITPPDQDDSMEG
SCH9            ADIDWEALKQKKIPPPFKPHLV-----SETDTSNFDPEFTT---ASTSYMNKHQPMMTAT
KGP1_DROME      LGFDWDGLASQLLIPPFVRPIA-----HPTDVRYFDRFPC------DLNEPPDELSGWDA
ARK2_RAT        KGIDWQYVYLRKYPPPLIPPRGEVNAADAFDIGSFDEEDTKG--IKLLDCDQDLYKNFPL
DBFB            AEINFETLRTS--SPPFIPQLD-----DETDAGYFDDFTNEEDMAKYADVFKRQNKLSAM
DBF2            ADINFSTLRSM--IPPFTPQLD-----SETDAGYFDDFTSEADMAKYADVFKRQDKLTAM
                               *                  *.                        

YPK1            ------QKQF
YPK2            ------QKQF
KPCA_HUMAN      D-----QSDF
KPCZ_HUMAN      D-----QSEF
KAPA            ------FRDF
KAPC            ------MKEF
KAPB            ------FQDF
KS6_HUMAN       A-----NQVF
KPC1            ------QEEF
KRAC_BOVIN      VDS-ERRPHF
SCH9            PLSPAMQAKF
KGP1_DROME      --------DF
ARK2_RAT        MISERWQQEV
DBFB            VDDSAVDSKL
DBF2            VDDSAVSSKL
YPK1            SQLSWKRLLMKGYIPPYKPAVS-----NSMDTSNFDEEFTR---EKPIDSVVDEYLSESV------QKQF
YPK2            KDISWKKLLLKGYIPPYKPIVK-----SEIDTANFDQEFTK---EKPIDSVVDEYLSASI------QKQF
KPCA_HUMAN      RRIDWEKLENREIQPPFKPKVC------GKGAENFDKFFTR---GQPVLTPPDQLVIANID-----QSDF
KPCZ_HUMAN      RSIDWDLLEKKQALPPFQPQIT-----DDYGLDNFDTQFTS---EPVQLTPDDEDAIKRID-----QSEF
KAPA            KEVVWEKLLSRNIETPYEPPIQ----QGQGDTSQFDKYPE----EDINYGVQGEDPYADL------FRDF
KAPC            NEVIWEKLLARYIETPYEPPIQ----QGQGDTSQFDRYPE----EEFNYGIQGEDPYMDL------MKEF
KAPB            SEVVWERLLAKDIETPYEPPIT----SGIGDTSLFDQYPE----EQLDYGIQGDDPYAEY------FQDF
KS6_HUMAN       RHINWEELLARKVEPPFKPLLQ-----SEEDVSQFDSKFTR---QTPVDSP-DDSTLSESA-----NQVF
KPC1            RNINFDDILNLRVKPPYIPEIK-----SPEDTSYFEQEFTS---APPTLTPLPSVLTTSQ------QEEF
KRAC_BOVIN      ASIVWQDVYEKKLSPPFKPQVT-----SETDTRYFDEEFTA---QMITITPPDQDDSMEGVDS-ERRPHF
SCH9            ADIDWEALKQKKIPPPFKPHLV-----SETDTSNFDPEFTT---ASTSYMNKHQPMMTATPLSPAMQAKF
KGP1_DROME      LGFDWDGLASQLLIPPFVRPIA-----HPTDVRYFDRFPC------DLNEPPDELSGWDA--------DF
ARK2_RAT        KGIDWQYVYLRKYPPPLIPPRGEVNAADAFDIGSFDEEDTKG--IKLLDCDQDLYKNFPLMISERWQQEV
DBFB            AEINFETLRTS--SPPFIPQLD-----DETDAGYFDDFTNEEDMAKYADVFKRQNKLSAMVDDSAVDSKL
DBF2            ADINFSTLRSM--IPPFTPQLD-----SETDAGYFDDFTSEADMAKYADVFKRQDKLTAMVDDSAVSSKL

parseclustal.pl

Reply via email to