Hi Reza, happy new year! The choice would depend on your alignment (aminoacid or nucleotides? are the sequences closely or distantly related? is it a large alignment? are there many gaps?)... Anyway, I think the safest, unbiased way to determine a group of outliers might be to compute a phylogenetic tree and look for an outgroup. But if the set of sequences is too large you might want (first) to use a clustering algorithm, such as CD-HIT ( http://weizhongli-lab.org/cd-hit/). HTH, Javier
On Thu, Jan 3, 2019 at 6:08 PM Ethan A Merritt <merr...@u.washington.edu> wrote: > On Thursday, January 3, 2019 12:40:05 PM PST Reza Khayat wrote: > > ?Hi, > > > > > > Happy new year to all! A bit of an off topic question. Does anyone > know of a method/program to extract the most distinct "n" (n>2) sequences > from a sequence alignment? Thanks. > > If these putative "most distinct" sequences are hypothesized to belong > together, then i suggest K-means clustering. If they are hypothesized > to be unrelated individual outliers then I think you would just take the > worst scores using whatever metric you used to create original alignment. > > Ethan > > > > > > > Best wishes, > > Reza > > > > > > Reza Khayat, PhD > > Assistant Professor > > City College of New York > > Department of Chemistry > > New York, NY 10031 > > > > ######################################################################## > > > > To unsubscribe from the CCP4BB list, click the following link: > > https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1 > > > -- > Ethan A Merritt > Biomolecular Structure Center, K-428 Health Sciences Bldg > MS 357742, University of Washington, Seattle 98195-7742 > > ######################################################################## > > To unsubscribe from the CCP4BB list, click the following link: > https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1 > -- Dr. Javier M. González Instituto de Bionanotecnología del NOA (INBIONATEC-CONICET) Universidad Nacional de Santiago del Estero (UNSE) RN9, Km 1125. Villa El Zanjón. (G4206XCP) Santiago del Estero. Argentina Tel: +54-(0385)-4238352 Email <bio...@gmail.com> LinkedIn <https://www.linkedin.com/in/javier-m-gonzalez-inbionatec> ######################################################################## To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1