On Sun, Dec 29, 2013 at 04:02:01PM -0500, Jing Ai wrote: > Hello, > > I am trying to rewrite some contents on a long list that contains words > within brackets and outside brackets and I'm having trouble extracting the > words within brackets, especially since I have to add the append function > for list as well. Does anyone have any suggestions? Thank you! > > *An example of list*: > > ['hypothetical protein BRAFLDRAFT_208408 [Branchiostoma floridae]\n', > 'hypoxia-inducible factor 1-alpha [Mus musculus]\n', 'hypoxia-inducible > factor 1-alpha [Gallus gallus]\n' ] > > *What I'm trying to extract out of this*: > > ['Branchiostoma floridae', 'Mus musculus', 'Gallus gallus']
You have a list of strings. Each string has exactly one pair of square brackets []. You want the content of the square brackets. Start with a function that extracts the content of the square brackets from a single string. def extract(s): start = s.find('[') if start == -1: # No opening bracket found. Should this be an error? return '' start += 1 # skip the bracket, move to the next character end = s.find(']', start) if end == -1: # No closing bracket found after the opening bracket. # Should this be an error instead? return s[start:] else: return s[start:end] Let's test it and see if it works: py> s = 'hypothetical protein BRAFLDRAFT_208408 [Branchiostoma floridae]\n' py> extract(s) 'Branchiostoma floridae' So far so good. Now let's write a loop: names = [] for line in list_of_strings: names.append(extract(line)) where list_of_strings is your big list like the example above. We can simplify the loop by using a list comprehension: names = [extract(line) for line in list_of_strings] If you prefer to use a regular expression, that's simple enough. Here's a new version of the extract function: import re def extract(s): mo = re.search(r'\[(.*)\]', s) if mo: return mo.group(1) return '' The list comprehension remains the same. -- Steven _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor