ast 在 2021年8月5日 星期四下午11:29:15 [UTC+8] 的信中寫道: > Le 05/08/2021 à 17:11, ast a écrit : > > Le 05/08/2021 à 11:40, Jach Feng a écrit : > >> I want to distinguish between numbers with/without a dot attached: > >> > >>>>> text = 'ch 1. is\nch 23. is\nch 4 is\nch 56 is\n' > >>>>> re.compile(r'ch \d{1,}[.]').findall(text) > >> ['ch 1.', 'ch 23.'] > >>>>> re.compile(r'ch \d{1,}[^.]').findall(text) > >> ['ch 23', 'ch 4 ', 'ch 56 '] > >> > >> I can guess why the 'ch 23' appears in the second list. But how to get > >> rid of it? > >> > >> --Jach > >> > > > > >>> import re > > > > >>> text = 'ch 1. is\nch 23. is\nch 4 is\nch 56 is\n' > > > > >>> re.findall(r'ch \d+\.', text) > > ['ch 1.', 'ch 23.'] > > > > >>> re.findall(r'ch \d+(?!\.)', text) # (?!\.) for negated look ahead > > ['ch 2', 'ch 4', 'ch 56'] > import regex > > # regex is more powerful that re > >>> text = 'ch 1. is\nch 23. is\nch 4 is\nch 56 is\n' > >>> regex.findall(r'ch \d++(?!\.)', text) > > ['ch 4', 'ch 56'] > > ## ++ means "possessive", no backtrack is allowed Can someone explain how the difference appear? I just can't figure it out:-(
>>> text = 'ch 1. is\nch 23. is\nch 4 is\nch 56 is\n' >>> re.compile(r'ch \d+[^.]').findall(text) ['ch 23', 'ch 4 ', 'ch 56 '] >>> re.compile(r'ch \d+[^.0-9]').findall(text) ['ch 4 ', 'ch 56 '] --Jach -- https://mail.python.org/mailman/listinfo/python-list