jak 在 2021年8月6日 星期五下午4:10:05 [UTC+8] 的信中寫道: > Il 05/08/2021 11:40, Jach Feng ha scritto: > > I want to distinguish between numbers with/without a dot attached: > > > >>>> text = 'ch 1. is\nch 23. is\nch 4 is\nch 56 is\n' > >>>> re.compile(r'ch \d{1,}[.]').findall(text) > > ['ch 1.', 'ch 23.'] > >>>> re.compile(r'ch \d{1,}[^.]').findall(text) > > ['ch 23', 'ch 4 ', 'ch 56 '] > > > > I can guess why the 'ch 23' appears in the second list. But how to get rid > > of it? > > > > --Jach > > > import re > t = 'ch 1. is\nch 23. is\nch 4 is\nch 56 is\n' > r = re.compile(r'(ch +\d+\.)|(ch +\d+)', re.M) > > res = r.findall(t) > > dot = [x[1] for x in res if x[1] != ''] > udot = [x[0] for x in res if x[0] != ''] > > print(f"dot: {dot}") > print(f"undot: {udot}") > > out: > > dot: ['ch 4', 'ch 56'] > undot: ['ch 1.', 'ch 23.'] The result can be influenced by the order of re patterns?
>>> import re >>> t = 'ch 1. is\nch 23. is\nch 4 is\nch 56 is\n' >>> re.compile(r'(ch +\d+\.)|(ch +\d+)', re.M).findall(t) [('ch 1.', ''), ('ch 23.', ''), ('', 'ch 4'), ('', 'ch 56')] >>> re.compile(r'(ch +\d+)|(ch +\d+\.)', re.M).findall(t) [('ch 1', ''), ('ch 23', ''), ('ch 4', ''), ('ch 56', '')] --Jach -- https://mail.python.org/mailman/listinfo/python-list