On Nov 30, 5:24 pm, MRAB <pyt...@mrabarnett.plus.com> wrote: > Jeremy wrote: > > I am using re.split to... well, split a string into sections. I want > > to split when, following a new line, there are 4 or fewer spaces. The > > pattern I use is: > > > sections = re.split('\n\s{,4}[^\s]', lineoftext) > > > This splits appropriately but I lose the character matched by [^s]. I > > know I can put parentheses around [^s] and keep the matched character, > > but the character is placed in it's own element of the list instead of > > with the rest of the lineoftext. > > > Does anyone know how I can accomplish this without losing the matched > > character? > > First of all, \s matches any character that's _whitespace_, such as > space, "\t", "\n", "\r", "\f". There's also \S, which matches any > character that's not whitespace.
Thanks for the reminder. I knew \S existed, but must have forgotten about it. > > But in answer to your question, use a look-ahead: > > sections = re.split('\n {,4}(?=\S)', lineoftext) Yep, that does the trick. Thanks for the help! Jeremy -- http://mail.python.org/mailman/listinfo/python-list