I'm not sure I follow what you mean...both regexes posted here preserve the punctuation...here is mine (ignore the names - it is in fact the same regex):

hotel_nlp.concretions.artefacts=> (pprint (hotel_nlp.protocols/run reg-seg

/"Statistics is closely related to probability theory, with which it is often grouped. The difference is, roughly, that probability theory starts from the given parameters of a total population to deduce probabilities that pertain to samples. Statistical inference, however, moves in the opposite direction---inductively inferring from samples to the parameters of a larger or total population!"/))

["Statistics is closely related to probability theory, with which it is often grouped." "The difference is, roughly, that probability theory starts from the given parameters of a total population to deduce probabilities that pertain to samples." "Statistical inference, however, moves in the opposite direction---inductively inferring from samples to the parameters of a larger or total population!"]

Similar thing happens with Lars's simpler regex...just use 're-seq' instead of 'split'

Jim

On 06/07/13 22:05, Denis Papathanasiou wrote:


On Saturday, July 6, 2013 1:54:49 PM UTC-4, Jim foo.bar wrote:

    I use this regex usually it's been a while since I last used it so
    I odn't remember how it performs...

    
#"(?<=[.!?]|[.!?][\\'\"])(?<!e\.g\.|i\.e\.|vs\.|p\.m\.|a\.m\.|Mr\.|Mrs\.|Ms\.|St\.|Fig\.|fig\.|Jr\.|Dr\.|Prof\.|Sr\.|[A-Z]\.)\s+")

    and as Lars said all you need is clojure.string/split


Thanks, though as I replied to Lars, I did want to preserve the actual terminating punctuation, whatever it was, so that why I'd looked into using partition-by.

Also, sorry for the double post (I didn't realize this group was moderated, so when I didn't see the first post appear, I re-submitted it a little while later).
--
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



--
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to