I'm not sure I follow what you mean...both regexes posted here preserve
the punctuation...here is mine (ignore the names - it is in fact the
same regex):
hotel_nlp.concretions.artefacts=> (pprint (hotel_nlp.protocols/run reg-seg
/"Statistics is closely related to probability theory, with which it is
often grouped. The difference is, roughly, that probability theory
starts from the given parameters of a total population to deduce
probabilities that pertain to samples. Statistical inference, however,
moves in the opposite direction---inductively inferring from samples to
the parameters of a larger or total population!"/))
["Statistics is closely related to probability theory, with which it is
often grouped."
"The difference is, roughly, that probability theory starts from the
given parameters of a total population to deduce probabilities that
pertain to samples."
"Statistical inference, however, moves in the opposite
direction---inductively inferring from samples to the parameters of a
larger or total population!"]
Similar thing happens with Lars's simpler regex...just use 're-seq'
instead of 'split'
Jim
On 06/07/13 22:05, Denis Papathanasiou wrote:
On Saturday, July 6, 2013 1:54:49 PM UTC-4, Jim foo.bar wrote:
I use this regex usually it's been a while since I last used it so
I odn't remember how it performs...
#"(?<=[.!?]|[.!?][\\'\"])(?<!e\.g\.|i\.e\.|vs\.|p\.m\.|a\.m\.|Mr\.|Mrs\.|Ms\.|St\.|Fig\.|fig\.|Jr\.|Dr\.|Prof\.|Sr\.|[A-Z]\.)\s+")
and as Lars said all you need is clojure.string/split
Thanks, though as I replied to Lars, I did want to preserve the actual
terminating punctuation, whatever it was, so that why I'd looked into
using partition-by.
Also, sorry for the double post (I didn't realize this group was
moderated, so when I didn't see the first post appear, I re-submitted
it a little while later).
--
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient
with your first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.