That's honestly closer to what I was originally envisioning--I've never 
really looked into graph dbs before, but I'll check out Neo4j tonight. Do 
you know whether you can model multiple edges between the same nodes? I'd 
love to be able to have POS-based wildcarding as a feature, so you could 
search for e.g. "the ADJ goose", but that's a whole other layer of stuff, 
so it might go in the "eventually, maybe" pile.



On Tuesday, March 10, 2015 at 3:47:37 PM UTC-4, Ray Miller wrote:
>
> On 10 March 2015 at 17:58, Sam Raker <sam....@gmail.com <javascript:>> 
> wrote: 
> > I more meant deciding on a maximum size and storing them qua ngrams--it 
> > seems limiting. On the other hand, after a certain size, they stop being 
> > ngrams and start being something else--"texts," possibly. 
>
> Exactly. When I first read your post, I almost suggested you model 
> this in a graph database like Neo4j or Titan. Each word would be a 
> node in the graph with an edge linking it to the next word in the 
> sentence. You could define an index on the words (so retrieving all 
> nodes for a given word would be fast), then follow edges to find and 
> count particular n-grams. This is more complicated than the relational 
> model I proposed, and will be a bit slower to query. But if you don't 
> want to put an upper-bound on the length of the n-gram when you index 
> the data, it might be the way to go. 
>
> Ray. 
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to