Hello, I'm currently in the process of evaluating solutions to index the contents of ~1TB of SEC (Securities and Exchange Commission) documents. File sizes vary between a few KB to a couple hundred KB. I started evaluating Riak first because ease of setting up and expanding a cluster are primary requirements (ElasticSearch is also probably going to get evaluated, along with Solr).
Below I have a few specific questions that I was hoping people could help with: * In going through the search querying documentation, I haven't found a way to extract a section of a result containing matches. Something similar to Google's search results page where you see an excerpt of the webpage contents that match your query. Is something like this built-in so that it doesn't have to be done by the application? * Given that the documents total ~1TB of storage (not including the generated indexes), does something like decreasing the n_val make sense? Mostly the documents are bulk inserted on a daily or weekly basis – other than that all of the operations are read-only. Other than these specific questions, if anyone can provide general insight on issues that would arise from a dataset like this within Riak, please feel free to mention them. Thanks, -- Hector _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com