I'd vote for csv then.

On Jul 31, 2013, at 12:00 PM, Ted Dunning <[email protected]> wrote:




On Wed, Jul 31, 2013 at 11:20 AM, Pat Ferrel <[email protected]> wrote:
A few architectural questions: http://bit.ly/18vbbaT

I created a local instance of the LucidWorks Search on my dev machine. I can 
quite easily save the similarity vectors from the DRMs into docs at special 
locations and index them with LucidWorks. But to ingest the docs and put them 
in separate fields of the same index we need some new code (unless I've missed 
some Lucid config magic) that does the indexing and integrates with LucidWorks.

I imagine two indexes. One index for the similarity matrix and optionally the 
cross-similairty matrix in two fields of type 'string'. Another index for 
users' history--we could put the docs there for retrieval by user ID. The user 
history docs then become the query on the similarity index and would return 
recommendations. Or any realtime collected or generated history could be used 
too.

Is this what you imagined Ted? Especially WRT Lucid integration?

Yes.  And I note in a later email that you discovered how Lucid provides lots 
of connectors for different formats.  XML is fine.  I have also used CSV.
 

Reply via email to