Hi Nikita, Speaking only for myself here... maybe explain more about what this library does in plain English - what problem does it solve? I had to look up the paper (ha! a known item!): http://www.cs.cmu.edu/~callan/Papers/sigir03-pto.pdf (add to README so others don't have to search?)
To make it easy to add this to Lucene, you should: * use and include ASL * include ASL snippet in each Java class * switch to Java for tests * move to org.apache.lucene... HTH, Otis -- Solr & ElasticSearch Support -- http://sematext.com/ On Fri, Jun 14, 2013 at 7:43 PM, Nikita Zhiltsov <[email protected]> wrote: > Hi all, > > I've just published a tiny extension to Lucene 4.0, which enables a mixture > of language models using standard FunctionQuery and ValueSource classes: > https://github.com/nzhiltsov/lucene-mlm > > I'd like you to assess the possibility of integrating this code into Lucene. > Appreciate any comments or fixes. > > NB. The implementation avoids using LMSimilarity per field basis, because it > would break the computation of correct Dirichlet priors for non-matched > terms, which the standard class LMSimilarity fails to include while > calculating term frequencies and treats them as zero probability entries. > > -- > > Nikita Zhiltsov > > Visiting Graduate Student > Emory University > Intelligent Information Access Lab > E500 Emerson Hall, Atlanta, Georgia, USA > Phone: (404) 834-5364 > E-mail: [email protected] > > > --------------------------------------------------------------------- > Graduate Student, Research Fellow > Kazan Federal University > Computational Linguistics Laboratory > Russia, 420008 > Kazan, Prof. Nuzhina Str., 1/37 room 117 > Skype: nickita.jhiltsov > Personal page: http://cll.niimm.ksu.ru/~nzhiltsov > E-mail: [email protected] > > --------------------------------------------------------------------- --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
