Take a look at Apache Solr (http://lucene.apache.org/solr/). You can find python clients for Solr here http://wiki.apache.org/solr/SolPython
For a quick solution, add all your products to solr with the name of the product as an indexed field. Add the *product id* if you plan on storing specs elsewhere. Add all the spec names to a list of stopwords while creating your solr index. Throw the user's query against the index and you should get a list of products rank ordered by a match score between the query and the product names stored. Remove the product name from the user's query and then do a match for the spec the user is looking for. You can use regular expressions or https://code.google.com/p/esmre/ for this. You should be able to account for spelling errors in the user's query with a little more work on the Solr side of things. Solr will also open up use cases where the user wants a list of phones which weigh 200g and costs <10k. Regards, Deepu On Sun, Sep 8, 2013 at 12:04 PM, Gopalakrishnan Subramani < gopalakrishnan.subram...@gmail.com> wrote: > I have database of specs in json format. This is not manual effort. > > Right now, NLTK seems to be hard to me. I will try a plain Python wrappers > based on word match, approach NLTK later. > > Thanks. > > > On Sun, Sep 8, 2013 at 11:29 AM, harish badrinath < > harishbadrin...@gmail.com > > wrote: > > > Hello, > > > > On Sun, Sep 8, 2013 at 2:34 AM, Gopalakrishnan Subramani < > > gopalakrishnan.subram...@gmail.com> wrote: > > > > > Dear All, > > > > > > I want to build a simple automatic text based chat bot for mobile, > > tablet > > > specs for proof of concept. > > > > > > How do you plan to preseed the knowledge for the application (manually > or > > information extraction through webpages,etc). > > > > > > > The question is, when the user talks about "Samsung Galaxy S3 Weight", > > > "Galaxy SIII Weight", can NLTK predict a product (ex: Galaxy SIII) and > > give > > > me the unique _id of the product for further look up for > group/attribute > > > like weight? > > > > > > If it is manually enter the knowledge then nltk should not be required > ( > > something like yacc plus a good database schema should suffice, again > > depends on the type of input language you plan to support). > > > > Warm regards, > > Harish Badrinath > > _______________________________________________ > > BangPypers mailing list > > BangPypers@python.org > > https://mail.python.org/mailman/listinfo/bangpypers > > > _______________________________________________ > BangPypers mailing list > BangPypers@python.org > https://mail.python.org/mailman/listinfo/bangpypers > _______________________________________________ BangPypers mailing list BangPypers@python.org https://mail.python.org/mailman/listinfo/bangpypers