Re: Indexing product keys with and without spaces in them

2012-01-03 Thread Christoph Kaser
Unfortunately, we don't have a designated field for product identifiers, and the product identifiers are from various manufacturers. So it is hard to normalize product keys, as we can't distinguish them from other parts of the document. Examples are xbox 360 (which might be searched as xbox360)

Re: Indexing product keys with and without spaces in them

2012-01-03 Thread Ian Lea
My suggestion wasn't to store/index the triplets, just a normalized version of the product key. So if you had id: CRXUSB2.0-16GB desc: some 16GB USB thing you'd index, in your searchable words field, "CRXUSB2.016GB some 16GB USB thing" And then at search time you'd take "CRX USB2.0-16G" and nor

RE: Indexing product keys with and without spaces in them

2012-01-03 Thread Uwe Schindler
Hi, > Has somebody ever tried something like this? Is there a way to do this without > increasing the index to about 15 times (1+2+3+4+5) its original size? The index will not have 15 times the size as it is inverted index and only indexes the unique parts of your tokens. In most cases it will ha

RE: Indexing product keys with and without spaces in them

2012-01-03 Thread Uwe Schindler
java-user@lucene.apache.org > Subject: Re: Indexing product keys with and without spaces in them > > Hi Aditya, > > Thank you for your suggestion! > Unfortunately, this is not possible, as there is no common format for all product > keys. The products are not ours nor are they al

Re: Indexing product keys with and without spaces in them

2012-01-03 Thread Christoph Kaser
Hi Ian, thank you for your reply. Unfortunately this will be hard, as we have no way of knowing at which position the user might enter spaces, so we cannot expand the product keys at indexing time. The other way round (triplets without spaces or hyphens) might work, however we have no real

Re: Indexing product keys with and without spaces in them

2012-01-03 Thread Christoph Kaser
Hi Aditya, Thank you for your suggestion! Unfortunately, this is not possible, as there is no common format for all product keys. The products are not ours nor are they all from the same manufacturer, so we don't have any influence on how the product keys look like. Regards, Christoph On 03

Re: Indexing product keys with and without spaces in them

2012-01-03 Thread findbestopensource
Hi Christoph My opinion is, you should not normalize or do any modification to the product keys. This should be unique. Should be used as it is. Instead of spaces you should have only used "-" but since the product already out in the market, it cannot help. In your UI, You could provide multiple

Re: Indexing product keys with and without spaces in them

2012-01-03 Thread Ian Lea
When indexing you could normalise them down to a standard format without spaces or hyphens, but searching is much harder if you really can't identify possible product ids within user queries. Make triplets without spaces or hyphens? "CRX USB-2.0 16GB" ==> CRXUSB2.016GB but also "some random words

Indexing product keys with and without spaces in them

2012-01-03 Thread Christoph Kaser
Hello, we use lucene as search engine in an online shop. The products in this shop often contain product keys like CRXUSB2.0-16GB. We would like our customers to be able to find products by entering their key. The problem is that product keys sometimes contain spaces or dashes and customers so