date:20170615

Re: Using POS payloads for chunking

2017-06-15 Thread José Tomás Atria

Ah, good to know! I'm actually using lower level calls, as I'm building the TokenStream by hand from UIMA annotations and not using any analyzer, but I'll keep that in mind for uture projects. Thanks! On Thu, Jun 15, 2017 at 12:10 PM Erick Erickson wrote: > José: > > Do note that, while the byt

Re: Using POS payloads for chunking

2017-06-15 Thread Erick Erickson

José: Do note that, while the bytearray isn't limited, prior to LUCENE-7705 most of the tokenizers you would use limited the incoming token to 256 at most. This is not at all a _Lucene_ limitation at a low level, rather if you're indexing data with a delimited payload (say abc|your_payload_here) t

Re: Using POS payloads for chunking

2017-06-15 Thread José Tomás Atria

Hi Markus, thanks for your response! Now I feel stupid, that is clearly a much simpler approach and it has the added benefits that it would not require me to meddle into the scoring process, which I'm still a bit terrified of. Thanks for the tip. I guess the question is still valid though? i.e. h

Re: email field - analyzed and not analyzed in single field using custom analyzer

2017-06-15 Thread Steve Rowe

Hi Kumaran, WordDelimiterGraphFilter with PRESERVE_ORIGINAL should do what you want: . Here’s a test I added to TestWordDelimiterGraphFilter.java that passed for me:

email field - analyzed and not analyzed in single field using custom analyzer

2017-06-15 Thread Kumaran Ramasubramanian

Hi All, i want to index email fields as both analyzed and not analyzed using custom analyzer. for example, sm...@yahoo.com will.sm...@yahoo.com that is, indexing sm...@yahoo.com as single token as well as analyzed tokens in same email field... My existing custom analyzer, public class Custom

Re: Using POS payloads for chunking

Re: Using POS payloads for chunking

Re: Using POS payloads for chunking

Re: email field - analyzed and not analyzed in single field using custom analyzer

email field - analyzed and not analyzed in single field using custom analyzer

5 matches

Site Navigation

Mail list logo

Footer information