Re: Using POS payloads for chunking

2017-06-15 Thread José Tomás Atria
find chunks of multiple POS-tags. > >> > >> This would be the first approach i can think of. Treating them as > regular > >> tokens enables you to use regular search for them. > >> > >> Regards, > >> Markus > >> > >> > >&

Re: Using POS payloads for chunking

2017-06-15 Thread Erick Erickson
unks of multiple POS-tags. >> >> This would be the first approach i can think of. Treating them as regular >> tokens enables you to use regular search for them. >> >> Regards, >> Markus >> >> >> >> -Original message- >> >

Re: Using POS payloads for chunking

2017-06-15 Thread José Tomás Atria
inal message- > > From:José Tomás Atria > > Sent: Wednesday 14th June 2017 22:29 > > To: java-user@lucene.apache.org > > Subject: Using POS payloads for chunking > > > > Hello! > > > > I'm not particularly familiar with lucene's search

RE: Using POS payloads for chunking

2017-06-14 Thread Markus Jelsma
encode. > > > > Thanks! > > Markus > > > > -Original message- > > > From:Erick Erickson > > > Sent: Wednesday 14th June 2017 23:29 > > > To: java-user > > > Subject: Re: Using POS payloads for chunking > > > > > &

Re: Using POS payloads for chunking

2017-06-14 Thread Tommaso Teofili
s. Payloads are versatile! > > > > > > The downside of payloads is that they are limited to 8 bits. Although > we can easily fit our reduced treebank in there, we also use single bits to > signal for compound/subword, and stemmed/unstemmed and some others. > >

RE: Using POS payloads for chunking

2017-06-14 Thread Markus Jelsma
-Original message- > From:Erick Erickson > Sent: Wednesday 14th June 2017 23:29 > To: java-user > Subject: Re: Using POS payloads for chunking > > Markus: > > I don't believe that payloads are limited in size at all. LUCENE-7705 > was done in part because there

Re: Using POS payloads for chunking

2017-06-14 Thread Erick Erickson
ds, > Markus > > -Original message- >> From:Erik Hatcher >> Sent: Wednesday 14th June 2017 23:03 >> To: java-user@lucene.apache.org >> Subject: Re: Using POS payloads for chunking >> >> Markus - how are you encoding payloads as bitsets and use them for scori

RE: Using POS payloads for chunking

2017-06-14 Thread Markus Jelsma
23:03 > To: java-user@lucene.apache.org > Subject: Re: Using POS payloads for chunking > > Markus - how are you encoding payloads as bitsets and use them for scoring? > Curious to see how folks are leveraging them. > > Erik > > > On Jun 14, 2017, at 4:45 PM, Mar

Re: Using POS payloads for chunking

2017-06-14 Thread Erik Hatcher
June 2017 22:29 >> To: java-user@lucene.apache.org >> Subject: Using POS payloads for chunking >> >> Hello! >> >> I'm not particularly familiar with lucene's search api (as I've been using >> the library mostly as a dumb index rather than

RE: Using POS payloads for chunking

2017-06-14 Thread Markus Jelsma
h June 2017 22:29 > To: java-user@lucene.apache.org > Subject: Using POS payloads for chunking > > Hello! > > I'm not particularly familiar with lucene's search api (as I've been using > the library mostly as a dumb index rather than a search engine), but I a

Using POS payloads for chunking

2017-06-14 Thread José Tomás Atria
Hello! I'm not particularly familiar with lucene's search api (as I've been using the library mostly as a dumb index rather than a search engine), but I am almost certain that, using its payload capabilities, it would be trivial to implement a regular chunker to look for patterns in sequences of p