Right. TextField.TYPE_NOT_STORED should be used then.
On Thu, Apr 24, 2025 at 10:37 AM Saha, Rajib
wrote:
> Thanks Mikhail for the suggestion.
> Now the previous exception has gone. But a new exception has come from
> Field.java.
> Here below are the excep
/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#buffer>[3]
> = (byte)(uid<
> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#uid
> >>>24);
> PayloadAttributeImpl attributeImpl = new
> PayloadAttributeImpl(new BytesRef(buffer));
> addAttributeImpl(attributeImpl);
> returnToken = true;
> }
> public boolean incrementToken() throws IOException {
> if (returnToken){
>returnToken = false;
>return true;
> }
> else {
>return false;
> }
> }
> }
>
> Regards
> Rajib
>
>
--
Sincerely yours
Mikhail Khludnev
include both "licence" and "license"), but the phrase
> substitutions are not. "http", "proxy" and "server " are there, but none of
> the conjunctions appear.
>
>
>
> I don't think synonym replacement should be occurring at search time, if
> only for performance reasons, but what have I missed in how this should
> work? Am I chasing the impossible dream?
>
>
>
> cheers
>
> T
>
>
>
>
>
>
--
Sincerely yours
Mikhail Khludnev
Rechtschreibfehler kann ich nicht
> ausschliessen
> > Am 02.03.2025 um 08:18 schrieb Mikhail Khludnev :
> >
> > Hi Daniel.
> > Giving >Lucene41< my bet it's written by 4.1..4.9 version.
> > Presumably you may get 4.9 (a decade old, heh) and invok
th the same version as the files above?
>
> Cheers.
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>
--
Sincerely yours
Mikhail Khludnev
t; 500), so I think this strategy is OK, but I don't like the
> idea of having "dynamic fields".
>
> Given the intersection query requirement, is there a better way to model
> the index, aside from creating multiple documents per Root entry?
>
> Regards
>
--
Sincerely yours
Mikhail Khludnev
ucene-analyzers-common-4.7.0.jar,
> lucene-queries-4.7.0.jar, lucene-queryparser-4.7.0.jar,
> lucene-sandbox-4.7.0.jar. When lucene core is upgraded is it recommended to
> upgrade all these jars.
>
>
>
> Regards,
>
> Lavanya
> --
> Lavanya
> Give out ' What you want most' to come back
>
--
Sincerely yours
Mikhail Khludnev
> Cosmos
> > > DB?
> > >
> > > *Thanks and Regards,*
> > > *Ashwini Singh*
> > >
> >
>
>
> --
>
>
> *Thanks and Regards,*
> *Ashwini Singh*
>
--
Sincerely yours
Mikhail Khludnev
Thanks for clarification Michael!
On Tue, Dec 3, 2024 at 1:56 PM Michael Sokolov wrote:
> Sparse is meaning two different things here. In the case you found Mikhail,
> it means not every document has a value for some vector field. I think the
> question here is about very high di
.
On Mon, Dec 2, 2024 at 8:03 PM Viacheslav Dobrynin
wrote:
> Hi!
>
> I need to index sparse vectors, whereas as I understand it,
> KnnFloatVectorField is designed for dense vectors.
> Therefore, it seems that this approach will not work.
>
> вс, 1 дек. 2024 г. в 18:36, Mikhai
Hi,
May it look like KnnFloatVectorField(... DOT_PRODUCT)
and KnnFloatVectorQuery?
ilder.add(FieldValueAsScoreQuery(field_name, value),
> BooleanClause.Occur.SHOULD)
> return builder.build()
>
> it seems to work, but I'm not sure if it's a good way to implement it.
> Example 2:
> I would also like to use this mechanism for the following index:
> term1 -> (doc_id1, score), (doc_idN, score), ...
> termN -> (doc_id1, score), (doc_idN, score), ...
> Where resulting score will be calculated as:
> sum(scores) by doc_id for terms in some query
>
> Thank you in advance!
>
> Best Regards,
> Viacheslav Dobrynin!
>
--
Sincerely yours
Mikhail Khludnev
ired by some code.
I suppose it's up to custom code around
org.events.business.search.operations.SearchOperation.doRun(SearchOperation.java:202)
--
Sincerely yours
Mikhail Khludnev
r
>kind:"VISI.Story" or kind:" VISI.Dataset" or kind:DataDiscoveryAlbum or
>kind:DataDiscovery)
>
>
>
> Any comment on the different result set for the above two queries would be
> really appreciated.
>
>
>
> Regards
>
> Rajib
>
>
>
--
Sincerely yours
Mikhail Khludnev
t; Any guidance or insights would be greatly appreciated. Thank you for your
> time and assistance.
>
> Hari
>
--
Sincerely yours
Mikhail Khludnev
ValueTermAttribute.toString();
>
> //How to get startOffset & endOffset as like in Lucene 2.4
>
> //Do some calculation based on startOffset & endOffset
> }
>
> Please let me know, if there is any further information is required from
> my side.
>
> Regards
> Rajib
>
--
Sincerely yours
Mikhail Khludnev
ther information from my side.
>
> Thanks In Advance.
>
> Regards
> Rajib
>
>
--
Sincerely yours
Mikhail Khludnev
e preserving the
> NumericRangeQuery type?
> BoostQuery doesn't allow this and I haven't found a way.
>
> Thanks for your help.
>
> Claude Lepère
>
--
Sincerely yours
Mikhail Khludnev
will return results and the second
> will not.
>
>
>
> However would a query like "NOT product:c" be OK as a filter query if it
> was
> combined with other queries as per the pseudocode above?
>
>
>
> I don't think it's significant but for what it's worth this application is
> still using Lucene 8_6.3.
>
>
>
> cheers
>
> T
>
>
>
>
--
Sincerely yours
Mikhail Khludnev
essing it again.
> This seems wasteful. Is there a solution to this? Or would I have to
> implement my own Codec or some such? I started digging down that route and
> it doesn’t look pretty. 😊
>
>
>
> Tony
>
>
--
Sincerely yours
Mikhail Khludnev
it's something over there
https://github.com/apache/lucene/blob/4e2ce76b3e131ba92b7327a52460e6c4d92c5e33/lucene/highlighter/src/java/org/apache/lucene/search/highlight/Highlighter.java#L159
On Sun, Nov 12, 2023 at 11:42 PM Michael Wechner
wrote:
> Hi Mikhail
>
> Thank you very
gt; with the code above?
> I can do this, but want to make sure, that I don’t update it in a wrong
> way.
>
>
>
>
--
Sincerely yours
Mikhail Khludnev
Hello,
I'm surprised and in doubt it may happen. Would you mind to upload a short
test reproducing it?
On Wed, Sep 20, 2023 at 11:44 PM Amitesh Kumar
wrote:
> Thanks Mikhail!
>
> I have tried all other tokenizers from Lucene4.4. In case of
> WhitespaceTokwnizer, it loses roma
gt;
> On the implementation front, I am using a set of filters like
> lowerCaseFilter, EnglishPossessiveFilter etc in addition to base tokenizer
> StandardTokenizer.
>
> Per my analysis, StandardTOkenizer strips off the % sign and hence the
> behavior.Has someone faced similar requirement? Any help/guidance is highly
> appreciated.
>
--
Sincerely yours
Mikhail Khludnev
hael Wechner <
>> michael.wech...@wyona.com> wrote:
>>
>>> Hi Together
>>>
>>> You might be interesed in this paper / article
>>>
>>> https://arxiv.org/abs/2308.14963
>>>
>>> Thanks
>>>
>>> Michael
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>>
--
Sincerely yours
Mikhail Khludnev
rld = iw.getPooledInstance(sci, true);
> segmentReader = rld.getReader(IOContext.READ);
>
> //process all live docs similar to above using the segmentReader.
>
> rld.release(segmentReader);
> iw.release(rld);
> }finally{
>if (iwRef != null) {
>iwRef.decref();
> }
> }
>
> Help would be much appreciated!
>
> Thanks,
> Rahul
>
--
Sincerely yours
Mikhail Khludnev
ple, "keywords" field has 78
> tokens. I think its field_length(dl) is 78, but lucene handled as
> 76(approximate) as described in function explainTF(Explaination freq, long
> norm).
> Thank you very much for your reading and look forward to your
> answer!
>
>
> Koo
> Drive development engineer
--
Sincerely yours
Mikhail Khludnev
statistics on those terms or
> proceed with this document without affecting it boolean score.
>
> What is the best way to achieve this?
>
--
Sincerely yours
Mikhail Khludnev
OK
https://lucene.apache.org/core/8_11_2/core/org/apache/lucene/search/Weight.html#matches-org.apache.lucene.index.LeafReaderContext-int-
On Mon, Jul 10, 2023 at 2:08 PM nedyalko.zhe...@freelance.de.INVALID
wrote:
> Hi Mikhail,
>
> I don't see the matches `searcher.matches(topDo
ch.Query-int-
On Mon, Jul 10, 2023 at 12:19 PM nedyalko.zhe...@freelance.de.INVALID
wrote:
> Hello Mikhail,
>
> Great, thanks for the very fast response! The link that you provided is
> very useful and informative.
>
> Though, I have an understanding issue. After I have search
other words, I'd like to get the matches in
> a form of terms with properties like frequncy and positions.
> How can achive this?
>
> Thanks in advance!
> Ned
>
>
--
Sincerely yours
Mikhail Khludnev
(Lucene) fields with different content.
If your logic is so comprehensive you may also consider to completely
extract analysis logic
https://solr.apache.org/guide/solr/latest/indexing-guide/external-files-processes.html#the-preanalyzedfield-type
On Tue, Apr 25, 2023 at 4:08 PM Wang, Guan wrote:
>
Guan,
I hardly grasp the particular obstacle. But I don't think that the task is
out of reach overall. Can you share a test case formally describing the
desired behavior?
On Tue, Apr 25, 2023 at 12:29 AM Wang, Guan wrote:
> Hi Mikhail,
>
> Thank you for introducing
Well.. maybe something like
https://lucene.apache.org/core/8_5_1/analyzers-common/org/apache/lucene/analysis/miscellaneous/ConditionalTokenFilter.html
?
On Mon, Apr 24, 2023 at 11:40 PM Wang, Guan wrote:
> Hi Mikhail,
>
> Thank you for the definitive answer!
>
> I could "sol
ld not
> be used for urgent or sensitive issues
>
--
Sincerely yours
Mikhail Khludnev
https://t.me/MUST_SEARCH
A caveat: Cyrillic!
Nope, it's embedded completely.
You can find Trie.java in lucene-8.11.2 sources. And compiled class
in lucene-analyzers-stempel-8.11.2.jar as well.
On Mon, Apr 3, 2023 at 12:03 PM Saha, Rajib
wrote:
> Hi Mikhail,
>
> In top stack,
> java.lang.
va:1757)
> at
> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1400)
>
> Regards
> Rajib
>
--
Sincerely yours
Mikhail Khludnev
https://t.me/MUST_SEARCH
A caveat: Cyrillic!
.
Enjoy.
On Fri, Mar 3, 2023 at 3:48 PM Saha, Rajib
wrote:
> Hi Mikhail, Uwe,
>
> We are been able to overcome several hurdles.
> Thanks for your suggestions, which helped us a lot. 😊
>
> We need one more suggestion. Previously, we had used a sample
(BLOOMBERG/ 919 3RD A) <
lkotzanie...@bloomberg.net> wrote:
> Hi Mikhail,
>
> Thanks for the quick reply and the suggestion. This is definitely good to
> know about. In my case however, there are several such NLP/data extraction
> systems and I am not sure if they all use the same
sense does a similar solution already exist? If
> it doesn’t exist yet would it be something that would be of interest to the
> community?
> Any thoughts on this would be much appreciated.
>
> Thanks,
> Luke
--
Sincerely yours
Mikhail Khludnev
https://t.me/MUST_SEARCH
A caveat: Cyrillic!
gory/volume, but unfortunately the highlighter.getBestTextFragments()
> method marks all the occurrences of "note" and "extra" in the content too.
> This we don't want.
>
> I can't see how to separate that part of the query out in the highlighter
> methods, and I wonder what best practice would be here. I'm probably being
> naive in using a single query for the whole job. Do I need to run a query
> for category/volume, and then a subquery on text and title, and just use
> the
> subquery in the highlighter? If that's the approach, is there a nice simple
> explanation somewhere you could point me to? Because I'm a simple user who
> has never done anything beyond using the simple QueryParser for everything.
>
>
>
> cheers
>
> T
>
>
>
>
>
>
>
>
--
Sincerely yours
Mikhail Khludnev
https://t.me/MUST_SEARCH
A caveat: Cyrillic!
t; happens for a bunch of the languages, just presented 2 examples.
> > Feel free to propose any changes, comments fixes :)
> > Thank's a lot in advance,
> > Thanos
>
--
Sincerely yours
Mikhail Khludnev
https://t.me/MUST_SEARCH
A caveat: Cyrillic!
Hello, Rajib.
On Mon, Jan 30, 2023 at 4:07 PM Saha, Rajib
wrote:
> Hi Mikhail,
>
> Thanks for your suggestion. It solved lots of cases today in my end. 😊
>
> I need some more suggestions from your end. I am putting together as below
> one by one:
>
ne9.
>
> But the "DirectPostingFormat" is still in Lucene9.
>
> Could anyone help me to understand how to replace the DirectDocValueFormat
> in Lucene9?
>
> Thanks
> Regards
> MyCoy
>
--
Sincerely yours
Mikhail Khludnev
https://t.me/MUST_SEARCH
A caveat: Cyrillic!
-org.apache.lucene.index.IndexReader-java.lang.String-
On Sun, Jan 29, 2023 at 2:08 PM Saha, Rajib
wrote:
> Hi Mikhail,
>
> Thanks for the reference link.
> It really helped me.
>
> In One of my requirement, I need to extract, all the Terms in a
> IndexReader.
> I was trying the refere
Right. SynonymMap.html#WORD_SEPARATOR
<https://lucene.apache.org/core/8_0_0/analyzers-common/org/apache/lucene/analysis/synonym/SynonymMap.html#WORD_SEPARATOR>
was
a redundant complication. Spaces work fine.
On Thu, Jan 19, 2023 at 4:26 AM Anh Dũng Bùi wrote:
> Thanks Mikhail!
>
&g
:18 PM _ SATNAM wrote:
> Hey Mikhail and Anh Dung Bui
> i am also struggling with synonym query
> my use case for eg
> I created synonyms for word
> API --> Application program interface
> UI -> user interface
>
> doc 1 ---> This is API and it is cal
{
> //Some internal function to process the doc.
> forEach.process(termDocs.doc());
> }
>
> }
>
> Regards
> Rajib
>
--
Sincerely yours
Mikhail Khludnev
https://t.me/MUST_SEARCH
A caveat: Cyrillic!
;
> Currently I am badly required of some examples of using TokenStream,
> tokenAttributes, *Filter.
> I need to replace the uses of "Token".
>
> Could somebody please help me in it?
>
> Regards
> Rajib
>
>
>
--
Sincerely yours
Mikhail Khludnev
https://t.me/MUST_SEARCH
A caveat: Cyrillic!
------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>
--
Sincerely yours
Mikhail Khludnev
https://t.me/MUST_SEARCH
A caveat: Cyrillic!
are computed? As I understand SynonymWeight
> will
> > > consider all terms as exactly the same while BooleanQuery will favor
> the
> > > documents with more matched terms.
> > > - Is it worth it to support multi-term synonyms in SynonymQuery? My
> > feeling
> > > is that it's better to just use BooleanQuery in those cases, since to
> > > support multi-term synonyms it needs to accept a list of Query, which
> > would
> > > make it behave like a BooleanQuery. Also how scoring works with
> > multi-term
> > > is another problem.
> > >
> > > Thanks & Regards!
> > >
> >
> >
> > -
> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: java-user-h...@lucene.apache.org
> >
> >
>
--
Sincerely yours
Mikhail Khludnev
https://t.me/MUST_SEARCH
A caveat: Cyrillic!
se BooleanQuery in those cases, since to
> support multi-term synonyms it needs to accept a list of Query, which would
> make it behave like a BooleanQuery. Also how scoring works with multi-term
> is another problem.
>
> Thanks & Regards!
>
--
Sincerely yours
Mikhail Khludnev
tectorOp.java#L39
> ) at production scale and discovered really bad performance during certain
> conditions which I attribute to this unnecessary synching. I suspect this
> may have impacted others as well
> https://stackoverflow.com/questions/42960569/indexing-taking-long-time-when-using-opennlp-lemmatizer-with-solr
> > Many thanks,
> > Luke Kot-Zaniewski
> >
>
--
Sincerely yours
Mikhail Khludnev
We may have dozens
> of such fields in our index, thus there isn't any one field that can be
> used to sort the index. So I guess my question if what I am trying to
> achieve is possible? I tried to look though Solr codebase, but so far
> couldn't come up with anything. Code example is here
> https://pastebin.com/i05E2wZy . I am using 9.4.1. Thanks in advance.
>
> Andrei
>
>
--
Sincerely yours
Mikhail Khludnev
> real impact on the retrieving quality and performance.
>
> I'm wondering if there is any best practice, e.g. how many docs should be
> in a single graph?
> Or does anyone have some production experience to share?
>
> Thanks & Regards
> MyCoy
>
--
Sincerely yours
Mikhail Khludnev
example, I've studied the "KnnVectors" a little.
> The "PerFieldKnnVectorsFormat.FieldsWriter" acutally uses the
> "Lucene94HnswVectorsFormat".
> But why do we have this kind of structures?
>
> Thanks & Regards
>
> MyCoy
>
--
Sincerely yours
Mikhail Khludnev
question about lucene suggester APIs. If I build multiple FSTs
> using a suggester, is there a way to merge two generated FSTs?
>
> --
>
> Nitish Jain
>
--
Sincerely yours
Mikhail Khludnev
ment, outside of
> Lucene?
>
> Kendall
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>
--
Sincerely yours
Mikhail Khludnev
gt; instances representing all of the Lucene Doc IDs in the index, with
> the bits turned on for those documents we want to be included in search
> results.
>
> If this has already been answered in a forum post, I apologize. Or if
> there's a Lucene specific forum somewhere I could look at, if you could
> kindly point me there, I would appreciate it.
>
> Any help/insight is greatly appreciated.
>
> Thanks,
> Scott Robey
>
--
Sincerely yours
Mikhail Khludnev
verhead of function calls can cause delay.
> As a result I'm looking for a trick to ignore the function call and have
> all no scoring on my whole query
>
> Is it possible to ignore this step?
>
> thanks a million
>
--
Sincerely yours
Mikhail Khludnev
xisting index for search? Also, is there a way to configure the
> benchmark to use multiple threads for indexing (looks to me that it’s a
> single-threaded indexing)?
>
> --Regards,
> Balmukund
>
--
Sincerely yours
Mikhail Khludnev
/browse/LUCENE-9136 ,
https://issues.apache.org/jira/browse/LUCENE-9322 . I see that there are some
related work and related PRs. What is the current state of this functionality?
--
Thanks,
Mikhail
---
> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: java-user-h...@lucene.apache.org
> >
> >
>
> --
> Vincenzo D'Amore
>
--
Sincerely yours
Mikhail Khludnev
gt; implications of going down this route, especially when dealing with large
> result sets.
>
> @Mikhail: Thanks for the suggestion! I actually hadn't thought of that.
> Could you please provide more details on how we could approach the problem
> from this angle?
>
> Tha
i
>
> [1]
>
> https://lucene.apache.org/core/8_5_1/join/org/apache/lucene/search/join/JoinUtil.html
> [2]
>
> https://lucene.472066.n3.nabble.com/access-to-joined-documents-td4412376.html
> [3] https://issues.apache.org/jira/browse/LUCENE-3602
>
--
Sincerely yours
Mikhail Khludnev
to achieve
> this?
>
>
> Regards
> Kumaran R
>
--
Sincerely yours
Mikhail Khludnev
Pass TopDocsCollector as the first arg into TimeLimitingCollector.
On Thu, Feb 27, 2020 at 2:31 PM wrote:
> Hi,-
>
> Sometimes the search takes too long even with PhraseWildcardQuery, so i
> would like to limit the search time via TimeLimitingCollector API.
>
>
> Thank
gt; Is there such an api or plan to implement one?
>
>
> Best regards
>
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>
--
Sincerely yours
Mikhail Khludnev
t;
There are no one.
> Best regards
>
> On 2/4/20 11:14 AM, baris.ka...@oracle.com wrote:
> >
> > Thanks but i thought this class would have a mechanism to fix this issue.
> > Thanks
> >
> >> On Feb 4, 2020, at 4:14 AM, Mikhail Khludnev wrote:
> >>
&g
> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> >> For additional commands, e-mail: java-user-h...@lucene.apache.org
> >>
> >
> > ---------
> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: java-user-h...@lucene.apache.org
> >
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>
--
Sincerely yours
Mikhail Khludnev
27;t use fixed number of Fields to
> query on. Even if there are fixed number of fields, the query has to check
> for each field to match at least one word.
>
> Is it possible to handle this requirement using Lucene? or should I go for
> other options?
>
> I am new to Lucene, any help would be appreciated.
>
>
>
> Thanks,
>
> Kart
>
>
--
Sincerely yours
Mikhail Khludnev
>
> > > I use SmartChineseAnalyzer to do the indexing, and add a document with
> a
> > > TextField whose value is a long sentence, when anaylized, will get 18
> > > terms.
> > >
> > > & then i use the same value to construct a PhraseQuery, setting slop to
> > 2,
> > > and adding the 18 terms concequently...
> > >
> > > I expect the search api to find this document, but it returns empty.
> > >
> > > Where am i wrong?
> > >
> >
> >
> > --
> > Adrien
> >
>
--
Sincerely yours
Mikhail Khludnev
, How can Lucene's Query API become high-order composable? Lucene's
> "LeafContext" concept is really very confusing me...
>
--
Sincerely yours
Mikhail Khludnev
o
reduce memory footprint by storing only top candidate results in a binary
heap.
IIRC it's described in this classic paper
http://www.savar.se/media/1181/space_optimizations_for_total_ranking.pdf
--
Sincerely yours
Mikhail Khludnev
ese doubts. I like to quote
this talk https://www.youtube.com/watch?v=T5RmMNDR5XI
>
> Mikhail Khludnev 于2019年12月27日周五 下午5:05写道:
>
> > Hello,
> > It's by design: StringFields are searchable and filled by analysis
> output,
> > StoredFields are returned input value
ll){String term
> = byteRef.utf8ToString();terms.add(term);}
> } catch (IOException e) {e.printStackTrace();
> log.error(e.getMessage(), e);}*
>
> To my supprise, terms seems only returning the STORED value, which is the
> original value form, but i expect they should be the terms i put in each
> StringField!
>
> Is this a design miss or impl. limit?
>
--
Sincerely yours
Mikhail Khludnev
> I have a document set, most fields to index is only text type, suited for a
> StandAnalyzer or a SmartChineseAnalyzer. But the problem is, i have a
> special field which is a KeywordList type, like "A;B;C", which i hope i can
> fully control the analyzing step.
>
> How to do this in Lucene?
>
--
Sincerely yours
Mikhail Khludnev
s my conditions:
> 1) Uses a StandardAnalyzer
> 2) Does the actual query.toString() return lowercase J and S
>
> David Shifflett
>
>
> On 10/22/19, 10:44 AM, "Mikhail Khludnev" wrote:
>
> On Tue, Oct 22, 2019 at 5:26 PM Shifflett, David [USA] <
On Tue, Oct 22, 2019 at 5:26 PM Shifflett, David [USA] <
shifflett_da...@bah.com> wrote:
> Mikhail,
>
> Thanks for running those tests.
> I haven’t looked into the test, but can you confirm it uses an analyzer
> with the lowercase filter?
>
Look at his diff. It
"~2
> Type of query : ComplexPhraseQuery
>
> If I change teststr to "\"Foo Bar\""
> I get
> Query : "Foo Bar"
> Type of query : ComplexPhraseQuery
>
> If I change teststr to "Foo Bar"
> I get
> Query : content:foo content:bar
> Type of query : BooleanQuery
>
>
> In the first two cases I was expecting the search terms to be switched to
> lowercase.
>
> Were the Foo and Bar left as originally specified because the terms are
> inside double quotes?
>
> How can I specify a search term that I want treated as a Phrase,
> but also have the query parser apply the LowerCaseFilter?
>
> I am hoping to avoid the need to handle this using PhraseQuery,
> and continue to use the QueryParser.
>
>
> Thanks in advance for any help you can give me,
> David Shifflett
>
>
--
Sincerely yours
Mikhail Khludnev
>
>
>
>
>
>
> --
> Sent from:
> https://lucene.472066.n3.nabble.com/Lucene-Java-Users-f532864.html
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>
--
Sincerely yours
Mikhail Khludnev
I'm essentially looking for something similar to `add-distinct`
> and `remove` from Solr's atomic updates functionality, just directly in
> Lucene.
>
--
Sincerely yours
Mikhail Khludnev
t; parents, I mean, it is already required to be the last document in the
> block, why do we need to provide a query for them?
> > >
> > > > On July 3, 2019 at 10:52 AM ANDREI SOLODIN
> wrote:
> > > >
> > > >
> > > > Thanks Mikhai
On Wed, Jul 3, 2019 at 6:11 PM ANDREI SOLODIN wrote:
>
> This returns "id3", which is unexpected.
>
> Please check ToPBJQ javadoc. It's absolutely expected.
--
Sincerely yours
Mikhail Khludnev
amp; won't work for multi-sort field queries or out-of-order scoring etc..
>
> But, in general will this be a good idea to explore or something that is
> best not attempted?
>
> Any help is much appreciated
>
> --
> Ravi
>
--
Sincerely yours
Mikhail Khludnev
example, at least to understand how to start a minimal basic
> project?
>
> Thanks
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>
--
Sincerely yours
Mikhail Khludnev
are not any subsequent terms in the field?
>
> -Mike
>
--
Sincerely yours
Mikhail Khludnev
base
query (in the worst case it's MatchAllDocsQuery) and custom
DoubleValuesSource by calling FunctionScoreQuery.boostByValue(Query,
DoubleValuesSource).
On Sun, Jan 27, 2019 at 9:34 PM MarcoR
wrote:
> Thanks Mikhail,
>
> I'm afraid I don't understand your sugge
e query type,
> but I'm stuck.
>
>
>
>
>
>
>
> --
> Sent from:
> http://lucene.472066.n3.nabble.com/Lucene-Java-Users-f532864.html
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>
--
Sincerely yours
Mikhail Khludnev
e, search "redHotChilly"
> instead of "red hot chilly" - you should use own pattern tokenizer to
> divide the query by regex pattern.
>
> Regards
> Vadim Gindin
>
> On Thu, Oct 4, 2018 at 11:58 AM Gordin, Ira wrote:
>
> > Hi friends,
> >
> > How can I implement Camel case search with Lucene?
> >
> > Thanks,
> > Ira
> >
> >
> >
>
--
Sincerely yours
Mikhail Khludnev
there any
> Combined Index structure like multiple-column indexes in mysql? I think is
> there any solutions to extends to FST which make the FINAL state connect to
> another FST?
>
>
> THANKS
--
Sincerely yours
Mikhail Khludnev
ave a way to see directly indexed data (Luke seems obsolete,
> Marple does not work with lucene 7.4.0 yet)?
>
> Thanks very much for helps, Lisheng
>
--
Sincerely yours
Mikhail Khludnev
highlighting, just a list of the words. So if
> I
> search for 'ski' and I match on 'skier' and 'skiis', I would like to get
> back a list that includes 'skier' and 'skiis'.
>
> Is there an API call that provides this?
>
>
>
> Thanks
>
> Mike
>
>
--
Sincerely yours
Mikhail Khludnev
I mean, you'd rather need offsets not positions, but I don't have something
definite to suggest.
On Tue, Jun 26, 2018 at 1:29 PM Gordin, Ira wrote:
> Hello Mikhail,
>
> I see in the link you sent that PositionIncrementAttribute determines the
> position of this token re
e I will get the 'a' positions in TokenStream.
> Additional question how I can get the line numbers and the positions
> inside the line.
> Many thanks in advance for your help,
> Ira
>
>
--
Sincerely yours
Mikhail Khludnev
ted that SearchContext will be propagated to a Query, but I didn't
> found the way how to get. I only have LeafReaderContext or LeafReader.
> Could you advice me?
>
> Regards,
> Vadim Gindin
>
--
Sincerely yours
Mikhail Khludnev
> > Apologies if I completely misundetstood but if you are looking to do
> a
> > > full
> > > > doc match, you could duplicate duplicated the doc into another field
> > that
> > > > is a true full text index of the document.
> > > >
>
ion. When explain(id) is called it checks specified id in this
> > collection and outputs "matched"/"not matched".
> >
> > The questions.
> > 0. This document is founded by the plugin, but explain(id) method takes
> > the wrong ID. Why? It happens in the real installation, but in the test
> > case - it works fine.
> > 1. ID=342 and others come to explain(id) method. Note, it is not a
> > document id - it is ID of the nested object (category). Why does it
> happen?
> > 2. I have a test case, based on ESIntegTestCase. It works fine with this
> > document. But this document is not founded in the real index.
> >
> > Regards,
> > Vadim Gindin
> >
>
--
Sincerely yours
Mikhail Khludnev
ference
> between these 20 implementations and which of them can be really useful?
>
> Regards,
> Vadim Gindin
>
--
Sincerely yours
Mikhail Khludnev
: what terms are matched to what fields and so on.
>
>
> It seems, that BooleanQuery/BooleanScorer is not a good place to accumulate
> some information from a child Queries/Scorers.
>
--
Sincerely yours
Mikhail Khludnev
1 - 100 of 120 matches
Mail list logo