hi
We are indexing some chinese text (using the following outputstreamwriter with
UTF-8 enconding).
OutputStreamWriter outputFileWriter = new OutputStreamWriter(new
FileOutputStream(outputFile), "utf8");
We are trying to inspect the index in Luke 3.4.0 (have chosen the UTF-8 option
in Luke)
Hi
We are indexing some chinese text (using the following outputstreamwriter with
UTF-8 enconding).
OutputStreamWriter outputFileWriter = new OutputStreamWriter(new
FileOutputStream(outputFile), "utf8");
using lucene 3.2. The analyzer is
new LimitTokenCountAnalyzer(new
SmartChineseAnalyze
Hi
I have a noobie question. I am trying to use the SweetSpotSimilarity (SSS)
class.
http://lucene.apache.org/core/old_versioned_docs/versions/3_5_0/api/contrib-misc/org/apache/lucene/misc/SweetSpotSimilarity.html
I understand the scoring behavior of Lucene
http://lucene.apache.org/core/old_ve
Hi
I am migrating from Lucene 3.6.1 to 4.3.0. I am however not sure how to migrate
my custom collector below to 4.3.0 (this page
http://lucene.apache.org/core/4_3_0/MIGRATE.html gives some hints but is the
instructions are incomplete and looking at the source examples of custom
collectors ma
Hi Adrien
thank you very much. It worked.
have a good day
On Jun 18, 2013, at 5:35 AM, Adrien Grand wrote:
> Hi,
>
> You didn't say specifically what your problem is so I assume it is
> with the following method:
>
> On Tue, Jun 18, 2013 at 4:37 AM, P
Hi
The problem I would like to solve is determining the lucene score of a word in
_a particular_ given document. The 2 candidates i have been trying are
- QueryWrapperFilter
- BooleanQuery
Both are to restrict search within a search space. But according to Doug
Cutting QueryWrapperFilter opti
at you've got for the two fields.
>
> As for performance, first narrow down where it is taking the time. If
> it is in lucene, read
> http://wiki.apache.org/lucene-java/ImproveSearchingSpeed
>
>
> --
> Ian.
>
> On Wed, Sep 21, 2011 at 5:38 PM, Peyman Faratin
p words removed etc.
>>>
>>> Maybe you need your "word" as TermQuery, assuming it is lowercased
>>> etc., and pass the title through query parser. In other words,
>>> reverse what you've got for the two fields.
>>>
>>> As for perfo
Hi
Newbie question. I'm trying to set the max field length property of the
indexwriter to unlimited. The old api is now deprecated but I can't seem to be
able to figure out how to set the field with the new (IndexWriterConfig) API.
I've tried IndexWriterConfig.maxFieldLength(Integer.MAX_VALUE)
line, you'll likely be
> interested in the Filter variant of the above-linked Analyzer wrapper:
>
> <http://lucene.apache.org/java/3_4_0/api/core/org/apache/lucene/analysis/LimitTokenCountFilter.html>
>
>
> Steve
>
>> -Original Message-
>> From: P
Hi
I have a sentence
"i'll email you at x...@abc.com"
and I am looking at the tokens a StandardAnalyzer (which uses the
StandardTokenizer) produces
1: [i'll:0->4:]
2: [email:5->10:]
3: [you:11->14:]
5: [x:18->19:]
6: [abc.com:20->27:]
I am using the following constructor
new Standar
or you could look at UAX29URLEmailTokenizer which should
> pick up the email component, although probably not the apostrophe.
>
>
> --
> Ian.
>
>
> On Thu, Sep 29, 2011 at 7:51 PM, Peyman Faratin
> wrote:
>> Hi
>>
>> I have a sentence
>>
>
Hi
I am trying to understand why I am not able to retrieve docs I have indexed by
a ShingleAnalyzer. The setup is as follows:
During indexing I do the following:
PerFieldAnalyzerWrapper wrapper =
DocFieldAnalyzerWrapper.getDocFieldAnalyzerWrapper(Stopwords);
Hi
I have the following shinglefilter (Lucene 3.2)
public TokenStream tokenStream(String fieldName, Reader reader) {
StandardTokenizer first = new
StandardTokenizer(Version.LUCENE_32, reader);
StandardFilter second = new
StandardFilter(Version.LUCEN
have expected there to be some shingles in there.
> Are we both missing something?
>
>
> --
> Ian.
>
>
> On Tue, Oct 11, 2011 at 3:25 PM, Peyman Faratin
> wrote:
>> Hi
>>
>> I have the following shinglefilter (Lucene 3.2)
>>
>>
Hi
I have a field that is indexed as follows
for(String c: article.getCategories()){
doc.add(new Field("categories", c.toLowerCase(),
Field.Store.YES, Field.Index.ANALYZED));
}
I have a search space of 2 million docs and I need to access the category field
of each hitdoc. I woul
Hi
A client is considering moving from Lucene to ElasticSearch. What is the
community's opinion on ES?
thank you
Peyman
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-use
Thank you all for the feedback and your point of views.
Peyman
On Nov 18, 2011, at 2:47 AM, Peter Karich wrote:
> Hi Lukáš, hi Mark
>
>> https://issues.apache.org/jira/browse/SOLR-839
>
>
> thanks for pointing me there
>
>
>>> although some parameters are available as URL parameters as w
Hi
I know how to get the docFreq of a term in a single field (say "content" field)
int docFreqInIndex = indexReader.docFreq(new Term("content", q));
But is it possible to get the docFreq of a boolean query consisting of matches
across two or more fields? For instance,
BooleanQuery booleanQuer
19 matches
Mail list logo