, 2000 and 5678), you
might map them down (to 0, 1, 2 and 3 for this example) and store them
as a byte.
Currently Lucene only supports atomic types for numerics in the
FieldCache, so the smallest one is byte. It is possible to use only
ceil(log2(#unique_values)) bits/document, although that
ing drive for the large
slow stuff. Nowadays, a 30GB index (or 100GB for that matter) falls into
the small low-latency bucket. SSDs speeds up almost everything, saves
RAM and spares a lot of work hours optimizing I/O-speed.
Regard
On Thu, 2013-03-14 at 04:11 +0100, dizh wrote:
> each document has a timestamp identify the time which it is indexed, I
> want search the documents using sort, the sort field is the timestamp,
[...]
> but when you do paging, for example in a web app , the user want to go
> to the last 4980-50
On Thu, 2013-03-14 at 11:03 +0100, Toke Eskildsen wrote:
> (timestamp_in_ms << 10) & counter++
This should be
(timestamp_in_ms << 10) | counter++
-
To unsubscribe, e-mail: java-user-unsubscr...@lu
a.com/wiki/display/BOBO/Create+a+Browse+Index
The implicit requirement is that the values for your facet fields are already
indexed so that the analyzed content fits your faceting requirements.
- Toke Eskildsen
-
To unsubscribe, e-m
ight want to switch to a
setup where the index writer is persistent.
- Toke Eskildsen, state and University Library, Denmark
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail:
jvm.html
Testing the Zing with MMapDirectory vs. RAMDirectory would be a great
addition to Mike's blog post.
I wonder if Java's ByteBuffer could be used to make a more GC-friendly
RAMDirectory?
Regards,
Toke Eskildsen, State and
o's and discard the ones that only has one? That is simpler:
(A:foo AND B:foo) OR A:"foo foo"~1000 OR B:"foo foo"~1000
This all works under the assumption that you have less than 1000 terms
in each instance of your fields. Adjust accordingly.
- Toke Eskildsen, State
On Sun, 2013-09-08 at 15:15 +0200, Mirko Sertic wrote:
> I have to check, but my usecase does not require sorting or even
> scoring at all. I still do not get what the difference is...
Please describe how you perform your measurements. How do you ensure
that the index is warmed equally for the two
Rob Bygrave [robin.bygr...@gmail.com] wrote:
> Has anyone done a performance comparison for an index on a Solid State Drive
> (vs any other hard drive ... SATA/SCSI)?
We did a fair amount of testing two years ago and put some graphs at
http://wiki.statsbiblioteket.dk/summa/Hardware The short vers
On Thu, 2010-06-10 at 04:03 +0200, fujian wrote:
> Another thing is about unique. I thought it was unique "field value". If it
> means unique term, for English even loading all around 300,000 terms it
> won't take much memory, right? (Suppose the average length of term is 10,
> the total memory usa
On Tue, 2010-07-13 at 23:49 +0200, Christopher Condit wrote:
> * 20 million documents [...]
> * 140GB total index size
> * Optimized into a single segment
I take it that you do not have frequent updates? Have you tried to see
if you can get by with more segments without significant slowdown?
> Th
u can also take a look at the rank for the most common terms. If it is
very high this would explain the long execution times for compound
queries that uses one or more of these terms. A stopword filter would
help in this case if such a filter is acceptable for you.
Regards,
Toke Eskildsen
---
and for that we used our standard setup with logged queries in
order to emulate the production setting.
Regards,
Toke Eskildsen
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
m, but most
of their thoughts and solutions can be used for clean data too.
Regards,
Toke Eskildsen
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
unting with sync instead of async which gave us much
better response times during copying at the cost of a substantially
slower copy. dirsync should also be worth looking into.
Regards,
Toke Eskildsen
-
To unsubscribe, e-mail: ja
ifying an
existing order-array is cheaper than a full re-sort or not depends on
your batch size.
Regards,
Toke Eskildsen
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: ja
On Fri, 2010-08-27 at 05:34 +0200, Shelly_Singh wrote:
> I have a lucene index of 100 million documents. [...] total index size is
> 7GB.
[...]
> I get a response time of over 2 seconds.
How many documents match such a query and how many of those documents do
you process (i.e. extract a term f
s.
Switching to the Java part, try using visualvm
https://visualvm.dev.java.net/
with the Visual GC-plugin to see where the time is spend.
- Toke Eskildsen
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For ad
On Thu, 2010-10-21 at 05:01 +0200, Sahin Buyrukbilen wrote:
> Unfortunately both methods didnt go through. I am getting memory error even
> at reading the directory contents.
Then your problem is probably not Lucene related, but the sheer number
of files returned by listFiles.
A Java File contain
ision is relatively modest machines with quad-core i7, 16GB of RAM and
consumer-grade SSDs (Intel or SandForce). As we have mirrored servers
and since no one dies if they can't find a book at our library, using
enterprise-
e same as above. For the faceting method, just reverse the order in
the bi-grams.
Regards,
Toke Eskildsen
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
size makes the existing 256GB/machine a tight
fit. I seem to remember that there are two free slots in our servers, so
adding 2 new consumer-class SSDs is the obvious upgrade. We're switching
to a more memory- and CPU-efficient way of handling sorting and
faceting, so we should not need to boost
On Thu, 2010-12-02 at 03:54 +0100, David Linde wrote:
> Has anyone figured out a way to logically prove that lucene indexes ever
> word properly?
The "Precision and recall in lucene"-thread seems relevant here.
> Our company has done alot of research into lucene, all of our IT department
> is rea
On Wed, 2010-12-15 at 09:42 +0100, Ganesh wrote:
> What is the advantage of going for 64 Bit.
Larger maximum heap, more memory in the machine.
> People claim performance and usage of more RAM.
Yes, pointers normally take up 64bit on a 64bit machine. Depending on
the application, the overhead can
e shard, then multiply the
performance of a single created by merging 10 shards with that number.
Regards,
Toke Eskildsen
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
On Fri, 2011-02-04 at 05:54 +0100, Ganesh wrote:
> 2. Consider a scenario I am sharding based on the User, I am having single
> search server and It is handling 1000 members. Now as the memory consumption
> is high, I have added one more search server. New users could access the
> second server
On Mon, 2011-02-28 at 22:44 +0100, Zhang, Lisheng wrote:
> Very sorry I made a typo, what I meant to say is that lucene sort produced
> wrong
> result in English names (String ASC):
>
> liu yu
> l yy
The standard Java Collator ignores whitespace. It can be hacked, but you
will have to write your
On Mon, 2011-05-09 at 13:56 +0200, Samarendra Pratap wrote:
> We have an index directory of 30 GB which is divided into 3 subdirectories
> (idx1, idx2, idx3) which are again divided into 21 sub-subdirectories
> (idx1-1, idx1-2, , idx2-1, , idx3-1, , idx3-21).
So each part is about ½ G
On Fri, 2011-05-13 at 12:11 +0200, Samarendra Pratap wrote:
> Comparison between - single index Vs 21 indexes
> Total Size - 18 GB
> Queries run - 500
> % improvement - roughly 18%
I was expecting a lot more. Could you test whether this is an IO-issue
by selecting a slow query and performing the e
On Tue, 2011-05-31 at 08:52 +0200, Maciej Klimczuk wrote:
> I did some testing with 3.1.0 demo on Windows and encountered some strange
> bahaviour. I tried to index ~6 small text documents using the demo.
> - First trial took about 18 minutes.
> - Second and third trial took about 2 minutes.
On Thu, 2011-06-02 at 21:51 +0200, Clint Gilbert wrote:
> We're also considering a home-grown scheme involving normalizing the
> denominators of all the index components in all our indices, based on
> the sums of counts obtained from all the indices. This feels like
> re-inventing the wheel, and i
On Mon, 2011-06-06 at 15:29 +0200, zhoucheng2008 wrote:
> I read the lucene in action book and just tested the
> FSversusRAMDirectoryTest.java with the following uncommented:
> [...]Here is the output:
>
> RAMDirectory Time: 805 ms
>
> FSDirectory Time : 728 ms
This is the code, right?
http://ja
On Fri, 2011-06-10 at 10:38 +0200, Sowmya V.B. wrote:
> I am looking for a possibility of boosting a given document at query-time,
> based on the values of a particular field : instead of plainly sorting the
> normal lucene results based on this field.
I think you misunderstand Eric's answer, as h
e vs. performance looked like the power law: Heavy performance
degradation in the beginning, less later. It makes sense when we look at
caching and it means that if you do not require stellar performance, you
can have very large indexes on few machines (cu
build your Query by code, you can use ConstantScoreRangeQuery or
RangeQuery for the range part, where you can call setBoost(float).
- Toke Eskildsen
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
On Thu, 2011-06-23 at 22:41 +0200, Tim Eck wrote:
> I don't want to accuse anyone of bad code but always preallocating a
> potentially large array in org.apache.lucene.util.PriorityQueue seems
> non-ideal for the search I want to run.
The current implementation of IndexSearcher uses threaded s
On Thu, 2011-06-30 at 11:45 +0200, Guru Chandar wrote:
> Thanks for the response. The documents are all distinct. My (limited)
> understanding on partitioning the indexes will lead to results being
> different from the case where you have all in one partition, due to
> Lucene currently not supp
On Tue, 2011-07-05 at 17:50 +0200, Hiller, Dean x66079 wrote:
> We are using a sort of nosql environment and deleting 200 gig on one machine
> from the database is fast, but then we go and delete 5 gigs of indexes that
> were created and it takes forever
8 million indexes is at a minimum 16
On Mon, 2011-08-22 at 18:49 +0200, Rich Cariens wrote:
> Does anyone have any experiences or stories they can share about how SSDs
> impacted search performance for better or worse?
Our measurements are getting old, but since spinning disks hasn't
improved and SSDs has improved substantially since
On Tue, 2011-08-23 at 10:23 +0200, Dawid Weiss wrote:
> This one is humorous (watch for foul language though). It does get to
> the point, however, and Bergman is a clever guy:
>
http://www.livestream.com/oreillyconfs/video?clipId=pla_3beec3a2-54f5-4a19-8aaf-35a839b6ecaa
We installed SSDs in all
On Tue, 2011-08-23 at 11:52 +0200, Federico Fissore wrote:
> we are probably running out of topic here, but for the record, there is
> also someone lamenting about ssd
I find all of this highly on-topic. SSD reliability is an important
issue. We use customer-grade SSDs (Intel 510 were the latest
On Tue, 2011-08-23 at 14:07 +0200, Marvin Humphrey wrote:
> I'm a little confused. What do you mean by a "full to-hardware flush"
> and how is that different from the sync()/fsync() calls that Lucene
> makes by default on each IndexWriter commit()?
A standard flush from the operating system flu
. I would suggest checking with S.M.A.R.T-tool to
see if it provides you with write-statistics. I would be surprised if
they were that high.
Regards,
Toke Eskildsen
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
hem out is
unfounded.
Regards,
Toke Eskildsen
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
statistics on models and recalls would come in handy.
> fede
Heh. I'm sorry, but in danish "fede" means "fatty". On the other hand, I
also know what "Toke" means in english.
Regards,
Toke Eskildsen
-
be a very bad wear-leveling strategy. Keeping a counter for
each cell and selecting the free cell with the lowest count is trivial.
However, given the bumpy road to great SSDs, I am sure that some vendors
has done it this way.
Regards,
Toke Eskildsen
-
On Wed, 2011-08-24 at 11:46 +0200, David Nemeskey wrote:
> Theoretically, in the case described above, it would be possible to move
> 'static' data (data of cells that have not been written to for a long time)
> to
> the 5GB in question and use the 'fresher' cells as free space; this could be
>
On Sat, 2011-09-03 at 20:09 +0200, Michael Bell wrote:
> To be exact, there are about 300 million documents. This is running on a 64
> bit JVM/64 bit OS with 24 GB(!) RAM allocated.
How much memory is allocated to the JVM?
> Now, their searches are working fine IF you do not SORT the results. If
On Tue, 2011-09-06 at 17:32 +0200, Saurabh Gokhale wrote:
> Then I saw index size started exponentially increasing and by the end of 1
> year worth of data processing, I was expecting the index to be 60 to 70 GB
> but the size grew to more than 120GB.
>
> 1. Is it an expected behavior?
No, quite
On Sat, 2011-09-17 at 03:57 +0200, Charlie Hubbard wrote:
> I really just want to be called back when a new document is found by the
> searcher, and I can load the Document, find my object, and drop that to a
> file. I thought that's essentially what a Collector is, being an interface
> that is c
.
The 2K does not always make sense BTW: Never harddrives used 4K as the smallest
physical entity: http://en.wikipedia.org/wiki/Disk_sector
- Toke Eskildsen
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
, then the scores
between them will be very poorly comparable.
> If so, what can I do to make the scores from multiple indices comparable?
Wait for https://issues.apache.org/jira/browse/SOLR-1632 or ensure that the
content (and sizes) of your indices are homogenou
produce a single large file. I guess you are
performing an optimize. Don't do that (it is not really recommended anyway) and
you should have multiple smaller files.
If that was not clear, then please show us the part of your code that handles
index updates.
-
give you poor performance:
http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
Stick to MMapDirectory: As you are considering using a RAMDirectory,
your index must be smaller than the amount of free RAM, which means that
everything will be fully cached and fast.
- Toke Eskildsen,
problem or switch to SSD.
- Toke Eskildsen
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
adays and even the enterprise ones
are not that pricey. Same goes for RAM as long as we're talking about a
relative small amount such as 32GB.
- Toke Eskildsen, State and University Library, Denmark
-
To unsubscribe, e-mail
tter to keep it a couple of minutes? That way further
searches from the same client would be fast.
Overall, I worry about your architecture. It scales badly with the
number of documents/client. You might not have any clients with more
than 500 documents right now, but can you be sure that this will no
ith Lucene seems like the absolute worst
of both worlds.
Does the DB-selector do anything that cannot easily be replicated in
Lucene?
- Toke Eskildsen, State and University Library, Denmark
-
To unsubscribe, e-mail: java-user-u
-queries are all simple matching?
No complex joins and such? If so, this calls even more for a full
Lucene-index solution, which handles all aspect of the search process.
>
- Toke Eskildsen, State and University Library, Denmark
---
.
Some observations you might find relevant:
https://sbdevel.wordpress.com/2013/06/06/memory-is-overrated/
- Toke Eskildsen, State and University Library, Denmark
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.
why you get so many more I/O operations with your
16 segments.
Do you have some typical response times from the optimized index and the
segmented one, after some hundred or thousand queries has been processed and
the OS cache is properly warmed?
Can you give us a representa
the right number? I do not
see a hardware upgrade changing that with the fine machine you're using.
What is your search speed if you disable continuous updates?
When you restart the searcher, how long does the first search take?
- Toke Eskildsen, State and University Li
ng updates
- Limit page size
- Limit lookup of returned fields
- Disable highlighting
- Simpler queries
- Whatever else you might think of
At some point along the way I would expect a sharp increase in
performance.
> I've requested access to the indexes so that we can perform further testing.
e. A searchAfter that takes a
position would either need to use some clever caching or perform the giant
sorted collection when called.
- Toke Eskildsen
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
he most compact way and perform sorting on the full
collection afterwards.
- Toke Eskildsen, State and University Library, Denmark
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-ma
rence.
I am no expert there, but I will advice you to check how much free
memory your JVM has when it is running searches. GC-tweaks does not help
much if the JVM is nearly our of memory.
- Toke Eskildsen, State and University Library, Denmark
this with Lucene? If so, which API functions do I need to call?
InPlaceMergeSorter is a nice one to extend. But again, with 50K result
sets, this seems like overkill.
- Toke Eskildsen, State and University Library, Denmark
-
T
atency? Increasing throughput?
More complex queries?
- Toke Eskildsen, State and University Library, Denmark
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
outcome of your test?
- Toke Eskildsen, State and University Library, Denmark
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
other services that are the
bottleneck.
- Toke Eskildsen, State and University Library, Denmark
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
get right, but the only
somewhat-sound approximation of real world performance.
- Toke Eskildsen
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
ChannelImpl.map(FileChannelImpl.java:846)
That error can also be thrown when the number of open files exceeds the given
limit. "OutOfMemory" should really have been named "OutOfResources".
Check the maximum number of open files with 'ulimit -n'. Try r
to 16k, which had been working well.
If you don't use compound indexes and all your indexes are handled under the
same process constraint, then 16K seems quite low for hundreds of indexes. You
could check by issuing a file count on your index fol
) and how ca we do such request ?
Luke has term statistics build-in. I don't remember the details, but I recall
that it was straight forward.
- Toke Eskildsen
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
F
se 'cp'
as it is smart enough to bypass the operation if the destination is /dev/null.
Caveat: This does not guarantee that your index stays fully cached. It can be
evicted just like all other disk cache, if other programs
our response times grows about linear (with a bump at one point, due to
switch from sparse to non-sparse docset) as a function of hitcount, there is
not much about it besides sharding, with the current single-threaded processing
of lucene qu
ndex time, your indexes are tiny. What you are
seeing is probably just statistical flukes. Try re-running your tests a
few times and you will see the numbers change.
- Toke Eskildsen
-
To unsubscribe, e-mail: jav
oblems
If that does not help, give us some information to work with: How large
is your index (byte size and document count), what hardware do you have,
how large is your JVM heap, how many documents do you request at a time,
what is a typical query?
- Toke Eskildsen, State and University Library, D
onvention and the special method being
BytesRef#shallowCopyOf(BytesRef).
But we are where we are, so I don't find it viable to change behaviour.
More explicit documentation, as Dawid suggests, seems the best band aid.
- Toke Eskil
instead.
This seems contrary to
http://lucene.apache.org/core/5_2_0/core/org/apache/lucene/document/BinaryDocValuesField.html
Maybe you could update the JavaDoc for that field to warn against using it?
- Toke Eskildsen
-
To unsubs
Arjen van der Meijden wrote:
> On 9-8-2015 16:22, Toke Eskildsen wrote:
> > Maybe you could update the JavaDoc for that field to warn against using it?
> It (probably) depends on the contents of the values.
That was my impression too, but we both seem to be second-guessing Robert
ces to change code in order for it to take advantage of a changes
FixedBitSet. What is it you are trying to achieve?
- Toke Eskildsen
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-m
t okay to have a slow first-search but faster subsequent searches?
- Toke Eskildsen
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
simple to emulate in
your Lucene handling code:
http://lucidworks.com/blog/2013/12/12/coming-soon-to-solr-efficient-cursor-based-iteration-of-large-result-sets/
- Toke Eskildsen
-
To unsubscribe, e-mail: java-user-unsubscr...@
l,
there is no need to store the length of the BytesRefs. They can be calculated
with bytesStarts[id+1] - bytesStarts[id]. This saves 1-2 bytes per entry and
upholds memory locality, so it should have the same performance as now (needs
to be tested of course).
- To
sue is that fewer larger segments gets
slower DocValues retrieval, compared to more smaller segments. So a
force merge to 1 segment can result in worse performance.
- Toke Eskildsen, the Royal Danish Library, Denmark
-
To unsubscr
k.
And +1 to the issue BTW. It does not matter too much for us now, as we have
shifted to a setup where we build more indexes in parallel, but 3 years ago our
process was sequential so the 8 hour delay before building the next part was a
bit of
is simply the
number of set bits at the same locations: An AND and a POPCNT of the
bitmaps.
This does imply a sequential pass of all potential documents, which
means that it won't scale well. On the other hand each comparison is a
fast check with very low memory overhead, so I hope it will wor
d them but I figured I'd share.
A few of them, but not all. And your notes on the articles are great.
Thanks,
Toke Eskildsen, Royal Danish Library
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
On Thu, 2008-09-04 at 17:58 +0200, Cam Bazz wrote:
> anyone using ramdisks for storage? there is ramsam and there is also fusion
> io. but they are kinda expensive. any other alternatives I wonder?
We've done some comparisons of RAM (Lucene RAMDirectory) vs. Flash-SSD
vs. conventional harddrives.
On Fri, 2008-09-05 at 10:33 +0200, Cam Bazz wrote:
[RAM vs. Flash-SSD vs. harddrives]
> I have done similar test with ram vs. disk, and IO was the bottleneck.
> What flash ssd did you try with?
For disks (as in conventional 10.000/15.000 RPM harddrives), IO is
clearly the bottleneck for us also.
On Fri, 2008-09-05 at 11:00 +0200, Toke Eskildsen wrote:
> As for Flash-SSDs, we've tried 2 * MTRON 6000 32GB RAID 0, 2 * SanDisk
> 5000 32GB RAID 0 and SanDisk something (64GB model) both as single drive
> and 4 drives in RAID 0.
Update:
The "SanDisk something" tu
On Fri, 2008-10-24 at 16:01 +0200, Sudarsan, Sithu D. wrote:
> 4. We've tried using larger JVM space by defining -Xms1800m and
> -Xmx1800m, but it runs out of memory. Only -Xms1080m and -Xmx1080m seems
> stable. That is strange as we have 32 GB of RAM and 34GB swap space.
> Typically no other appli
Sudarsan, Sithu D. [EMAIL PROTECTED] wrote:
> There have been some earlier messages, where memory consumption issue
> for Lucene Documents due to 64 bit (double that of 32 bit).
All pointers are doubled, yes. While not a doubling in total RAM consumption,
it does give a substantial overhead.
> We
On Mon, 2008-11-03 at 04:42 +0100, Justus Pendleton wrote:
> 1. Why does the merge factor of 4 appear to be faster than the merge
> factor of 2?
Because you alternate between updating the index and searching? With 4
segments, chances are that most of the segment-data will be unchanged
between sear
On Mon, 2008-11-03 at 23:37 +0100, Justus Pendleton wrote:
> What constitutes a "proper warm up before measuring"?
The simplest way is to do a number of searches before you start
measuring. The first searches are always very slow, compared to later
searches.
If you look at http://wiki.statsbiblio
We use Lucene at our library for indexing from different sources into
the same logical index. The sources are very diverse and are prioritized
differently at index-time with document boosts. However, different
groups of users (or individual users for that matter) have different
preferences for the
On Thu, 2008-11-27 at 07:30 +0100, Karl Wettin wrote:
> The most scary part is that that you will have to score each and every
> document that has a source, probably all of the documents in your
> corpus.
I now see my query-logic was flawed. In order to avoid matching all
documents every time,
On Thu, 2008-11-27 at 20:55 +0100, Karl Wettin wrote:
> A cosmetic remark, I would personally choose a single field for the
> boosts and then one token per source. (groupboost:A^10 groupboost:B^1
> groupboost:C^0.1).
Agreed. Thanks.
> If I'm not misstaken CustomScoreQuery is a non matching qu
1 - 100 of 164 matches
Mail list logo