Erick Erickson schrieb:
This has been discussed more than a few times, I suggest you take
a look at the searchable archive for things like privileges, access
privileges, etc. You'll find lots of information faster that way...
You mean Erik Hatcher's answer re SecurityFilter
http://archives.devshed.com/forums/apache-92/lucene-vs-sql-database-1416862.html
or Eugene Dzhurinsky's post where he is also asking re pre-filtering
http://mail-archives.apache.org/mod_mbox/lucene-java-user/200510.mbox/[EMAIL
PROTECTED]
http://mail-archives.apache.org/mod_mbox/lucene-java-user/200510.mbox/[EMAIL
PROTECTED]
http://mail-archives.apache.org/mod_mbox/lucene-java-user/200510.mbox/[EMAIL
PROTECTED]
and Erik keeps suggesting to use a filter
http://mail-archives.apache.org/mod_mbox/lucene-java-user/200510.mbox/[EMAIL
PROTECTED]
whereas Hui also points out that filters are a problem re
peformance/scalability
http://mail-archives.apache.org/mod_mbox/lucene-java-user/200510.mbox/[EMAIL
PROTECTED]
I actually searched quite a bit before my post, but didn't find an
answer which was really answering the performance/scalability issues.
But if it was answered before and somebody knows about it, then I would
very much appreciate any concrete URLs/pointers.
Thanks
Michael
Best
Erick
On Mon, Nov 10, 2008 at 2:52 PM, Michael Wechner
<[EMAIL PROTECTED]>wrote:
Hi
We have about 1 mio documents and growing within a hierarchical order (3 to
20 deep) and about 3000 people accessing these nodes, whereas some people
have access to certain branches and other people to other branches and some
branches are shared. The access control of these nodes is changing every day
and also contains shortcuts which allows people to glimpse into parts of
branches which they otherwise do not have access to.
Currently we have one index for all nodes, which is ok
peformance/scalability wise, but before displaying the results we need to
filter based on the access privileges each user has, which is very bad
peformance wise, because it might be that the first 10K hits are all
protected re this user and hence it can take a very long time that one
finally finds a result that the user is actually allowed to see.
We were thinking about introducing an index for each user which only
contains the documents a user is actually is allowed to see, but this
doesn't scale well either if the user number is growing.
Any hints how other people are approaching such a situation would be very
much appreciated.
Thanks
Michael
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]