Hello everyone, I have a use case for my users which I'm having issues implementing. Hoping to find some insights here.
We are trying to let our users search for almost any content data that they have, while respecting access control policies. My users are grouped into teams, and policies are applied on the content based on teams. How we are doing it now is by storing any piece of data as a document, { type: contact_number, value: 1234567890, aclId: 1_contact_number } in our content index (6 million documents) and { aclId: 1_contact_number, canRead: TEAM_A } in our acl index (2 million documents). DocValues is enabled for aclId on both indexes. During query, we'll query the content index and use the Join Query Parser in the fq as such, fq={!join from=aclId fromIndex=acl to=acl_id}canRead: TEAM_A OR TEAM_B, where the user is part of TEAM_A and TEAM_B. This takes close to 8 seconds for uncached queries. Based on my understanding, this is slow because Solr has to 1. Retrieve all hits from the acl index 2. Comb through the entire content index, finding documents whose aclId matches those hits from the acl index 3. Apply any remaining content query to filter the results from the content index We have also tried using {!join ... score=none} (based on what we Googled) Thoughts on improving this - Thought of using streaming expressions but using /export on the content index requires sorting by fields other than the score - Querying the content index based on just the content, get the results filter based on acl on our backend until we have the first 10 results. - This requires us to load the entire acl - Repeatedly query content index if documents keep getting dropped because of acl - Benefit is that we don't have to comb the entire content index - This could be a Plugin? (not sure if it's worth the effort) Am I barking up the wrong tree? -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html