[ 
https://issues.apache.org/jira/browse/SOLR-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13666130#comment-13666130
 ] 

Per Steffensen commented on SOLR-4470:
--------------------------------------

Sorry to interrupt again, but Jan is clearly still that one that really 
understand what this is and is not about. It is a shame.

In this discussion "security" seems to be just one thing. People talk about 
aspects of security that this issue has nothing to do with AT ALL.

Security is about a lot of aspects - a.o.:
* Authentication: Allowing the client to identify itself as being someone or 
something. And do it in a way so that you (the server-side) trust him. Examples 
if someone/something:
** a) The one(s) knowing a specific set of username/password
** b) The one(s) holding the private part of a certificate-pair (e.g. RSA)
** c) The one(s) able to requests for a machine with a certain IP-address
* Authorization: Basically a map from the "key" associated with your 
authentication to a set of things you are allowed to do. "Things you are 
allowed to do" can e.g. be "functions/operations you are allowed to do" or more 
fine-grained "data you are allowed to read or update or delete". For a) the 
"key" will be the username. For b) the "key" will be the public part of the 
certificate-pair (which you know in advance) and for c) the "key" will be the 
IP-address
* Integrity: E.g. on transport-layer that data has not been changed on the way 
from when the client (the authenticated party) sent it and until the server 
receives it
* Confidentiality: E.g. on transport-layer that data has not been read (and 
understood) on the way from when the client sent it and until the server 
receives it
* Dealing with the aspects on storage-, application-, transport-, OS-, 
etc-levels

Different kinds of technology claim to be able to guarantee different sets of 
those aspects. E.g. a webcontainer is required to be able to deal with 
authentication and authorization on application-layer and integrity and 
confidentiality on transport-layer - at least if it wants to be certified, 
because it is part of the spec that it needs to implement. The fact that a 
webcontainer is able to deal with those things for you is something people 
would like to advantage of - whether or not you guys are willing to accept it 
or not. At my company we want to do it, and as I understand it, Jan knows about 
at least two other others that also want to do it. I would claim that at least 
90% of Solr users that want "to do security" would like to take advantage of 
the fact that a webcontainer can handle those things for you.

bq. Providing security features is not just something you do, like adding any 
other feature. You need to have people with a real security background who know 
what the fuck they are doing to ensure correctness. You need to deal with the 
inevitable security vulnerabilities and fixes to those. I don't think this is 
something our PMC should waste its time with.

I agree. But this is NOT about Solr going into "security" in the way that "we 
handle/guarantee this and that kind of security aspect for you". That is still 
left to other technologies like e.g. the webcontainer. This issue is all about 
enabling a SolrCloud cluster to work, IF you (a Solr user) choose to have 
another technology enforce certain security aspects for you. If a Solr users 
sets up any kind of security technology that require ingoing traffic to a 
Solr-node to be authenticated (by http basic auth) an this is also enforced for 
Solr-node to Solr-node traffic the SolrCloud cluster will not work, and you 
cannot (easily) make it work in a secure way without Solr changes. If you use 
the webcontainer (running Solr) to enforce those security aspects it will be 
enforced for Solr-node to Solr-node traffic also.

bq. So when issues come up about security-related things in lucene/solr that 
can be done elsewhere outside of it in a more secure way instead (e.g. 
encrypting directories and so on), you can expect me to push back too.

You might be able to do it "elsewhere" but it will certainly not be "more 
secure".
Encrypting directories is something completely different. It deals with 
confidentiality aspects on storage-level. This issue has absolutely nothings to 
do with that.

bq. we could recommend ipsec instead for example

IPsec is mainly about integrity and confidentiality on transport-level 
(IP-layer). That is a valid alternative to letting the webcontainer deal with 
integrity and confidentiality on transport-level (basically require HTTPs 
transport). Using IPsec for authentication and authorization is very 
complicated, and unless you really want to do major work, it is only able to 
deal with those aspects based on certificates. People want to use usernames and 
passwords in a lot of use-cases. You do not see facebook or twitter or ... 
wanting you to generate you own RSA-certificate-pairs and send them the public 
part of it. I know Solr has the philosophy that is it not supposed to be 
exposed directly - instead be exposed indirectly though some kind of gateway 
(where authentication and authorization wrt "outer users" can be enforced). But 
if you are fairly paranoid you do not necessarily want to trust those gateways 
(they might do bad things both intentionally and unintentionally), and 
therefore you will also likely want to set up security around you SolrCloud 
cluster itself. Activating the webcontainers (the ones running Solr) ability to 
do it for you is just an obvious way.

bq. That's why I don't understand how someone would use "forwarding 
credentials" feature if Solr does not provide any way (best practices, recipes, 
whatever) to enforce authz policies / security. How do you do that in your 
application? How do you specify who can do what? Where do you enforce that - in 
custom UpdateProcessor, SearchComponent, SolrDispathFilter?

We use the webcontainers ability to enforce those aspects of security. For 
recipes I have added a lot to http://wiki.apache.org/solr/SolrSecurity - go 
read.

To spell it out we do the following
* Add to Solr web.xml AT THE VERY TOP
{code}
  <filter>
    <filter-name>RegExpAuthorizationFilter</filter-name>
    
<filter-class>org.apache.solr.servlet.security.RegExpAuthorizationFilter</filter-class>
    <init-param>
      <param-name>search-constraint</param-name>
      <param-value>1|search-role,admin-role|^.*/select$</param-value>
    </init-param>
    <init-param>
      <param-name>terms-constraint</param-name>
      <param-value>2|search-role,admin-role|^.*/terms$</param-value>
    </init-param>
    <init-param>
      <param-name>get-constraint</param-name>
      <param-value>3|search-role,admin-role|^.*/get$</param-value>
    </init-param>
    <init-param>
      <param-name>admin-constraint</param-name>
      <param-value>4|admin-role|^.*$</param-value>
    </init-param>
  </filter>
{code}
* Add to Solr web.xml (at the spot where it belongs)
{code}
  <security-constraint>
    <web-resource-collection>
      <web-resource-name>All resources need authentication</web-resource-name>
      <url-pattern>/*</url-pattern>
    </web-resource-collection>
    <auth-constraint>
      <role-name>search-role</role-name>
      <role-name>admin-role</role-name>
    </auth-constraint>
  </security-constraint>

  <login-config>
    <auth-method>BASIC</auth-method>
    <realm-name>My Realm</realm-name>
  </login-config>
{code}
* Add to jetty.xml
{code}
    <Call name="addBean">
      <Arg>
        <New class="xxx.yyy.zzz.MyZKLoginService">
          <Set name="name">My Realm</Set>
        </New>
      </Arg>
    </Call>
{code}

This basically asks jetty to handle authentication and authorization for you - 
AT application-layer. See details on http://wiki.apache.org/solr/SolrSecurity 
about how it works and why it is done the way it is
* Actually we only let the webcontainer deal with the authentication part. We 
want to do authorization based on URL-patterns, which a webcontianer is able to 
do. But due to limitations on the <url-pattern> and the way Solr-URLs are 
structured and our requirements to URL-based authorization (basically we want a 
"search-user" allowed to do searches only and an "admin-user" allowed to do 
anything), we need to deal with authorization in another way. We deal with 
URL-based authorization by adding the RegExpAuthorizationFilter filter in 
web.xml. It does URL-pattern authentication, just as the webcontainer itself is 
able to do for you, but this solution allowed reg-exp URL-patterns, enabling us 
to enforce the rules we want.
* We use our own Realm, but you can use one of those that come out of the box 
with jetty - see http://wiki.apache.org/solr/SolrSecurity. Our realm is using 
data that we put in ZooKeeper. ZooKeeper has some properties that makes it a 
nice persistence layer for a realm. It is distributed (unlike local files) so 
it is easy to make sure that all Solr-nodes at any time authenticate and 
authorize with the same set of credentials and roles. It also has this nice 
push-thing (watchers) enabling us to do changes in the realm-data-foundation 
(in ZK) and have all realms (living in the webcontainers) be aware of the 
changes without having to go pull for changes all the time.

bq. supporting basic auth and https primarily at the container level is much 
less contraversial

Agree. That is exactly why you want to enable a SolrCloud cluster to still 
work, if the Solr admin chooses to let the container enforce that kind of 
security

bq. SolrCloud is still at an early enough phase that I'm not really willing to 
spend a lot of time considering security as I add new features or refactor 
older code. Nor do I want to be on the line when some big company has a 
security breach due to my code changes.

You will not have to deal with the big company. If the enforcement of security 
does not work, it is because the technology they use to enforce it does not 
work. Solr is not enforcing security - the webcontainer or something else is. 
This patch only introduces the ability in SolrCloud, that you can make it work 
if the Solr admin choose to let the container handle security for you.

bq. can you setup some basic auth at the container level

No basically not. Not before this patch.

bq. ...and run most things over https?

No, but that is another issue with Solr - or at least it was the last time I 
checked it. IPsec is a valid alternative, though.

bq. I think ssl stuff should be working after recent http client upgrade and 
switch to SystemDefaultHttpClient. Now I believe you can set up your key and 
trust stores using standard Java properties and it should work.

Well you are kind of right, even though you mix up concepts a little. SSL (or 
HTTPS = HTTP over SSL) is about the transport-layer - nothing to do with this 
issue, SystemDefaultHttpClient or key- and trust-stores. But SSL uses 
certificate-pairs to do the encryption over the transport-layer. Those same 
certificate-pairs can be used for authentication, but it is another aspect and 
has nothing to do with SSL. But there is one big difference in how easy it is 
to use certificates for encrypted transport vs using it for authentication. To 
use it for authentication you need to pre-exchange the public parts of you 
certificates. To use it for encryption on the transport-layer you do not have 
to pre-exchange.

SystemDefaultHttpClient enables us to do certificated based authentication, 
yes. But it requires setting up key- and trust-stores and pre-exchanging 
certificates. People want to use username/password based authentication in a 
lot of user-cases.

bq. Correct me if I'm wrong, but all handling of security on inbound requests 
to Solr is still handled fully by the container, even with this patch. I.e. no 
code that you add to SolrCloud will be able to open a hole for accepting 
incoming search requests that should not have been accepted. The user 
configures the realms, user/pass etc fully on the container level.

Correct!

bq. With one exception, and that is the 
-DinternalAuthCredentialsBasicAuthPassword=<password> passed to Solr code, 
enabling system-initiated inter-node communication. If this is snapped up by 
foreigners, they potentially gain full access to Solr if they have physical 
network access. We should find a better way than passing this on the 
command-line

Agree with you concern. The VM-param way of handing over the passwords to Solr 
is the easiest way, though. I wanted to limit the patch, so that is what is 
only directly supported for now. But the solution actually does enable you to 
do it a different way. You can override the CredentialsProviders or you can 
choose to use the default one (finding credentials in VM params) but not set 
the VM params using command-line. We do the later in my company - basically we 
pipe credentials in though stdin, have a small bean reading from stdin when the 
container starts and add whatever it reads as VM params. Voila, passwords not 
to be found in environment or on command-line (exposed by e.g. "ps -ef" or "ps 
eww").
                
> Support for basic http auth in internal solr requests
> -----------------------------------------------------
>
>                 Key: SOLR-4470
>                 URL: https://issues.apache.org/jira/browse/SOLR-4470
>             Project: Solr
>          Issue Type: New Feature
>          Components: clients - java, multicore, replication (java), SolrCloud
>    Affects Versions: 4.0
>            Reporter: Per Steffensen
>            Assignee: Jan Høydahl
>              Labels: authentication, https, solrclient, solrcloud, ssl
>             Fix For: 4.4
>
>         Attachments: SOLR-4470_branch_4x_r1452629.patch, 
> SOLR-4470_branch_4x_r1452629.patch, SOLR-4470_branch_4x_r1454444.patch, 
> SOLR-4470.patch
>
>
> We want to protect any HTTP-resource (url). We want to require credentials no 
> matter what kind of HTTP-request you make to a Solr-node.
> It can faily easy be acheived as described on 
> http://wiki.apache.org/solr/SolrSecurity. This problem is that Solr-nodes 
> also make "internal" request to other Solr-nodes, and for it to work 
> credentials need to be provided here also.
> Ideally we would like to "forward" credentials from a particular request to 
> all the "internal" sub-requests it triggers. E.g. for search and update 
> request.
> But there are also "internal" requests
> * that only indirectly/asynchronously triggered from "outside" requests (e.g. 
> shard creation/deletion/etc based on calls to the "Collection API")
> * that do not in any way have relation to an "outside" "super"-request (e.g. 
> replica synching stuff)
> We would like to aim at a solution where "original" credentials are 
> "forwarded" when a request directly/synchronously trigger a subrequest, and 
> fallback to a configured "internal credentials" for the 
> asynchronous/non-rooted requests.
> In our solution we would aim at only supporting basic http auth, but we would 
> like to make a "framework" around it, so that not to much refactoring is 
> needed if you later want to make support for other kinds of auth (e.g. digest)
> We will work at a solution but create this JIRA issue early in order to get 
> input/comments from the community as early as possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to