[ 
https://issues.apache.org/jira/browse/SOLR-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883290#comment-13883290
 ] 

Per Steffensen commented on SOLR-4470:
--------------------------------------

bq.  This product does not, and even worse, the core Dev team seems intent on 
NEVER doing so!

At least most of them, yes. It is really a shame.

bq. As the lead Java architect for Distributed Systems Engineering at a fortune 
100 company, security is my single most important concern

As the tech lead on the largest REAL SolrCloud installation on the planet, I 
agree :-) I believe I can say that we have the largest installation in the 
world for two reasons
* Upgrading from one version of SolrCloud to the next is not something that 
seem to be very important in this product. At least it is hard to do, and there 
seem to be no testing of it when a new release 4.y comes out - no testing that 
you can actually upgrade to it from 4.x. This makes me believe that no-one or 
at least only a few, have so big installations that just installing 4.y and 
store/index all data from the old 4.x installation from scratch is not an 
option. If others actually had to do upgrades where this is not possible, lots 
of complaints would pop up - and they dont
* Our biggest system stores and indexes 1-2 billion documents per day, and have 
2 years of history. That is about 1000 billion documents in Solr at any time 
with 1-2 billion going in every day (and 30-60 billion going out every month). 
To be able to run such a system we needed to do numerous optimizations, and in 
general without optimizations you will never get such a big system working. I 
do not see much talk around here about optimizations of that kind - probably 
because people have not run into the problems yet.

bq. I like Solr. I like what it does and how it does it.

Me too. On that part it actually has numerous advantages over e.g. 
ElasticSearch. We used ES to begin with, and we liked it, but for political 
reasons we where not allowed to keep using it, and we turned to find an 
alternative. At that point in time SolrCloud (4.x) where only in its startup 
phase (a year before 4.0 was released), but we believed so much in the idea 
behind, that we decided to go for it.

bq. However, it's lack of internal security hooks is a complete show stopper 
for use at my firm

For us, too. That is why we made our own fix to it - provided as a patch here 
and also available at https://github.com/steff1193/lucene-solr

bq. Using this patch as our starting point

I am happy to hear that. Please feel free to contact me if you have any 
problems making it work or understanding what it does. I might also be able to 
provide a few tips on making it extra secure :-)

bq. and have our own Solr-like engine

We made the same decision years ago. We have had our own version of Solr in our 
own VCS for years. Just recently I put the code on 
https://github.com/steff1193/lucene-solr. No releases (incl maven artifacts) 
yet. But that will come soon. Until then you will have to build it yourself 
from source.

bq. Also, Mavenize the damned thing! Modern projects still use Ant? I haven't 
opened a build.xml script in half a decade or more....

Already done. 
{code}
ant [-Dversion=$VERSION] get-maven-poms
{code}
Will build the maven structure in folder "maven-build"
E.g. if you use Eclipse
{code}
ant eclipse
{code}
In Eclipse right-click the root-folder, chose "Import..." and "Existing Maven 
Project". Import all Maven pom.xmls from maven-build folder

bq. We have absolutely no idea what servlet container the user is going to use 
for running the solr war.

It isnt important for this issue. Protecting the HTTP endpoints with 
authentication and authorization is standardized in the servlet-spec. All 
web-containers have to live up to that standard (to be certified). Only place 
where the standardization is not very clear is how to install a realm (the 
thingy knowing about user-credentials and roles), but all containers have 
plenty of documentation on how to do it.

It is very important to understand that this issue, and the patch I provided 
will work for any web-container. This issue is not about enforcing the 
protection - let the web-container do that. This issue and the patch is ONLY 
about enabling Solr to send credentials in its Solr-node-to-Solr-node requests, 
so that things will keep working, if/when you make the obvious security 
decision and make usage of the security-features provided to you for free by 
the container.

bq. Solr has no control over the server-side HTTP layer right now, so anything 
we try to do will almost certainly be wrong as soon as the user changes 
containers or decides to modify their container config.

NO!

bq. Solr 5.0 will not ship as a .war file

Bad idea. This is one of the points where Solr did a better decision that ES

bq.  Once Solr is a "real" application that owns and fully controls the HTTP 
layer, security will not be such a nightmare

Web-containers let you control exactly what you ought to be able to control 
through configuration.
Security is not a nightmare today. If you remove the web-container it will 
become a nightmare.
In web-containers it is standardized and most people know how it works. The 
web-container vendors (JBoss, Glassfish, Tomcat, WebLogic, Jetty, Geronimo, 
WebSphere, Trifork T4, etc etc) have used years building and stabilizing the 
security-implementation. Thinking that we can do better by ourselves in Solr is 
just naive.
Web-container vendors have also been using years and years on developing, 
stabilizing and optimizing all the other stuff that web-containers give us for 
free (optimized HTTP end-points, thread-controll/pooling, security etc etc). 
Believing that Solr can do better on those areas is just naive. Why not let 
web-containers deal with what they have been made to deal with, and let Solr 
deal with its core area of business.

The decision of going away from letting the web-container is also kinda 
contradictory with what Mark said earlier "We have kept security related things 
at the container and user level for a reason in the past - moving from that 
stance should require a lot of buy in." I agree with Mark on that one, so why 
go away from that. Web-containers are very good at handling security - why not 
let them do that job

> Support for basic http auth in internal solr requests
> -----------------------------------------------------
>
>                 Key: SOLR-4470
>                 URL: https://issues.apache.org/jira/browse/SOLR-4470
>             Project: Solr
>          Issue Type: New Feature
>          Components: clients - java, multicore, replication (java), SolrCloud
>    Affects Versions: 4.0
>            Reporter: Per Steffensen
>            Assignee: Jan Høydahl
>              Labels: authentication, https, solrclient, solrcloud, ssl
>             Fix For: 4.7
>
>         Attachments: SOLR-4470.patch, SOLR-4470.patch, 
> SOLR-4470_branch_4x_r1452629.patch, SOLR-4470_branch_4x_r1452629.patch, 
> SOLR-4470_branch_4x_r1454444.patch
>
>
> We want to protect any HTTP-resource (url). We want to require credentials no 
> matter what kind of HTTP-request you make to a Solr-node.
> It can faily easy be acheived as described on 
> http://wiki.apache.org/solr/SolrSecurity. This problem is that Solr-nodes 
> also make "internal" request to other Solr-nodes, and for it to work 
> credentials need to be provided here also.
> Ideally we would like to "forward" credentials from a particular request to 
> all the "internal" sub-requests it triggers. E.g. for search and update 
> request.
> But there are also "internal" requests
> * that only indirectly/asynchronously triggered from "outside" requests (e.g. 
> shard creation/deletion/etc based on calls to the "Collection API")
> * that do not in any way have relation to an "outside" "super"-request (e.g. 
> replica synching stuff)
> We would like to aim at a solution where "original" credentials are 
> "forwarded" when a request directly/synchronously trigger a subrequest, and 
> fallback to a configured "internal credentials" for the 
> asynchronous/non-rooted requests.
> In our solution we would aim at only supporting basic http auth, but we would 
> like to make a "framework" around it, so that not to much refactoring is 
> needed if you later want to make support for other kinds of auth (e.g. digest)
> We will work at a solution but create this JIRA issue early in order to get 
> input/comments from the community as early as possible.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to