jtuglu1 commented on PR #19236:
URL: https://github.com/apache/druid/pull/19236#issuecomment-4238743315

   > > Maybe a hybrid approach would work? We could introduce `scopeForUser` in 
core and run it at submit time. In your custom extension, rather than applying 
vended credentials at scope/submit time, you could use `scopeForUser` to embed 
the user's own credentials in the input source. We could add a 
`PasswordProvider` field to `IcebergInputSource` to support that. Then you 
could use them at runtime in the task to acquire vended credentials.
   > 
   > I think this makes sense – my only concern is, for example, being able to 
reliably "touch-up" these inputSource fields in arbitrary specs (both on 
overlord and on broker) to apply scopeForUser(). For example, you can have an 
MSQ query that queries multiple iceberg tables and joins against a Druid table.
   > 
   > In general, having the identity at the task-level also opens up ability 
for Druid to do auth with other resources that tasks might read from (e.g. 
Kafka), so generally want to lean in direction of having identity/credentials 
exposable at the task-level (not just seeing the end credentials for whatever 
is needed).
   
   
   
   > > Maybe a hybrid approach would work? We could introduce `scopeForUser` in 
core and run it at submit time. In your custom extension, rather than applying 
vended credentials at scope/submit time, you could use `scopeForUser` to embed 
the user's own credentials in the input source. We could add a 
`PasswordProvider` field to `IcebergInputSource` to support that. Then you 
could use them at runtime in the task to acquire vended credentials.
   > 
   > I think this makes sense – my only concern is, for example, being able to 
reliably "touch-up" these inputSource fields in arbitrary specs (both on 
overlord and on broker) to apply scopeForUser(). For example, you can have an 
MSQ query that queries multiple iceberg tables and joins against a Druid table.
   > 
   > In general, having the identity at the task-level also opens up ability 
for Druid to do auth with other resources that tasks might read from (e.g. 
Kafka), so generally want to lean in direction of having identity/credentials 
exposable at the task-level (not just seeing the end credentials for whatever 
is needed).
   
   
   
   > > Maybe a hybrid approach would work? We could introduce `scopeForUser` in 
core and run it at submit time. In your custom extension, rather than applying 
vended credentials at scope/submit time, you could use `scopeForUser` to embed 
the user's own credentials in the input source. We could add a 
`PasswordProvider` field to `IcebergInputSource` to support that. Then you 
could use them at runtime in the task to acquire vended credentials.
   > 
   > I think this makes sense – my only concern is, for example, being able to 
reliably "touch-up" these inputSource fields in arbitrary specs (both on 
overlord and on broker) to apply scopeForUser(). For example, you can have an 
MSQ query that queries multiple iceberg tables and joins against a Druid table.
   > 
   > In general, having the identity at the task-level also opens up ability 
for Druid to do auth with other resources that tasks might read from (e.g. 
Kafka), so generally want to lean in direction of having identity/credentials 
exposable at the task-level (not just seeing the end credentials for whatever 
is needed).
   
   Another thing to add here is I'd like to (if possible) avoid putting the 
burden on the caller for "injecting" this user identity. IMO, if we can always 
do it at the overlord level (and put the burden on the task type implementor to 
actually do something with a valid, provided identity), that should be more 
maintainable. Especially in a world with multiple input sources and IMO code 
related to input sources (e.g. Kafka, Iceberg, Delta) should not be involved on 
anything but a supervisor thread and the task process itself.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to