potiuk commented on issue #18334: URL: https://github.com/apache/airflow/issues/18334#issuecomment-922252101
Yeah.... DataClasses are fine, but I think I agree with @jedcunningham it would be great to eventually have a separate Database-related package that keeps only those for the future. The "model" package is not best in this context, I think more appropriate will be 'sqlachemy' or 'db' or smth like that. Maybe this is really what the problem is :). And I think we should have a bit different criteria here than whether something is a DataClass type or not. Some more context: I can imagine that in order to implement the true workload isolation and multi-tenancy in the future (which is something we likely have to implement sooner or later), we will have to separate the raw DB access through SQLAlchemy with real "Internal API" of Airflow that can be used by workers (without accessing the DB directly). This likely mean that we should be very strict about which of Airflow classes can (and should be) accessed by "workers" (this is really now all about isolating workers - all other parts are pretty much done for multi-tenancy). This likely means that things like BaseOperator (and maybe crypto - depending where we will take care about decrypting Fernet-encrypted fields) should be moved to a separate package, together with some other "worker available internal APIs" - variables, connections, context, etc. ). For those I think we should have some non-DB-related "facades" (which should not even require importing SQLAlchemy to work). And having a separate package where we have all non-SQL Alchemy related APIs and DataClasses clearly separated from SQL-Alchemy ones is - I believe a prerequisite for that. Separating those via separate packages, sounds like good idea. Actually 'crypto.*' is an interesting one and I think where it should be depends on the decision we make on where to do decryption/encryption. Even if it is not an SQLAlchemy based, it is actually deeply connected with the DB and we could make a choice (in the future) that all Fernet decryption is done outside of workers. If we decide to implement an "internal API" server that will provide APIs for the workers to communicate, decryption could be done there. In this case Crypto actually would make sense to be in the `sqlalchemy`-related package (even if it is not SQLAlchemy related). But if we decide that's better to decrypt those in-workers and transmit encrypted values to the workers (which we might well do as well - both approaches have its pros and cons) - then Crypto will belong to the "Internal API" part. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
