fzhan opened a new issue, #28445: URL: https://github.com/apache/superset/issues/28445
### Bug description After upgrade to 4.0.0, the workers keeps restarting ### How to reproduce the bug Install helm with this value.yaml ``` # # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # # Default values for superset. # This is a YAML-formatted file. # Declare variables to be passed into your templates. # A README is automatically generated from this file to document it, using helm-docs (see https://github.com/norwoodj/helm-docs) # To update it, install helm-docs and run helm-docs from the root of this chart # -- Provide a name to override the name of the chart nameOverride: ~ # -- Provide a name to override the full names of resources fullnameOverride: ~ # -- User ID directive. This user must have enough permissions to run the bootstrap script # Running containers as root is not recommended in production. Change this to another UID - e.g. 1000 to be more secure runAsUser: 0 # -- Specify service account name to be used serviceAccountName: ~ serviceAccount: # -- Create custom service account for Superset. If create: true and serviceAccountName is not provided, `superset.fullname` will be used. create: false annotations: {} # -- Install additional packages and do any other bootstrap configuration in this script # For production clusters it's recommended to build own image with this step done in CI # @default -- see `values.yaml` bootstrapScript: | #!/bin/bash pip install --upgrade pip &&\ pip install sqlalchemy-bigquery redis \ pymssql pyodbc pymysql mysql-connector-python holidays==0.32 prophet==1.1.4 authlib apt-get update apt-get install -y wget zip libodbc2 wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb apt-get install -y --no-install-recommends ./google-chrome-stable_current_amd64.deb CHROMEVER=$(google-chrome --product-version | grep -o "[^\.]*\.[^\.]*\.[^\.]*") DRIVERVER=$(curl -s "https://chromedriver.storage.googleapis.com/LATEST_RELEASE_$CHROMEVER") wget "https://storage.googleapis.com/chrome-for-testing-public/124.0.6367.60/linux64/chromedriver-linux64.zip" unzip chromedriver-linux64.zip ls -la chmod +x chromedriver-linux64/chromedriver mv chromedriver-linux64/chromedriver /usr/local/bin apt-get autoremove -yqq --purge apt-get clean rm -f google-chrome-stable_current_amd64.deb chromedriver_linux64.zip if [ ! -f ~/bootstrap ]; then echo "Running Superset with uid {{ .Values.runAsUser }}" > ~/bootstrap; fi # -- The name of the secret which we will use to generate a superset_config.py file # Note: this secret must have the key superset_config.py in it and can include other files as well configFromSecret: '{{ template "superset.fullname" . }}-config' # -- The name of the secret which we will use to populate env vars in deployed pods # This can be useful for secret keys, etc. envFromSecret: '{{ template "superset.fullname" . }}-env' # -- This can be a list of templated strings envFromSecrets: [] # -- Extra environment variables that will be passed into pods extraEnv: # Different gunicorn settings, refer to the gunicorn documentation # https://docs.gunicorn.org/en/stable/settings.html# # These variables are used as Flags at the gunicorn startup # https://github.com/apache/superset/blob/master/docker/run-server.sh#L22 # Extend timeout to allow long running queries. GUNICORN_TIMEOUT: 300 # Increase the gunicorn worker amount, can improve performance drastically # See: https://docs.gunicorn.org/en/stable/design.html#how-many-workers SERVER_WORKER_AMOUNT: 4 WORKER_MAX_REQUESTS: 0 WORKER_MAX_REQUESTS_JITTER: 0 SERVER_THREADS_AMOUNT: 20 GUNICORN_KEEPALIVE: 2 SERVER_LIMIT_REQUEST_LINE: 0 SERVER_LIMIT_REQUEST_FIELD_SIZE: 0 SUPERSET_WEBSERVER_PROTOCOL: "http" WEBDRIVER_BASEURL_USER_FRIENDLY: "https://company-domain.***.com/" SMTP_HOST: "mail.***.com" SMTP_PORT: 2525 SMTP_STARTTLS: 'True' SMTP_SSL_SERVER_AUTH: 'True' SMTP_SSL: 'False' SMTP_USER: "smtp@***.com" SMTP_MAIL_FROM: "noreply@***.com" EMAIL_REPORTS_SUBJECT_PREFIX: "[*** BI] " # OAUTH_HOME_DOMAIN: .. # # If a whitelist is not set, any address that can use your OAuth2 endpoint will be able to login. # # this includes any random Gmail address if your OAuth2 Web App is set to External. # OAUTH_WHITELIST_REGEX: ... # -- Extra environment variables in RAW format that will be passed into pods extraEnvRaw: # Load DB password from other secret (e.g. for zalando operator) - name: REDIS_PASS valueFrom: secretKeyRef: name: redis key: redis-password # -- Extra environment variables to pass as secrets extraSecretEnv: SMTP_PASSWORD: "***" MAPBOX_API_KEY: '***' SUPERSET_SECRET_KEY: '***' # MAPBOX_API_KEY: ... # # Google API Keys: https://console.cloud.google.com/apis/credentials # GOOGLE_KEY: ... # GOOGLE_SECRET: ... # # Generate your own secret key for encryption. Use openssl rand -base64 42 to generate a good key # SUPERSET_SECRET_KEY: 'CHANGE_ME_TO_A_COMPLEX_RANDOM_SECRET' # -- Extra files to mount on `/app/pythonpath` extraConfigs: custom_sso_security_manager.py: | import logging from superset.security import SupersetSecurityManager logger = logging.getLogger("azure_ad") class CustomSsoSecurityManager(SupersetSecurityManager): from flask_appbuilder.security.manager import (AUTH_DB, AUTH_OAUTH) def oauth_user_info(self, provider, response=None): #In case if userDetail url is nor working case use response attribute to get the user details like name,mail etc logger.debug("Oauth2 provider: {0}.".format(provider)) if provider == 'azure': # As example, this line request a GET to base_url + '/' + userDetails with Bearer Authentication, # and expects that authorization server checks the token, and response with user details #me = self.appbuilder.sm.oauth_remotes[provider].get('userDetails').data #logging.debug("user_data: {0}".format(me)) #check the values of user_name, mail and others values in me variable #return { 'name' : me['User_name'], 'email' : me['mail'], 'id' : me['user_name'], 'username' : me['user_name'], 'first_name':'', 'last_name':''} logger.debug("Azure response received : {0}".format(response)) id_token = response["id_token"] logger.debug(str(id_token)) #logging.debug(" -- oauth user detail -- : {0}".format(self.appbuilder.sm.oauth_remotes[provider].get('userDetails'))) me = self._decode_and_validate_azure_jwt(id_token) logger.debug("Parse JWT token : {0}".format(me)) return { "name": me.get("name", ""), "email": me["upn"], "first_name": me.get("given_name", ""), "last_name": me.get("family_name", ""), "id": me["unique_name"], "username": me["upn"], "role_keys": me.get("roles", []), } # import_datasources.yaml: | # databases: # - allow_file_upload: true # allow_ctas: true # allow_cvas: true # database_name: example-db # extra: "{\r\n \"metadata_params\": {},\r\n \"engine_params\": {},\r\n \"\ # metadata_cache_timeout\": {},\r\n \"schemas_allowed_for_file_upload\": []\r\n\ # }" # sqlalchemy_uri: example://example-db.local # tables: [] # -- Extra files to mount on `/app/pythonpath` as secrets extraSecrets: {} extraVolumes: - name: v configMap: name: v items: - key: k path: v # - name: customConfig # configMap: # name: '{{ template "superset.fullname" . }}-custom-config' # - name: additionalSecret # secret: # secretName: my-secret # defaultMode: 0600 extraVolumeMounts: - name: v mountPath: /tmp subPath: v readOnly: true # - name: customConfig # mountPath: /mnt/config # readOnly: true # - name: additionalSecret: # mountPath: /mnt/secret # -- A dictionary of overrides to append at the end of superset_config.py - the name does not matter # WARNING: the order is not guaranteed # Files can be passed as helm --set-file configOverrides.my-override=my-file.py configOverrides: feature_flags: | import ast FEATURE_FLAGS = { "ALERT_REPORTS": True } SMTP_HOST = os.getenv("SMTP_HOST","localhost") SMTP_STARTTLS = ast.literal_eval(os.getenv("SMTP_STARTTLS", "True")) SMTP_SSL = ast.literal_eval(os.getenv("SMTP_SSL", "False")) SMTP_USER = os.getenv("SMTP_USER","superset") SMTP_PORT = os.getenv("SMTP_PORT",25) SMTP_PASSWORD = os.getenv("SMTP_PASSWORD","superset") SMTP_MAIL_FROM = os.getenv("SMTP_MAIL_FROM","[email protected]") ALERT_REPORTS_NOTIFICATION_DRY_RUN = ast.literal_eval(os.getenv("ALERT_REPORTS_NOTIFICATION_DRY_RUN","False")) SLACK_API_TOKEN = os.getenv("SLACK_API_TOKEN",None) celery_conf: | from celery.schedules import crontab class CeleryConfig: broker_url = f"redis://:{env('REDIS_PASS')}@{env('REDIS_HOST')}:{env('REDIS_PORT')}/0" imports = ( "superset.sql_lab", "superset.tasks.cache", "superset.tasks.scheduler", ) result_backend = f"redis://:{env('REDIS_PASS')}@{env('REDIS_HOST')}:{env('REDIS_PORT')}/0" worker_prefetch_multiplier = 10 task_acks_late = True task_annotations = { "sql_lab.get_sql_results": { "rate_limit": "100/s", }, } beat_schedule = { "reports.scheduler": { "task": "reports.scheduler", "schedule": crontab(minute="*", hour="*"), }, "reports.prune_log": { "task": "reports.prune_log", 'schedule': crontab(minute=0, hour=0), }, 'cache-warmup-hourly': { "task": "cache-warmup", "schedule": crontab(minute="*/30", hour="*"), "kwargs": { "strategy_name": "top_n_dashboards", "top_n": 10, "since": "7 days ago", }, } } CELERY_CONFIG = CeleryConfig cache_config: | from superset.tasks.types import ExecutorType FILTER_CACHE_REDIS_URL = f"{REDIS_BASE_URL}/{env('REDIS_DB', 2)}{REDIS_URL_PARAMS}" FILTER_STATE_CACHE_CONFIG = { 'CACHE_TYPE': 'RedisCache', 'CACHE_DEFAULT_TIMEOUT': 24*60*60, 'CACHE_KEY_PREFIX': 'superset_filter_state', 'CACHE_REDIS_URL': FILTER_CACHE_REDIS_URL, } THUMBNAIL_CACHE_REDIS_URL = f"{REDIS_BASE_URL}/{env('REDIS_DB', 3)}{REDIS_URL_PARAMS}" THUMBNAIL_CACHE_CONFIG = { 'CACHE_TYPE': 'RedisCache', 'CACHE_DEFAULT_TIMEOUT': 24*60*60*7, 'CACHE_KEY_PREFIX': 'thumbnail_', 'CACHE_REDIS_URL': THUMBNAIL_CACHE_REDIS_URL, } THUMBNAIL_SELENIUM_USER = "admin" THUMBNAIL_EXECUTE_AS = [ExecutorType.SELENIUM] reports: | EMAIL_PAGE_RENDER_WAIT = 60 WEBDRIVER_BASEURL = "http://{{ template "superset.fullname" . }}:{{ .Values.service.port }}/" WEBDRIVER_BASEURL_USER_FRIENDLY = os.getenv("WEBDRIVER_BASEURL_USER_FRIENDLY", "https://www.example.com/") WEBDRIVER_TYPE= "chrome" WEBDRIVER_OPTION_ARGS = [ "--force-device-scale-factor=2.0", "--high-dpi-support=2.0", "--headless", "--disable-gpu", "--disable-dev-shm-usage", # This is required because our process runs as root (in order to install pip packages) "--no-sandbox", "--disable-setuid-sandbox", "--disable-extensions", ] SCREENSHOT_LOCATE_WAIT = 100 SCREENSHOT_LOAD_WAIT = 1200 from superset.utils.urls import headless_url from superset.utils.machine_auth import MachineAuthProvider def auth_driver(driver, user): # Setting cookies requires doing a request first, but /login is redirected to oauth provider, and stuck there. driver.get(headless_url("/doesnotexist")) cookies = MachineAuthProvider.get_auth_cookies(user) for cookie_name, cookie_val in cookies.items(): driver.add_cookie(dict(name=cookie_name, value=cookie_val)) return driver WEBDRIVER_AUTH_FUNC = auth_driver extend_timeout: | # Extend timeout to allow long running queries. SUPERSET_WEBSERVER_TIMEOUT = 300 enable_oauth: | import sys sys.path.append('/app/configs/') print (sys.path) from flask_appbuilder.security.manager import AUTH_OAUTH from custom_sso_security_manager import CustomSsoSecurityManager CUSTOM_SECURITY_MANAGER = CustomSsoSecurityManager # For managing User data after fetching from SSO app AUTH_TYPE = AUTH_OAUTH OAUTH_PROVIDERS = [ { 'name':'azure', 'token_key':'access_token', # Name of the token in the response of access_token_url 'icon':'fa-address-card', # Icon for the provider 'remote_app': { 'client_id':'***', # Client Id (Identify Superset application) 'client_secret':'***', # Secret for this Client Id (Identify Superset application) 'client_kwargs':{ 'scope': 'User.read name preferred_username email profile upn groups' # Scope for the Authorization }, 'access_token_method':'POST', # HTTP Method to call access_token_url #'access_token_params':{ # Additional parameters for calls to access_token_url # 'client_id':'myClientId' #}, #'jwks_uri':'https://myAuthorizationServe/adfs/discovery/keys', # may be required to generate token #'access_token_headers':{ # Additional headers for calls to access_token_url # 'Authorization': 'Basic Base64EncodedClientIdAndSecret' #}, 'api_base_url':'https://login.microsoftonline.com/***/oauth2', 'access_token_url':'https://login.microsoftonline.com/***/oauth2/token', 'authorize_url':'https://login.microsoftonline.com/***/oauth2/authorize' } } ] AUTH_ROLE_PUBLIC = 'Public' AUTH_USER_REGISTRATION = True additional_config: | APP_NAME = '*** BI' APP_ICON = "/static/assets/images/logo.png" APP_ICON_WIDTH = 200 LOGO_TARGET_PATH = '/' LOGO_TOOLTIP = '*** BI' SUPERSET_WEBSERVER_PROTOCOL = os.getenv("SUPERSET_WEBSERVER_PROTOCOL","https") EMAIL_REPORTS_SUBJECT_PREFIX = os.getenv("EMAIL_REPORTS_SUBJECT_PREFIX","*** BI] ") ENABLE_PROXY_FIX = True RATELIMIT_ENABLED = False AUTH_RATE_LIMITED = False DEFAULT_FEATURE_FLAGS: dict[str, bool] = { "CLIENT_CACHE": False, # deprecated "DISABLE_DATASET_SOURCE_EDIT": False, # deprecated "DRUID_JOINS": False, "DYNAMIC_PLUGINS": False, "DISABLE_LEGACY_DATASOURCE_EDITOR": True, "ENABLE_EXPLORE_JSON_CSRF_PROTECTION": False, # deprecated "ENABLE_TEMPLATE_PROCESSING": False, "ENABLE_TEMPLATE_REMOVE_FILTERS": True, # deprecated "ENABLE_JAVASCRIPT_CONTROLS": False, "KV_STORE": True, "PRESTO_EXPAND_DATA": False, "THUMBNAILS": True, "THUMBNAILS_SQLA_LISTENERS": True, "DASHBOARD_CACHE": False, # deprecated "REMOVE_SLICE_LEVEL_LABEL_COLORS": False, # deprecated "SHARE_QUERIES_VIA_KV_STORE": True, "TAGGING_SYSTEM": True, "SQLLAB_BACKEND_PERSISTENCE": True, "LISTVIEWS_DEFAULT_CARD_VIEW": False, "ESCAPE_MARKDOWN_HTML": False, "DASHBOARD_NATIVE_FILTERS": True, # deprecated "DASHBOARD_CROSS_FILTERS": True, "DASHBOARD_NATIVE_FILTERS_SET": False, # deprecated "DASHBOARD_FILTERS_EXPERIMENTAL": False, # deprecated "DASHBOARD_VIRTUALIZATION": False, "GLOBAL_ASYNC_QUERIES": False, "VERSIONED_EXPORT": True, # deprecated "EMBEDDED_SUPERSET": False, "ALERT_REPORTS": True, "DASHBOARD_RBAC": True, "ENABLE_EXPLORE_DRAG_AND_DROP": True, # deprecated "ENABLE_ADVANCED_DATA_TYPES": True, "ALERTS_ATTACH_REPORTS": True, "ALLOW_FULL_CSV_EXPORT": True, "GENERIC_CHART_AXES": True, # deprecated "ALLOW_ADHOC_SUBQUERY": True, "USE_ANALAGOUS_COLORS": False, "RLS_IN_SQLLAB": False, "CACHE_IMPERSONATION": False, "CACHE_QUERY_BY_USER": True, "EMBEDDABLE_CHARTS": True, "DRILL_TO_DETAIL": True, "DRILL_BY": True, "DATAPANEL_CLOSED_BY_DEFAULT": False, "HORIZONTAL_FILTER_BAR": True, "ESTIMATE_QUERY_COST": False, "SSH_TUNNELING": False, "AVOID_COLORS_COLLISION": True, "MENU_HIDE_USER_INFO": False, "ENABLE_SUPERSET_META_DB": True, "PLAYWRIGHT_REPORTS_AND_THUMBNAILS": False, } # extend_timeout: | # # Extend timeout to allow long running queries. # SUPERSET_WEBSERVER_TIMEOUT = ... # enable_oauth: | # from flask_appbuilder.security.manager import (AUTH_DB, AUTH_OAUTH) # AUTH_TYPE = AUTH_OAUTH # OAUTH_PROVIDERS = [ # { # "name": "google", # "whitelist": [ os.getenv("OAUTH_WHITELIST_REGEX", "") ], # "icon": "fa-google", # "token_key": "access_token", # "remote_app": { # "client_id": os.environ.get("GOOGLE_KEY"), # "client_secret": os.environ.get("GOOGLE_SECRET"), # "api_base_url": "https://www.googleapis.com/oauth2/v2/", # "client_kwargs": {"scope": "email profile"}, # "request_token_url": None, # "access_token_url": "https://accounts.google.com/o/oauth2/token", # "authorize_url": "https://accounts.google.com/o/oauth2/auth", # "authorize_params": {"hd": os.getenv("OAUTH_HOME_DOMAIN", "")} # } # } # ] # # Map Authlib roles to superset roles # AUTH_ROLE_ADMIN = 'Admin' # AUTH_ROLE_PUBLIC = 'Public' # # Will allow user self registration, allowing to create Flask users from Authorized User # AUTH_USER_REGISTRATION = True # # The default user self registration role # AUTH_USER_REGISTRATION_ROLE = "Admin" # secret: | # # Generate your own secret key for encryption. Use `openssl rand -base64 42` to generate a good key # SECRET_KEY = 'CHANGE_ME_TO_A_COMPLEX_RANDOM_SECRET' # -- Same as above but the values are files configOverridesFiles: {} # extend_timeout: extend_timeout.py # enable_oauth: enable_oauth.py configMountPath: "/app/pythonpath" extraConfigMountPath: "/app/configs" image: repository: apachesuperset.docker.scarf.sh/apache/superset tag: 4.0.0-dev pullPolicy: Always imagePullSecrets: [] initImage: repository: apache/superset tag: dockerize pullPolicy: Always service: type: ClusterIP port: 8088 annotations: {} # cloud.google.com/load-balancer-type: "Internal" loadBalancerIP: ~ nodePort: # -- (int) http: nil ingress: enabled: true ingressClassName: "public" annotations: kubernetes.io/tls-acme: "true" ## Extend timeout to allow long running queries. nginx.ingress.kubernetes.io/proxy-connect-timeout: "300" nginx.ingress.kubernetes.io/proxy-read-timeout: "300" nginx.ingress.kubernetes.io/proxy-send-timeout: "300" nginx.ingress.kubernetes.io/client_max_body_size : "50m" nginx.ingress.kubernetes.io/client-body-buffer-size: "50m" nginx.ingress.kubernetes.io/proxy-body-size: "50m" # kubernetes.io/tls-acme: "true" ## Extend timeout to allow long running queries. # nginx.ingress.kubernetes.io/proxy-connect-timeout: "300" # nginx.ingress.kubernetes.io/proxy-read-timeout: "300" # nginx.ingress.kubernetes.io/proxy-send-timeout: "300" path: / pathType: ImplementationSpecific hosts: - ***.com tls: - hosts: - ***.com secretName: ***.com-tls extraHostsRaw: [] # - secretName: chart-example-tls # hosts: # - chart-example.local resources: {} # We usually recommend not to specify default resources and to leave this as a conscious # choice for the user. This also increases chances charts run on environments with little # resources, such as Minikube. If you do want to specify resources, uncomment the following # lines, adjust them as necessary, and remove the curly braces after 'resources:'. # The limits below will apply to all Superset components. To set individual resource limitations refer to the pod specific values below. # The pod specific values will overwrite anything that is set here. # limits: # cpu: 100m # memory: 128Mi # requests: # cpu: 100m # memory: 128Mi # -- Custom hostAliases for all superset pods ## https://kubernetes.io/docs/tasks/network/customize-hosts-file-for-pods/ hostAliases: [] # - hostnames: # - nodns.my.lan # ip: 18.27.36.45 # Superset node configuration supersetNode: replicaCount: 1 autoscaling: enabled: false minReplicas: 1 maxReplicas: 100 targetCPUUtilizationPercentage: 80 # targetMemoryUtilizationPercentage: 80 # -- Sets the [pod disruption budget](https://kubernetes.io/docs/tasks/run-application/configure-pdb/) for supersetNode pods podDisruptionBudget: # -- Whether the pod disruption budget should be created enabled: false # -- If set, maxUnavailable must not be set - see https://kubernetes.io/docs/tasks/run-application/configure-pdb/#specifying-a-poddisruptionbudget minAvailable: 1 # -- If set, minAvailable must not be set - see https://kubernetes.io/docs/tasks/run-application/configure-pdb/#specifying-a-poddisruptionbudget maxUnavailable: 1 # -- Startup command # @default -- See `values.yaml` command: - "/bin/sh" - "-c" - ". {{ .Values.configMountPath }}/superset_bootstrap.sh; /usr/bin/run-server.sh" connections: # -- Change in case of bringing your own redis and then also set redis.enabled:false redis_host: 'redis-***' redis_port: "6379" redis_user: "" redis_password: *** redis_cache_db: "1" redis_celery_db: "0" # Or SSL port is usually 6380 # Update following for using Redis with SSL redis_ssl: enabled: false ssl_cert_reqs: CERT_NONE # You need to change below configuration incase bringing own PostgresSQL instance and also set postgresql.enabled:false db_host: 'postgresql-***' db_port: "5432" db_user: *** db_pass: "***" db_name: *** env: {} # -- If true, forces deployment to reload on each upgrade forceReload: true # -- Init containers # @default -- a container waiting for postgres initContainers: - name: wait-for-postgres image: "{{ .Values.initImage.repository }}:{{ .Values.initImage.tag }}" imagePullPolicy: "{{ .Values.initImage.pullPolicy }}" envFrom: - secretRef: name: "{{ tpl .Values.envFromSecret . }}" command: - /bin/sh - -c - dockerize -wait "tcp://$DB_HOST:$DB_PORT" -timeout 120s # -- Launch additional containers into supersetNode pod extraContainers: [] # -- Annotations to be added to supersetNode deployment deploymentAnnotations: {} # -- Labels to be added to supersetNode deployment deploymentLabels: {} # -- Affinity to be added to supersetNode deployment affinity: {} # -- TopologySpreadConstrains to be added to supersetNode deployments topologySpreadConstraints: [] # -- Annotations to be added to supersetNode pods podAnnotations: {} # -- Labels to be added to supersetNode pods podLabels: {} startupProbe: httpGet: path: /health port: http initialDelaySeconds: 15 timeoutSeconds: 1 failureThreshold: 60 periodSeconds: 5 successThreshold: 1 livenessProbe: httpGet: path: /health port: http initialDelaySeconds: 15 timeoutSeconds: 1 failureThreshold: 3 periodSeconds: 15 successThreshold: 1 readinessProbe: httpGet: path: /health port: http initialDelaySeconds: 15 timeoutSeconds: 1 failureThreshold: 3 periodSeconds: 15 successThreshold: 1 # -- Resource settings for the supersetNode pods - these settings overwrite might existing values from the global resources object defined above. resources: {} # limits: # cpu: 100m # memory: 128Mi # requests: # cpu: 100m # memory: 128Mi podSecurityContext: {} containerSecurityContext: {} strategy: {} # type: RollingUpdate # rollingUpdate: # maxSurge: 25% # maxUnavailable: 25% # Superset Celery worker configuration supersetWorker: replicaCount: 5 autoscaling: enabled: false minReplicas: 1 maxReplicas: 100 targetCPUUtilizationPercentage: 80 # targetMemoryUtilizationPercentage: 80 # -- Sets the [pod disruption budget](https://kubernetes.io/docs/tasks/run-application/configure-pdb/) for supersetWorker pods podDisruptionBudget: # -- Whether the pod disruption budget should be created enabled: false # -- If set, maxUnavailable must not be set - see https://kubernetes.io/docs/tasks/run-application/configure-pdb/#specifying-a-poddisruptionbudget minAvailable: 1 # -- If set, minAvailable must not be set - see https://kubernetes.io/docs/tasks/run-application/configure-pdb/#specifying-a-poddisruptionbudget maxUnavailable: 1 # -- Worker startup command # @default -- a `celery worker` command command: - "/bin/sh" - "-c" - ". {{ .Values.configMountPath }}/superset_bootstrap.sh; celery --app=superset.tasks.celery_app:app worker" # -- If true, forces deployment to reload on each upgrade forceReload: true # -- Init container # @default -- a container waiting for postgres and redis initContainers: - name: wait-for-postgres-redis image: "{{ .Values.initImage.repository }}:{{ .Values.initImage.tag }}" imagePullPolicy: "{{ .Values.initImage.pullPolicy }}" envFrom: - secretRef: name: "{{ tpl .Values.envFromSecret . }}" command: - /bin/sh - -c - dockerize -wait "tcp://$DB_HOST:$DB_PORT" -wait "tcp://$REDIS_HOST:$REDIS_PORT" -timeout 120s # -- Launch additional containers into supersetWorker pod extraContainers: [] # -- Annotations to be added to supersetWorker deployment deploymentAnnotations: {} # -- Labels to be added to supersetWorker deployment deploymentLabels: {} # -- Affinity to be added to supersetWorker deployment affinity: {} # -- TopologySpreadConstrains to be added to supersetWorker deployments topologySpreadConstraints: [] # -- Annotations to be added to supersetWorker pods podAnnotations: {} # -- Labels to be added to supersetWorker pods podLabels: {} # -- Resource settings for the supersetWorker pods - these settings overwrite might existing values from the global resources object defined above. resources: {} # limits: # cpu: 100m # memory: 128Mi # requests: # cpu: 100m # memory: 128Mi podSecurityContext: {} containerSecurityContext: {} strategy: {} # type: RollingUpdate # rollingUpdate: # maxSurge: 25% # maxUnavailable: 25% livenessProbe: exec: # -- Liveness probe command # @default -- a `celery inspect ping` command command: - sh - -c - celery -A superset.tasks.celery_app:app inspect ping -d celery@$HOSTNAME initialDelaySeconds: 120 timeoutSeconds: 60 failureThreshold: 3 periodSeconds: 60 successThreshold: 1 # -- No startup/readiness probes by default since we don't really care about its startup time (it doesn't serve traffic) startupProbe: {} # -- No startup/readiness probes by default since we don't really care about its startup time (it doesn't serve traffic) readinessProbe: {} # Superset beat configuration (to trigger scheduled jobs like reports) supersetCeleryBeat: # -- This is only required if you intend to use alerts and reports enabled: true # -- Sets the [pod disruption budget](https://kubernetes.io/docs/tasks/run-application/configure-pdb/) for supersetCeleryBeat pods podDisruptionBudget: # -- Whether the pod disruption budget should be created enabled: false # -- If set, maxUnavailable must not be set - see https://kubernetes.io/docs/tasks/run-application/configure-pdb/#specifying-a-poddisruptionbudget minAvailable: 1 # -- If set, minAvailable must not be set - see https://kubernetes.io/docs/tasks/run-application/configure-pdb/#specifying-a-poddisruptionbudget maxUnavailable: 1 # -- Command # @default -- a `celery beat` command command: - "/bin/sh" - "-c" - ". {{ .Values.configMountPath }}/superset_bootstrap.sh; celery --app=superset.tasks.celery_app:app beat --pidfile /tmp/celerybeat.pid --schedule /tmp/celerybeat-schedule" # -- If true, forces deployment to reload on each upgrade forceReload: false # -- List of init containers # @default -- a container waiting for postgres initContainers: - name: wait-for-postgres-redis image: "{{ .Values.initImage.repository }}:{{ .Values.initImage.tag }}" imagePullPolicy: "{{ .Values.initImage.pullPolicy }}" envFrom: - secretRef: name: "{{ tpl .Values.envFromSecret . }}" command: - /bin/sh - -c - dockerize -wait "tcp://$DB_HOST:$DB_PORT" -wait "tcp://$REDIS_HOST:$REDIS_PORT" -timeout 120s # -- Launch additional containers into supersetCeleryBeat pods extraContainers: [] # -- Annotations to be added to supersetCeleryBeat deployment deploymentAnnotations: {} # -- Affinity to be added to supersetCeleryBeat deployment affinity: {} # -- TopologySpreadConstrains to be added to supersetCeleryBeat deployments topologySpreadConstraints: [] # -- Annotations to be added to supersetCeleryBeat pods podAnnotations: {} # -- Labels to be added to supersetCeleryBeat pods podLabels: {} # -- Resource settings for the CeleryBeat pods - these settings overwrite might existing values from the global resources object defined above. resources: {} # limits: # cpu: 100m # memory: 128Mi # requests: # cpu: 100m # memory: 128Mi podSecurityContext: {} containerSecurityContext: {} supersetCeleryFlower: # -- Enables a Celery flower deployment (management UI to monitor celery jobs) # WARNING: on superset 1.x, this requires a Superset image that has `flower<1.0.0` installed (which is NOT the case of the default images) # flower>=1.0.0 requires Celery 5+ which Superset 1.5 does not support enabled: false replicaCount: 1 # -- Sets the [pod disruption budget](https://kubernetes.io/docs/tasks/run-application/configure-pdb/) for supersetCeleryFlower pods podDisruptionBudget: # -- Whether the pod disruption budget should be created enabled: false # -- If set, maxUnavailable must not be set - see https://kubernetes.io/docs/tasks/run-application/configure-pdb/#specifying-a-poddisruptionbudget minAvailable: 1 # -- If set, minAvailable must not be set - see https://kubernetes.io/docs/tasks/run-application/configure-pdb/#specifying-a-poddisruptionbudget maxUnavailable: 1 # -- Command # @default -- a `celery flower` command command: - "/bin/sh" - "-c" - "pip install redis authlib flower && celery --app=superset.tasks.celery_app:app flower" service: type: ClusterIP annotations: {} loadBalancerIP: ~ port: 5555 nodePort: # -- (int) http: nil startupProbe: httpGet: path: /api/workers port: flower initialDelaySeconds: 5 timeoutSeconds: 1 failureThreshold: 60 periodSeconds: 5 successThreshold: 1 livenessProbe: httpGet: path: /api/workers port: flower initialDelaySeconds: 5 timeoutSeconds: 1 failureThreshold: 3 periodSeconds: 5 successThreshold: 1 readinessProbe: httpGet: path: /api/workers port: flower initialDelaySeconds: 5 timeoutSeconds: 1 failureThreshold: 3 periodSeconds: 5 successThreshold: 1 # -- List of init containers # @default -- a container waiting for postgres and redis initContainers: - name: wait-for-postgres-redis image: "{{ .Values.initImage.repository }}:{{ .Values.initImage.tag }}" imagePullPolicy: "{{ .Values.initImage.pullPolicy }}" envFrom: - secretRef: name: "{{ tpl .Values.envFromSecret . }}" command: - /bin/sh - -c - dockerize -wait "tcp://$DB_HOST:$DB_PORT" -wait "tcp://$REDIS_HOST:$REDIS_PORT" -timeout 120s # -- Launch additional containers into supersetCeleryFlower pods extraContainers: [] # -- Annotations to be added to supersetCeleryFlower deployment deploymentAnnotations: {} # -- Affinity to be added to supersetCeleryFlower deployment affinity: {} # -- TopologySpreadConstrains to be added to supersetCeleryFlower deployments topologySpreadConstraints: [] # -- Annotations to be added to supersetCeleryFlower pods podAnnotations: {} # -- Labels to be added to supersetCeleryFlower pods podLabels: {} # -- Resource settings for the CeleryBeat pods - these settings overwrite might existing values from the global resources object defined above. resources: {} # limits: # cpu: 100m # memory: 128Mi # requests: # cpu: 100m # memory: 128Mi podSecurityContext: {} containerSecurityContext: {} supersetWebsockets: # -- This is only required if you intend to use `GLOBAL_ASYNC_QUERIES` in `ws` mode # see https://github.com/apache/superset/blob/master/CONTRIBUTING.md#async-chart-queries enabled: false replicaCount: 1 # -- Sets the [pod disruption budget](https://kubernetes.io/docs/tasks/run-application/configure-pdb/) for supersetWebsockets pods podDisruptionBudget: # -- Whether the pod disruption budget should be created enabled: false # -- If set, maxUnavailable must not be set - see https://kubernetes.io/docs/tasks/run-application/configure-pdb/#specifying-a-poddisruptionbudget minAvailable: 1 # -- If set, minAvailable must not be set - see https://kubernetes.io/docs/tasks/run-application/configure-pdb/#specifying-a-poddisruptionbudget maxUnavailable: 1 ingress: path: /ws pathType: Prefix image: # -- There is no official image (yet), this one is community-supported repository: oneacrefund/superset-websocket tag: latest pullPolicy: IfNotPresent # -- The config.json to pass to the server, see https://github.com/apache/superset/tree/master/superset-websocket # Note that the configuration can also read from environment variables (which will have priority), see https://github.com/apache/superset/blob/master/superset-websocket/src/config.ts for a list of supported variables # @default -- see `values.yaml` config: { "port": 8080, "logLevel": "debug", "logToFile": false, "logFilename": "app.log", "statsd": { "host": "127.0.0.1", "port": 8125, "globalTags": [] }, "redis": { "port": 6379, "host": "127.0.0.1", "password": "", "db": 0, "ssl": false, }, "redisStreamPrefix": "async-events-", "jwtSecret": "CHANGE-ME", "jwtCookieName": "async-token", } service: type: ClusterIP annotations: {} loadBalancerIP: ~ port: 8080 nodePort: # -- (int) http: nil command: [] resources: {} # -- Launch additional containers into supersetWebsockets pods extraContainers: [] deploymentAnnotations: {} # -- Affinity to be added to supersetWebsockets deployment affinity: {} # -- TopologySpreadConstrains to be added to supersetWebsockets deployments topologySpreadConstraints: [] podAnnotations: {} podLabels: {} strategy: {} podSecurityContext: {} containerSecurityContext: {} startupProbe: httpGet: path: /health port: ws initialDelaySeconds: 5 timeoutSeconds: 1 failureThreshold: 60 periodSeconds: 5 successThreshold: 1 livenessProbe: httpGet: path: /health port: ws initialDelaySeconds: 5 timeoutSeconds: 1 failureThreshold: 3 periodSeconds: 5 successThreshold: 1 readinessProbe: httpGet: path: /health port: ws initialDelaySeconds: 5 timeoutSeconds: 1 failureThreshold: 3 periodSeconds: 5 successThreshold: 1 init: # Configure resources # Warning: fab command consumes a lot of ram and can # cause the process to be killed due to OOM if it exceeds limit # Make sure you are giving a strong password for the admin user creation( else make sure you are changing after setup) # Also change the admin email to your own custom email. resources: {} # limits: # cpu: # memory: # requests: # cpu: # memory: # -- Command # @default -- a `superset_init.sh` command command: - "/bin/sh" - "-c" - ". {{ .Values.configMountPath }}/superset_bootstrap.sh; . {{ .Values.configMountPath }}/superset_init.sh" enabled: false jobAnnotations: "helm.sh/hook": post-install,post-upgrade "helm.sh/hook-delete-policy": "before-hook-creation" loadExamples: false createAdmin: false adminUser: username: *** firstname: *** lastname: *** email: *** password: *** # -- List of initContainers # @default -- a container waiting for postgres initContainers: - name: wait-for-postgres image: "{{ .Values.initImage.repository }}:{{ .Values.initImage.tag }}" imagePullPolicy: "{{ .Values.initImage.pullPolicy }}" envFrom: - secretRef: name: "{{ tpl .Values.envFromSecret . }}" command: - /bin/sh - -c - dockerize -wait "tcp://$DB_HOST:$DB_PORT" -timeout 120s # -- A Superset init script # @default -- a script to create admin user and initialize roles initscript: |- #!/bin/sh pip install holidays==0.23 set -eu echo "Upgrading DB schema..." superset db upgrade echo "Initializing roles..." superset init {{ if .Values.init.createAdmin }} echo "Creating admin user..." superset fab create-admin \ --username {{ .Values.init.adminUser.username }} \ --firstname {{ .Values.init.adminUser.firstname }} \ --lastname {{ .Values.init.adminUser.lastname }} \ --email {{ .Values.init.adminUser.email }} \ --password {{ .Values.init.adminUser.password }} \ || true {{- end }} {{ if .Values.init.loadExamples }} echo "Loading examples..." superset load_examples {{- end }} if [ -f "{{ .Values.extraConfigMountPath }}/import_datasources.yaml" ]; then echo "Importing database connections.... " superset import_datasources -p {{ .Values.extraConfigMountPath }}/import_datasources.yaml fi # -- Launch additional containers into init job pod extraContainers: [] ## Annotations to be added to init job pods podAnnotations: {} podSecurityContext: {} containerSecurityContext: {} ## Tolerations to be added to init job pods tolerations: [] ## Affinity to be added to init job pods affinity: {} # -- TopologySpreadConstrains to be added to init job topologySpreadConstraints: [] # -- Configuration values for the postgresql dependency. # ref: https://github.com/bitnami/charts/tree/main/bitnami/postgresql # @default -- see `values.yaml` postgresql: ## ## Use the PostgreSQL chart dependency. ## Set to false if bringing your own PostgreSQL. enabled: false ## Authentication parameters auth: ## The name of an existing secret that contains the postgres password. existingSecret: ## PostgreSQL name for a custom user to create username: *** ## PostgreSQL password for the custom user to create. Ignored if `auth.existingSecret` with key `password` is provided password: *** ## PostgreSQL name for a custom database to create database: *** image: tag: "14.6.0-debian-11-r13" ## PostgreSQL Primary parameters primary: ## ## Persistent Volume Storage configuration. ## ref: https://kubernetes.io/docs/user-guide/persistent-volumes persistence: ## ## Enable PostgreSQL persistence using Persistent Volume Claims. enabled: true ## ## Persistent class # storageClass: classname ## ## Access modes: accessModes: - ReadWriteOnce ## PostgreSQL port service: ports: postgresql: "5432" # -- Configuration values for the Redis dependency. # ref: https://github.com/bitnami/charts/blob/master/bitnami/redis # More documentation can be found here: https://artifacthub.io/packages/helm/bitnami/redis # @default -- see `values.yaml` redis: ## ## Use the redis chart dependency. ## ## If you are bringing your own redis, you can set the host in supersetNode.connections.redis_host ## ## Set to false if bringing your own redis. enabled: false ## ## Set architecture to standalone/replication architecture: standalone ## ## Auth configuration: ## auth: ## Enable password authentication enabled: false ## The name of an existing secret that contains the redis password. existingSecret: "" ## Name of the key containing the secret. existingSecretKey: "" ## Redis password password: *** ## ## Master configuration ## master: ## ## Image configuration # image: ## ## docker registry secret names (list) # pullSecrets: nil ## ## Configure persistance persistence: ## ## Use a PVC to persist data. enabled: false ## ## Persistent class # storageClass: classname ## ## Access mode: accessModes: - ReadWriteOnce nodeSelector: {} tolerations: [] affinity: {} # -- TopologySpreadConstrains to be added to all deployments topologySpreadConstraints: [] ``` ### Screenshots/recordings Logs from the worker pod ``` logging was configured successfully 2024-05-13 02:26:25,220:INFO:superset.utils.logging_configurator:logging was configured successfully 2024-05-13 02:26:25,222:INFO:root:Configured event logger of type <class 'superset.utils.log.DBEventLogger'> 2024-05-13 02:26:26,655:DEBUG:celery.utils.functional: def chord(self, header, body, partial_args=0, interval=1, countdown=2, max_retries=3, eager=4, **kwargs): return 1 2024-05-13 02:26:26,655:DEBUG:celery.utils.functional: def accumulate(self, *args, **kwargs): return 1 2024-05-13 02:26:26,656:DEBUG:celery.utils.functional: def cache_chart_thumbnail(current_user, chart_id, force=0, window_size=1, thumb_size=2): return 1 2024-05-13 02:26:26,656:DEBUG:celery.utils.functional: def execute(self, report_schedule_id): return 1 2024-05-13 02:26:26,656:DEBUG:celery.utils.functional: def unlock_chord(self, group_id, callback, interval=0, max_retries=1, result=2, Result=3, GroupResult=4, result_from_tuple=5, **kwargs): return 1 2024-05-13 02:26:26,656:DEBUG:celery.utils.functional: def backend_cleanup(): return 1 2024-05-13 02:26:26,657:DEBUG:celery.utils.functional: def cache_dashboard_thumbnail(current_user, dashboard_id, force=0, thumb_size=1): return 1 2024-05-13 02:26:26,657:DEBUG:celery.utils.functional: def xmap(task, it): return 1 2024-05-13 02:26:26,657:DEBUG:celery.utils.functional: def prune_log(): return 1 2024-05-13 02:26:26,657:DEBUG:celery.utils.functional: def xstarmap(task, it): return 1 2024-05-13 02:26:26,657:DEBUG:celery.utils.functional: def cache_warmup(strategy_name, *args, **kwargs): return 1 2024-05-13 02:26:26,658:DEBUG:celery.utils.functional: def get_sql_results(query_id, rendered_query, return_results=0, store_results=1, username=2, start_time=3, expand_data=4, log_params=5): return 1 2024-05-13 02:26:26,658:DEBUG:celery.utils.functional: def fetch_url(data, headers): return 1 2024-05-13 02:26:26,658:DEBUG:celery.utils.functional: def chunks(task, it, n): return 1 2024-05-13 02:26:26,658:DEBUG:celery.utils.functional: def group(self, tasks, result, group_id, partial_args, add_to_parent=0): return 1 2024-05-13 02:26:26,659:DEBUG:celery.utils.functional: def scheduler(): return 1 2024-05-13 02:26:26,659:DEBUG:celery.utils.functional: def chain(*args, **kwargs): return 1 /usr/local/lib/python3.10/site-packages/celery/platforms.py:829: SecurityWarning: You're running the worker with superuser privileges: this is absolutely not recommended! Please specify a different user using the --uid option. User information: uid=0 euid=0 gid=0 egid=0 warnings.warn(SecurityWarning(ROOT_DISCOURAGED.format( ['/usr/local/bin', '/app/pythonpath', '/usr/local/lib/python310.zip', '/usr/local/lib/python3.10', '/usr/local/lib/python3.10/lib-dynload', '/usr/local/lib/python3.10/site-packages', '/app', '/app/configs/'] Loaded your LOCAL configuration at [/app/pythonpath/superset_config.py] -------------- celery@superset-worker-54bc9dffd4-skkk8 v5.3.6 (emerald-rush) --- ***** ----- -- ******* ---- Linux-5.15.0-102-generic-x86_64-with-glibc2.36 2024-05-13 02:26:26 - *** --- * --- - ** ---------- [config] - ** ---------- .> app: __main__:0x7f934c950a90 - ** ---------- .> transport: redis://:**@redis-master:6379/0 - ** ---------- .> results: redis://:**@redis-master:6379/0 - *** --- * --- .> concurrency: 24 (prefork) -- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker) --- ***** ----- -------------- [queues] .> celery exchange=celery(direct) key=celery [2024-05-13 02:26:29,760: WARNING/MainProcess] /usr/local/lib/python3.10/site-packages/celery/worker/consumer/consumer.py:507: CPendingDeprecationWarning: The broker_connection_retry configuration setting will no longer determine whether broker connection retries are made during startup in Celery 6.0 and above. If you wish to retain the existing behavior for retrying connections on startup, you should set broker_connection_retry_on_startup to True. warnings.warn( [2024-05-13 02:26:29,768: WARNING/MainProcess] /usr/local/lib/python3.10/site-packages/celery/worker/consumer/consumer.py:507: CPendingDeprecationWarning: The broker_connection_retry configuration setting will no longer determine whether broker connection retries are made during startup in Celery 6.0 and above. If you wish to retain the existing behavior for retrying connections on startup, you should set broker_connection_retry_on_startup to True. warnings.warn( ``` ### Superset version 4.0.0 ### Python version 3.10 ### Node version 16 ### Browser Chrome ### Additional context _No response_ ### Checklist - [X] I have searched Superset docs and Slack and didn't find a solution to my problem. - [X] I have searched the GitHub issue tracker and didn't find a similar bug report. - [X] I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
