villebro commented on code in PR #31757:
URL: https://github.com/apache/superset/pull/31757#discussion_r1936105716


##########
superset/charts/api.py:
##########
@@ -564,8 +569,14 @@ def cache_screenshot(self, pk: int, **kwargs: Any) -> 
WerkzeugResponse:
                 schema:
                   $ref: '#/components/schemas/screenshot_query_schema'
           responses:
+            200:
+                description: Chart async result
+                content:
+                  application/json:
+                    schema:
+                      $ref: 
"#/components/schemas/ChartCacheScreenshotResponseSchema"
             202:
-              description: Chart async result
+              description: Chart async created

Review Comment:
   nit, while we're at it:
   ```suggestion
                 description: Chart async task created
   ```



##########
superset/charts/schemas.py:
##########
@@ -304,6 +304,21 @@ class ChartCacheScreenshotResponseSchema(Schema):
     image_url = fields.String(
         metadata={"description": "The url to fetch the screenshot"}
     )
+    update_status = fields.String(
+        metadata={"description": "The status of the async screenshot"}
+    )
+    updated_at = fields.String(
+        metadata={"description": "The timestamp of the last change in status"}
+    )

Review Comment:
   "update" seems ambiguous here:
   ```suggestion
       task_status = fields.String(
           metadata={"description": "The status of the async screenshot task"}
       )
       task_updated_at = fields.String(
           metadata={"description": "The timestamp of the last change in task 
status"}
       )
   ```



##########
superset/utils/screenshots.py:
##########
@@ -17,12 +17,14 @@
 from __future__ import annotations
 
 import logging
+from datetime import datetime
+from enum import Enum
 from io import BytesIO
-from typing import TYPE_CHECKING
+from typing import Optional, TYPE_CHECKING

Review Comment:
   Nit: we should not use `Optional` anymore, always just `from __future__ 
import annotations` and then `| None`.



##########
superset/models/slice.py:
##########
@@ -380,9 +380,7 @@ def event_after_chart_changed(
     _mapper: Mapper, _connection: Connection, target: Slice
 ) -> None:
     cache_chart_thumbnail.delay(
-        current_user=get_current_user(),
-        chart_id=target.id,
-        force=True,
+        current_user=get_current_user(), chart_id=target.id, force=True

Review Comment:
   Hmm, I'd maybe be ok with forcefully invalidating thumbs on chart/dashboard 
change, but recaching it? I'd propose adding a config flag 
`THUMBNAIL_ON_CHANGE: "invalidate" | "recalculate" | None` for admins to be 
able to specify if they want to just ignore changes, invalidate or recache 
every time the object changes.



##########
superset/charts/api.py:
##########
@@ -591,25 +603,36 @@ def cache_screenshot(self, pk: int, **kwargs: Any) -> 
WerkzeugResponse:
 
         chart_url = get_url_path("Superset.slice", slice_id=chart.id)
         screenshot_obj = ChartScreenshot(chart_url, chart.digest)
-        cache_key = screenshot_obj.cache_key(window_size, thumb_size)
+        cache_key = screenshot_obj.get_cache_key(window_size, thumb_size)
+        cache_payload = (
+            screenshot_obj.get_from_cache_key(cache_key) or 
ScreenshotCachePayload()
+        )
         image_url = get_url_path(
             "ChartRestApi.screenshot", pk=chart.id, digest=cache_key
         )
 
-        def trigger_celery() -> WerkzeugResponse:
+        def build_response(status_code: int) -> WerkzeugResponse:
+            return self.response(
+                status_code,
+                cache_key=cache_key,
+                chart_url=chart_url,
+                image_url=image_url,
+                updated_at=cache_payload.get_timestamp(),
+                update_status=cache_payload.get_status(),
+            )
+
+        if cache_payload.status in [StatusValues.PENDING, StatusValues.ERROR] 
or force:

Review Comment:
   Typically when a caching operation has errored out, we'd actually like to 
leave it in the cache with a specific TTL, as there's usually some access 
issue, or the chart is broken or something similar. Rerunning that screenshot 
every time is very expensive. Could we maybe introduce a config flag with an 
error TTL? This will likely save considerable compute on large deployments.



##########
superset/config.py:
##########
@@ -729,6 +729,7 @@ class D3TimeFormat(TypedDict, total=False):
 
 THUMBNAIL_CACHE_CONFIG: CacheConfig = {
     "CACHE_TYPE": "NullCache",
+    "CACHE_DEFAULT_TIMEOUT": 3600,

Review Comment:
   we should be using `timedelta` here. Also, 3600 seems rather short for a 
thumbnail. Maybe a week? Or even a month?
   ```suggestion
       "CACHE_DEFAULT_TIMEOUT": int(timedelta(days=7).total_seconds()),
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to