codeant-ai-for-open-source[bot] commented on code in PR #38408:
URL: https://github.com/apache/superset/pull/38408#discussion_r2885668442


##########
superset/mcp_service/chart/tool/generate_chart.py:
##########
@@ -384,6 +509,67 @@ async def generate_chart(  # noqa: C901
                 if form_data_key_list:
                     form_data_key = form_data_key_list[0]
 
+            # Compile check for preview-only mode
+            await ctx.report_progress(3, 5, "Running compile check (test 
query)")
+            numeric_dataset_id: int | None = None
+            if isinstance(request.dataset_id, int):
+                numeric_dataset_id = request.dataset_id
+            elif isinstance(request.dataset_id, str) and 
request.dataset_id.isdigit():
+                numeric_dataset_id = int(request.dataset_id)
+            else:
+                from superset.daos.dataset import DatasetDAO
+
+                ds = DatasetDAO.find_by_id(request.dataset_id, 
id_column="uuid")
+                if ds and has_dataset_access(ds):

Review Comment:
   **Suggestion:** In preview-only mode, the compile check runs queries against 
any numeric dataset ID (int or digit-only string) without verifying dataset 
existence or user access, unlike the saved-chart path which performs a 
`DatasetDAO` lookup and `has_dataset_access` check; this inconsistency can 
allow unauthorized users to probe or execute queries on datasets they should 
not access during compile checks. [security]
   
   <details>
   <summary><b>Severity Level:</b> Critical 🚨</summary>
   
   ```mdx
   - ❌ Preview compile check may query datasets without access validation.
   - ⚠️ Users can infer dataset existence via compile error messages.
   - ⚠️ Inconsistent security behavior between saved and preview modes.
   ```
   </details>
   
   ```suggestion
               numeric_dataset_id: int | None = None
               from superset.daos.dataset import DatasetDAO
   
               if isinstance(request.dataset_id, int) or (
                   isinstance(request.dataset_id, str) and 
request.dataset_id.isdigit()
               ):
                   dataset_id = (
                       int(request.dataset_id)
                       if isinstance(request.dataset_id, str)
                       else request.dataset_id
                   )
                   ds = DatasetDAO.find_by_id(dataset_id)
                   if ds and has_dataset_access(ds):
                       numeric_dataset_id = ds.id
               else:
                   ds = DatasetDAO.find_by_id(request.dataset_id, 
id_column="uuid")
                   if ds and has_dataset_access(ds):
                       numeric_dataset_id = ds.id
   ```
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Invoke the MCP tool `generate_chart` (decorated with `@tool` in
   `superset/mcp_service/chart/tool/generate_chart.py`, around the `async def 
generate_chart`
   definition) with `save_chart=False` so that the preview-only branch 
(starting near line
   499) is taken.
   
   2. In the request payload, set `dataset_id` to a numeric identifier (e.g., 
`123`)
   corresponding to a dataset that would fail `has_dataset_access(dataset)` for 
the current
   user (i.e., a dataset they should not be able to query).
   
   3. When execution reaches the preview compile-check block at lines 514–523, 
the code
   assigns `numeric_dataset_id` directly from `request.dataset_id` for numeric 
IDs, without
   performing any `DatasetDAO.find_by_id` lookup or `has_dataset_access` check; 
only
   non-numeric IDs go through `DatasetDAO` and permission validation.
   
   4. Because `numeric_dataset_id` is non-None, the code proceeds into the `if
   numeric_dataset_id is not None:` block (lines ~526–571), calls 
`_compile_chart(form_data,
   numeric_dataset_id)`, which builds a `QueryContext` and runs 
`ChartDataCommand.run()`
   (lines 72–105) against that dataset ID; on failure, details from 
`compile_result.error`
   are returned inside a `ChartGenerationError` response (lines 539–568), 
allowing
   compile-time probing of datasets that were never checked with 
`has_dataset_access` for
   numeric IDs.
   ```
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/mcp_service/chart/tool/generate_chart.py
   **Line:** 514:523
   **Comment:**
        *Security: In preview-only mode, the compile check runs queries against 
any numeric dataset ID (int or digit-only string) without verifying dataset 
existence or user access, unlike the saved-chart path which performs a 
`DatasetDAO` lookup and `has_dataset_access` check; this inconsistency can 
allow unauthorized users to probe or execute queries on datasets they should 
not access during compile checks.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38408&comment_hash=6dc185ed452b6964a876c6629002ace16461abc94ff2d35a75c8c88d191107e6&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38408&comment_hash=6dc185ed452b6964a876c6629002ace16461abc94ff2d35a75c8c88d191107e6&reaction=dislike'>👎</a>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to