codeant-ai-for-open-source[bot] commented on code in PR #38408:
URL: https://github.com/apache/superset/pull/38408#discussion_r2885668442
##########
superset/mcp_service/chart/tool/generate_chart.py:
##########
@@ -384,6 +509,67 @@ async def generate_chart( # noqa: C901
if form_data_key_list:
form_data_key = form_data_key_list[0]
+ # Compile check for preview-only mode
+ await ctx.report_progress(3, 5, "Running compile check (test
query)")
+ numeric_dataset_id: int | None = None
+ if isinstance(request.dataset_id, int):
+ numeric_dataset_id = request.dataset_id
+ elif isinstance(request.dataset_id, str) and
request.dataset_id.isdigit():
+ numeric_dataset_id = int(request.dataset_id)
+ else:
+ from superset.daos.dataset import DatasetDAO
+
+ ds = DatasetDAO.find_by_id(request.dataset_id,
id_column="uuid")
+ if ds and has_dataset_access(ds):
Review Comment:
**Suggestion:** In preview-only mode, the compile check runs queries against
any numeric dataset ID (int or digit-only string) without verifying dataset
existence or user access, unlike the saved-chart path which performs a
`DatasetDAO` lookup and `has_dataset_access` check; this inconsistency can
allow unauthorized users to probe or execute queries on datasets they should
not access during compile checks. [security]
<details>
<summary><b>Severity Level:</b> Critical 🚨</summary>
```mdx
- ❌ Preview compile check may query datasets without access validation.
- ⚠️ Users can infer dataset existence via compile error messages.
- ⚠️ Inconsistent security behavior between saved and preview modes.
```
</details>
```suggestion
numeric_dataset_id: int | None = None
from superset.daos.dataset import DatasetDAO
if isinstance(request.dataset_id, int) or (
isinstance(request.dataset_id, str) and
request.dataset_id.isdigit()
):
dataset_id = (
int(request.dataset_id)
if isinstance(request.dataset_id, str)
else request.dataset_id
)
ds = DatasetDAO.find_by_id(dataset_id)
if ds and has_dataset_access(ds):
numeric_dataset_id = ds.id
else:
ds = DatasetDAO.find_by_id(request.dataset_id,
id_column="uuid")
if ds and has_dataset_access(ds):
numeric_dataset_id = ds.id
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Invoke the MCP tool `generate_chart` (decorated with `@tool` in
`superset/mcp_service/chart/tool/generate_chart.py`, around the `async def
generate_chart`
definition) with `save_chart=False` so that the preview-only branch
(starting near line
499) is taken.
2. In the request payload, set `dataset_id` to a numeric identifier (e.g.,
`123`)
corresponding to a dataset that would fail `has_dataset_access(dataset)` for
the current
user (i.e., a dataset they should not be able to query).
3. When execution reaches the preview compile-check block at lines 514–523,
the code
assigns `numeric_dataset_id` directly from `request.dataset_id` for numeric
IDs, without
performing any `DatasetDAO.find_by_id` lookup or `has_dataset_access` check;
only
non-numeric IDs go through `DatasetDAO` and permission validation.
4. Because `numeric_dataset_id` is non-None, the code proceeds into the `if
numeric_dataset_id is not None:` block (lines ~526–571), calls
`_compile_chart(form_data,
numeric_dataset_id)`, which builds a `QueryContext` and runs
`ChartDataCommand.run()`
(lines 72–105) against that dataset ID; on failure, details from
`compile_result.error`
are returned inside a `ChartGenerationError` response (lines 539–568),
allowing
compile-time probing of datasets that were never checked with
`has_dataset_access` for
numeric IDs.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/mcp_service/chart/tool/generate_chart.py
**Line:** 514:523
**Comment:**
*Security: In preview-only mode, the compile check runs queries against
any numeric dataset ID (int or digit-only string) without verifying dataset
existence or user access, unlike the saved-chart path which performs a
`DatasetDAO` lookup and `has_dataset_access` check; this inconsistency can
allow unauthorized users to probe or execute queries on datasets they should
not access during compile checks.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38408&comment_hash=6dc185ed452b6964a876c6629002ace16461abc94ff2d35a75c8c88d191107e6&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38408&comment_hash=6dc185ed452b6964a876c6629002ace16461abc94ff2d35a75c8c88d191107e6&reaction=dislike'>👎</a>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]