villebro commented on code in PR #30522:
URL: https://github.com/apache/superset/pull/30522#discussion_r1792159262
##########
superset/config.py:
##########
@@ -955,6 +955,9 @@ class D3TimeFormat(TypedDict, total=False):
SQLLAB_SAVE_WARNING_MESSAGE = None
SQLLAB_SCHEDULE_WARNING_MESSAGE = None
+# Max payload size (MB) for SQL Lab to prevent browser hangs with large
results.
+SQL_LAB_PAYLOAD_MAX_MB = None
Review Comment:
To keep the config keys in line with the pre-existing ones above, let's
remove the underscore
```suggestion
SQLLAB_PAYLOAD_MAX_MB = None
```
##########
superset/sql_lab.py:
##########
@@ -601,6 +603,23 @@ def execute_sql_statements(
serialized_payload = _serialize_payload(
payload, cast(bool, results_backend_use_msgpack)
)
+
+ # Check the size of the serialized payload
+ if config.get("SQL_LAB_PAYLOAD_MAX_MB"):
+ serialized_payload_size = sys.getsizeof(serialized_payload)
+ sql_lab_payload_max_mb = config["SQL_LAB_PAYLOAD_MAX_MB"]
Review Comment:
nit: walrus operator can be used to simplify:
```suggestion
if sql_lab_payload_max_mb :=
config.get("SQL_LAB_PAYLOAD_MAX_MB"):
serialized_payload_size =
sys.getsizeof(serialized_payload)
```
##########
superset/sql_lab.py:
##########
@@ -214,7 +215,7 @@ def execute_sql_statement( # pylint:
disable=too-many-statements, too-many-loca
insert_rls = (
insert_rls_as_subquery
if database.db_engine_spec.allows_subqueries
- and database.db_engine_spec.allows_alias_in_select
+ and database.db_engine_spec.allows_alias_in_select
Review Comment:
I assume this will be flagged by the linter
```suggestion
and database.db_engine_spec.allows_alias_in_select
```
##########
tests/unit_tests/sql_lab_test.py:
##########
@@ -125,6 +130,90 @@ def test_execute_sql_statement_with_rls(
SupersetResultSet.assert_called_with([(42,)], cursor.description,
db_engine_spec)
+def test_execute_sql_statement_exceeds_payload_limit_log_check(mocker:
MockerFixture, caplog) -> None:
+ """
+ Test for `execute_sql_statements` when the result payload size exceeds the
limit,
+ and check if the correct log message is captured without raising the
exception.
+ """
Review Comment:
What is the purpose of this test? Shouldn't we always raise an exception if
the payload size exceeds the threshold?
##########
superset/sql_lab.py:
##########
@@ -601,6 +603,23 @@ def execute_sql_statements(
serialized_payload = _serialize_payload(
payload, cast(bool, results_backend_use_msgpack)
)
+
+ # Check the size of the serialized payload
+ if config.get("SQL_LAB_PAYLOAD_MAX_MB"):
+ serialized_payload_size = sys.getsizeof(serialized_payload)
+ sql_lab_payload_max_mb = config["SQL_LAB_PAYLOAD_MAX_MB"]
+ max_bytes = sql_lab_payload_max_mb * 1024 * 1024
+
+ if serialized_payload_size > max_bytes:
+ logger.info("Result size exceeds the allowed limit.")
+ raise SupersetErrorException(
+ SupersetError(
+ message=f"Result size
({serialized_payload_size / (1024 * 1024):.2f} MB) exceeds the allowed limit of
{sql_lab_payload_max_mb} MB.",
Review Comment:
nit: we're rewriting `1024 * 1024` here and in many places after this. To
keep the codebase DRY, let's define it only once as a const.
##########
requirements/base.txt:
##########
@@ -408,3 +408,4 @@ zipp==3.19.0
# via importlib-metadata
zstandard==0.22.0
# via flask-compress
+PyHive==0.7.0
Review Comment:
Unrelated change?
```suggestion
PyHive==0.7.0
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]