e-kotov opened a new issue, #537:
URL: https://github.com/apache/sedona-db/issues/537
# Issue Report: `datafusion.runtime.memory_limit` enforcement and visibility
## Description
The `datafusion.runtime.memory_limit` setting is correctly enforced by the
SedonaDB (DataFusion) runtime, blocking queries that exceed the limit. However,
the setting is "invisible" to the SQL `SHOW ALL` command, making it difficult
for users to verify their current configuration.
## Reproduction Script
```python
import sedonadb
import pandas as pd
import numpy as np
import pyarrow as pa
# 1. Setup Data (1 Million rows ~30MB)
print("--- Setup ---")
table = pa.Table.from_pandas(pd.DataFrame({
"id": np.arange(1000000),
"v": np.random.randn(1000000)
}))
sd = sedonadb.connect()
sd.create_data_frame(table).to_view("data")
# 2. PROOF OF INVISIBILITY
print("\n--- Test 1: Invisibility ---")
sd.sql("SET datafusion.runtime.memory_limit = '2G'").execute()
df_show = sd.sql("SHOW ALL").to_pandas()
is_visible = any(df_show["name"].str.contains("memory_limit"))
print(f"Setting 'memory_limit' found in SHOW ALL: {is_visible}")
# 3. PROOF OF ENFORCEMENT
print("\n--- Test 2: Enforcement ---")
sd.sql("SET datafusion.runtime.memory_limit = '1M'").execute()
print("Limit set to 1MB. Attempting a sort...")
try:
# Sort triggers memory allocation
sd.sql("SELECT * FROM data ORDER BY v").head(1).execute()
print("Failure: Ingestion succeeded (Limit was ignored)")
except Exception as e:
print(f"Success: Enforcement confirmed. Execution blocked.")
print(f"Error caught: {str(e)[:150]}...")
# 4. Cleanup/Recovery
sd.sql("SET datafusion.runtime.memory_limit = '10G'").execute()
print(f"\nFinal count after lifting limit: {sd.view('data').count()}")
```
## Actual Output
```text
--- Setup ---
--- Test 1: Invisibility ---
Setting 'memory_limit' found in SHOW ALL: False
--- Test 2: Enforcement ---
Limit set to 1MB. Attempting a sort...
Success: Enforcement confirmed. Execution blocked.
Error caught: External error: Resources exhausted: Additional allocation
failed with top memory consumers (across reservations) as:
TopK[0]#2(can spill: false) co...
Final count after lifting limit: 1000000
```
Maybe this just wasn't yet propagated to SedonaDB from
https://github.com/apache/datafusion/issues/18452 ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]