The GitHub Actions job "Build and push images" on texera.git/main has failed. Run started by GitHub user bobbai00 (triggered by bobbai00).
Head commit for run: 253409a6ba8c07f573aa0dede1ea747ad6b2c97e / carloea2 <[email protected]> refactor(dataset): Redirect multipart upload through File Service (#4136) ### What changes were proposed in this PR? * **DB / schema** * Add `dataset_upload_session` to track multipart upload sessions, including: * `(uid, did, file_path)` as the primary key * `upload_id` (**UNIQUE**), `physical_address` * **`num_parts_requested`** to enforce expected part count * Add `dataset_upload_session_part` to track per-part completion for a multipart upload: * `(upload_id, part_number)` as the primary key * `etag` (`TEXT NOT NULL DEFAULT ''`) to persist per-part ETags for finalize * `CHECK (part_number > 0)` for sanity * `FOREIGN KEY (upload_id) REFERENCES dataset_upload_session(upload_id) ON DELETE CASCADE` * **Backend (`DatasetResource`)** * Multipart upload API (server-side streaming to S3, LakeFS manages multipart state): * `POST /dataset/multipart-upload?type=init` * Validates permissions and input. * Creates a LakeFS multipart upload session. * Inserts a DB session row including `num_parts_requested`. * **Pre-creates placeholder rows** in `dataset_upload_session_part` for part numbers `1..num_parts_requested` with `etag = ''` (enables deterministic per-part locking and simple completeness checks). * **Rejects init if a session already exists** for `(uid, did, file_path)` (409 Conflict). Race is handled via PK/duplicate handling + best-effort LakeFS abort for the losing initializer. * `POST /dataset/multipart-upload/part?filePath=...&partNumber=...` * Requires dataset write access and an existing upload session. * **Requires `Content-Length`** for streaming uploads. * Enforces `partNumber <= num_parts_requested`. * **Per-part locking**: locks the `(upload_id, part_number)` row using `SELECT … FOR UPDATE NOWAIT` to prevent concurrent uploads of the same part. * Uploads the part to S3 and **persists the returned ETag** into `dataset_upload_session_part.etag` (upsert/overwrite for retries). * Implements idempotency for retries by returning success if the ETag is already present for that part. * `POST /dataset/multipart-upload?type=finish` * Locks the session row using `SELECT … FOR UPDATE NOWAIT` to prevent concurrent finalize/abort. * Validates completeness using DB state: * Confirms the part table has `num_parts_requested` rows for the `upload_id`. * Confirms **all parts have non-empty ETags** (no missing parts). * Optionally surfaces a bounded list of missing part numbers (without relying on error-message asserts in tests). * Fetches `(part_number, etag)` ordered by `part_number` from DB and completes multipart upload via LakeFS. * Deletes the DB session row; part rows are cleaned up via `ON DELETE CASCADE`. * **NOWAIT lock contention is handled** (mapped to “already being finalized/aborted”, 409). * `POST /dataset/multipart-upload?type=abort` * Locks the session row using `SELECT … FOR UPDATE NOWAIT`. * Aborts the multipart upload via LakeFS and deletes the DB session row (parts cascade-delete). * **NOWAIT lock contention is handled** similarly to `finish`. * Access control and dataset permissions remain enforced on all endpoints. * **Frontend service (`dataset.service.ts`)** * `multipartUpload(...)` updated to reflect the server flow and return values (ETag persistence is server-side; frontend does not need to track ETags). * **Frontend component (`dataset-detail.component.ts`)** * Uses the same init/part/finish flow. * Abort triggers backend `type=abort` to clean up the upload session. --- ### Any related issues, documentation, discussions? Closes #4110 --- ### How was this PR tested? * **Unit tests added/updated** (multipart upload spec): * Init validation (invalid numParts, invalid filePath, permission denied). * Upload part validation (missing/invalid Content-Length, partNumber bounds, minimum size enforcement for non-final parts). * **Per-part lock behavior** under contention (no concurrent streams for the same part; deterministic assertions). * Finish/abort locking behavior (NOWAIT contention returns 409). * Successful end-to-end path (init → upload parts → finish) with DB cleanup assertions. * **Integrity checks**: positive + negative SHA-256 tests by downloading the finalized object and verifying it matches (or does not match) the expected concatenated bytes. * Manual testing via the dataset detail page (single and multiple uploads), verified: * Progress, speed, and ETA updates. * Abort behavior (UI state + DB session cleanup). * Successful completion path (all expected parts uploaded, LakeFS object present, dataset version creation works). --- ### Was this PR authored or co-authored using generative AI tooling? GPT partial use. --------- Co-authored-by: Chen Li <[email protected]> Report URL: https://github.com/apache/texera/actions/runs/20736103681 With regards, GitHub Actions via GitBox
