The GitHub Actions job "Build and push images" on texera.git/main has failed.
Run started by GitHub user bobbai00 (triggered by bobbai00).

Head commit for run:
253409a6ba8c07f573aa0dede1ea747ad6b2c97e / carloea2 <[email protected]>
refactor(dataset): Redirect multipart upload through File Service (#4136)

### What changes were proposed in this PR?

* **DB / schema**

* Add `dataset_upload_session` to track multipart upload sessions,
including:

    * `(uid, did, file_path)` as the primary key
    * `upload_id` (**UNIQUE**), `physical_address`
    * **`num_parts_requested`** to enforce expected part count

* Add `dataset_upload_session_part` to track per-part completion for a
multipart upload:

    * `(upload_id, part_number)` as the primary key
* `etag` (`TEXT NOT NULL DEFAULT ''`) to persist per-part ETags for
finalize
    * `CHECK (part_number > 0)` for sanity
* `FOREIGN KEY (upload_id) REFERENCES dataset_upload_session(upload_id)
ON DELETE CASCADE`

* **Backend (`DatasetResource`)**

* Multipart upload API (server-side streaming to S3, LakeFS manages
multipart state):

    * `POST /dataset/multipart-upload?type=init`

      * Validates permissions and input.
      * Creates a LakeFS multipart upload session.
      * Inserts a DB session row including `num_parts_requested`.
* **Pre-creates placeholder rows** in `dataset_upload_session_part` for
part numbers `1..num_parts_requested` with `etag = ''` (enables
deterministic per-part locking and simple completeness checks).
* **Rejects init if a session already exists** for `(uid, did,
file_path)` (409 Conflict). Race is handled via PK/duplicate handling +
best-effort LakeFS abort for the losing initializer.

    * `POST /dataset/multipart-upload/part?filePath=...&partNumber=...`

      * Requires dataset write access and an existing upload session.
      * **Requires `Content-Length`** for streaming uploads.
      * Enforces `partNumber <= num_parts_requested`.
* **Per-part locking**: locks the `(upload_id, part_number)` row using
`SELECT … FOR UPDATE NOWAIT` to prevent concurrent uploads of the same
part.
* Uploads the part to S3 and **persists the returned ETag** into
`dataset_upload_session_part.etag` (upsert/overwrite for retries).
* Implements idempotency for retries by returning success if the ETag is
already present for that part.

    * `POST /dataset/multipart-upload?type=finish`

* Locks the session row using `SELECT … FOR UPDATE NOWAIT` to prevent
concurrent finalize/abort.

      * Validates completeness using DB state:

* Confirms the part table has `num_parts_requested` rows for the
`upload_id`.
* Confirms **all parts have non-empty ETags** (no missing parts).
* Optionally surfaces a bounded list of missing part numbers (without
relying on error-message asserts in tests).

* Fetches `(part_number, etag)` ordered by `part_number` from DB and
completes multipart upload via LakeFS.

* Deletes the DB session row; part rows are cleaned up via `ON DELETE
CASCADE`.

* **NOWAIT lock contention is handled** (mapped to “already being
finalized/aborted”, 409).

    * `POST /dataset/multipart-upload?type=abort`

      * Locks the session row using `SELECT … FOR UPDATE NOWAIT`.
* Aborts the multipart upload via LakeFS and deletes the DB session row
(parts cascade-delete).
      * **NOWAIT lock contention is handled** similarly to `finish`.

* Access control and dataset permissions remain enforced on all
endpoints.

* **Frontend service (`dataset.service.ts`)**

* `multipartUpload(...)` updated to reflect the server flow and return
values (ETag persistence is server-side; frontend does not need to track
ETags).

* **Frontend component (`dataset-detail.component.ts`)**

  * Uses the same init/part/finish flow.
  * Abort triggers backend `type=abort` to clean up the upload session.

---

### Any related issues, documentation, discussions?

Closes #4110

---

### How was this PR tested?

* **Unit tests added/updated** (multipart upload spec):

* Init validation (invalid numParts, invalid filePath, permission
denied).
* Upload part validation (missing/invalid Content-Length, partNumber
bounds, minimum size enforcement for non-final parts).
* **Per-part lock behavior** under contention (no concurrent streams for
the same part; deterministic assertions).
  * Finish/abort locking behavior (NOWAIT contention returns 409).
* Successful end-to-end path (init → upload parts → finish) with DB
cleanup assertions.
* **Integrity checks**: positive + negative SHA-256 tests by downloading
the finalized object and verifying it matches (or does not match) the
expected concatenated bytes.

* Manual testing via the dataset detail page (single and multiple
uploads), verified:

  * Progress, speed, and ETA updates.
  * Abort behavior (UI state + DB session cleanup).
* Successful completion path (all expected parts uploaded, LakeFS object
present, dataset version creation works).

---

### Was this PR authored or co-authored using generative AI tooling?

GPT partial use.

---------

Co-authored-by: Chen Li <[email protected]>

Report URL: https://github.com/apache/texera/actions/runs/20736103681

With regards,
GitHub Actions via GitBox

Reply via email to