The GitHub Actions job "Build" on 
texera.git/claude/load-example-datasets-workflows-01EondBaTrVUSeKXoHXmKBxP has 
failed.
Run started by GitHub user bobbai00 (triggered by bobbai00).

Head commit for run:
683fa0453d9c1e650adf77020db2f5f2da9e88f2 / Claude <[email protected]>
feat: add Python-based example dataset and workflow loader

This commit introduces a Python-based example data loader for Texera,
providing an alternative to shell scripts for loading example datasets
and workflows.

Key features:
- Python modules for authentication, dataset, and workflow management
- Multipart upload support for large files using S3-compatible API
- Concurrent part uploads for improved performance
- Comprehensive error handling and logging
- Docker container support for easy deployment

Components:
- login.py: JWT authentication module
- load_dataset.py: Dataset creation and multipart file upload
- load_workflow.py: Workflow creation and management
- main.py: Orchestration script with CLI interface
- requirements.txt: Python dependencies
- texera-examples-loader.dockerfile: Docker container definition

The multipart upload implementation:
- Follows the Texera file-service multipart upload API
- Calculates optimal part sizes based on file size and S3 limits
- Uploads parts concurrently with configurable concurrency
- Handles presigned URLs for direct S3 uploads
- Includes automatic abort on failure

Example data included:
- country_sales_small.csv: Sample sales dataset
- country_sales_analysis.json: Example workflow

Related to PR #3754 but uses Python instead of shell scripts.

Report URL: https://github.com/apache/texera/actions/runs/19626376804

With regards,
GitHub Actions via GitBox

Reply via email to