martin-g commented on code in PR #19560:
URL: https://github.com/apache/datafusion/pull/19560#discussion_r2653120583
##########
ci/scripts/rust_clippy.sh:
##########
@@ -17,5 +17,60 @@
# specific language governing permissions and limitations
# under the License.
-set -ex
-cargo clippy --all-targets --workspace --features
avro,integration-tests,extended_tests -- -D warnings
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+SCRIPT_NAME="$(basename "${BASH_SOURCE[0]}")"
+CLIPPY_FEATURES="avro,integration-tests,extended_tests"
+CLIPPY_ARGS=(--all-targets --workspace --features "$CLIPPY_FEATURES")
+CLIPPY_LINT_ARGS=(-- -D warnings)
+
+source "${SCRIPT_DIR}/utils/git.sh"
+
+MODE="check"
+ALLOW_DIRTY=0
+
+usage() {
+ cat >&2 <<EOF
+Usage: $SCRIPT_NAME [--write] [--allow-dirty]
+
+Runs \`cargo clippy\` to lint.
+--write Run \`cargo clippy --fix\` to apply fixes for clippy lints
(requires a clean git worktree, no uncommitted changes).
+--allow-dirty Allow \`--write\` to run even when the git worktree has
uncommitted or staged changes.
+EOF
+ exit 1
+}
+
+while [[ $# -gt 0 ]]; do
+ case "$1" in
+ --write)
+ MODE="write"
+ ;;
+ --allow-dirty)
+ ALLOW_DIRTY=1
+ ;;
+ -h|--help)
+ usage
+ ;;
+ *)
+ usage
+ ;;
+ esac
+ shift
+done
+
+if [[ "$MODE" == "write" && $ALLOW_DIRTY -eq 0 ]]; then
+ require_clean_work_tree "$SCRIPT_NAME" || exit 1
+fi
+
+if [[ "$MODE" == "write" ]]; then
+ ALLOW_DIRTY_ARGS=()
+ if [[ $ALLOW_DIRTY -eq 1 ]]; then
+ ALLOW_DIRTY_ARGS+=(--allow-dirty --allow-staged)
+ fi
+ echo "[${SCRIPT_NAME}] \`cargo clippy --fix --all-targets --workspace
--features ${CLIPPY_FEATURES} -- -D warnings\`"
+ cargo clippy --fix "${ALLOW_DIRTY_ARGS[@]}" "${CLIPPY_ARGS[@]}"
"${CLIPPY_LINT_ARGS[@]}"
Review Comment:
Expanding empty array with `set -u` will lead to `"unbound variable"` on
older Bash versions than 4.4 (e.g. MacOS comes with 3.2)
##########
ci/scripts/doc_prettier_check.sh:
##########
@@ -17,41 +17,68 @@
# specific language governing permissions and limitations
# under the License.
-SCRIPT_PATH="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)/$(basename
"${BASH_SOURCE[0]}")"
-
-MODE="--check"
-ACTION="Checking"
-if [ $# -gt 0 ]; then
- if [ "$1" = "--write" ]; then
- MODE="--write"
- ACTION="Formatting"
- else
- echo "Usage: $0 [--write]" >&2
- exit 1
- fi
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+SCRIPT_NAME="$(basename "${BASH_SOURCE[0]}")"
+PRETTIER_VERSION="2.7.1"
Review Comment:
Is there a reason to keep using this old version ?
Latest is https://www.npmjs.com/package/prettier/v/3.7.4
##########
ci/scripts/utils/git.sh:
##########
@@ -0,0 +1,27 @@
+#!/usr/bin/env bash
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
Review Comment:
```suggestion
set -euo pipefail
```
Not really needed at the moment but it does not do any harm and might become
useful in the future if/when someone adds more logic in this script.
##########
ci/scripts/license_header.sh:
##########
@@ -17,6 +17,62 @@
# specific language governing permissions and limitations
# under the License.
-# Check Apache license header
-set -ex
-hawkeye check --config licenserc.toml
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+SCRIPT_NAME="$(basename "${BASH_SOURCE[0]}")"
+
+source "${SCRIPT_DIR}/utils/git.sh"
+
+MODE="check"
+ALLOW_DIRTY=0
+HAWKEYE_CONFIG="licenserc.toml"
+
+usage() {
+ cat >&2 <<EOF
+Usage: $SCRIPT_NAME [--write] [--allow-dirty]
+
+Checks Apache license headers with \`hawkeye check --config $HAWKEYE_CONFIG\`.
+--write Run \`hawkeye format --config $HAWKEYE_CONFIG\` to
auto-add/fix headers (requires a clean git worktree, no uncommitted changes).
+--allow-dirty Allow \`--write\` to run even when the git worktree has
uncommitted changes.
+EOF
+ exit 1
+}
+
+while [[ $# -gt 0 ]]; do
+ case "$1" in
+ --write)
+ MODE="write"
+ ;;
+ --allow-dirty)
+ ALLOW_DIRTY=1
+ ;;
+ -h|--help)
+ usage
+ ;;
+ *)
+ usage
+ ;;
+ esac
+ shift
+done
+
+if [[ "$MODE" == "write" && $ALLOW_DIRTY -eq 0 ]]; then
+ require_clean_work_tree "$SCRIPT_NAME" || exit 1
+fi
+
+if [[ "$MODE" == "write" ]]; then
+ echo "[${SCRIPT_NAME}] \`hawkeye format --config ${HAWKEYE_CONFIG}\`"
+ if ! hawkeye format --config "${HAWKEYE_CONFIG}"; then
+ status=$?
Review Comment:
Here `$?` will be always `0` because it is the status of the `if`.
You will need to do something like:
```bash
hawkeye format --config "${HAWKEYE_CONFIG}" &> /dev/null
status=$1
...
```
##########
dev/rust_lint.sh:
##########
@@ -23,30 +23,103 @@
# Note: The installed checking tools (e.g., taplo) are not guaranteed to match
# the CI versions for simplicity, there might be some minor differences. Check
# `.github/workflows` for the CI versions.
+#
+#
+#
+# For each lint scripts:
+#
+# By default, they run in check mode:
+# ./ci/scripts/rust_fmt.sh
+#
+# With `--write`, scripts perform best-effort auto fixes:
+# ./ci/scripts/rust_fmt.sh --write
+#
+# The `--write` flag assumes a clean git repository (no uncommitted changes);
to force
+# auto fixes even if there are unstaged changes, use `--allow-dirty`:
+# ./ci/scripts/rust_fmt.sh --write --allow-dirty
+#
+# New scripts can use `rust_fmt.sh` as a reference.
+
+set -euo pipefail
+
+usage() {
+ cat >&2 <<EOF
+Usage: $0 [--write] [--allow-dirty]
+
+Runs the local Rust lint suite similar to CI.
+--write Run formatters, clippy and other non-functional checks in
best-effort write/fix mode (requires a clean git worktree, no uncommitted
changes; some checks are test-only and ignore this flag).
+--allow-dirty Allow \`--write\` to run even when the git worktree has
uncommitted changes.
+EOF
+ exit 1
+}
+
+ensure_tool() {
+ local cmd="$1"
+ local install_cmd="$2"
+ if ! command -v "$cmd" &> /dev/null; then
+ echo "Installing $cmd using: $install_cmd"
+ eval "$install_cmd"
+ fi
+}
+
+MODE="check"
+ALLOW_DIRTY=0
+
+while [[ $# -gt 0 ]]; do
+ case "$1" in
+ --write)
+ MODE="write"
+ ;;
+ --allow-dirty)
+ ALLOW_DIRTY=1
+ ;;
+ -h|--help)
+ usage
+ ;;
+ *)
+ usage
+ ;;
+ esac
+ shift
+done
+
+SCRIPT_NAME="$(basename "${BASH_SOURCE[0]}")"
+
+ensure_tool "taplo" "cargo install taplo-cli"
Review Comment:
no need of `--locked` ?
##########
ci/scripts/rust_fmt.sh:
##########
@@ -17,5 +17,52 @@
# specific language governing permissions and limitations
# under the License.
-set -ex
-cargo fmt --all -- --check
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+SCRIPT_NAME="$(basename "${BASH_SOURCE[0]}")"
+source "${SCRIPT_DIR}/utils/git.sh"
+
+MODE="check"
+ALLOW_DIRTY=0
+
+usage() {
+ cat >&2 <<EOF
+Usage: $0 [--write] [--allow-dirty]
+
+Runs \`cargo fmt --all -- --check\` by default to verify Rust formatting.
+--write Run \`cargo fmt --all\` to auto-fix formatting (requires a
clean git worktree, no uncommitted changes).
+--allow-dirty Allow \`--write\` to run even when the git worktree has
uncommitted changes.
+EOF
+ exit 1
+}
+
+while [[ $# -gt 0 ]]; do
+ case "$1" in
+ --write)
+ MODE="write"
+ ;;
+ --allow-dirty)
+ ALLOW_DIRTY=1
+ ;;
+ -h|--help)
+ usage
+ ;;
+ *)
+ usage
+ ;;
+ esac
+ shift
+done
+
+if [[ "$MODE" == "write" && $ALLOW_DIRTY -eq 0 ]]; then
+ require_clean_work_tree "$SCRIPT_NAME" || exit 1
+fi
+
+if [[ "$MODE" == "write" ]]; then
+ echo "[${SCRIPT_NAME}] \`cargo fmt --all\`"
+ cargo fmt --all
+else
+ echo "[${SCRIPT_NAME}] \`cargo fmt --all -- --check\`"
+ cargo fmt --all -- --check
+fi
Review Comment:
nit: It seems there is no newline (`\n`) at the end here (non POSIX
complaint).
It is recommended to have it otherwise "it may break some old tools".
Even Github UI marks it as a warning.
##########
dev/rust_lint.sh:
##########
Review Comment:
```suggestion
#!/usr/bin/env bash
```
since you are modernizing the script anyway
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]