Copilot commented on code in PR #8866:
URL: https://github.com/apache/gravitino/pull/8866#discussion_r2454287479
##########
core/src/main/java/org/apache/gravitino/job/JobManager.java:
##########
@@ -692,7 +694,11 @@ public static JobTemplate createRuntimeJobTemplate(
// For shell job template
if (content.jobType() == JobTemplate.JobType.SHELL) {
- List<String> scripts = fetchFilesFromUri(content.scripts(), stagingDir,
TIMEOUT_IN_MS);
+ List<String> scripts =
+ content.scripts().stream()
+ .map(script -> replacePlaceholder(script, jobConf))
+ .map(script -> fetchFileFromUri(script, stagingDir,
TIMEOUT_IN_MS))
Review Comment:
[nitpick] The stream processes each script twice with `.map()` operations.
While functional, this creates intermediate stream objects. Consider using a
single `.map()` with a lambda that performs both operations: `.map(script ->
fetchFileFromUri(replacePlaceholder(script, jobConf), stagingDir,
TIMEOUT_IN_MS))` for better performance.
```suggestion
.map(script -> fetchFileFromUri(replacePlaceholder(script,
jobConf), stagingDir, TIMEOUT_IN_MS))
```
##########
core/src/main/java/org/apache/gravitino/job/JobManager.java:
##########
@@ -707,10 +713,25 @@ public static JobTemplate createRuntimeJobTemplate(
// For Spark job template
if (content.jobType() == JobTemplate.JobType.SPARK) {
- String className = content.className();
- List<String> jars = fetchFilesFromUri(content.jars(), stagingDir,
TIMEOUT_IN_MS);
- List<String> files = fetchFilesFromUri(content.files(), stagingDir,
TIMEOUT_IN_MS);
- List<String> archives = fetchFilesFromUri(content.archives(),
stagingDir, TIMEOUT_IN_MS);
+ String className = replacePlaceholder(content.className(), jobConf);
+ List<String> jars =
+ content.jars().stream()
+ .map(jar -> replacePlaceholder(jar, jobConf))
+ .map(jar -> fetchFileFromUri(jar, stagingDir, TIMEOUT_IN_MS))
+ .collect(Collectors.toList());
+
+ List<String> files =
+ content.files().stream()
+ .map(file -> replacePlaceholder(file, jobConf))
+ .map(file -> fetchFileFromUri(file, stagingDir, TIMEOUT_IN_MS))
Review Comment:
[nitpick] The files stream uses two separate `.map()` operations. Consider
combining them into a single operation: `.map(file ->
fetchFileFromUri(replacePlaceholder(file, jobConf), stagingDir,
TIMEOUT_IN_MS))` for better performance.
```suggestion
.map(file -> fetchFileFromUri(replacePlaceholder(file,
jobConf), stagingDir, TIMEOUT_IN_MS))
```
##########
core/src/main/java/org/apache/gravitino/job/JobManager.java:
##########
@@ -707,10 +713,25 @@ public static JobTemplate createRuntimeJobTemplate(
// For Spark job template
if (content.jobType() == JobTemplate.JobType.SPARK) {
- String className = content.className();
- List<String> jars = fetchFilesFromUri(content.jars(), stagingDir,
TIMEOUT_IN_MS);
- List<String> files = fetchFilesFromUri(content.files(), stagingDir,
TIMEOUT_IN_MS);
- List<String> archives = fetchFilesFromUri(content.archives(),
stagingDir, TIMEOUT_IN_MS);
+ String className = replacePlaceholder(content.className(), jobConf);
+ List<String> jars =
+ content.jars().stream()
+ .map(jar -> replacePlaceholder(jar, jobConf))
+ .map(jar -> fetchFileFromUri(jar, stagingDir, TIMEOUT_IN_MS))
+ .collect(Collectors.toList());
+
+ List<String> files =
+ content.files().stream()
+ .map(file -> replacePlaceholder(file, jobConf))
+ .map(file -> fetchFileFromUri(file, stagingDir, TIMEOUT_IN_MS))
+ .collect(Collectors.toList());
+
+ List<String> archives =
+ content.archives().stream()
+ .map(archive -> replacePlaceholder(archive, jobConf))
+ .map(archive -> fetchFileFromUri(archive, stagingDir,
TIMEOUT_IN_MS))
Review Comment:
[nitpick] Similar to scripts processing, the jars stream uses two separate
`.map()` operations. Consider combining them into a single operation: `.map(jar
-> fetchFileFromUri(replacePlaceholder(jar, jobConf), stagingDir,
TIMEOUT_IN_MS))` for better performance.
```suggestion
.map(jar -> fetchFileFromUri(replacePlaceholder(jar, jobConf),
stagingDir, TIMEOUT_IN_MS))
.collect(Collectors.toList());
List<String> files =
content.files().stream()
.map(file -> fetchFileFromUri(replacePlaceholder(file,
jobConf), stagingDir, TIMEOUT_IN_MS))
.collect(Collectors.toList());
List<String> archives =
content.archives().stream()
.map(archive -> fetchFileFromUri(replacePlaceholder(archive,
jobConf), stagingDir, TIMEOUT_IN_MS))
```
##########
core/src/main/java/org/apache/gravitino/job/JobManager.java:
##########
@@ -707,10 +713,25 @@ public static JobTemplate createRuntimeJobTemplate(
// For Spark job template
if (content.jobType() == JobTemplate.JobType.SPARK) {
- String className = content.className();
- List<String> jars = fetchFilesFromUri(content.jars(), stagingDir,
TIMEOUT_IN_MS);
- List<String> files = fetchFilesFromUri(content.files(), stagingDir,
TIMEOUT_IN_MS);
- List<String> archives = fetchFilesFromUri(content.archives(),
stagingDir, TIMEOUT_IN_MS);
+ String className = replacePlaceholder(content.className(), jobConf);
+ List<String> jars =
+ content.jars().stream()
+ .map(jar -> replacePlaceholder(jar, jobConf))
+ .map(jar -> fetchFileFromUri(jar, stagingDir, TIMEOUT_IN_MS))
+ .collect(Collectors.toList());
+
+ List<String> files =
+ content.files().stream()
+ .map(file -> replacePlaceholder(file, jobConf))
+ .map(file -> fetchFileFromUri(file, stagingDir, TIMEOUT_IN_MS))
+ .collect(Collectors.toList());
+
+ List<String> archives =
+ content.archives().stream()
+ .map(archive -> replacePlaceholder(archive, jobConf))
+ .map(archive -> fetchFileFromUri(archive, stagingDir,
TIMEOUT_IN_MS))
Review Comment:
[nitpick] The archives stream uses two separate `.map()` operations.
Consider combining them into a single operation: `.map(archive ->
fetchFileFromUri(replacePlaceholder(archive, jobConf), stagingDir,
TIMEOUT_IN_MS))` for better performance.
```suggestion
.map(archive -> fetchFileFromUri(replacePlaceholder(archive,
jobConf), stagingDir, TIMEOUT_IN_MS))
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]