[jira] [Created] (SOLR-16813) PackageLoader not copying manifest.json file to filestore
Will White created SOLR-16813: - Summary: PackageLoader not copying manifest.json file to filestore Key: SOLR-16813 URL: https://issues.apache.org/jira/browse/SOLR-16813 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: Package Manager Affects Versions: 9.2.1, 9.2 Reporter: Will White When a new Solr node enters the cluster and the package management system recreates the package from the source node, the filestore location is set up and the JAR files are copied across correctly, but the {{manifest.json}} file isn't (and the {{.manifest.json.json}} file isn't generated). This means that when trying to interact with the package (such as {{bin/solr package list-installed}}) the command errors out because of the missing expected file. This can be reproduced by: * Setup Zookeeper cluster * Create two solr nodes ("solr1" and "solr2") in cluster * Install a package on solr1 ** {{bin/solr package add-repo data-import-handler "https://raw.githubusercontent.com/searchscale/dataimporthandler/master/repo"}} ** {{bin/solr package install data-import-handler}} ** {{find /var/solr/data/filestore}} ({{manifest.json}} and {{.manifest.json.json}} should exist) ** {{bin/solr package list-installed}} * At this point, the package exists as expected on solr2 * Stop solr1 and solr2 * Restart the cluster, now with an additional solr3 node * solr3 will have the JAR files and their {{.json}} files, but the {{manifest.json}} file won't be present ** Running {{bin/solr package list-installed}} will throw an exception The difference appears to be that the {{org.apache.solr.packagemanager.RepositoryManager::installPackage}} method contains the following lines: {code:java} // org/apache/solr/packagemanager/RepositoryManager.java#182 if (release.manifest == null) { String manifestJson = PackageUtils.getFileFromJarsAsString(downloaded, "manifest.json"); if (manifestJson == null) { throw new SolrException(ErrorCode.NOT_FOUND, "No manifest found for package: " + packageName + ", version: " + version); } release.manifest = getMapper().readValue(manifestJson, SolrPackage.Manifest.class); } String manifestJson = getMapper().writeValueAsString(release.manifest); // We go on to write this file later {code} But in the equivalent space in {{org.apache.solr.pkg.SolrPackageLoader.Version()}}, the equivalent section to fetch the JAR files if missing doesn't reference the version.manifest anywhere: {code:java} // org/apache/solr/pkg/SolrPackageLoader.java#276 coreContainer.getPackageStoreAPI().validateFiles(version.files, true, s -> errs.add(s)); if (!errs.isEmpty()) { ... } for (String file : version.files) { paths.add(coreContainer.getPackageStoreAPI().getPackageStore().getRealpath(file)); } {code} Something like [this commit|https://github.com/apache/solr/commit/9fc5253eef573dc5e53edf52d98d4a2e55e00e1e] solves the problem by manually copying the file across, but I'm not sure if this is the best approach to take. I'm happy to put a full PR together, although I'd appreciate it if someone could confirm whether the {{manifest.json}} file should be transferred from the source node like this (and can be treated as mandatory), or if I've misunderstood how this should be working. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-16813) PackageLoader not copying manifest.json file to filestore
[ https://issues.apache.org/jira/browse/SOLR-16813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725017#comment-17725017 ] Will White commented on SOLR-16813: --- Brief discussion in -IRC- Slack: https://apachesolr.slack.com/archives/C01GVPZSSK0/p1684680323785689 Tossed a coin on priority, raised as Major because without manually copying the {{manifest.json}} file across to the new node, all commands to the new package break (which I've interpreted as a Major issue on the assumption of how the package manager might be used in a clustered environment). Happy to downgrade to a Minor if that's a less common use case than I've supposed. > PackageLoader not copying manifest.json file to filestore > -- > > Key: SOLR-16813 > URL: https://issues.apache.org/jira/browse/SOLR-16813 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Package Manager >Affects Versions: 9.2, 9.2.1 >Reporter: Will White >Priority: Major > > When a new Solr node enters the cluster and the package management system > recreates the package from the source node, the filestore location is set up > and the JAR files are copied across correctly, but the {{manifest.json}} file > isn't (and the {{.manifest.json.json}} file isn't generated). This means that > when trying to interact with the package (such as {{bin/solr package > list-installed}}) the command errors out because of the missing expected file. > This can be reproduced by: > * Setup Zookeeper cluster > * Create two solr nodes ("solr1" and "solr2") in cluster > * Install a package on solr1 > ** {{bin/solr package add-repo data-import-handler > "https://raw.githubusercontent.com/searchscale/dataimporthandler/master/repo"}} > ** {{bin/solr package install data-import-handler}} > ** {{find /var/solr/data/filestore}} ({{manifest.json}} and > {{.manifest.json.json}} should exist) > ** {{bin/solr package list-installed}} > * At this point, the package exists as expected on solr2 > * Stop solr1 and solr2 > * Restart the cluster, now with an additional solr3 node > * solr3 will have the JAR files and their {{.json}} files, but the > {{manifest.json}} file won't be present > ** Running {{bin/solr package list-installed}} will throw an exception > > The difference appears to be that the > {{org.apache.solr.packagemanager.RepositoryManager::installPackage}} method > contains the following lines: > {code:java} > // org/apache/solr/packagemanager/RepositoryManager.java#182 > if (release.manifest == null) { > String manifestJson = PackageUtils.getFileFromJarsAsString(downloaded, > "manifest.json"); > if (manifestJson == null) { > throw new SolrException(ErrorCode.NOT_FOUND, "No manifest found for > package: " + packageName + ", version: " + version); > } > release.manifest = getMapper().readValue(manifestJson, > SolrPackage.Manifest.class); > } > String manifestJson = getMapper().writeValueAsString(release.manifest); > // We go on to write this file later > {code} > But in the equivalent space in > {{org.apache.solr.pkg.SolrPackageLoader.Version()}}, the equivalent section > to fetch the JAR files if missing doesn't reference the version.manifest > anywhere: > {code:java} > // org/apache/solr/pkg/SolrPackageLoader.java#276 > coreContainer.getPackageStoreAPI().validateFiles(version.files, true, s -> > errs.add(s)); > if (!errs.isEmpty()) { > ... > } > for (String file : version.files) { > > paths.add(coreContainer.getPackageStoreAPI().getPackageStore().getRealpath(file)); > } > {code} > Something like [this > commit|https://github.com/apache/solr/commit/9fc5253eef573dc5e53edf52d98d4a2e55e00e1e] > solves the problem by manually copying the file across, but I'm not sure if > this is the best approach to take. > I'm happy to put a full PR together, although I'd appreciate it if someone > could confirm whether the {{manifest.json}} file should be transferred from > the source node like this (and can be treated as mandatory), or if I've > misunderstood how this should be working. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Updated] (SOLR-16814) Package tool 'add-repo' command fails on mime type handling inside Docker image for text/plain sources
[ https://issues.apache.org/jira/browse/SOLR-16814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will White updated SOLR-16814: -- Description: When building a Docker image from {{main}} the mime type handling no longer handles application/json presented as text/plain when it previously did so (as of 9.2). Cause of change: https://github.com/apache/solr/pull/1575 Reproduction steps: * Checkout Solr to {{main}} * Build local image via {{./gradlew docker}} * Run this image in cloud mode, including {{-Denable.packages=true}} * Open shell on container (e.g. {{docker compose exec NAME /bin/bash}}) * Run {{bin/solr package add-repo data-import-handler "https://raw.githubusercontent.com/searchscale/dataimporthandler/master/repo"}} This will attempt to pull down the repository.json file (which githubusercontent provides as text/plain), and it will throw an error. Error message: {code:bash} /opt/solr-10.0.0-SNAPSHOT$ bin/solr package add-repo data-import-handler "https://raw.githubusercontent.com/searchscale/dataimporthandler/master/repo"; Found 1 Solr nodes: Solr process 14 running on port 8983 com.fasterxml.jackson.databind.JsonMappingException: Error from server at https://raw.githubusercontent.com/searchscale/dataimporthandler/master/repo/repository.json?wt=json&version=2.2: Expected mime type in [application/json] but got text/plain. {code} was: When building a Docker image from {main} the mime type handling no longer handles application/json presented as text/plain when it previously did so (as of 9.2). Cause of change: https://github.com/apache/solr/pull/1575 Reproduction steps: * Checkout Solr to {{main}} * Build local image via {{./gradlew docker}} * Run this image in cloud mode, including {{-Denable.packages=true}} * Open shell on container (e.g. {{docker compose exec NAME /bin/bash}}) * Run {{bin/solr package add-repo data-import-handler "https://raw.githubusercontent.com/searchscale/dataimporthandler/master/repo"}} This will attempt to pull down the repository.json file (which githubusercontent provides as text/plain), and it will throw an error. Error message: {code:bash} /opt/solr-10.0.0-SNAPSHOT$ bin/solr package add-repo data-import-handler "https://raw.githubusercontent.com/searchscale/dataimporthandler/master/repo"; Found 1 Solr nodes: Solr process 14 running on port 8983 com.fasterxml.jackson.databind.JsonMappingException: Error from server at https://raw.githubusercontent.com/searchscale/dataimporthandler/master/repo/repository.json?wt=json&version=2.2: Expected mime type in [application/json] but got text/plain. {code} > Package tool 'add-repo' command fails on mime type handling inside Docker > image for text/plain sources > -- > > Key: SOLR-16814 > URL: https://issues.apache.org/jira/browse/SOLR-16814 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Package Manager >Affects Versions: main (10.0) >Reporter: Will White >Priority: Minor > > When building a Docker image from {{main}} the mime type handling no longer > handles application/json presented as text/plain when it previously did so > (as of 9.2). > Cause of change: https://github.com/apache/solr/pull/1575 > Reproduction steps: > * Checkout Solr to {{main}} > * Build local image via {{./gradlew docker}} > * Run this image in cloud mode, including {{-Denable.packages=true}} > * Open shell on container (e.g. {{docker compose exec NAME /bin/bash}}) > * Run {{bin/solr package add-repo data-import-handler > "https://raw.githubusercontent.com/searchscale/dataimporthandler/master/repo"}} > This will attempt to pull down the repository.json file (which > githubusercontent provides as text/plain), and it will throw an error. > Error message: > {code:bash} > /opt/solr-10.0.0-SNAPSHOT$ bin/solr package add-repo data-import-handler > "https://raw.githubusercontent.com/searchscale/dataimporthandler/master/repo"; > Found 1 Solr nodes: > Solr process 14 running on port 8983 > com.fasterxml.jackson.databind.JsonMappingException: Error from server at > https://raw.githubusercontent.com/searchscale/dataimporthandler/master/repo/repository.json?wt=json&version=2.2: > Expected mime type in [application/json] but got text/plain. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Created] (SOLR-16814) Package tool 'add-repo' command fails on mime type handling inside Docker image for text/plain sources
Will White created SOLR-16814: - Summary: Package tool 'add-repo' command fails on mime type handling inside Docker image for text/plain sources Key: SOLR-16814 URL: https://issues.apache.org/jira/browse/SOLR-16814 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: Package Manager Affects Versions: main (10.0) Reporter: Will White When building a Docker image from {main} the mime type handling no longer handles application/json presented as text/plain when it previously did so (as of 9.2). Cause of change: https://github.com/apache/solr/pull/1575 Reproduction steps: * Checkout Solr to {{main}} * Build local image via {{./gradlew docker}} * Run this image in cloud mode, including {{-Denable.packages=true}} * Open shell on container (e.g. {{docker compose exec NAME /bin/bash}}) * Run {{bin/solr package add-repo data-import-handler "https://raw.githubusercontent.com/searchscale/dataimporthandler/master/repo"}} This will attempt to pull down the repository.json file (which githubusercontent provides as text/plain), and it will throw an error. Error message: {code:bash} /opt/solr-10.0.0-SNAPSHOT$ bin/solr package add-repo data-import-handler "https://raw.githubusercontent.com/searchscale/dataimporthandler/master/repo"; Found 1 Solr nodes: Solr process 14 running on port 8983 com.fasterxml.jackson.databind.JsonMappingException: Error from server at https://raw.githubusercontent.com/searchscale/dataimporthandler/master/repo/repository.json?wt=json&version=2.2: Expected mime type in [application/json] but got text/plain. {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Updated] (SOLR-16814) Package tool 'add-repo' command fails on mime type handling inside Docker image for text/plain sources
[ https://issues.apache.org/jira/browse/SOLR-16814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will White updated SOLR-16814: -- Description: When building a Docker image from {{main}} the mime type handling no longer handles application/json presented as text/plain when it previously did so (as of 9.2). Cause of change: https://github.com/apache/solr/pull/1575 Reproduction steps: * Checkout Solr to {{main}} * Build local image via {{./gradlew docker}} * Run this image in cloud mode, including {{-Denable.packages=true}} * Open shell on container (e.g. {{docker compose exec NAME /bin/bash}}) * Run {{bin/solr package add-repo data-import-handler "https://raw.githubusercontent.com/searchscale/dataimporthandler/master/repo"}} This will attempt to pull down the repository.json file (which githubusercontent provides as text/plain), and it will throw an error. Error message: {code:bash} /opt/solr-10.0.0-SNAPSHOT$ bin/solr package add-repo data-import-handler "https://raw.githubusercontent.com/searchscale/dataimporthandler/master/repo"; Found 1 Solr nodes: Solr process 14 running on port 8983 com.fasterxml.jackson.databind.JsonMappingException: Error from server at https://raw.githubusercontent.com/searchscale/dataimporthandler/master/repo/repository.json?wt=json&version=2.2: Expected mime type in [application/json] but got text/plain. {code} The contents of that file are valid JSON, but don't have the correct header type sent back. This does correctly reduce the attack surface of the image (as SOLR-16752 intends), but I've opened this ticket as a loss of behaviour which might impact SolrCloud users. was: When building a Docker image from {{main}} the mime type handling no longer handles application/json presented as text/plain when it previously did so (as of 9.2). Cause of change: https://github.com/apache/solr/pull/1575 Reproduction steps: * Checkout Solr to {{main}} * Build local image via {{./gradlew docker}} * Run this image in cloud mode, including {{-Denable.packages=true}} * Open shell on container (e.g. {{docker compose exec NAME /bin/bash}}) * Run {{bin/solr package add-repo data-import-handler "https://raw.githubusercontent.com/searchscale/dataimporthandler/master/repo"}} This will attempt to pull down the repository.json file (which githubusercontent provides as text/plain), and it will throw an error. Error message: {code:bash} /opt/solr-10.0.0-SNAPSHOT$ bin/solr package add-repo data-import-handler "https://raw.githubusercontent.com/searchscale/dataimporthandler/master/repo"; Found 1 Solr nodes: Solr process 14 running on port 8983 com.fasterxml.jackson.databind.JsonMappingException: Error from server at https://raw.githubusercontent.com/searchscale/dataimporthandler/master/repo/repository.json?wt=json&version=2.2: Expected mime type in [application/json] but got text/plain. {code} > Package tool 'add-repo' command fails on mime type handling inside Docker > image for text/plain sources > -- > > Key: SOLR-16814 > URL: https://issues.apache.org/jira/browse/SOLR-16814 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Package Manager >Affects Versions: main (10.0) >Reporter: Will White >Priority: Minor > > When building a Docker image from {{main}} the mime type handling no longer > handles application/json presented as text/plain when it previously did so > (as of 9.2). > Cause of change: https://github.com/apache/solr/pull/1575 > Reproduction steps: > * Checkout Solr to {{main}} > * Build local image via {{./gradlew docker}} > * Run this image in cloud mode, including {{-Denable.packages=true}} > * Open shell on container (e.g. {{docker compose exec NAME /bin/bash}}) > * Run {{bin/solr package add-repo data-import-handler > "https://raw.githubusercontent.com/searchscale/dataimporthandler/master/repo"}} > This will attempt to pull down the repository.json file (which > githubusercontent provides as text/plain), and it will throw an error. > Error message: > {code:bash} > /opt/solr-10.0.0-SNAPSHOT$ bin/solr package add-repo data-import-handler > "https://raw.githubusercontent.com/searchscale/dataimporthandler/master/repo"; > Found 1 Solr nodes: > Solr process 14 running on port 8983 > com.fasterxml.jackson.databind.JsonMappingException: Error from server at > https://raw.githubusercontent.com/searchscale/dataimporthandler/master/repo/repository.json?wt=json&version=2.2: > Expected mime type in [application/json] but got text/plain. > {code} > The contents of that file are valid JSON, but don't have the correct header > type sent back. > This does correctly reduce the attack surface of the image (as SOLR-16752 > intends), but I've o
[jira] [Created] (SOLR-16820) PackageUtils collection validation is more restrictive than CreateCollectionAPI allows
Will White created SOLR-16820: - Summary: PackageUtils collection validation is more restrictive than CreateCollectionAPI allows Key: SOLR-16820 URL: https://issues.apache.org/jira/browse/SOLR-16820 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: Package Manager Reporter: Will White It's possible to create a collection via the CreateCollectionAPI which [passes validation from the SolrIdentifierValidation|https://github.com/apache/solr/blob/main/solr/solrj/src/java/org/apache/solr/client/solrj/util/SolrIdentifierValidator.java#L50-L52] (a regex which includes the '.' character), but that same collection name won't then pass validation when deployed/undeployed via the PackageTool because of the [packagemanager.PackageUtils validateCollection() method|https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/packagemanager/PackageUtils.java#L271]. A change [like this, using the existing SolrIdentifierValidator|https://github.com/apache/solr/commit/638fd768ebd7ed7908029ced08e56bed05a4a2a5] would bring the two validation steps back in line, although there's presumably a better approach. *Potential risks* As highlighted by Gus Heck [in this thread|https://lists.apache.org/thread/h7hnksgqwxxl7nkwkhn01r6jn8xjkjjs] changing the validation of collection names could be a risky change to make. The source of the PackageUtils regex appears to be [https://github.com/apache/lucene-solr/pull/994] from before Solr split from the Lucene project, and it seems that the regex wasn't crafted for a specific subset of use cases that specifically excluded the '.' character - it just appears to be the regex implemented at the time. Using the {{SolrIdentifierValidator}} approach mentioned above as an example, other than disallowing a collection name that begins with a '-' character, the {{SolrIdentifierValidator.identifierPattern}} would be a strict expansion of the allowed collection names for the {{PackageUtils.validateCollections}}. Any other solution (such as [this more naive example|https://github.com/apache/solr/blame/998fffdccf51a0560589e2cb413e9da127a5f26e/solr/core/src/java/org/apache/solr/packagemanager/PackageUtils.java#L271]) could similarly mitigate a lot of the potential risk by only expanding the allowed collection names. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Updated] (SOLR-16820) PackageUtils collection validation is more restrictive than CreateCollectionAPI allows
[ https://issues.apache.org/jira/browse/SOLR-16820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will White updated SOLR-16820: -- Description: It's possible to create a collection via the CreateCollectionAPI which [passes validation from the SolrIdentifierValidation|https://github.com/apache/solr/blob/main/solr/solrj/src/java/org/apache/solr/client/solrj/util/SolrIdentifierValidator.java#L50-L52] (a regex which among other elements includes the '.' character), but that same collection name won't then pass validation when deployed/undeployed via the PackageTool because of the [packagemanager.PackageUtils validateCollection() method|https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/packagemanager/PackageUtils.java#L271]. A change [like this, using the existing SolrIdentifierValidator|https://github.com/apache/solr/commit/638fd768ebd7ed7908029ced08e56bed05a4a2a5] would bring the two validation steps back in line, although there's presumably a better approach. *Potential risks* As highlighted by Gus Heck [in this thread|https://lists.apache.org/thread/h7hnksgqwxxl7nkwkhn01r6jn8xjkjjs] changing the validation of collection names could be a risky change to make. The source of the PackageUtils regex appears to be [https://github.com/apache/lucene-solr/pull/994] from before Solr split from the Lucene project, and it seems that the regex wasn't crafted for a specific subset of use cases that specifically excluded the '.' character - it just appears to be the regex implemented at the time. Using the {{SolrIdentifierValidator}} approach mentioned above as an example, other than disallowing a collection name that begins with a '-' character, the {{SolrIdentifierValidator.identifierPattern}} would be a strict expansion of the allowed collection names for the {{{}PackageUtils.validateCollections{}}}. Any other solution (such as [this more naive example|https://github.com/apache/solr/blame/998fffdccf51a0560589e2cb413e9da127a5f26e/solr/core/src/java/org/apache/solr/packagemanager/PackageUtils.java#L271]) could similarly mitigate a lot of the potential risk by only expanding the allowed collection names. was: It's possible to create a collection via the CreateCollectionAPI which [passes validation from the SolrIdentifierValidation|https://github.com/apache/solr/blob/main/solr/solrj/src/java/org/apache/solr/client/solrj/util/SolrIdentifierValidator.java#L50-L52] (a regex which includes the '.' character), but that same collection name won't then pass validation when deployed/undeployed via the PackageTool because of the [packagemanager.PackageUtils validateCollection() method|https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/packagemanager/PackageUtils.java#L271]. A change [like this, using the existing SolrIdentifierValidator|https://github.com/apache/solr/commit/638fd768ebd7ed7908029ced08e56bed05a4a2a5] would bring the two validation steps back in line, although there's presumably a better approach. *Potential risks* As highlighted by Gus Heck [in this thread|https://lists.apache.org/thread/h7hnksgqwxxl7nkwkhn01r6jn8xjkjjs] changing the validation of collection names could be a risky change to make. The source of the PackageUtils regex appears to be [https://github.com/apache/lucene-solr/pull/994] from before Solr split from the Lucene project, and it seems that the regex wasn't crafted for a specific subset of use cases that specifically excluded the '.' character - it just appears to be the regex implemented at the time. Using the {{SolrIdentifierValidator}} approach mentioned above as an example, other than disallowing a collection name that begins with a '-' character, the {{SolrIdentifierValidator.identifierPattern}} would be a strict expansion of the allowed collection names for the {{PackageUtils.validateCollections}}. Any other solution (such as [this more naive example|https://github.com/apache/solr/blame/998fffdccf51a0560589e2cb413e9da127a5f26e/solr/core/src/java/org/apache/solr/packagemanager/PackageUtils.java#L271]) could similarly mitigate a lot of the potential risk by only expanding the allowed collection names. > PackageUtils collection validation is more restrictive than > CreateCollectionAPI allows > -- > > Key: SOLR-16820 > URL: https://issues.apache.org/jira/browse/SOLR-16820 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Package Manager >Reporter: Will White >Priority: Minor > Labels: packagemanager > > It's possible to create a collection via the CreateCollectionAPI which > [passes validation from the > SolrIdentifierValidation|https://github.com/apache/solr/blob/main/so
[jira] [Commented] (SOLR-16820) PackageUtils collection validation is more restrictive than CreateCollectionAPI allows
[ https://issues.apache.org/jira/browse/SOLR-16820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17727656#comment-17727656 ] Will White commented on SOLR-16820: --- Sure thing [~gerlowskija], I'd be happy to (y) It'll likely be Sunday at the earliest (busy week ahead!), hope that's ok time-wise. > PackageUtils collection validation is more restrictive than > CreateCollectionAPI allows > -- > > Key: SOLR-16820 > URL: https://issues.apache.org/jira/browse/SOLR-16820 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Package Manager >Reporter: Will White >Priority: Minor > Labels: packagemanager > > It's possible to create a collection via the CreateCollectionAPI which > [passes validation from the > SolrIdentifierValidation|https://github.com/apache/solr/blob/main/solr/solrj/src/java/org/apache/solr/client/solrj/util/SolrIdentifierValidator.java#L50-L52] > (a regex which among other elements includes the '.' character), but that > same collection name won't then pass validation when deployed/undeployed via > the PackageTool because of the [packagemanager.PackageUtils > validateCollection() > method|https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/packagemanager/PackageUtils.java#L271]. > A change [like this, using the existing > SolrIdentifierValidator|https://github.com/apache/solr/commit/638fd768ebd7ed7908029ced08e56bed05a4a2a5] > would bring the two validation steps back in line, although there's > presumably a better approach. > *Potential risks* > As highlighted by Gus Heck [in this > thread|https://lists.apache.org/thread/h7hnksgqwxxl7nkwkhn01r6jn8xjkjjs] > changing the validation of collection names could be a risky change to make. > The source of the PackageUtils regex appears to be > [https://github.com/apache/lucene-solr/pull/994] from before Solr split from > the Lucene project, and it seems that the regex wasn't crafted for a specific > subset of use cases that specifically excluded the '.' character - it just > appears to be the regex implemented at the time. > Using the {{SolrIdentifierValidator}} approach mentioned above as an example, > other than disallowing a collection name that begins with a '-' character, > the {{SolrIdentifierValidator.identifierPattern}} would be a strict expansion > of the allowed collection names for the > {{{}PackageUtils.validateCollections{}}}. Any other solution (such as [this > more naive > example|https://github.com/apache/solr/blame/998fffdccf51a0560589e2cb413e9da127a5f26e/solr/core/src/java/org/apache/solr/packagemanager/PackageUtils.java#L271]) > could similarly mitigate a lot of the potential risk by only expanding the > allowed collection names. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org