`info sort` (which is mostly talking about the `sort` POSIX command) claims that LC_COLLATE is what affects sort orders on POSIX-like systems, but I know that LANG and LC_ALL somehow override those other variables.
"C" is, as Michael says the "safest" and/or most predictable one, it sorts by the naive reading of the underlying bytes, no lexical sorting of numbers and lowercase numbers sort earlier than their uppercase counterparts (because they are smaller binary numbers) Hope this helps Lee Hambley http://lee.hambley.name/ +49 (0) 170 298 5667 On Thu, 28 Nov 2019 at 15:56, Michael A. Smith <mich...@smith-li.com> wrote: > On unixish systems it probably depends on the locale, as in LANG and > LC_COLLATE. In my experience, the least surprising behavior comes with > LANG=C, except when you're dealing with file names containing a lot of > non-ascii text. > > On Thu, Nov 28, 2019 at 04:29 Ryan Skraba <r...@skraba.com> wrote: > >> Effectively, the schemas are added in the order that the file system >> lists files: >> https://github.com/apache/avro/blob/f310ac8db5ab962a49d448f41b7b953488cdb033/lang/java/tools/src/main/java/org/apache/avro/tool/SpecificCompilerTool.java#L149 >> >> As you observed, this depends on the operating system and/or >> filesystem... I've experienced this in the past (with an unrelated >> tool that generated a classpath from a list of JARS, and seeing an >> unreliable order on Windows vs. linux). >> >> Just reading the code, it should be deterministic if you explicitly >> list the avsc files (or at least the "problem" file) with the >> required order: >> >> java -jar avro-tools-1.9.1.jar compile schemas/Component.avsc >> /schemas/Parent.avsc out-dir/ >> >> or >> >> java -jar avro-tools-1.9.1.jar compile schemas/Component.avsc schemas/ >> out-dir/ >> >> Would it be possible to give this workaround a try? >> >> I took a quick look at the avro-maven-plugin; it doesn't use >> listFiles() directly to discover files, but uses FileSetManager from >> the maven project. I'm hoping they've taken this into account! >> >> Thanks for the well-described, well-defined email! It would make an >> excellent bug report :D https://issues.apache.org/jira/browse/AVRO >> >> Ryan >> >> >> On Thu, Nov 28, 2019 at 12:05 AM Austin Cawley-Edwards >> <austin.caw...@gmail.com> wrote: >> > >> > Hi, >> > >> > We're trying to use the `compile {src dir} {output dir}` command in >> > `avro-tools` and finding that there are some non-deterministic >> > behaviors between systems, depending on how the OS sorts files. >> > >> > Example: >> > schemas/Component.avsc >> > - defines Component record type in the namespace `com.test` >> > >> > schemas/Parent.avsc >> > - defines a Parent record, in the same `com.test` namespace, with a >> > field of type `com.test.Component` >> > >> > >> > With the command, `java -jar avro-tools-1.9.1.jar compile schemas/ >> > out-dir/`, some systems compile the directory in the order Component, >> > Parent while others compile in the order Parent, Component. The latter >> > fails as Component has not been defined when it is referenced by >> > Parent. >> > >> > We have also tried using the IDL and importing the dependency types, >> > and then converting them to avsc, and finally compiling the entire >> > directory, but that fails as the generated avsc files embed/ duplicate >> > the "Component" types each time it is used. >> > >> > >> > Is there a way to deterministically compile a directory? Or compile >> > directly from IDL to java? >> > >> > >> > OS: >> > Linux 857aaf92e059 4.15.0-70-generic #79-Ubuntu SMP Tue Nov 12 >> > 10:36:11 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux >> > >> > Avro: >> > version 1.9.1 >> > >> > >> > >> > Thank you! >> > Austin Cawley-Edwards >> >