Re: [numbers][gsoc] GSoC 2022 - NUMBERS-186 Proposal
On Sat, 11 Jun 2022 at 06:08, Sumanth Rajkumar wrote: > > The type change would have been source compatible where existing > applications would have to recompile (without any code changes) to run > with the new version? > Yes. > > So yes, we cannot change Complex type to interface to maintain binary > runtime compatibility > > I assume, binary compatibility would be maintained if we add additional > interfaces with new methods to Complex class? > Yes. To use new methods another application would have to recompile source. Those applications that do now recompile source are ignorant of any new methods.
Re: [numbers][gsoc] GSoC 2022 - NUMBERS-186 Proposal
On Sat, 11 Jun 2022 at 06:30, Sumanth Rajkumar wrote: > On Fri, Jun 10, 2022, 19:30 Gilles Sadowski wrote: > > > The current implementation of "Complex" encapsulates two "double" fields > > (not > > a "double[]"). > > Should we make two, at first separate, discussions: One for the > > implementation > > of the "complex number" concept, and another for (n-dimensional) lists of > > them? > > > > Yes. We could also change existing Complex type from two double fields to > double[] and even make it implement new ComplexNumber interface? This > should maintain runtime compatibility? > True. It does have the following disadvantages: - No longer immutable. Immutability would have to be enforced by encapsulation. - Increases memory consumption. Note this would compound the original issue with representing Complex in a List with inefficient memory consumption. So any application that does not update to the new implementation would suffer. > > I was thinking this would allow efficient reuse of functions processing > lists also to be used on single numbers (Complex) which is also a list of > size 1 > But we have also discussed other ways to share code between a single number and a list of them that are more safe, i.e. those that represent each entry of the list as a type to pass to the computation method which stores the result in the provided computation result. > > Jtransform has support for large arrays (size greater than integer.max with > array index of type long. > > Do we need to support that now or add that support later if needed? > No. You have to have a very good reason to use the undocumented sun.misc.Unsafe in code. I do not think we currently have that. Note that ~2^31 is a lot of numbers and to exceed this would be a specialist application for example using JTransforms 3D FFT on very large 3D data represented as a 1D array.
Re: [numbers][gsoc] GSoC 2022 - NUMBERS-186 Proposal
On Sat, 11 Jun 2022 at 07:09, Sumanth Rajkumar wrote: > Hi Alex and Gilles, > > For the complex functional interfaces, we have so far been working on > different options but all operate on single complex numbers one at a > time... > > Instead can we design around Functional Interfaces that work on List/Arrays > instead? > An interesting approach. IIUC this approach is to allow parallel optimisation. So the question is what computations would be performed in parallel on a large set of numbers? I am wondering if all the ISO c99 functions would ever be used in this way. And if they are, how well they would optimise given the high number of branch statements in many of the functions to handle edge cases. The optimisation is performing: for i in range: x(i) = f(x(i)) So you have to provide a function to operate on a number to obtain another number. How is this coded in the Java Vector API for simd support? Would a single layer of abstraction mapping a double[] (in the list) to ComplexNumber (for the function input) and back (from the function output) impact performance? <-- SNIP --> > > Interface ComplexDoubleArray{ > > DoubleStream realAndImgPairs (int index, int length) > > <-- SNIP --> A stream would have to have the pair of [real, imag] as the unit that the stream operates on. It cannot be a (possibly interleaved) stream of each part otherwise no function can operate on the whole complex number. Any API would require: Stream stream(int index, int length) with T the type of the complex number. ... > > } > > @FunctionalInterface > Interface ComplexDoubleUnaryOperator{ > > void apply(ComplexDoubleArray input, ComplexDoubleArray output, int > inputoffset, int outputoffset, int length); > > } > > This has following advantages > > 1) iterating implementations over lists will now be in the complex function > instead of ComplexList data structures allowing for different > implementations. For example we could have multi threaded java.util.stream > based operation on lists and single threaded cursor while loop based > implementation as suggested before by Alex. > Thinking about streams... For multi-threaded stream based operation this would: - Create the stream with a start and end point - Create a custom Spliterator holding the range limits - Allow division of the Spliterator which handles dividing the range until no more divisions are required (max threads reached) Thus the final unit to operate on is an unknown range (this is known to the Spliterator) of numbers. The unit for the functional interface is then ComplexDoubleArray: interface ComplexDoubleArray { Stream stream(int start, int length); } ComplexDoubleArray a; // Will use the Java 8 ForkJoinPool.commonPool() for parallel execution a.stream(start, length).parallel().forEach(x -> ComplexFunctions.conj(x, x)); class ComplexFunctions { static void conj(ComplexDoubleArray in, ComplexDoubleArray out); } So for streams the operations are required to operate on a range that is unknown. Internally the method must then have a way to access each entry of the ComplexDoubleArray, which is effectively a List.subList(...). Did you envision a different implementation? > 2) it will be required to take advantage of simd parallelism as discussed > before. For example conjugate operation on Complex list (double[] of real > and imaginary pairs) would be vector multiplication with array [ > 1,-1,1-1,..] > > If functional interface enforce operation on one complex number at a time, > we might not be able take advantage of simd optimizations? > IIUC the Java vector API is for multiplications, additions, and other arithmetic on primitives. The API is not finalised but it appears to have support for pow and sqrt elementary functions. So this would not allow full support for the ISO c99 elementary functions in Complex. Also the API requires a primitive array to be converted to a Vector species and operated on. So perhaps to support this only requires that any list of complex numbers can return the real and imaginary parts: interface ComplexList { void getReal(int start, int length, double[] dest); void getImaginary(int start, int length, double[] dest); // set methods } Then the functions can be written to extract the parts from the list, operate on them using the Java vector API and return them to the list. These functions would have an advantage on simple primitive operations. But any functions that require conditional logic may not be optimised (this is not clear from the examples I found). So for instance the multiplication of two complex numbers performed using: (a + i b)(c + i d) = (ac - bd) + i (ad + bc) final double ac = a * c; final double bd = b * d; final double ad = a * d; final double bc = b * c; double x = ac - bd; double y = ad + bc; The result x and y are then checked for NaN and suitable infinities recovered which requires a lot of logic. This logic may have to be moved out of the vector optimised loop and pe
Re: [numbers][gsoc] GSoC 2022 - NUMBERS-186 Proposal
Hello. > [...] > > interface ComplexDoubleArray { > Stream stream(int start, int length); > } > > ComplexDoubleArray a; > // Will use the Java 8 ForkJoinPool.commonPool() for parallel execution > a.stream(start, length).parallel().forEach(x -> ComplexFunctions.conj(x, > x)); > > class ComplexFunctions { > static void conj(ComplexDoubleArray in, ComplexDoubleArray out); > } > > [...] I have a hard time figuring out whether these bits of code are intended to become the application developer API... What data-structure(s) will be visible (from the application)? What will be hidden ("implementation details")? Do we have use-cases of non-trivial processing of N-dimensional cubes of complex numbers? [I imagine that the same API should be able to also process cubes of real numbers (without storing the "0" imaginary parts).] Gilles - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
RE: [collections] Add a list/deque faster than TreeList?
I'd like to point out that rodde has also asked this project to be included in OpenJDK: https://mail.openjdk.java.net/pipermail/core-libs-dev/2022-June/091412.html On 2022/06/10 19:29:43 Rodion Efremov wrote: Hi, I have this List/Deque data structure: https://github.com/coderodde/IndexedLinkedList In a versatile benchmark, it outperforms the O(log n) TreeList easily by the factor of 2. (Also, it requires less memory than TreeList.) What do you think? Does it deserve to be included in collections? Best regards, rodde - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [collections] Add a list/deque faster than TreeList?
Looks pretty interesting. I’ll be honest that my own data structure expertise revolves around immutable persistent patterns rather than mutable ones, but your class makes perfect sense as a mutable one. Do you have any notes on thread safety? While it’s neat that you’re submitting to the JDK as well, that sort of process is fairly long term, so I’d imagine that Collections would be a great place for it. If you’re trying to donate this to multiple projects, then Eclipse also has a collections library that might like it, and Guava might like it, too. — Matt Sicker > On Jun 10, 2022, at 14:30, Rodion Efremov wrote: > > Hi, > > I have this List/Deque data structure: > > https://github.com/coderodde/IndexedLinkedList > > In a versatile benchmark, it outperforms the O(log n) TreeList easily by > the factor of 2. (Also, it requires less memory than TreeList.) > > What do you think? Does it deserve to be included in collections? > > Best regards, > rodde
Re: [collections] Add a list/deque faster than TreeList?
Hi Matt and community, About thread safety: I keep an int counting modifications (called modCount). Now, spliterator/iterator/sublist check that modCount == expectedModCount, and if that is not the case, throws ConcurrentModificationException. Basically, it resembles the fail-fast behavior of ArrayList/LinkedList. In order to deal with concurrency, one should wrap an instance of IndexedLinkedList into Collections.synchronizedList(...). About Guava: they happily turned down my offer. Best regards, rodde la 11.6.2022 klo 19.44 Matt Sicker kirjoitti: > Looks pretty interesting. I’ll be honest that my own data structure > expertise revolves around immutable persistent patterns rather than mutable > ones, but your class makes perfect sense as a mutable one. > > Do you have any notes on thread safety? > > While it’s neat that you’re submitting to the JDK as well, that sort of > process is fairly long term, so I’d imagine that Collections would be a > great place for it. If you’re trying to donate this to multiple projects, > then Eclipse also has a collections library that might like it, and Guava > might like it, too. > > — > Matt Sicker > > > On Jun 10, 2022, at 14:30, Rodion Efremov wrote: > > > > Hi, > > > > I have this List/Deque data structure: > > > > https://github.com/coderodde/IndexedLinkedList > > > > In a versatile benchmark, it outperforms the O(log n) TreeList easily by > > the factor of 2. (Also, it requires less memory than TreeList.) > > > > What do you think? Does it deserve to be included in collections? > > > > Best regards, > > rodde >
Re: [collections] Add a list/deque faster than TreeList?
About thread safety, I was mostly interested in yes or no, though the details are neat. We have annotations for that so that users know which classes are applicable to different situations. Guava seems to be more of a giant toolbox than an easily embedded library like things at Commons, so I suppose that doesn’t surprise me too much. — Matt Sicker > On Jun 11, 2022, at 12:25, Rodion Efremov wrote: > > Hi Matt and community, > > About thread safety: I keep an int counting modifications (called > modCount). Now, spliterator/iterator/sublist check that modCount == > expectedModCount, and if that is not the case, throws > ConcurrentModificationException. Basically, it resembles the fail-fast > behavior of ArrayList/LinkedList. In order to deal with concurrency, one > should wrap an instance of IndexedLinkedList into > Collections.synchronizedList(...). > > About Guava: they happily turned down my offer. > > Best regards, > rodde > > la 11.6.2022 klo 19.44 Matt Sicker kirjoitti: > >> Looks pretty interesting. I’ll be honest that my own data structure >> expertise revolves around immutable persistent patterns rather than mutable >> ones, but your class makes perfect sense as a mutable one. >> >> Do you have any notes on thread safety? >> >> While it’s neat that you’re submitting to the JDK as well, that sort of >> process is fairly long term, so I’d imagine that Collections would be a >> great place for it. If you’re trying to donate this to multiple projects, >> then Eclipse also has a collections library that might like it, and Guava >> might like it, too. >> >> — >> Matt Sicker >> On Jun 10, 2022, at 14:30, Rodion Efremov wrote: >>> >>> Hi, >>> >>> I have this List/Deque data structure: >>> >>> https://github.com/coderodde/IndexedLinkedList >>> >>> In a versatile benchmark, it outperforms the O(log n) TreeList easily by >>> the factor of 2. (Also, it requires less memory than TreeList.) >>> >>> What do you think? Does it deserve to be included in collections? >>> >>> Best regards, >>> rodde >>
Re: [collections] Add a list/deque faster than TreeList?
Hello, I agree that this looks interesting. I personally would like to see the following before weighing in on whether or not to include it in commons: 1. A list of use cases where this data structure would be potentially more performant or useful than existing data structures. 2. A set of JMH[1] benchmarks that compares this data structure to TreeList and ArrayList (and others if needed). I see that there are basic benchmarks already in the code, but JMH will give us much more accurate and reproducible results. There are examples of such benchmarks in several commons projects, e.g., commons-numbers [2] and commons-geometry [3]. Regardless, thank you for your proposal and code, rodde. It is much appreciated. Regards, Matt J [1] https://openjdk.java.net/projects/code-tools/jmh/ [2] https://github.com/apache/commons-numbers/tree/master/commons-numbers-examples/examples-jmh [3] https://github.com/apache/commons-geometry/tree/master/commons-geometry-examples/examples-jmh On Sat, Jun 11, 2022 at 1:25 PM Rodion Efremov wrote: > > Hi Matt and community, > > About thread safety: I keep an int counting modifications (called > modCount). Now, spliterator/iterator/sublist check that modCount == > expectedModCount, and if that is not the case, throws > ConcurrentModificationException. Basically, it resembles the fail-fast > behavior of ArrayList/LinkedList. In order to deal with concurrency, one > should wrap an instance of IndexedLinkedList into > Collections.synchronizedList(...). > > About Guava: they happily turned down my offer. > > Best regards, > rodde > > la 11.6.2022 klo 19.44 Matt Sicker kirjoitti: > > > Looks pretty interesting. I’ll be honest that my own data structure > > expertise revolves around immutable persistent patterns rather than mutable > > ones, but your class makes perfect sense as a mutable one. > > > > Do you have any notes on thread safety? > > > > While it’s neat that you’re submitting to the JDK as well, that sort of > > process is fairly long term, so I’d imagine that Collections would be a > > great place for it. If you’re trying to donate this to multiple projects, > > then Eclipse also has a collections library that might like it, and Guava > > might like it, too. > > > > — > > Matt Sicker > > > > > On Jun 10, 2022, at 14:30, Rodion Efremov wrote: > > > > > > Hi, > > > > > > I have this List/Deque data structure: > > > > > > https://github.com/coderodde/IndexedLinkedList > > > > > > In a versatile benchmark, it outperforms the O(log n) TreeList easily by > > > the factor of 2. (Also, it requires less memory than TreeList.) > > > > > > What do you think? Does it deserve to be included in collections? > > > > > > Best regards, > > > rodde > > - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [numbers][gsoc] GSoC 2022 - NUMBERS-186 Proposal
Hello, Good discussion here! This is great! I lost track of what the overall goal here is while reading through the conversation. The goal of NUMBERS-186 is to "allow operations to be performed on lists of complex numbers". My first thought when looking at this is "how are we going to represent lists of complex numbers?" I don't think there is a single answer to this since the correct format for a list of complex numbers is whatever format the user already has them in. They could be in a file, in separate double arrays, in a single double array (alternating real and imaginary), in a float array, in a java.nio.Buffer, etc. So far, I haven't seen an API that can accomodate all of these. What I would like to see us create is an interface or abstract class like java.util.Buffer that allows us to wrap an underlying storage mechanism with complex number semantics. For example, if you have real and imaginary parts in two separate arrays, you could do something like ComplexBuffer buf = ComplexBuffer.fromRealAndImaginaryArrays(realArr, imArr); Similarly, with a DoubleBuffer: ComplexBuffer buf = ComplexBuffer.fromDoubleBuffer(buf); We can completely hide the classes implementing the wrappings here from the public API so that things are straightforward for the user. If we can first create a simple public API like this, then we can safely focus on performance improvements within the private code. Regards, Matt J On Sat, Jun 11, 2022 at 8:32 AM Gilles Sadowski wrote: > > Hello. > > > [...] > > > > interface ComplexDoubleArray { > > Stream stream(int start, int length); > > } > > > > ComplexDoubleArray a; > > // Will use the Java 8 ForkJoinPool.commonPool() for parallel execution > > a.stream(start, length).parallel().forEach(x -> ComplexFunctions.conj(x, > > x)); > > > > class ComplexFunctions { > > static void conj(ComplexDoubleArray in, ComplexDoubleArray out); > > } > > > > [...] > > I have a hard time figuring out whether these bits of code are > intended to become the application developer API... > What data-structure(s) will be visible (from the application)? > What will be hidden ("implementation details")? > Do we have use-cases of non-trivial processing of N-dimensional > cubes of complex numbers? [I imagine that the same API should > be able to also process cubes of real numbers (without storing the > "0" imaginary parts).] > > Gilles > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [configuration] Jakarta mailapi 2.0.1
I'm glad I asked this question :-) Bruno actually submitted a PR with support for both the old and new namespaces [1] but we decided not to go with it since it felt like too much to support both versions of the API. However, this discussion is making me rethink that choice. For one, dropping support for the old API may actually require users to update their code (as described in my initial email), effectively breaking backwards compatibility. However, we do want to support the newer mailapi version as well so we do not force users to use an outdated dependency. So, unless anyone objects, I will plan on resurrecting Bruno's PR within the next few days and merging it in. Regards, Matt J [1] https://github.com/apache/commons-configuration/pull/107 On Fri, Jun 10, 2022 at 4:32 PM Emmanuel Bourg wrote: > > On 10/06/2022 22:16, Rob Spoor wrote: > > Since reflection is used to create the instances, isn't it possible to > > support both? > > +1, we should support both > > Emmanuel Bourg > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [configuration] Jakarta mailapi 2.0.1
No objections from me. I'm +1 to normally killing old code too, but I think in this case it might be simple to keep both working in [configuration] as users appear to be still transitioning JEE apps to the jakarta namespace. We might just need to remember to remove the old package/namespace when we are certain a vast majority of users have migrated to jakarta. Sorry for the radio silence in that PR, and thanks for taking over Matt! 🎉 -Bruno On Sun, 12 Jun 2022 at 10:26, Matt Juntunen wrote: > I'm glad I asked this question :-) Bruno actually submitted a PR with > support for both the old and new namespaces [1] but we decided not to > go with it since it felt like too much to support both versions of the > API. However, this discussion is making me rethink that choice. For > one, dropping support for the old API may actually require users to > update their code (as described in my initial email), effectively > breaking backwards compatibility. However, we do want to support the > newer mailapi version as well so we do not force users to use an > outdated dependency. > > So, unless anyone objects, I will plan on resurrecting Bruno's PR > within the next few days and merging it in. > > Regards, > Matt J > > > [1] https://github.com/apache/commons-configuration/pull/107 > > On Fri, Jun 10, 2022 at 4:32 PM Emmanuel Bourg wrote: > > > > On 10/06/2022 22:16, Rob Spoor wrote: > > > Since reflection is used to create the instances, isn't it possible to > > > support both? > > > > +1, we should support both > > > > Emmanuel Bourg > > > > - > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > > For additional commands, e-mail: dev-h...@commons.apache.org > > > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > >
Re: [numbers][gsoc] GSoC 2022 - NUMBERS-186 Proposal
On Sat, Jun 11, 2022, 18:02 Gilles Sadowski wrote: > I have a hard time figuring out whether these bits of code are > intended to become the application developer API... > What data-structure(s) will be visible (from the application)? > What will be hidden ("implementation details")? > Developer API will look something like this //1) Static methods to initialize array from external forms ComplexDoubleArray a = ComplexDoubleArray.fromBuffer(Double Buffer); ComplexDoubleArray b = ComplexDoubleArray.fromArray(double[]); ComplexDoubleArray c = ComplexDoubleArray.parseFile(File); //2) Methods on Array interface for complex operations. One for each functional interface type.. unary, binary, scalar operations @FunctionalInterface interface ComplexDoubleUnaryOperator { ComplexDoubleArray apply(ComplexDoubleArray input, ComplexDoubleArray result); } interface ComplexDoubleArray { int size(); //MutableComplexDoubleArray extends ComplexDoubleArray with setter, mutation and static methods to allocate capacity MutableComplexDoubleArray asMutable(); ComplexDoubleArray asImmutable(); default boolean isImmutable() { return true;} default ComplexDoubleArray apply(ComplexDoubleUnaryOperator op){ return op.apply(this, result); } . . . } //Function interfaces behavior - If output array is immutable, operations return a copy, else result is applied in place on the passed in output and returned //3 Developer API usage of complex operations on complex arrays ComplexDoubleArray a = init(); a = a.asImmutable(); ComplexDoubleArray r1 = a.apply(ComplexFunctions::exp); ComplexDoubleArray r2 = a.apply(ComplexParrallelStreamFunctions::exp); ComplexDoubleArray r3 = a.apply(ComplexVectorFunctions::exp); ComplexDoubleArray r4 = a.apply(ComplexJTransformFunctions::forward_fft); // 4 ComplexFunctions can provide commons C99 reference implementations for supported complex operations as static methods that match the functional interfaces. ( Refactored methods from existing Complex class). Third parties can provide alternate implementations such as parallel stream or vector based implementations and used by developers as above. //5 the above functions also reused with existing Complex class? Complex implements ComplexDouble, ComplexDoubleArray { @Override public int size() {return 1;} @Override double [] toDoubleArray() { . . . } . . . //Refactored instance methods of Complex class public Complex exp() { return apply(ComplexFunctions::exp); } . . . } Complex c = Complex.ofCartesian(r,i); //Backward compatible C99 implementation Complex expc = c.exp(); Complex thirdparty_exp = c.apply(SomeThirdPartyFunctions::exp); //Require Java16+ Complex vectorexp = c.apply(Complex VectorFunctions::exp) Thanks, Sumanth