Jon Harrop wrote: > Filip Wasilewski wrote: > > Jon Harrop wrote: > >> Filip Wasilewski wrote: > >> > Jon, both Python and Matlab implementations discussed here use the > >> > lifting scheme, while yours is a classic convolution based approach. > >> > >> I've done both in OCaml. The results are basically the same. > > > > Have you tried taking advantage of the 50% reduction of arithmetic > > operations for the lifting scheme? > > Yes. The time taken is dominated by memory accesses. The amount of > arithmetic is almost irrelevant.
I can say from my experience, that this depends much on the input size and the CPU cache size. For arrays of about n^18 or less and 2MB cache the speedup for the straightforward C lifting implementation is around 50%, while all operations are done in-place. For bigger arrays traversing memory several times becomes too expensive, at least until the input size is so big that the memory gets swapped and the inplace method becomes faster again. I suppose that such behavior should be quite similar for other low-level implementations as well. Actually I started to wonder if I could automate conversion of multi-loop lifting schemes [1] into one-pass loop. This could be done by delaying computation of the "coming next" step just by few units, until values from the preceding step are ready to use. This should reduce the random memory access effect for several cases and make the timings quite stable. [1] http://en.wikipedia.org/wiki/Lifting_scheme > >> It makes sense because they solve the same problem. When you're comparing > >> non-trivial problems between disparate languages you are not likely to > >> use the same algorithms. Using built-in hash functions is an obvious > >> example. > > > > I don't even ask if you have checked the output result coefficients > > because I'm pretty sure you have not. > > I checked the coefficients against other DWTs from the web. > > > They solve similar problem but in > > different ways and give a bit different results, for example the > > layout of coefficients in the memory is different as well as border > > distortion handling. Please don't tell me it's better to do something > > else instead of presenting a relevant solution. > > I used a more efficient approach because I could. Using a compiled language > gave me that choice. We should compare both approaches in both languages. I am getting a bit tired of this discussion. I am not arguing which approach is faster, because it very much depends on the use case. I just ask you to do fair comparison using the original algorithm presented here instead of forcing your arguments with the not-so-relevant variant convenient for you. Every language (or at least vast majority of them) that has loops and basic arithmetic gives you such choice, no matter compiled or not. These are pretty simple algorithms when limited to single D4 case, but the D4 is not the only wavelet transform known to mankind and there are situations when you do not have such choice, because some problems just can't be expressed using the classic approach. If the OCaml is so superior as you claim, please show me how to do the lifting scheme (preferably in the http://groups.google.com/group/comp.lang.python/msg/a71a5bf00a88ad57 form with features described in my previous post) so I could actually learn something new and maybe even start to consider it as a possible alternative to C-ish language family. As far as the DWT only is concerned, well, If I were to choose the most efficient approach (in both the CPU time and the development time terms) for Python I would use a ready extension instead of developing the wheel once again. Let me think, that should go something like this: import pywt coeffs = pywt.wavedec(x, 'db2', 'per') > >> No. The effects you are talking about are swamped by the interpreted vs > >> compiled effect. > > > > Have I already mentioned it was observed with low-level C > > implementation on x86? I would really appreciate you took some time on > > exploring the unfortunate topic of this benchmark before posting > > another comment like that. > > We are talking about this Python implementation, not some "low-level C > implementation on x86". With Python, it is the overhead of the interpreter > that makes it slow and encourages you to obfuscate your code in order to > make it run in a reasonable amount of time. I was talking about why you should not limit your benchmarks to the very single case and gave example how the choice of parameters can influence the results. It has nothing to do with the interpreted vs. compiled harangue. Would you argue that some O(n^2) algorithm is always better than it's O(n) alternative just because it has lower call overhead and performs better under some limited circumstances? > > What I would really like to see is a lifting scheme > > implementation that can take 1- to 10-dimensional arrays > > I believe that is just a case of abstracting get/set. > > > (single and double precision floats, > > That is also a case of abstracting get/set. > > > convert on the fly from integer types when necessary > > That is dynamic typing. > > > and handle strided, non-contiguous arrays), > > That is also an abstraction of get/set. Oh my, are all these abstractions to be put into reality every time from a scratch? > > axis and maximum decomposition level as input > > I don't know what that is. When doing 1D operations on a multidimensional array it is often convenient to be able to specify along which axis these operations should be performed, for example: >>> x = numpy.ones((2,4)) >>> x.sum(axis=0) array([ 2., 2., 2., 2.]) >>> x.sum(axis=1) array([ 4., 4.]) The maximum level denotes how many decomposition steps should be performed for multilevel transform. Just like described on http://www.mathworks.com/access/helpdesk/help/toolbox/wavelet/wavedec.html > > and return list of views (or arrays if necessary) > > Probably just a closure. > > > with transform coefficients for every level of > > decomposition. Can do? > > Sure. Sounds like a lot of work though. See. It's for sure pretty much work to create a general solution in C. It is a lot easier to focus on the real problem when using Python (and NumPy or other package from a wide range of available extensions) and save yourself getting much into implementation details while still maintaining reasonable performance. That is one of the most important reasons I prefer to use high-level language for most of my tasks. I also try to avoid doing premature optimizations that could spare some CPU cycles, just to see in a few moments that the performance gain is completely swallowed by I/O overhead on some remote data access point. For the rest of intensive number crunching, if there is no specialized extension available yet, I still prefer to use the Python/Pyrex/C or similar combo to get solution that can be easily integrated with other parts of a bigger system (like networking, databases, reports, web, etc. - wow, how many things the transition from C++ and Java few years ago made just so accessible) or just used to play in the interactive mode. > This would be best done in F# > because you can make your own IList derivatives with get_Item and set_Item > that provide views but look like arrays. > > > Are you telling I have to define a different operator for every > > arithmetic operation for every two types? > > In OCaml, yes. Not in F#. > > > That definitely does not look > > like a quick, flexible and clear solution to start with. Any working > > example I could type into OCamlWinPlus of the following would be very > > appreciated. I'm fairly new to this and would really like to see how to > > handle differently shaped arrays and data types in the interactive mode > > without much hassle: > > > > from numpy import ones, arange, reshape, int32 > > a = ones((2,3,4,5)) > > b = ones(a.shape, dtype=int32) > > c = ones((3,4,5)) > > d = 2*a + 2.5*(3*b + 3.3) > > d[0] = d[0] + 5 > > d[1] *= c * (2+3*1.2) > > d[:,2,...] = reshape(arange(d[:,2,...].size), d[:,2,...].shape) > > print d[...,1] > > I can't get that to work in Python. What do I need to do? Type exactly as presented above, starting from the very first line. The `from module import ...` imports chosen identifiers into current namespace, so you don't have to prefix them with module name. > >>> import numpy > >>> a = ones((2,3,4,5)) > Traceback (most recent call last): > File "<stdin>", line 1, in ? > NameError: name 'ones' is not defined > > > The ray tracer example is really nice, however the sudoku solver is not > > so impressive when compared to the K thingy ;-) - > > http://markbyers.com/moinmoin/moin.cgi/ShortestSudokuSolver. > > Well, OCaml does ok on the sudoku but K does awfully on the ray tracer. That's pretty natural. Different languages are designed to solve different tasks. I hope your point is not to convince people to use OCaml to solve (or complicate) every possible problem. > > Where can I find > > documentation or tutorial on using Bigarrays? > > http://caml.inria.fr/pub/docs/manual-ocaml/libref/Bigarray.html Yes, I found this before but it is not very helpful to start with. Any doc with basic use cases and examples would be much better. Think of it as a possible restraint for people trying to learn the language. > You probably want to use F# though. On the other hand, I wonder how this integrates with other .net languages like C# or IronPython. cheers, fw -- http://mail.python.org/mailman/listinfo/python-list