Hi all,

thanks for your thoughts so far.

I expect we can maintain a C interface for e.g. FFI (Foreign Function Interface?).

The VOLK library itself is written in C. However, we provide all the necessary tools to use VOLK in C++. Unlike other libraries that is not as straightforward as one might think.
1. C complex.h and C++ std::complex are incompatible. We need to convert.
2. STD_C_COMPLEX is optional. MSVC doesn't support it. One might wonder: How do we compile VOLK with MSVC then? Well, we pretend everything is C++ and things compile. Unfortunately that approach leads to our C vs C++ interface mess where we rely on undefined behavior.

I mentioned `std::span` because it seems like the better alternative to `float *values, const size_t length`. We can still add a wrapper for C that looks like the current API.

C++20 adds quite a few interesting features like `std::span` but also concepts. Depending on our timeline, it might be too early for C++20 though.

Also, I already received feedback that suggests to do one step at a time. i.e. finish our VOLK LGPL re-licensing business and then work on the next step.

I arrived at the point where I wrote this email after I tried to improve the current VOLK API. I'd like to add smth like `multiply(std::span result, std::span in0, std::span in1)` and just unpack that to call a `multiply(float* result, ... , unsigned length)` function. However, the current complex.h support makes that almost impossible. We can add the API but we still have this let's compile things in C or C++ mode depending on the compiler thing.

Cheers
Johannes

On 22.12.21 01:21, Marcus Müller wrote:
Hey Nick,

 > Yes, that's kind of my fault.

"Dault" is kind of a hard word when it's your achievement that a C API VOLK got as far as it took us!

 > C++20 finally makes C++ a much less Lovecraftian nightmare

We're going to have template metaprogramming! SFinaE fhtagn!

Seriously, though. Operations as base classes, kernels / hardware specializations as subclasses, and the actual call being a virtual operator() call...

Thinks about this: Instead of the dispatcher bending the addresses symbols from shared libraries point to (as we do now), there could be a dispatcher object (possibly, but not necessary a singleton) with members of the operation class type, which get assigned instances of the optimal (according to prior volk_profile and/or heuristics) kernel implementation subclass. I.e., instead of using our current nice trick to save the address of the correct &volk_sig_in_kernel_sig_out_arch_alignment in volk_sig_in_kernel_sig_out_aligment, we just use the standard vtable/C++ polymorphism. Same performance at runtime - one CALL.
Immediate benefit: all these things suddenly become self-aware.
Let's add a self-documenting call that returns a const char* describing what this thing does. That makes people very happy when they wrap things for Python, because now the type comes with documentation in your IDE through little effort. We can tell the user that this kernel prefers but doesn't need aligned memory. Or, much more bug-relevant, we could communicate the acceptable input multiples, and stop doing the cute "for the rest, we do the _general approach after the main loop is through in every single _arch_alignment" implementation.

(of course, I have far more somber dreams, don't assume the Lovecraftian horrors are too far from here. If each kernel implementation is a type, we can make these types have the capability (optional trait) to give us the type encapsulating what the VOLK kernel does in its "inner loop" (if applicable). Because C++ allows us to pass things like __mm256& and const __mm256&, we can then simply compose new inner loops. And of course, instead of implementing the same loop skeleton 200 times, we could just, for those kernels where the inner loop is "simple", have one templated

template<typename nucleus_operation>
class loop_kernel {
    using op = nucleus_operation;
    operator()(std::span<op::in_type> in, ...) {
     for(auto ptr = in.begin(); ptr < in.end(); ptr += op::simd_width)
       op::operate(*ptr, ...)
    }
};

or so.)

Cheers!
Marcus


On 21.12.21 20:25, Nick Foster wrote:

On Tue, Dec 21, 2021 at 3:29 AM Marcus Müller <mmuel...@gnuradio.org <mailto:mmuel...@gnuradio.org>> wrote:

    Hi Johannes,

    I, for one, like it :) Especially since I honestly find void
    volk_32fc_x2_s32fc_multiply_conjugate_add_32fc to be a teeny tiny bit clunky and would     rather call a type-safe, overloaded function in a volk namespace called
    multiply_conjugate_add.


Yes, that's kind of my fault. It was the best option we could come up with to be rigorously type-specific in C, kind of a bespoke implementation of name mangling. The original motivation, of course, was the VOLK dispatcher. C was a hard requirement at the time, and I confess I don't remember why. I think it came down from namccart's original donation of vectorized code.

I would be very happy to see VOLK move to C++ (or at least provide wrappers). I strongly advocate for using C++20 -- std::span, variadic arguments, lambdas etc. seem tailor-made for VOLK. Runtime dispatching could be positively elegant, compared to how it must be done in C. And C++20 finally makes C++ a much less Lovecraftian nightmare of a language than the one I learned from Stroustrop.

Nick

    Re: RFC: can we have something like a wiki page (maybe on the VOLK repo?) to collect
    these
    comments?

    You mention spans, so C++-VOLK would be >= C++20?

    Cheers,
    Marcus

    On 21.12.21 10:55, Johannes Demel wrote:
     > Hi everyone,
     >
     > today I'd like to propose an idea for the future of VOLK. Currently, VOLK is a C
    library
     > with a C++ interface and tooling that is written in C++.
     >
     > I propose to make VOLK a C++ library. Similar to e.g. UHD, we can add a C interface if
     > the need arises.
     >
     > This email serves as a request for comments. So go ahead.
     >
     > Benefits:
     > - sane std::complex interface.
     > - same compilation mode on all platforms.
     > - Better dynamic kernel load management.
     > - Option to use std::simd in the future
     > - Less manual memory management (think vector, ...).
     >
     > Drawbacks:
     > - It is a major effort.
     > - VOLK won't be a C project anymore.
     >
     > Why do I propose this shift?
     > VOLK segfaults on PowerPC architectures. This issue requires a breaking API change
    to be
     > fixable. I tried to update the API to fix this isse.
     > https://github.com/gnuradio/volk/pull/488 <https://github.com/gnuradio/volk/pull/488>
     > It works with GCC and Clang but fails on MSVC.
     > One might argue that PowerPC is an obscure architecture at this point but new      > architectures might cause the same issue in the future. Also, VOLK tries to be
    portable
     > and that kind of issue is a serious roadblock.
     >
     > How did we get into this mess?
     > The current API is a workaround to make things work for a specific compiler: MSVC.
    MSVC
     > does not support C `complex.h` at all. The trick to make things work with MSVC is:
     > compile VOLK in C++ mode and pretend it is a C++ library anyways.
     > In turn `volk_complex.h` defines complex data types differently depending if VOLK is      > included in C or C++. Finally, we just hope that the target platform provides the same      > ABI for C complex and C++ complex. C complex and C++ complex are not compatible.
     > However, passing pointers around is.
     > Thus, the proposed change does not affect Windows/MSVC users because they were
    excluded
     > from our C API anyways. The bullet point: "same compilation mode on all platforms"
     > refers to this issue.
     >
     > Proposed timeline:
     > Together with our re-licensing effort, we aim for a VOLK 3.0 release. VOLK 3.0 is a
    good
     > target for breaking API changes.
     >
     > Effects:
     > I'd like to make the transition to VOLK 3.0 as easy as possible. Thus, I'd like to
    keep
     > an interface that hopefully doesn't require any code changes for VOLK 2.x users. A      > re-built of your application should be sufficient. However, we'd be able to adopt a
     > C++-ic API as well. e.g. use vectors, spans etc.
     >
     > The current implementation to detect and load the preferred implementation at
    runtime is
     > hard to understand and easy to break. C++ should offer more accessible tools to make
     > this part easier.
     >
     > What about all the current kernels?
     > We'd start with a new API and hide the old kernel code behind that interface. We
    come up
     > with a new implementation structure and how to load it. Thus, we can progressively
     > convert to "new-style" implementations.
     >
     > Another bonus: std::simd
     > Currently, std::simd is a proposal for C++23. Making VOLK a C++ lib would allow us to      > eventually use std::simd in VOLK and thus make Comms DSP algorithms more optimized on
     > more platforms.
     >
     > Cheers
     > Johannes
     >



Reply via email to