Re: [wsjt-devel] Using FFTW system 'wisdom' file

Phil Karn via wsjt-devel Wed, 11 Aug 2021 16:13:21 -0700

On 8/11/21 06:04, Bill Somerville via wsjt-devel wrote:

another option that requires no code changes is as follows:

1) Create wisdom as above but without the '-T 1' option,
2) Rename the newly created wisdom to wspr_wisdom.dat in the expecteddirectory,
3) Make the wspr_wisdom.dat file non-writable,
4) Symlink or hardlink the file into every location needed by multipleinstances of wsprd.
The existing plans will be "upgraded" from their default ofFFTW_ESTIMATE to FFTW_PATIENT for the specified FFT types and sizes,and there will be no file locking requirement as attempts by wsprd toupdate the wisdom file on exit will silently fail.

Hi Bill, that's certainly one good and simple way to get wsprd to use"wiser" wisdom. But I use FFTW a lot in my own applications (mainly forfast convolution, which I use for frequency shifting, filtering anddownsampling) and it just seems nice and elegant to have a common wisdomfile for everything I do. I don't see any downside in having wsprdimport system wisdom, as it's a harmless no-op on Windows and wsprd willstill import "local" wisdom if you want.

The only complication, as I mentioned, is that non-threaded "wisdom" isincompatible with threaded wisdom, even a single thread. I haven't duginto FFTW to find out why, but it was easy enough to just invokethreading in all my applications and set the number of threads to 1 if Idon't really want more. To be honest, adding threads isn't a huge win;speed scales much less quickly than linearly with additional threads.But it can make a difference when you have a realtime deadline on aslower CPU (e.g., Raspberry Pi.)

I've begun to look at the wspr_timer.out file. The 69% that my systemnow spends in my Fano decoder really jumped out. Whoa! But I shouldn'tbe terribly surprised. One of the reasons sequential decoding (includingFano) isn't used much anymore is that it's a poor match to modern CPUs.Sequential decoding does a lot of data-dependent program branching, andthis defeats the branch prediction that modern CPUs rely on to keeptheir deep pipelines full. Every time a prediction is wrong, the CPUcomes to a screeching halt as the pipelines are flushed and reloadedfrom the correct execution point. I've gotten much better results withmy Viterbi decoder, which I hand-optimized to make full use of thevector hardware. And there's no data-dependent branching.

Hmm. I did a k=24 Viterbi decoder for the ISEE-3 recovery project in2014. (The complexity of Viterbi decoding increases exponentially withk, and k=7 is a more typical number, so k=24 is huge.) IIRC, I had itrunning at about 230 b/s on the computer I had then. Maybe, just forfun, I could try one for the k=32 code in WSPR. It has two full minutesto decode only 50 bits...!

There are still ways to speed up Fano decoding, though. The most obviouswould be multithreading to take advantage of multicore CPUs. Create aset of "server" threads, each running a Fano decoder, and have them workin parallel on various frequency/time hypotheses. To keep a really deepsearch with a high limit from bogging things down, a decoder threadcould begin each attempt at normal CPU priority and then progressivelyreduce it (make itself less important) as it spends much more than 1decoder move/bit. Transmissions with good SNR would continue to decodequickly while the system would still try to dig out the really weak oneson an as-available basis, limited only by your tolerance for false decodes.


73, Phil





_______________________________________________
wsjt-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/wsjt-devel

Re: [wsjt-devel] Using FFTW system 'wisdom' file

Reply via email to