Re: [ot][spam][random][crazy][random][crazy]

Undescribed Horrific Abuse, One Victim & Survivor of Many Wed, 16 Nov 2022 02:15:23 -0800

    shortspace_freqs = fftfreq(len(randvec), complex = False, dc_offset = True)
    longspace_freqs = fftfreq(len(randvec), complex = False, dc_offset
= True, repetition_samples = short_duration)
    assert np.allclose(longspace_freqs, shortspace_freqs *
short_duration / len(randvec))


0455

shortspace_freqs does this:
(Pdb) list
 54             elif repetition_time:
 55                 min_freq = 1 / repetition_time
 56             elif repetition_samples:
 57                 min_freq = freq_sample_rate / repetition_samples
 58             else:
 59  ->             min_freq = freq_sample_rate / freq_count
 60         if max_freq is None:
 61             #max_freq = freq_sample_rate / 2
 62             max_freq = freq_count * min_freq / 2
 63             if freq_count % 2 != 0:
 64                 #max_freq -= freq_sample_rate / (2 * freq_count)
(Pdb) p freq_sample_rate
1
(Pdb) p freq_count
30
(Pdb) p freq_sample_rate / freq_count
0.03333333333333333

(Pdb) p shortspace_freqs
array([0.        , 0.03333333, 0.06666667, 0.1       , 0.13333333,
       0.16666667, 0.2       , 0.23333333, 0.26666667, 0.3       ,
       0.33333333, 0.36666667, 0.4       , 0.43333333, 0.46666667,
       0.5       ])

longspace_freqs does this:
(Pdb) list
 52             if repetition_rate:
 53                 min_freq = repetition_rate
 54             elif repetition_time:
 55                 min_freq = 1 / repetition_time
 56             elif repetition_samples:
 57  ->             min_freq = freq_sample_rate / repetition_samples
 58             else:
 59                 min_freq = freq_sample_rate / freq_count
 60         if max_freq is None:
 61             #max_freq = freq_sample_rate / 2
 62             max_freq = freq_count * min_freq / 2
(Pdb) p freq_sample_rate
1
(Pdb) p repetition_samples
369.47410639563054
(Pdb) p freq_sample_rate / repetition_samples
0.0027065496138698455

(Pdb) p longspace_freqs
array([0.        , 0.00270655, 0.0054131 , 0.00811965, 0.0108262 ,
       0.01353275, 0.0162393 , 0.01894585, 0.0216524 , 0.02435895,
       0.0270655 , 0.02977205, 0.0324786 , 0.03518514, 0.03789169,
       0.04059824])

0459
OK. So, the error is because I am scaling by freq_count, which when
complex = False is scaled up from 16 to 30. The scaling up is simply
so it can represent the number of frequencies returned, but compare
clearly against np.fft.fftfreq and np.fft.rfftfreq .

I'm using freq_count here as the same thing as the output sample
count, which it is not. The bug should only be encountered when
complex = False.

Thinking ...

The frequencies here kind of assume a concept of how many samples are
output to. Multiplying each one by an increment of 1, advances the
sample offset of the signal generated, and the signal should loop
after the total samples.

So, fftfreq does assume a time-domain sample count that is equal to one period.

Once could then interpret the output when repetition_samples is set,
as more correct than when it is not.

The immediate question, then, is what time-domain sample length should
the function assume when complex = False and complex = True?

Assuming we keep freq_count as meaning the number of frequency bins,
then it would be sample_count=freq_count for complex = True and
sample_count=(freq_count-1)*2 for complex=False . This is near what it
already does.

Something I've been doing is using complex = False and imagining the
same number of frequency bins and sample bins. The idea is that I get
more frequencies and it feels more accurate or precise. This is silly
because there isn't actually that much data there. I'm likely
imagining having a set buffer size of say 1KB, and wanting to use as
much of it as I can for frequencies, while I parse data that is maybe
1MB in 1KB chunks. So, it's kind of a silly beginning of a possibly
larger scenario.

In the 1KB/1MB scenario, it could represent data that is 2KBs long,
using 1KB of frequencies. That seems reasonable to provide for.

Meanwhile, here I am foolishly using 16 frequency bins when I only
have 16 samples of data. It would be nice to provide for that if the
user wants to try it, using the pseudoinverse, but it doesn't quite
seem like the right thing to do.

It should indeed be assuming 30 samples of interesting data in such a situation.

This could be a change to create_freq2time, where it sets time_count
depending on whether or not freqs contains negative frequencies.

I'm trying changing create_freq2time:
    if freqs is None:
        freqs = fftfreq(time_count)
    elif time_count is None:
        if freqs[-1] < 0: # complex data
            time_count = len(freqs)
        else:
            time_count = (len(freqs) - 1) * 2

I'm stepping away from the system and could re-engage my issues that
make it hard to do, so I'm attaching the file to make it easier to
find again.

# AGPL-3 Karl Semich 2022
import numpy as np

# TODO: shift max_freq up with min_freq, when max_freq not specified
def fftfreq(freq_count, sample_rate = None, min_freq = None, max_freq = None,
            dc_offset = True, complex = True, sample_time = None,
            repetition_rate = None, repetition_time = None, repetition_samples = None,
            freq_sample_rate = None, freq_sample_time = None):
    '''
    Calculates and returns the frequency bins used to convert between the time
    and frequency domains with a discrete Fourier transform.

    With no optional arguments, this function should be equivalent to numpy.fft.fftfreq .

    To specify frequencies in terms of repetitions / sample_count, set
    sample_rate to sample_count. If frequencies were in Hz (cycles/sec), this
    implies that sample_count should be considered to last 1 second, so the
    frequency becomes equal to the cycle count.

    Parameters:
        - freq_count: the number of frequency bins to generate
        - sample_rate: the time-domain sample rate (default: 1 sample/time_unit)
        - min_freq: the minimum frequency the signal contains
        - max_fraq: the maximum frequency the signal contains
        - dc_offset: whether to include a DC offset component (0 Hz)
        - complex: whether to generate negative frequencies
        - sample_time: sample_rate as the duration of a single sample
        - repetition_rate: min_freq as the repetition rate of a subsignal
        - repetition_time: min_freq as the period time of a subsignal
        - repetition_samples: min_freq as the period size of a subsignal in samples
        - freq_sample_rate: convert to or from a different frequency-domain sample rate
        - freq_sample_time: freq_sample_rate as the duration of a single sample

    Returns a vector of sinusoid time scalings that can be used to perform or
    analyse a discrete Fourier transform.  Multiply this vector by the time-domain
    sample rate to get the frequencies.
    '''
    assert not sample_time or not sample_rate
    assert not freq_sample_time or not freq_sample_rate
    assert (min_freq, repetition_rate, repetition_time, repetition_samples).count(None) >= 3
    if sample_time is not None:
        sample_rate = 1 / sample_time
    if freq_sample_time is not None:
        freq_sample_rate = 1 / freq_sample_time
    sample_rate = sample_rate or freq_sample_rate or 1
    freq_sample_rate = freq_sample_rate or sample_rate or 1
    if not dc_offset:
        freq_count += 1
    if not complex:
        freq_count = int((freq_count - 1) * 2)
    if not min_freq:
        if repetition_rate:
            min_freq = repetition_rate
        elif repetition_time:
            min_freq = 1 / repetition_time
        elif repetition_samples:
            min_freq = freq_sample_rate / repetition_samples
        else:
            min_freq = freq_sample_rate / freq_count
    if max_freq is None:
        #max_freq = freq_sample_rate / 2
        max_freq = freq_count * min_freq / 2
        if freq_count % 2 != 0:
            #max_freq -= freq_sample_rate / (2 * freq_count)
            max_freq -= min_freq / 2
    min_freq /= sample_rate
    max_freq /= sample_rate
    if freq_count % 2 == 0:
        if complex:
            neg_freqs = np.linspace(-max_freq, -min_freq, num=freq_count // 2, endpoint=True)
            pos_freqs = -neg_freqs[:0:-1]
        else:
            pos_freqs = np.linspace(min_freq, max_freq, num=freq_count // 2, endpoint=True)
            neg_freqs = pos_freqs[:0]
    else:
        pos_freqs = np.linspace(min_freq, max_freq, num=freq_count // 2, endpoint=True)
        neg_freqs = -pos_freqs[::-1] if complex else pos_freqs[:0]
    return np.concatenate([
        np.array([0] if dc_offset else []),
        pos_freqs,
        neg_freqs
    ])

def create_freq2time(time_count = None, freqs = None, wavelet = lambda x: np.exp(2j * np.pi * x)):
    '''
    Creates a matrix that will perform an inverse discrete Fourier transform
    when it post-multiplies a vector of complex frequency magnitudes.

    Example
        time_data = spectrum @ create_freq2time(len(spectrum))

    Parameters:
        - time_count: size of the output vector, defaults to the frequency bincount
        - freqs: frequency bins to convert, defaults to traditional FFT frequencies

    Returns:
        - an inverse discrete Fourier matrix of shape (len(freqs), time_count)
    '''
    assert (time_count is not None) or (freqs is not None)
    if freqs is None:
        freqs = fftfreq(time_count)
    elif time_count is None:
        if freqs[-1] < 0: # complex data
            time_count = len(freqs)
        else:
            time_count = (len(freqs) - 1) * 2
    offsets = np.arange(time_count)
    #mat = np.exp(2j * np.pi * np.outer(freqs, offsets))
    mat = wavelet(np.outer(freqs, offsets))
    return mat / len(freqs) # scaled to match numpy convention

def create_time2freq(time_count = None, freqs = None, wavelet = lambda x: np.exp(2j * np.pi * x)):
    '''
    Creates a matrix that will perform a forward discrete Fourier transform
    when it post-multiplies a vector of time series data.

    If time_count is too small or large, the minimal least squares solution
    over all the data passed will be produced.

    This function is equivalent to calling .pinv() on the return value of
    create_freq2time. If the return value is single-use, it is more efficient and
    accurate to use numpy.linalg.lstsq .

    Example
        spectrum = time_data @ create_time2freq(len(time_data))

    Parameters:
        - time_count: size of the input vector, defaults to the frequency bincount
        - freqs: frequency bins to produce, defaults to traditional FFT frequencies

    Returns:
        - a discrete Fourier matrix of shape (time_count, len(freqs))
    '''
    forward_mat = create_freq2time(time_count, freqs, wavelet)
    reverse_mat = np.linalg.pinv(forward_mat)
    return reverse_mat

def peak_pair_idcs(freq_data, dc_offset=True, sum=True):
    dc_offset = int(dc_offset)
    freq_heights = abs(freq_data) # squares and sums the components
    if sum:
        while len(freq_heights).shape > 1:
            freq_heights = freq_height.sum(axis=0)
    paired_heights = freq_heights[...,dc_offset:-1] + freq_heights[...,dc_offset+1:]
    peak_idx = paired_heights.argmax(axis=-1, keepdims=True) + dc_offset
    return np.concatenate(peak_idx, peak_idx + 1, axis=-1)

def improve_peak(time_data, min_freq, max_freq):
    freqs = fftfreq(time_data.shape[-1], min_freq = min_freq, max_freq = max_freq)
    freq_data = time_data @ np.linalg.inv(create_freq2time(freqs))
    left, right = peak_pair_idcs(freq_data)
    return freq_data[left], freq_data[right]
    

def test():
    np.random.seed(0)

    randvec = np.random.random(16)
    freqs16 = fftfreq(16)
    ift16 = create_freq2time(16, freqs=freqs16)
    ft16 = create_time2freq(16)
    randvec2time = randvec @ ift16
    randvec2freq = randvec @ ft16
    randvec2ifft = np.fft.ifft(randvec)
    randvec2fft = np.fft.fft(randvec)
    assert np.allclose(freqs16, np.fft.fftfreq(16))
    assert np.allclose(randvec2ifft, randvec2time)
    assert np.allclose(randvec2fft, randvec2freq)
    assert np.allclose(randvec2ifft, np.linalg.solve(ft16.T, randvec))
    assert np.allclose(randvec2fft, np.linalg.solve(ift16.T, randvec))
    freqs15 = fftfreq(15)
    ift15 = create_freq2time(15, freqs=freqs15)
    ft15 = create_time2freq(15)
    assert np.allclose(freqs15, np.fft.fftfreq(15))
    assert np.allclose(randvec[:15] @ ft15 @ ift15, randvec[:15])

    # [ 0, 1, 2, 3, 4, 3, 2, 1]
    #irft9_16 = create_freq2time(16, fftfreq(9, complex = False))
    #rft16_30 = create_time2freq(16, fftfreq(30, complex = False))
    #randvec2
    #assert np.allclose((randvec @ rft16) @ irft16, randvec)
    
    
    # sample data at a differing rate
    time_rate = np.random.random() * 2
    freq_rate = 1.0
    freqs = np.fft.fftfreq(len(randvec))
    rescaling_freqs = fftfreq(len(randvec), freq_sample_rate = freq_rate, sample_rate = time_rate)
    rescaling_ift = create_freq2time(freqs = rescaling_freqs)
    rescaling_ft = create_time2freq(freqs = rescaling_freqs)
    rescaled_time_data = np.array([
        np.mean([
            randvec[freqidx] * np.exp(2j * np.pi * freqs[freqidx] * sampleidx / time_rate)
            for freqidx in range(len(randvec))
        ])
        for sampleidx in range(len(randvec))
    ])
    assert np.allclose(rescaled_time_data, randvec @ rescaling_ift)
    assert np.allclose(rescaled_time_data, np.linalg.solve(rescaling_ft.T, randvec))
    unscaled_freq_data = rescaled_time_data @ rescaling_ft
    unscaled_time_data = unscaled_freq_data @ ift16
    assert np.allclose(unscaled_freq_data, randvec)
    assert np.allclose(unscaled_time_data, randvec2time)
    assert np.allclose(np.linalg.solve(rescaling_ift.T, rescaled_time_data), randvec)

    # extract a repeating wave
    longvec = np.empty(len(randvec)*100)
    short_count = np.random.random() * 4 + 1
    short_duration = len(longvec) / short_count
    for idx in range(len(longvec)):
        longvec[idx] = randvec[int((idx / len(longvec) * short_duration) % len(randvec))]
    wavelet = lambda x: (np.floor(x * 2) % 2) * 2 - 1
    import pdb; pdb.set_trace()
    shortspace_freqs = fftfreq(len(randvec), complex = False, dc_offset = True)
    longspace_freqs = fftfreq(len(randvec), complex = False, dc_offset = True, repetition_samples = short_duration)
    assert np.allclose(longspace_freqs, shortspace_freqs * short_duration / len(randvec))
    inserting_ift = create_freq2time(len(longvec), longspace_freqs, wavelet = wavelet)
    extracting_ft = create_time2freq(len(longvec), longspace_freqs, wavelet = wavelet)
    extracting_ift = create_freq2time(freqs = shortspace_freqs, wavelet = wavelet)
    unextracting_ft = create_time2freq(freqs = shortspace_freqs, wavelet = wavelet)
    inserting_spectrum = randvec @ unextracting_ft
    assert np.allclose(inserting_spectrum @ extracting_ift, randvec)
    assert np.allclose(longvec, inserting_spectrum @ inserting_ift)
    extracted_freqs = longvec @ extracting_ft
    extracted_randvec = extracted_freqs @ extracting_ift
    assert np.allclose(extracted_randvec, randvec)

if __name__ == '__main__':
    test()

Re: [ot][spam][random][crazy][random][crazy]

Reply via email to