On 2022-05-29 16:17, Benjamin Schollnick wrote:
Okay, you are capturing the audio stream as a digital file somewhere, correct?
Why not just right a 3rd party package to normalize the audio levels in the
digital file? It’ll be faster, and probably easier than trying to do it in
real time…
eg.
https://campus.datacamp.com/courses/spoken-language-processing-in-python/manipulating-audio-files-with-pydub?ex=8
<https://campus.datacamp.com/courses/spoken-language-processing-in-python/manipulating-audio-files-with-pydub?ex=8>
Normalizing an audio file with PyDub
Sometimes you'll have audio files where the speech is loud in some portions and
quiet in others. Having this variance in volume can hinder transcription.
Luckily, PyDub's effects module has a function called normalize() which finds
the maximum volume of an AudioSegment, then adjusts the rest of the
AudioSegment to be in proportion. This means the quiet parts will get a volume
boost.
You can listen to an example of an audio file which starts as loud then goes quiet,
loud_then_quiet.wav, here
<https://assets.datacamp.com/production/repositories/4637/datasets/9251c751d3efccf781f3e189d68b37c8d22be9ca/ex3_datacamp_loud_then_quiet.wav>.
In this exercise, you'll use normalize() to normalize the volume of our file, making
it sound more like this
<https://assets.datacamp.com/production/repositories/4637/datasets/f0c1ba35ff99f07df8cfeee810c7b12118d9cd0f/ex3_datamcamp_normalized_loud_quiet.wav>.
or
https://stackoverflow.com/questions/57925304/how-to-normalize-a-raw-audio-file-with-python
<https://stackoverflow.com/questions/57925304/how-to-normalize-a-raw-audio-file-with-python>
[snip]
Here's a sample script that uses pyaudio instead of Audacity.
You can check whether the podcast is playing by checking the volume soon
after it should've started.
Pyaudio can also read and write files.
import pyaudio
import time
import numpy as np
WIDTH = 2
CHANNELS = 2
RATE = 44100
MAX_VOL = 1024
GAIN_STEP = 0.2
LOUDER = 1 + GAIN_STEP
QUIETER = 1 - GAIN_STEP
gain = 1
p = pyaudio.PyAudio()
def callback(data, frame_count, time_info, status):
global gain
# Decode the bytestream
chunk = np.frombuffer(data, dtype=np.int16)
# Adjust the volume.
chunk = (chunk.astype(np.double) * gain).astype(np.int16)
# Adjust the gain according to the current maximum volume.
max_vol = max(chunk)
if max_vol < MAX_VOL:
gain *= LOUDER
elif max_vol > MAX_VOL:
gain *= QUIETER
return (chunk.tobytes(), pyaudio.paContinue)
stream = p.open(format=p.get_format_from_width(WIDTH), channels=CHANNELS,
rate=RATE, input=True, output=True, stream_callback=callback)
stream.start_stream()
while stream.is_active():
time.sleep(0.1)
stream.stop_stream()
stream.close()
p.terminate()
--
https://mail.python.org/mailman/listinfo/python-list