ObjDocumentAudio is a document delegate class for generating waveform visualization icons from audio files. It loads audio samples and renders Spotify-style waveform visualizations using matplotlib, supporting multiple audio formats including WAV, MP3, OGG, FLAC, and AAC.
Module: factory.core/ObjDocumentAudio.py
Inherits from: ObjDocumentDelegate.ObjDocumentDelegate
Test file: resource.test/pytests/factory.core/test_ObjDocumentAudio.py
This module enables developers to:
__init__(DB: int = 0) -> NoneInitializes the ObjDocumentAudio instance.
Parameters:
DB (int, optional): Database connection identifier (default: 0)Initialization:
_IsA to "ObjDocumentAudio"_read_wav(file_path: str) -> tupleReads a WAV file and returns normalized audio samples with sample rate.
Parameters:
file_path (str): Path to the WAV fileReturns:
(samples, sample_rate):
samples (numpy.ndarray): Normalized audio samples (float64, range -1.0 to 1.0)sample_rate (int): Sample rate in Hz (e.g., 44100)Processing:
Sample Width Support:
Example:
audio = ObjDocumentAudio()
# Read WAV file
samples, sample_rate = audio._read_wav("/audio/song.wav")
print(f"Sample rate: {sample_rate} Hz")
print(f"Sample count: {len(samples)}")
print(f"Duration: {len(samples) / sample_rate:.2f} seconds")
_convert_to_wav(audio_path: str) -> strConverts any audio file to a temporary WAV file using ffmpy/FFmpeg.
Parameters:
audio_path (str): Path to the input audio file (any format)Returns:
str: Path to the temporary WAV fileRaises:
RuntimeError: If FFmpeg conversion failsFFmpeg Options:
-y: Overwrite output file if exists-loglevel quiet: Suppress FFmpeg outputCleanup:
Example:
audio = ObjDocumentAudio()
try:
wav_path = audio._convert_to_wav("/audio/song.mp3")
# Use wav_path...
finally:
if os.path.exists(wav_path):
os.remove(wav_path)
load_audio(file_path: str) -> tupleUnified audio loading method that handles all supported formats.
Parameters:
file_path (str): Path to the audio fileReturns:
(samples, sample_rate):
samples (numpy.ndarray): Normalized audio samplessample_rate (int): Sample rate in HzProcess Flow:
.wav: Reads directly with _read_wav()_convert_to_wav(), then readsExample:
audio = ObjDocumentAudio()
# Load any supported format
samples, rate = audio.load_audio("/audio/song.mp3")
samples, rate = audio.load_audio("/audio/podcast.ogg")
samples, rate = audio.load_audio("/audio/music.flac")
generate_preview(samples: np.ndarray, sample_rate: int, output_path: str, figure_size: int = 8) -> boolRenders an audio waveform visualization using matplotlib.
Parameters:
samples (numpy.ndarray): Audio samples arraysample_rate (int): Sample rate in Hzoutput_path (str): Path where preview image will be savedfigure_size (int, optional): Figure width in inches (default: 8)Returns:
True if preview generation succeededFalse if an exception occurredVisualization Style:
Figure Settings:
Example:
audio = ObjDocumentAudio()
# Load audio
samples, rate = audio.load_audio("/audio/song.mp3")
# Generate waveform preview
success = audio.generate_preview(
samples,
rate,
"/output/waveform.png",
figure_size=10
)
_do_generate_icon(input_path: str, output_path: str, icon_size: int) -> NoneInternal method that loads an audio file and generates a square thumbnail icon.
Parameters:
input_path (str): Path to the input audio fileoutput_path (str): Path where the icon will be savedicon_size (int): Desired size of the square icon in pixelsProcess Flow:
load_audio() to load audio samplesgenerate_preview() to create waveform visualizationDocumentTools.resize_to_square() to resize to square thumbnailExample:
audio = ObjDocumentAudio()
# Generate 128x128 thumbnail with waveform
audio._do_generate_icon(
"/audio/podcast.mp3",
"/thumbnails/podcast_thumb.png",
128
)
from ObjDocumentAudio import ObjDocumentAudio
# Create instance
audio = ObjDocumentAudio()
# Load audio and generate waveform
samples, rate = audio.load_audio("/audio/song.mp3")
audio.generate_preview(samples, rate, "/output/waveform.png")
from ObjDocumentAudio import ObjDocumentAudio
audio = ObjDocumentAudio()
# Generate 256x256 thumbnail
audio._do_generate_icon(
"/audio/podcast.ogg",
"/thumbnails/podcast.png",
256
)
from ObjDocumentAudio import ObjDocumentAudio
import os
audio = ObjDocumentAudio()
audio_dir = "/audio_library"
output_dir = "/thumbnails"
audio_formats = (".mp3", ".wav", ".ogg", ".flac", ".aac")
for filename in os.listdir(audio_dir):
if filename.lower().endswith(audio_formats):
input_path = os.path.join(audio_dir, filename)
output_name = os.path.splitext(filename)[0] + "_thumb.png"
output_path = os.path.join(output_dir, output_name)
try:
audio._do_generate_icon(input_path, output_path, 128)
print(f"Generated thumbnail for {filename}")
except Exception as e:
print(f"Failed for {filename}: {e}")
from ObjDocumentAudio import ObjDocumentAudio
audio = ObjDocumentAudio()
# Load audio
samples, sample_rate = audio.load_audio("/audio/recording.wav")
# Calculate duration
duration = len(samples) / sample_rate
print(f"Audio duration: {duration:.2f} seconds")
print(f"Sample count: {len(samples)}")
print(f"Sample rate: {sample_rate} Hz")
from ObjDocumentAudio import ObjDocumentAudio
audio = ObjDocumentAudio()
# Load audio
samples, rate = audio.load_audio("/audio/music.mp3")
# Generate large waveform preview
audio.generate_preview(
samples,
rate,
"/output/large_waveform.png",
figure_size=16 # Wider figure
)
The module has comprehensive test coverage including:
| Test Case | Description | Status |
|---|---|---|
test_isa_set |
Validates _IsA attribute initialization | ✓ |
test_reads_wav_16bit_mono |
Tests 16-bit mono WAV reading | ✓ |
test_reads_wav_stereo |
Tests stereo to mono downsampling | ✓ |
test_reads_wav_8bit |
Tests 8-bit WAV reading | ✓ |
test_calls_ffmpeg |
Verifies FFmpeg conversion call | ✓ |
test_raises_on_ffmpeg_failure |
Tests FFmpeg failure handling | ✓ |
test_load_wav_directly |
Tests direct WAV loading | ✓ |
test_load_mp3_converts_first |
Tests MP3 conversion pipeline | ✓ |
test_generates_waveform |
Tests waveform rendering | ✓ |
test_dark_background |
Verifies Spotify-style dark theme | ✓ |
test_axes_off |
Tests axes hiding | ✓ |
test_returns_false_on_error |
Tests error handling | ✓ |
test_delegates_to_load_and_preview |
Tests icon generation pipeline | ✓ |
test_handles_load_error |
Tests error logging | ✓ |
Test file location: resource.test/pytests/factory.core/test_ObjDocumentAudio.py
Run tests:
pytest resource.test/pytests/factory.core/test_ObjDocumentAudio.py -v
numpy - Array operations for audio samplesmatplotlib - Waveform visualizationwave - WAV file reading (Python standard library)struct - Binary data unpacking (Python standard library)ffmpy - Python wrapper for FFmpegtempfile - Temporary file creationObjDocumentDelegate - Base delegate classObjDocumentTools.DocumentTools - Thumbnail resizing utilitiesThis module requires FFmpeg for non-WAV audio formats:
Ubuntu/Debian:
sudo apt-get update
sudo apt-get install ffmpeg
macOS:
brew install ffmpeg
Windows:
Download from https://ffmpeg.org/download.html and add to PATH
ffmpeg -version
The visualization uses Spotify's brand colors:
#191414 (near-black)#1DB954 (Spotify green)fill_between() creates the filled area under the waveformplot() draws the waveform line on topfrom ObjDocumentAudio import ObjDocumentAudio
class PodcastEpisode:
def __init__(self, audio_path):
self.audio_path = audio_path
self.audio_handler = ObjDocumentAudio()
def get_waveform(self, output_path):
"""Generate waveform for podcast episode."""
samples, rate = self.audio_handler.load_audio(self.audio_path)
return self.audio_handler.generate_preview(
samples, rate, output_path, figure_size=12
)
def get_duration(self):
"""Get episode duration in seconds."""
samples, rate = self.audio_handler.load_audio(self.audio_path)
return len(samples) / rate
from ObjDocumentAudio import ObjDocumentAudio
import concurrent.futures
audio = ObjDocumentAudio()
def generate_thumbnail(song_path, output_dir):
"""Generate thumbnail for a song."""
filename = os.path.basename(song_path)
output = os.path.join(output_dir, f"{filename}.png")
audio._do_generate_icon(song_path, output, 200)
return output
# Parallel thumbnail generation
song_paths = [...] # List of song paths
with concurrent.futures.ThreadPoolExecutor(max_workers=4) as executor:
futures = [
executor.submit(generate_thumbnail, path, "/thumbnails")
for path in song_paths
]
for future in concurrent.futures.as_completed(futures):
result = future.result()
print(f"Generated: {result}")
Error: RuntimeError: Failed to convert audio to WAV
Solution:
# Check FFmpeg installation
which ffmpeg
# Install if missing
sudo apt-get install ffmpeg
Issue: High memory usage with very long audio files
Solution:
# Downsample long files before visualization
def load_audio_downsampled(path, target_samples=44100*30):
audio = ObjDocumentAudio()
samples, rate = audio.load_audio(path)
if len(samples) > target_samples:
# Keep every nth sample
step = len(samples) // target_samples
samples = samples[::step]
return samples, rate
Error: RuntimeError: Invalid DISPLAY variable
Solution:
The module already uses matplotlib.use("Agg") which is non-interactive. If issues persist:
import matplotlib
matplotlib.use("Agg") # Must be before pyplot import
import matplotlib.pyplot as plt
Error: ValueError: Unsupported sample width
Cause: WAV file has unusual sample width (not 8, 16, or 32 bit)
Solution:
Convert the file first:
ffmpeg -i input.wav -sample_fmt s16 -ar 44100 output.wav
ObjDocumentDelegate.py - Base class for document delegatesObjDocumentTools.py - Shared thumbnail utilitiesObjDocumentVideo.py - Video file thumbnail generationObjDocument.py - Main document handler that delegates to format-specific handlers