ObjDocumentSpreadsheet is a document delegate class for generating table preview icons from spreadsheet files. It loads CSV and XLSX data using pandas and renders visual table previews using matplotlib, providing thumbnail representations of spreadsheet content.
Module: factory.core/ObjDocumentSpreadsheet.py
Inherits from: ObjDocumentDelegate.ObjDocumentDelegate
Test file: resource.test/pytests/factory.core/test_ObjDocumentSpreadsheet.py
This module enables developers to:
| Format | Extension | Library | Support |
|---|---|---|---|
| CSV | .csv |
pandas | Full |
| Excel | .xlsx |
pandas + openpyxl | Full |
| Excel | .xls |
pandas + xlrd | Full |
__init__(DB: int = 0) -> NoneInitializes the ObjDocumentSpreadsheet instance.
Parameters:
DB (int, optional): Database connection identifier (default: 0)Initialization:
_IsA to "ObjDocumentSpreadsheet"load_data(file_path: str, max_rows: int = 10, max_cols: int = 8) -> pd.DataFrameReads CSV or XLSX file and returns a truncated DataFrame.
Parameters:
file_path (str): Path to the spreadsheet filemax_rows (int, optional): Maximum number of rows to read (default: 10)max_cols (int, optional): Maximum number of columns to include (default: 8)Returns:
pd.DataFrame: Pandas DataFrame with truncated dataBehavior:
pd.read_csv() with nrows parameterpd.read_excel() with nrows parameterdf.iloc[:, :max_cols] if column count exceeds limitExample:
spreadsheet = ObjDocumentSpreadsheet()
# Load first 10 rows, 8 columns (default)
df = spreadsheet.load_data("/data/sales.csv")
# Load first 5 rows only
df = spreadsheet.load_data("/data/sales.csv", max_rows=5)
# Load first 20 rows, 15 columns
df = spreadsheet.load_data("/data/large.xlsx", max_rows=20, max_cols=15)
print(f"Loaded {len(df)} rows, {len(df.columns)} columns")
generate_preview(df: pd.DataFrame, output_path: str, figure_size: int = 8) -> boolRenders a DataFrame as a table image using matplotlib.
Parameters:
df (pd.DataFrame): DataFrame to renderoutput_path (str): Path where preview image will be savedfigure_size (int, optional): Figure width in inches (default: 8)Returns:
True if preview generation succeededFalse if an exception occurredFigure Dimensions:
max(2, 0.4 * (rows + 1)) for proportional sizingTable Styling:
Table Configuration:
Example:
spreadsheet = ObjDocumentSpreadsheet()
# Load data
df = pd.DataFrame({
"Name": ["Alice", "Bob", "Carol"],
"Score": [90, 85, 78],
"Grade": ["A", "B", "C"]
})
# Generate preview
success = spreadsheet.generate_preview(
df,
"/output/table_preview.png",
figure_size=10
)
if success:
print("Preview generated successfully")
_do_generate_icon(input_path: str, output_path: str, icon_size: int) -> NoneInternal method that loads a spreadsheet and generates a square thumbnail icon.
Parameters:
input_path (str): Path to the input spreadsheet fileoutput_path (str): Path where the icon will be savedicon_size (int): Desired size of the square icon in pixelsProcess Flow:
load_data() to load spreadsheet datagenerate_preview() to create table visualizationDocumentTools.resize_to_square() to resize to square thumbnailExample:
spreadsheet = ObjDocumentSpreadsheet()
# Generate 128x128 thumbnail
spreadsheet._do_generate_icon(
"/data/sales.csv",
"/thumbnails/sales_thumb.png",
128
)
# Generate 256x256 thumbnail
spreadsheet._do_generate_icon(
"/data/report.xlsx",
"/thumbnails/report_thumb.png",
256
)
from ObjDocumentSpreadsheet import ObjDocumentSpreadsheet
import pandas as pd
# Create instance
spreadsheet = ObjDocumentSpreadsheet()
# Create sample data
df = pd.DataFrame({
"Product": ["Widget A", "Widget B", "Widget C"],
"Price": [19.99, 29.99, 39.99],
"Stock": [100, 50, 25]
})
# Generate preview
spreadsheet.generate_preview(df, "/output/products.png")
from ObjDocumentSpreadsheet import ObjDocumentSpreadsheet
spreadsheet = ObjDocumentSpreadsheet()
# Generate 200x200 thumbnail for CSV
spreadsheet._do_generate_icon(
"/data/customers.csv",
"/thumbnails/customers.png",
200
)
# Generate thumbnail for Excel file
spreadsheet._do_generate_icon(
"/data/sales_report.xlsx",
"/thumbnails/sales.png",
200
)
from ObjDocumentSpreadsheet import ObjDocumentSpreadsheet
import os
spreadsheet = ObjDocumentSpreadsheet()
data_dir = "/spreadsheets"
output_dir = "/thumbnails"
spreadsheet_formats = (".csv", ".xlsx", ".xls")
for filename in os.listdir(data_dir):
if filename.lower().endswith(spreadsheet_formats):
input_path = os.path.join(data_dir, filename)
output_name = os.path.splitext(filename)[0] + "_thumb.png"
output_path = os.path.join(output_dir, output_name)
try:
spreadsheet._do_generate_icon(input_path, output_path, 128)
print(f"Generated thumbnail for {filename}")
except Exception as e:
print(f"Failed for {filename}: {e}")
from ObjDocumentSpreadsheet import ObjDocumentSpreadsheet
spreadsheet = ObjDocumentSpreadsheet()
# Load only first 5 rows and 4 columns
df = spreadsheet.load_data(
"/data/large_dataset.csv",
max_rows=5,
max_cols=4
)
# Generate preview with larger figure
spreadsheet.generate_preview(df, "/output/preview.png", figure_size=12)
from ObjDocumentSpreadsheet import ObjDocumentSpreadsheet
spreadsheet = ObjDocumentSpreadsheet()
files = [
"/data/january_sales.csv",
"/data/february_sales.csv",
"/data/march_sales.csv",
]
for i, file_path in enumerate(files):
df = spreadsheet.load_data(file_path)
output = f"/previews/sales_month_{i+1}.png"
spreadsheet.generate_preview(df, output)
print(f"Generated preview for {os.path.basename(file_path)}")
The module has comprehensive test coverage including:
| Test Case | Description | Status |
|---|---|---|
test_isa_set |
Validates _IsA attribute initialization | ✓ |
test_load_csv |
Tests CSV file loading | ✓ |
test_load_csv_max_rows |
Tests row limit enforcement | ✓ |
test_load_csv_max_cols |
Tests column truncation | ✓ |
test_load_xlsx |
Tests XLSX file loading | ✓ |
test_xlsx_max_rows |
Tests XLSX row limiting | ✓ |
test_generates_image |
Tests table rendering | ✓ |
test_axes_turned_off |
Tests axes hiding | ✓ |
test_header_row_styled_blue |
Verifies header styling | ✓ |
test_returns_false_on_error |
Tests error handling | ✓ |
test_delegates_to_load_and_preview |
Tests icon generation pipeline | ✓ |
test_handles_load_error |
Tests error logging | ✓ |
Test file location: resource.test/pytests/factory.core/test_ObjDocumentSpreadsheet.py
Run tests:
pytest resource.test/pytests/factory.core/test_ObjDocumentSpreadsheet.py -v
pandas - Data loading and manipulationmatplotlib - Table visualizationopenpyxl - Excel XLSX file support (pandas dependency)xlrd - Excel XLS file support (pandas dependency, optional)ObjDocumentDelegate - Base delegate classObjDocumentTools.DocumentTools - Thumbnail resizing utilitiesInstallation:
pip install pandas matplotlib openpyxl
# For XLS support:
pip install xlrd
Professional blue theme inspired by Microsoft Excel:
#4472C4 (vibrant blue)#D9E2F3 (light blue)#FFFFFF (white)# Header cells (row 0)
cell.set_facecolor("#4472C4")
cell.set_text_props(color="white", weight="bold")
# Even data rows
cell.set_facecolor("#D9E2F3")
# Odd data rows
cell.set_facecolor("#FFFFFF")
Optimization Tips:
# For very large files, use smaller limits
df = spreadsheet.load_data("/huge_file.csv", max_rows=5, max_cols=5)
# Or use pandas chunk reading for massive files
for chunk in pd.read_csv("/huge.csv", chunksize=10):
spreadsheet.generate_preview(chunk, f"/previews/chunk_{i}.png")
break # Only preview first chunk
from ObjDocumentSpreadsheet import ObjDocumentSpreadsheet
import pandas as pd
class SpreadsheetPreviewGenerator:
def __init__(self):
self.handler = ObjDocumentSpreadsheet()
self.supported_formats = {".csv", ".xlsx", ".xls"}
def is_supported(self, file_path):
"""Check if file format is supported."""
ext = os.path.splitext(file_path)[1].lower()
return ext in self.supported_formats
def get_info(self, file_path):
"""Get spreadsheet information."""
df = pd.read_csv(file_path) if file_path.endswith(".csv") \
else pd.read_excel(file_path)
return {
"rows": len(df),
"columns": len(df.columns),
"column_names": list(df.columns),
"size_bytes": os.path.getsize(file_path)
}
def generate_preview(self, file_path, output_path,
rows=10, cols=8, figure_size=8):
"""Generate table preview with custom settings."""
df = self.handler.load_data(file_path, rows, cols)
return self.handler.generate_preview(df, output_path, figure_size)
def generate_thumbnail(self, file_path, output_path, size=128):
"""Generate square thumbnail."""
self.handler._do_generate_icon(file_path, output_path, size)
return output_path
from ObjDocumentSpreadsheet import ObjDocumentSpreadsheet
import pandas as pd
def create_data_report(csv_files, output_dir):
"""Generate previews for multiple datasets."""
spreadsheet = ObjDocumentSpreadsheet()
for csv_file in csv_files:
# Load and preview
df = spreadsheet.load_data(csv_file, max_rows=15)
filename = os.path.basename(csv_file)
output = os.path.join(output_dir, f"{filename}_preview.png")
spreadsheet.generate_preview(df, output, figure_size=10)
# Print summary
print(f"{filename}: {len(df)} rows x {len(df.columns)} cols")
from ObjDocumentSpreadsheet import ObjDocumentSpreadsheet
import pandas as pd
def compare_spreadsheets(file1, file2, output_dir):
"""Generate side-by-side previews for comparison."""
spreadsheet = ObjDocumentSpreadsheet()
# Load both files
df1 = spreadsheet.load_data(file1)
df2 = spreadsheet.load_data(file2)
# Generate previews
spreadsheet.generate_preview(df1, f"{output_dir}/file1_preview.png")
spreadsheet.generate_preview(df2, f"{output_dir}/file2_preview.png")
# Compare structure
print(f"File 1: {len(df1)} rows, {len(df1.columns)} cols")
print(f"File 2: {len(df2)} rows, {len(df2.columns)} cols")
if list(df1.columns) == list(df2.columns):
print("Column names match")
else:
print("Column names differ")
Error: ModuleNotFoundError: No module named 'pandas'
Solution:
pip install pandas matplotlib
Error: ImportError: Missing optional dependency 'openpyxl'
Solution:
pip install openpyxl # For .xlsx files
pip install xlrd # For .xls files (legacy)
Issue: MemoryError when loading large CSV/XLSX files
Solution:
# Use smaller row limits
df = spreadsheet.load_data("/large_file.csv", max_rows=5)
# Or use pandas chunking
for chunk in pd.read_csv("/large.csv", chunksize=10):
spreadsheet.generate_preview(chunk, "/preview.png")
break # Only first chunk
Error: UnicodeDecodeError when loading CSV
Solution:
# Specify encoding explicitly
df = pd.read_csv("/file.csv", encoding='latin-1', nrows=10)
spreadsheet.generate_preview(df, "/output.png")
Error: RuntimeError: Invalid DISPLAY variable
Solution:
The module already uses matplotlib.use("Agg") for headless rendering. If issues persist:
import matplotlib
matplotlib.use("Agg") # Must be before pyplot import
Issue: Column headers are cut off or overlapping
Solution:
# Reduce column count
df = spreadsheet.load_data("/file.csv", max_cols=6)
# Or increase figure size
spreadsheet.generate_preview(df, "/output.png", figure_size=12)
ObjDocumentDelegate.py - Base class for document delegatesObjDocumentTools.py - Shared thumbnail utilitiesObjDocumentCode.py - Code file syntax highlightingObjDocument.py - Main document handler that delegates to format-specific handlers