Data Structures
The FuzzyDataFrame is the cornerstone data structure for fuzzy data analysis in AxisFuzzy, providing a pandas-like interface specifically designed for handling fuzzy numbers efficiently. This document introduces you to the FuzzyDataFrame’s design philosophy, core capabilities, and practical usage patterns that make fuzzy data analysis both intuitive and powerful.
Think of FuzzyDataFrame as your familiar pandas DataFrame, but enhanced with native support for fuzzy numbers. Just as pandas revolutionized data analysis by providing labeled, heterogeneous data structures, FuzzyDataFrame brings the same level of convenience and power to the world of fuzzy data analysis.
Understanding FuzzyDataFrame
What is FuzzyDataFrame
FuzzyDataFrame is a specialized two-dimensional data structure designed for fuzzy data analysis. Think of it as pandas DataFrame’s fuzzy-aware cousin - it maintains the familiar tabular structure you know and love, but each cell contains fuzzy numbers instead of crisp values.
The Fundamental Concept
In traditional data analysis, a DataFrame cell might contain a value like 0.75. In a
FuzzyDataFrame, that same cell contains a fuzzy number that might represent “approximately 0.75”
with associated membership and non-membership degrees. This allows you to capture and work with
uncertainty, imprecision, and subjective judgments that are inherent in real-world data.
# Traditional pandas DataFrame
crisp_df = pd.DataFrame({
'score': [0.75, 0.82, 0.68],
'rating': [4.2, 4.7, 3.9]
})
# FuzzyDataFrame equivalent
fuzzy_df = FuzzyDataFrame({
'score': fuzzarray_scores, # Each element is a fuzzy number
'rating': fuzzarray_ratings # Preserving uncertainty information
})
Core Design Principles
FuzzyDataFrame follows several key design principles that make it both powerful and accessible:
Pandas-Inspired Interface: If you know how to use pandas DataFrame, you already understand
most of FuzzyDataFrame’s interface. Methods like shape, columns, index, and
indexing operations work exactly as you’d expect.
Fuzzarray Foundation: Each column is a Fuzzarray - AxisFuzzy’s high-performance fuzzy array structure. This ensures efficient storage and computation while maintaining the full richness of fuzzy information.
Type Consistency: All columns in a FuzzyDataFrame share the same fuzzy type (mtype), ensuring mathematical operations between columns are well-defined and meaningful.
Future-Ready Architecture: While currently built on pandas infrastructure, FuzzyDataFrame is designed to potentially migrate to polars backend for even better performance.
Key Structural Characteristics
Understanding FuzzyDataFrame’s structure helps you work with it effectively:
Column-Oriented Storage: Each column is an independent Fuzzarray containing fuzzy numbers
Labeled Axes: Both rows and columns have labels, just like pandas DataFrame
Homogeneous Fuzzy Type: All fuzzy numbers in the DataFrame share the same mtype (e.g., ‘qrofn’)
Index Alignment: Row and column operations respect pandas-style index alignment
Memory Efficiency: Leverages Fuzzarray’s backend system for optimized memory usage
Relationship to AxisFuzzy Ecosystem
FuzzyDataFrame isn’t an isolated component - it’s deeply integrated with AxisFuzzy’s broader ecosystem:
Components: Analysis components can consume and produce FuzzyDataFrame objects
Pipelines: FuzzyDataFrame flows seamlessly through analysis pipelines
Models: High-level models can work directly with FuzzyDataFrame inputs and outputs
Contracts: Type contracts ensure FuzzyDataFrame compatibility across the system
Why FuzzyDataFrame Matters
Traditional data analysis assumes your data is precise and certain. But real-world scenarios often involve uncertainty, subjective judgments, and imprecise measurements. FuzzyDataFrame addresses these limitations in several crucial ways.
Preserving Information Richness
When you convert fuzzy data to crisp numbers (like taking just the membership degree), you lose valuable information about uncertainty and confidence. FuzzyDataFrame preserves the complete fuzzy representation throughout your entire analysis workflow.
Consider a customer satisfaction survey where responses like “somewhat satisfied” contain
inherent ambiguity. Traditional approaches might convert this to a single number like 3.5.
FuzzyDataFrame preserves the uncertainty, allowing your analysis to account for the fact that
this rating could reasonably range from 3.0 to 4.0 with varying degrees of confidence.
Familiar Yet Powerful Interface
FuzzyDataFrame leverages pandas conventions, dramatically reducing the learning curve. If you can work with pandas DataFrame, you can work with FuzzyDataFrame. This familiarity accelerates adoption while providing access to sophisticated fuzzy analysis capabilities.
# Familiar pandas-style operations
print(fuzzy_df.shape) # (100, 5)
print(fuzzy_df.columns) # ['feature_1', 'feature_2', ...]
column_data = fuzzy_df['score'] # Returns a Fuzzarray
# But with fuzzy-aware semantics
fuzzy_subset = fuzzy_df[fuzzy_df.columns[:3]] # Maintains fuzzy properties
Performance at Scale
FuzzyDataFrame is built on Fuzzarray’s efficient backend system, which optimizes memory usage and computational performance. This means you can work with large fuzzy datasets without sacrificing speed or consuming excessive memory.
The backend system automatically selects the most efficient representation for your specific fuzzy number type and operations, ensuring that fuzzy computations scale to real-world datasets.
Seamless Ecosystem Integration
Perhaps most importantly, FuzzyDataFrame integrates seamlessly with AxisFuzzy’s analysis ecosystem. You can:
Feed FuzzyDataFrame directly into analysis components
Use it as input/output for fuzzy pipelines
Apply high-level models that expect fuzzy tabular data
Leverage the contract system for type-safe data flow
This integration means you can build sophisticated fuzzy analysis workflows without worrying about data format conversions or compatibility issues.
Real-World Applications
FuzzyDataFrame excels in scenarios where uncertainty and imprecision are inherent:
Decision Support Systems: Where criteria have subjective weights and uncertain outcomes
Risk Assessment: Where probabilities and impacts contain inherent uncertainty
Quality Evaluation: Where ratings and scores reflect subjective judgments
Sensor Data Analysis: Where measurements contain noise and calibration uncertainty
Expert Systems: Where domain knowledge involves linguistic variables and approximate reasoning
By preserving and working with uncertainty rather than discarding it, FuzzyDataFrame enables more robust and realistic analysis of complex real-world problems.
Creating and Initializing FuzzyDataFrame
FuzzyDataFrame provides flexible construction patterns to accommodate different data sources and use cases. Whether you’re starting with crisp data, existing fuzzy arrays, or building from scratch, there’s an appropriate construction approach.
Basic Construction Patterns
Direct Construction from Fuzzarray Dictionary
Create a FuzzyDataFrame directly from a dictionary mapping column names to Fuzzarray objects:
from axisfuzzy.analysis.dataframe import FuzzyDataFrame
from axisfuzzy import fuzzyarray, fuzzynum
# Create fuzzy arrays
scores = fuzzyarray([
fuzzynum((0.8,0.1), q=2),
fuzzynum((0.7,0.2), q=2)
])
# Construct FuzzyDataFrame
fuzzy_df = FuzzyDataFrame({'performance': scores})
print(fuzzy_df.shape) # (2, 1)
print(fuzzy_df)
output:
performance
0 <0.8,0.1>
1 <0.7,0.2>
Construction with Custom Index and Columns
Specify custom index and column labels for meaningful data organization:
import pandas as pd
fuzzy_df = FuzzyDataFrame(
data={'q1_performance': scores}, # 键名与 columns 匹配
index=pd.Index(['product_a', 'product_b'], name='products'),
columns=pd.Index(['q1_performance'], name='quarters')
)
print(fuzzy_df)
output:
quarters q1_performance
products
product_a <0.8,0.1>
product_b <0.7,0.2>
Converting from Pandas DataFrame
The most common scenario involves converting crisp data into fuzzy representations using
the from_pandas() class method.
Basic Conversion Process
import pandas as pd
from axisfuzzy.fuzzifier import Fuzzifier
# Existing crisp data
sensor_data = pd.DataFrame({
'temperature': [20.5, 25.3, 18.7],
'humidity': [65.2, 70.1, 58.9]
})
# Configure fuzzification
fuzzifier = Fuzzifier(
mf='gaussmf',
mtype='qrofn',
q=2,
mf_params=[{'sigma': 10, 'c': 30}]
)
# Convert to FuzzyDataFrame
fuzzy_data = FuzzyDataFrame.from_pandas(sensor_data, fuzzifier)
print(f"Fuzzy type: {fuzzy_data.mtype}")
What Happens During Conversion
The from_pandas() method performs these operations:
Column-wise Fuzzification: Each column is processed by the fuzzifier
Structure Preservation: Original index and column labels are maintained
Type Consistency: All fuzzy numbers share the same mtype
Validation: Ensures proper fuzzifier configuration
Using the Pandas Accessor
The pandas accessor provides seamless integration with existing pandas workflows through
the .fuzzy accessor.
Basic Accessor Usage
# Existing pandas workflow
data = pd.DataFrame({
'feature_1': [1.2, 2.3, 1.8],
'feature_2': [0.8, 1.5, 1.1]
})
# Configure and convert
fuzzifier = Fuzzifier(
mf='gaussmf',
mtype='qrofn',
q=2,
mf_params=[{'sigma': 10, 'c': 30}]
)
fuzzy_data = data.fuzzy.to_fuzz_dataframe(fuzzifier)
Integration with Analysis Workflows
The accessor integrates with AxisFuzzy’s analysis ecosystem:
from axisfuzzy.analysis.pipeline import FuzzyPipeline
# Execute pipeline directly from pandas DataFrame
# pipeline = FuzzyPipeline()
# result = data.fuzzy.run(pipeline, fuzzifier=fuzzifier)
Construction Best Practices
When creating FuzzyDataFrame objects, follow these guidelines:
Choose the Right Method:
Use
from_pandas()for converting crisp dataUse direct construction for existing Fuzzarray objects
Use the accessor for pandas workflow integration
Ensure Consistency:
All Fuzzarray columns must have the same length
All fuzzy numbers should share the same mtype
Maintain proper index alignment
Memory Considerations:
Process large datasets in chunks when necessary
Choose appropriate membership function parameters
Consider backend implications of your mtype choice
Working with FuzzyDataFrame
Creating Your First FuzzyDataFrame
Before exploring FuzzyDataFrame operations, let’s create a sample dataset that we’ll use throughout this section. This example demonstrates the typical workflow of converting crisp data into fuzzy representations.
import pandas as pd
from axisfuzzy.analysis.dataframe import FuzzyDataFrame
from axisfuzzy.fuzzifier import Fuzzifier
# Create sample crisp data
crisp_data = pd.DataFrame({
'temperature': [20.5, 25.3, 18.7, 22.1, 19.8],
'humidity': [65.2, 70.1, 58.9, 67.5, 62.3],
'pressure': [78.2, 46.8, 55.5, 57.1, 79.7]
}, index=['sensor_1', 'sensor_2', 'sensor_3', 'sensor_4', 'sensor_5'])
# Configure fuzzifier for converting crisp values to fuzzy numbers
fuzzifier = Fuzzifier(
mf='gaussmf', # Gaussian membership function
mtype='qrofn', # q-rung orthopair fuzzy numbers
q=2, # q-rung parameter
mf_params=[{'sigma': 40, 'c': 50}] # Gaussian parameters
)
# Create FuzzyDataFrame from crisp data
fdf = FuzzyDataFrame.from_pandas(crisp_data, fuzzifier)
print(fdf)
output:
temperature humidity pressure
sensor_1 <0.7619,0.6399> <0.9303,0.3528> <0.78,0.6178>
sensor_2 <0.8264,0.5541> <0.8814,0.4617> <0.9968,0>
sensor_3 <0.7363,0.6693> <0.9756,0.1957> <0.9906,0.0934>
sensor_4 <0.7841,0.6126> <0.9087,0.4052> <0.9844,0.145>
sensor_5 <0.752,0.6515> <0.9538,0.2832> <0.7591,0.6433>
Now that we have our FuzzyDataFrame fdf, let’s explore its capabilities and operations.
Understanding FuzzyDataFrame Fundamentals
FuzzyDataFrame serves as your primary tool for organizing and manipulating fuzzy data in a structured, tabular format. Think of it as a specialized version of pandas DataFrame, but designed specifically to handle the complexities of fuzzy numbers while maintaining familiar, intuitive operations.
Unlike traditional data structures that work with crisp values, FuzzyDataFrame manages collections of fuzzy numbers (Fuzzarray objects) as columns, ensuring that all fuzzy operations preserve uncertainty information throughout your analysis workflow.
Core Architecture
FuzzyDataFrame organizes data in a column-oriented structure where:
Each column contains a Fuzzarray (a collection of fuzzy numbers)
Each row represents a data record with fuzzy values across different attributes
All columns must share the same mtype (fuzzy number type) for consistency
Index and column labels follow pandas conventions for familiar navigation
Essential Properties and Information
FuzzyDataFrame provides comprehensive properties to understand your data structure and content. These properties help you quickly assess data dimensions, types, and organization patterns.
Dimensional Information
Understand the size and structure of your fuzzy dataset:
# Get shape as (rows, columns) tuple
rows, cols = fdf.shape
print(f"Dataset contains {rows} records with {cols} fuzzy attributes")
# Alternative: get row count directly
num_records = len(fdf)
print(f"Total records: {num_records}")
Index and Column Management
Access and examine the organizational structure:
# Examine row labels (index)
print("Row labels:", fdf.index.tolist())
# Examine column names
print("Fuzzy attributes:", fdf.columns.tolist())
# Check if index has names
if fdf.index.name:
print(f"Index represents: {fdf.index.name}")
Fuzzy Type Information
Verify the consistency of fuzzy number types across your dataset:
# Check the fuzzy number type
print(f"Fuzzy type: {fdf.mtype}")
# This ensures all columns use the same fuzzy representation
# (e.g., all triangular, all trapezoidal, etc.)
Column Operations and Data Access
FuzzyDataFrame provides intuitive methods for accessing and manipulating individual columns and data elements, maintaining the fuzzy nature of your data throughout all operations.
Column Retrieval and Inspection
Access individual columns as Fuzzarray objects for detailed analysis:
# Retrieve a specific fuzzy attribute
temperature_data = fdf['temperature']
print(f"Temperature column type: {type(temperature_data)}") # Fuzzarray
# Examine column properties
print(f"Column length: {len(temperature_data)}")
print(f"Column fuzzy type: {temperature_data.mtype}")
Adding and Modifying Columns
Extend your dataset with new fuzzy attributes:
# Create new fuzzy data
from axisfuzzy import fuzzynum, fuzzyarray
# Prepare new fuzzy values
pressure_values = [fuzzynum((0.7,0.3), q=2) for _ in range(len(fdf))]
new_pressure_column = fuzzyarray(pressure_values)
# Add the new column
fdf['pressure'] = new_pressure_column
# Verify addition
print(f"Updated columns: {fdf.columns.tolist()}")
Element-Level Access
Retrieve and examine individual fuzzy numbers:
# Access specific fuzzy values
first_temperature = fdf['temperature'][0]
print(f"First temperature reading: {first_temperature}")
# Access by row and column position
specific_value = fdf['humidity'][2] # Third humidity reading
print(f"Specific humidity value: {specific_value}")
Data Inspection and Visualization
Effective fuzzy data analysis requires understanding the content and characteristics of your dataset. FuzzyDataFrame provides multiple approaches for inspecting and visualizing fuzzy information.
Dataset Overview and Display
Get a comprehensive view of your fuzzy dataset:
# Display the complete FuzzyDataFrame
print(fdf)
# This shows:
# - All fuzzy values in readable format
# - Row and column labels
# - Automatic formatting for large datasets
Detailed Fuzzy Number Examination
Inspect the internal structure of individual fuzzy numbers:
# Select a specific fuzzy value for detailed analysis
sample_value = fdf['temperature'][0]
# Examine fuzzy number components
print(f"Fuzzy value: {sample_value}")
print(f"membership and non-membership degree: [{sample_value.md}, {sample_value.nmd}]")
print(f"Score value: {sample_value.score}")
Data Quality and Consistency Checks
Verify the integrity and consistency of your fuzzy dataset:
# Check for empty or invalid data
if fdf.shape[0] == 0:
print("Warning: Dataset is empty")
# Verify column consistency
print(f"All columns have same mtype: {fdf.mtype}")
# Check for proper column lengths
column_lengths = [len(fdf[col]) for col in fdf.columns]
if len(set(column_lengths)) == 1:
print("All columns have consistent length")
else:
print("Warning: Column length mismatch detected")
Working with Subsets and Selections
Extract and work with portions of your fuzzy dataset:
# Work with specific columns (individual column access)
temperature_data = fdf['temperature']
humidity_data = fdf['humidity']
# Create a subset FuzzyDataFrame with selected columns
environmental_data = FuzzyDataFrame({
'temperature': fdf['temperature'],
'humidity': fdf['humidity']
}, index=fdf.index)
# Access multiple values from a column
first_three_temps = [fdf['temperature'][i] for i in range(3)]
print(f"First three temperature readings: {first_three_temps}")
# Examine data patterns
for col_name in fdf.columns:
sample_val = fdf[col_name][0]
print(f"{col_name}: {sample_val}")
This comprehensive approach to working with FuzzyDataFrame ensures you can effectively manage, inspect, and understand your fuzzy data while maintaining the mathematical rigor required for accurate fuzzy analysis.
Integration with Analysis Ecosystem
FuzzyDataFrame serves as the central data structure that connects different parts of AxisFuzzy’s analysis ecosystem. Think of it as the “common language” that allows various analysis tools to work together seamlessly. This section shows you how FuzzyDataFrame integrates with the three main parts of the ecosystem: components, contracts, and models.
Pandas Accessor Integration
The most user-friendly way to work with FuzzyDataFrame is through pandas’ .fuzzy
accessor, which extends any pandas DataFrame with fuzzy analysis capabilities.
Converting Pandas to FuzzyDataFrame
Transform your regular pandas data into fuzzy representation:
import pandas as pd
from axisfuzzy.fuzzifier import Fuzzifier
from axisfuzzy.membership import TriangularMF
# Your regular pandas DataFrame
df = pd.DataFrame({
'temperature': [18.5, 22.3, 25.1, 19.8],
'humidity': [17.2, 26.8, 27.9, 18.3]
})
# Create a fuzzifier with triangular membership function
fuzzifier = Fuzzifier(
mf='trimf',
mtype='qrofn',
q=2,
mf_params={'a': 15.0, 'b': 22.0, 'c': 30.0}
)
# Convert to FuzzyDataFrame using the .fuzzy accessor
fuzzy_df = df.fuzzy.to_fuzz_dataframe(fuzzifier=fuzzifier)
# Now you have a FuzzyDataFrame ready for analysis
print(fuzzy_df) # <class 'FuzzyDataFrame'>
output:
temperature humidity
0 <0.5,0.8602> <0.3143,0.944>
1 <0.9625,0.2522> <0.4,0.911>
2 <0.6125,0.7841> <0.2625,0.9597>
3 <0.6857,0.721> <0.4714,0.8762>
Running Analysis Models
Execute complex analysis workflows directly from pandas:
# Assuming you have a pre-built analysis model
from axisfuzzy.analysis.app.model import Model
# Run the model using pandas accessor
# Assume 'my_analysis_model' is a pre-built analytical model
results = df.fuzzy.run(my_analysis_model, weights=[0.6, 0.4])
# The accessor automatically handles data conversion and injection
Component System Integration
Components are the building blocks of fuzzy analysis. FuzzyDataFrame flows through
these components, getting transformed at each step.
Basic Component Workflow
Here’s how components work with FuzzyDataFrame:
from axisfuzzy.analysis.component.basic import (
ToolFuzzification, ToolNormalization
)
from axisfuzzy.fuzzifier import Fuzzifier
# Start with crisp data
crisp_data = pd.DataFrame({'score1': [85, 92, 78], 'score2': [88, 85, 90]})
# Step 1: Normalize the crisp data first
normalizer = ToolNormalization(method='min_max')
normalized_data = normalizer.run(crisp_data) # DataFrame → DataFrame
# Step 2: Convert normalized data to fuzzy data
# Create fuzzifier with triangular membership function
fuzzifier_config = Fuzzifier(
mf='trimf',
mtype='qrofn',
q=2,
mf_params={'a': 70, 'b': 85, 'c': 100} # Adjusted for normalized range [0,1]
)
fuzzifier = ToolFuzzification(fuzzifier=fuzzifier_config)
fuzzy_data = fuzzifier.run(normalized_data) # Returns FuzzyDataFrame
# Step 3: Access and work with fuzzy data
# FuzzyDataFrame provides access to underlying Fuzzarray objects
print(f"Fuzzy data shape: {fuzzy_data.shape}")
print(f"Columns: {fuzzy_data.columns}")
# Access individual columns as Fuzzarray for further processing
score1_fuzzy = fuzzy_data['score1'] # Returns Fuzzarray
score2_fuzzy = fuzzy_data['score2'] # Returns Fuzzarray
# Now you can use Fuzzarray's built-in aggregation methods
score1_mean = score1_fuzzy.mean() # Fuzzy mean using extension system
score2_mean = score2_fuzzy.mean() # Fuzzy mean using extension system
print(f"Score1 fuzzy mean: {score1_mean}")
print(f"Score2 fuzzy mean: {score2_mean}")
Component Chaining
Components can be chained together for complex workflows. The key is to ensure contract compatibility between components:
from axisfuzzy.analysis.component.basic import (
ToolFuzzification, ToolNormalization, ToolSimpleAggregation
)
from axisfuzzy.fuzzifier import Fuzzifier
import pandas as pd
# Sample data
crisp_data = pd.DataFrame({'score1': [85, 92, 78], 'score2': [88, 85, 90]})
# Create components
normalizer = ToolNormalization(method='min_max')
fuzzifier_config = Fuzzifier(
mf='trimf',
mtype='qrofn',
q=2,
mf_params={'a': 80, 'b': 90, 'c': 100}
)
fuzzifier = ToolFuzzification(fuzzifier=fuzzifier_config)
# ✅ Correct chaining: normalize → fuzzify → access individual arrays
normalized_data = normalizer.run(crisp_data) # DataFrame → DataFrame
fuzzy_data = fuzzifier.run(normalized_data) # DataFrame → FuzzyDataFrame
# For aggregation, extract Fuzzarray from FuzzyDataFrame
score1_fuzzy = fuzzy_data['score1'] # Extract Fuzzarray
score2_fuzzy = fuzzy_data['score2'] # Extract Fuzzarray
# Use Fuzzarray's built-in aggregation methods
score1_mean = score1_fuzzy.mean() # Fuzzy aggregation
score2_mean = score2_fuzzy.mean() # Fuzzy aggregation
print(f"Final scores: {score1_mean}, {score2_mean}")
# Alternative: If you need crisp aggregation, convert back to DataFrame first
# This approach loses fuzzy information but enables ToolSimpleAggregation
crisp_aggregator = ToolSimpleAggregation(operation='mean')
crisp_result = crisp_aggregator.run(normalized_data) # Works on crisp data
Contract System and Type Safety
The contract system ensures that FuzzyDataFrame is used correctly throughout your analysis pipeline. It’s like having a safety net that catches data type errors before they cause problems.
Understanding Contracts
Contracts define what type of data a function expects and returns:
from axisfuzzy.analysis.contracts.decorator import contract
from axisfuzzy.analysis.build_in import ContractCrispTable, ContractFuzzyTable
from axisfuzzy.analysis.component.basic import ToolFuzzification
from axisfuzzy.fuzzifier import Fuzzifier
@contract
def my_analysis_function(data: ContractCrispTable) -> ContractFuzzyTable:
"""
This function expects crisp data and returns fuzzy data.
The contract decorator automatically validates inputs and outputs.
"""
# Convert crisp data to FuzzyDataFrame
fuzzifier_engine = Fuzzifier(mf='trimf', mtype='qrofn',
mf_params={'a': 0, 'b': 0.5, 'c': 1})
fuzzifier = ToolFuzzification(fuzzifier=fuzzifier_engine)
return fuzzifier.run(data)
# The contract system automatically validates:
# - Input: Must be a pandas DataFrame with numeric data
# - Output: Must be a FuzzyDataFrame
result = my_analysis_function(crisp_data)
Built-in Contracts for FuzzyDataFrame
AxisFuzzy provides several contracts specifically for FuzzyDataFrame:
from axisfuzzy.analysis.build_in import (
ContractFuzzyTable, # For FuzzyDataFrame
ContractCrispTable, # For pandas DataFrame with numeric data
ContractWeightVector # For weight arrays
)
@contract
def weighted_fuzzy_analysis(
fuzzy_data: ContractFuzzyTable,
weights: ContractWeightVector
) -> ContractFuzzyTable:
# Your analysis logic here
# Apply weights to fuzzy data and return processed result
processed_fuzzy_data = fuzzy_data # Placeholder for actual processing
return processed_fuzzy_data
Model API Integration
The Model API provides the highest level of abstraction, allowing you to build complex analysis workflows that feel like writing regular Python classes.
Creating Analysis Models
Build reusable models that work with FuzzyDataFrame:
from axisfuzzy.analysis.app.model import Model
from axisfuzzy.analysis.build_in import ContractCrispTable, ContractFuzzyTable
from axisfuzzy.analysis.component.basic import ToolFuzzification, ToolNormalization, ToolSimpleAggregation
from axisfuzzy.fuzzifier import Fuzzifier
class EnvironmentalAnalysisModel(Model):
def __init__(self, fuzzifier_type='triangular'):
super().__init__()
# Define your analysis components
fuzzifier_engine = Fuzzifier(mf='trimf', mtype='qrofn',
mf_params={'a': 0, 'b': 0.5, 'c': 1})
self.fuzzifier = ToolFuzzification(fuzzifier=fuzzifier_engine)
self.normalizer = ToolNormalization(method='min_max')
self.aggregator = ToolSimpleAggregation(operation='mean')
def forward(self, environmental_data: ContractCrispTable) -> ContractFuzzyTable:
# Define your analysis workflow
# Step 1: Normalize the crisp data first
normalized_data = self.normalizer(environmental_data)
# Step 2: Convert normalized crisp data to fuzzy representation
fuzzy_data = self.fuzzifier(normalized_data)
# Step 3: For aggregation, we need to extract Fuzzarray from FuzzyDataFrame
# Since ToolSimpleAggregation expects ContractCrispTable, we'll return fuzzy_data directly
# Users can extract specific columns as Fuzzarray for fuzzy aggregation if needed
return fuzzy_data
def get_config(self):
return {'fuzzifier_type': 'triangular'}
Using Models
Once built, models are easy to use:
# Create and build the model
model = EnvironmentalAnalysisModel()
model.build() # This creates the internal pipeline
# Use the model
environmental_data = pd.DataFrame({
'temperature': [20.5, 23.1, 18.9],
'humidity': [65.2, 58.7, 72.1]
})
result = model.run(environmental_data=environmental_data)
# Or use with pandas accessor for convenience
result = environmental_data.fuzzy.run(model)
This integration ecosystem makes FuzzyDataFrame a powerful bridge between different analysis approaches, from simple component-based processing to sophisticated model-driven workflows, all while maintaining type safety and ease of use.
Advanced Usage and Best Practices
This section explores advanced techniques for maximizing FuzzyDataFrame’s capabilities in production environments. Understanding these patterns helps you build robust, scalable fuzzy analysis workflows that leverage the full power of AxisFuzzy’s architecture.
Performance Optimization Strategies
FuzzyDataFrame’s performance characteristics are fundamentally shaped by its column-oriented architecture and integration with Fuzzarray’s backend system. Understanding these design decisions helps you write efficient fuzzy analysis code.
Memory Architecture and Optimization
FuzzyDataFrame employs a Structure-of-Arrays (SoA) design where each column stores fuzzy numbers as separate Fuzzarray objects. This architecture provides significant performance advantages for analytical workloads:
# Column-wise operations are highly optimized
temperature_data = fdf['temperature'] # Direct Fuzzarray access
humidity_data = fdf['humidity'] # No data copying
# Vectorized operations across entire columns
comfort_index = temperature_data * 0.6 + humidity_data * 0.4
# Memory-efficient column selection - create subset with individual column access
subset_data = {
'temperature': fdf['temperature'],
'humidity': fdf['humidity'],
'pressure': fdf['pressure']
}
subset = FuzzyDataFrame(subset_data, index=fdf.index)
Backend-Aware Performance Patterns
FuzzyDataFrame automatically leverages Fuzzarray’s optimized backends for computational efficiency. Understanding these patterns helps you write performance-conscious code:
# Efficient: Batch operations on crisp data before fuzzification
# Convert FuzzyDataFrame to crisp representation for normalization
crisp_data = pd.DataFrame({
col: [float(fuzz_val.membership) for fuzz_val in fdf[col]]
for col in fdf.columns
}, index=fdf.index)
normalized_scores = normalizer.run(crisp_data) # Vectorized processing
# Less efficient: Row-by-row processing
# Avoid this pattern for large datasets
results = []
for i in range(len(fdf)):
row_data = {col: fdf[col][i] for col in fdf.columns}
results.append(process_single_row(row_data))
Memory Management for Large Datasets
When working with large fuzzy datasets, consider memory usage patterns:
# Memory-efficient data loading
def load_large_fuzzy_dataset(file_path, fuzzifier, chunk_size=10000):
"""Load large datasets in chunks to manage memory usage."""
import pandas as pd
from axisfuzzy.analysis.dataframe import FuzzyDataFrame
chunks = pd.read_csv(file_path, chunksize=chunk_size)
fuzzy_chunks = []
for chunk in chunks:
fuzzy_chunk = FuzzyDataFrame.from_pandas(chunk, fuzzifier)
fuzzy_chunks.append(fuzzy_chunk)
return fuzzy_chunks
# Example usage with proper variable definitions
from axisfuzzy.fuzzifier import Fuzzifier
from axisfuzzy.analysis.pipeline import FuzzyPipeline
# Initialize required components
fuzzifier = Fuzzifier(mtype='qrofn', q=2)
analysis_pipeline = FuzzyPipeline() # Configure as needed
# Load and process data
fuzzy_chunks = load_large_fuzzy_dataset('large_dataset.csv', fuzzifier)
results = []
for chunk in fuzzy_chunks:
chunk_result = analysis_pipeline.run(chunk)
results.append(chunk_result)
Production-Ready Best Practices
Building robust fuzzy analysis systems requires attention to data consistency, error handling, and integration patterns. These practices ensure your FuzzyDataFrame workflows are reliable and maintainable.
Data Type Consistency and Validation
Maintaining consistent fuzzy data types across your analysis workflow prevents subtle bugs and ensures predictable behavior:
# Establish consistent fuzzy types early
def create_standardized_fuzzy_dataframe(crisp_data, analysis_config):
"""Create FuzzyDataFrame with consistent mtype across all columns."""
fuzzifier = Fuzzifier(
mtype=analysis_config['fuzzy_type'], # e.g., 'qrofn'
**analysis_config['fuzzifier_params']
)
# Validate input data before conversion
if not all(pd.api.types.is_numeric_dtype(dtype) for dtype in crisp_data.dtypes):
raise ValueError("All columns must contain numeric data for fuzzification")
return FuzzyDataFrame.from_pandas(crisp_data, fuzzifier)
# Verify mtype consistency in analysis pipelines
def validate_fuzzy_compatibility(fdf1, fdf2):
"""Ensure two FuzzyDataFrames have compatible fuzzy types."""
if fdf1.mtype != fdf2.mtype:
raise TypeError(f"Incompatible fuzzy types: {fdf1.mtype} vs {fdf2.mtype}")
Efficient Data Conversion Patterns
Minimize computational overhead by optimizing data conversion workflows:
# Pattern 1: Batch conversion for multiple analyses
class FuzzyAnalysisWorkflow:
def __init__(self, fuzzifier):
self.fuzzifier = fuzzifier
self._fuzzy_cache = {}
def get_fuzzy_data(self, data_key, crisp_data):
"""Cache fuzzy conversions to avoid repeated computation."""
if data_key not in self._fuzzy_cache:
self._fuzzy_cache[data_key] = FuzzyDataFrame.from_pandas(
crisp_data, self.fuzzifier
)
return self._fuzzy_cache[data_key]
# Pattern 2: Incremental data processing
def process_streaming_data(data_stream, fuzzifier, batch_size=1000):
"""Process streaming data in batches for memory efficiency."""
batch = []
for record in data_stream:
batch.append(record)
if len(batch) >= batch_size:
batch_df = pd.DataFrame(batch)
fuzzy_batch = FuzzyDataFrame.from_pandas(batch_df, fuzzifier)
yield fuzzy_batch
batch = []
Seamless Ecosystem Integration
Leverage FuzzyDataFrame’s integration with AxisFuzzy’s broader ecosystem for powerful analysis workflows:
# Integration with pandas accessor
def enhanced_data_pipeline(crisp_data):
"""Demonstrate seamless integration patterns."""
# Traditional pandas preprocessing
cleaned_data = crisp_data.dropna().reset_index(drop=True)
# Smooth transition to fuzzy analysis
fuzzy_data = cleaned_data.fuzzy.to_fuzz_dataframe(fuzzifier)
# Component-based analysis
normalized_data = normalizer.run(fuzzy_data)
analysis_result = aggregator.run(normalized_data)
return analysis_result
# Integration with Model API
from axisfuzzy.analysis.app.model import Model
from axisfuzzy.analysis.component.basic import ToolNormalization, ToolFuzzification, ToolSimpleAggregation
from axisfuzzy.analysis.build_in import ContractCrispTable
class ProductionAnalysisModel(Model):
def __init__(self):
super().__init__()
self.preprocessor = ToolNormalization()
self.analyzer = ToolFuzzification(fuzzifier=production_fuzzifier)
self.aggregator = ToolSimpleAggregation()
def forward(self, input_data: ContractCrispTable):
# Automatic FuzzyDataFrame handling
normalized = self.preprocessor(input_data)
fuzzy_data = self.analyzer(normalized)
return self.aggregator(fuzzy_data)
Error Handling and Robustness
Implement comprehensive error handling for production reliability:
def robust_fuzzy_analysis(crisp_data, fuzzifier, fallback_strategy='skip'):
"""Robust fuzzy analysis with comprehensive error handling."""
try:
# Validate input data
if crisp_data.empty:
raise ValueError("Input data is empty")
# Check for required numeric types
non_numeric_cols = [col for col in crisp_data.columns
if not pd.api.types.is_numeric_dtype(crisp_data[col])]
if non_numeric_cols:
if fallback_strategy == 'skip':
crisp_data = crisp_data.drop(columns=non_numeric_cols)
else:
raise TypeError(f"Non-numeric columns found: {non_numeric_cols}")
# Create FuzzyDataFrame with validation
fuzzy_data = FuzzyDataFrame.from_pandas(crisp_data, fuzzifier)
return fuzzy_data
except Exception as e:
logger.error(f"Fuzzy analysis failed: {str(e)}")
if fallback_strategy == 'raise':
raise
return None
Future Evolution and Roadmap
FuzzyDataFrame is designed as an evolving platform that adapts to emerging computational paradigms and user needs. Understanding the planned evolution helps you prepare for future capabilities.
Strategic Backend Migration to Polars
Note
Polars Integration Roadmap: AxisFuzzy is planning a strategic migration from pandas to Polars as the underlying computational engine. Polars (https://pola.rs/) is a high-performance DataFrame library written in Rust with Python bindings, designed specifically for large-scale data processing and analytical workloads.
The transition to Polars represents a fundamental architectural advancement that addresses the computational demands of large-scale fuzzy data analysis. This migration embodies AxisFuzzy’s commitment to performance optimization while maintaining complete API compatibility.
Core Performance Advantages
Polars delivers transformative computational improvements through several key technological innovations:
Lazy Evaluation Engine: Query optimization and computational graph analysis reduce overhead for complex multi-step fuzzy operations
Native Parallelization: Multi-threading capabilities leverage modern multi-core architectures for fuzzy number computations
Memory Efficiency: Columnar processing model aligns with FuzzyDataFrame’s architecture, optimizing memory utilization patterns
Rust-Based Performance: Zero-copy operations and optimized algorithms deliver substantial speed improvements
API Compatibility Guarantee
The Polars migration maintains complete backward compatibility:
# Current pandas-based implementation
fuzzy_df = FuzzyDataFrame.from_pandas(crisp_data, fuzzifier)
result = fuzzy_df['temperature'].apply(analysis_function)
# Future Polars-enhanced implementation (identical API)
fuzzy_df = FuzzyDataFrame.from_pandas(crisp_data, fuzzifier)
result = fuzzy_df['temperature'].apply(analysis_function) # Faster execution
Performance Projections
Preliminary benchmarking indicates significant improvements:
Fuzzification Operations: 3-5x performance gain for large datasets
Aggregation Functions: 2-4x speedup for complex operations
Memory Footprint: 30-50% reduction in memory usage
Query Optimization: Automatic pipeline optimization
Extended Analytical Capabilities
Future Polars-enhanced versions will introduce advanced fuzzy operations:
Fuzzy Joins: Similarity-based join operations with fuzzy matching
Temporal Fuzzy Analysis: Time-series operations with fuzzy reasoning
Distributed Processing: Cluster-based fuzzy analysis capabilities
Streaming Integration: Real-time fuzzy data processing support
Note
The Polars migration timeline ensures seamless transition with zero breaking changes. Existing FuzzyDataFrame code will automatically benefit from performance improvements without modification.
Conclusion
The data structures in axisfuzzy.analysis establish a comprehensive foundation for
fuzzy data manipulation and analysis, bridging the gap between traditional data
processing paradigms and fuzzy logic requirements. Through the FuzzyDataFrame
and its supporting ecosystem, developers gain access to powerful tools that maintain
both computational efficiency and analytical precision.
Core Architectural Achievements:
Seamless Integration: Native compatibility with pandas workflows while extending functionality for fuzzy data types and operations
Type Safety: Contract-driven validation ensuring data integrity throughout complex analytical pipelines
Performance Optimization: Memory-efficient storage and vectorized operations designed for large-scale fuzzy analysis workloads
Extensible Design: Modular architecture supporting custom fuzzy number types and specialized analytical operations
Practical Impact:
The unified data structure approach eliminates the traditional friction between data preparation and fuzzy analysis, enabling researchers and practitioners to focus on analytical insights rather than data transformation complexities. The framework’s emphasis on familiar pandas-like interfaces reduces learning curves while providing the specialized capabilities required for sophisticated fuzzy logic applications.
Future-Ready Foundation:
This data structure ecosystem positions AxisFuzzy as a scalable platform for emerging fuzzy analysis methodologies, with built-in support for streaming data, cloud-native deployments, and advanced visualization integration. The commitment to API stability ensures long-term viability for research and production systems.
The axisfuzzy.analysis data structures transform fuzzy data analysis from a specialized, tool-specific domain into an accessible, integrated component of modern data science workflows, maintaining scientific rigor while embracing practical usability.