.. _data_structures:

===============
Data Structures
===============

The FuzzyDataFrame is the cornerstone data structure for fuzzy data analysis in AxisFuzzy, 
providing a pandas-like interface specifically designed for handling fuzzy numbers efficiently. 
This document introduces you to the FuzzyDataFrame's design philosophy, core capabilities, 
and practical usage patterns that make fuzzy data analysis both intuitive and powerful.

Think of FuzzyDataFrame as your familiar pandas DataFrame, but enhanced with native support 
for fuzzy numbers. Just as pandas revolutionized data analysis by providing labeled, 
heterogeneous data structures, FuzzyDataFrame brings the same level of convenience and 
power to the world of fuzzy data analysis.

.. contents::
   :local:

Understanding FuzzyDataFrame
----------------------------

What is FuzzyDataFrame
~~~~~~~~~~~~~~~~~~~~~~

FuzzyDataFrame is a specialized two-dimensional data structure designed for fuzzy data analysis. 
Think of it as pandas DataFrame's fuzzy-aware cousin - it maintains the familiar tabular structure 
you know and love, but each cell contains fuzzy numbers instead of crisp values.

**The Fundamental Concept**

In traditional data analysis, a DataFrame cell might contain a value like ``0.75``. In a 
FuzzyDataFrame, that same cell contains a fuzzy number that might represent "approximately 0.75" 
with associated membership and non-membership degrees. This allows you to capture and work with 
uncertainty, imprecision, and subjective judgments that are inherent in real-world data.

.. code-block:: python

    # Traditional pandas DataFrame
    crisp_df = pd.DataFrame({
        'score': [0.75, 0.82, 0.68],
        'rating': [4.2, 4.7, 3.9]
    })
    
    # FuzzyDataFrame equivalent
    fuzzy_df = FuzzyDataFrame({
        'score': fuzzarray_scores,    # Each element is a fuzzy number
        'rating': fuzzarray_ratings   # Preserving uncertainty information
    })

**Core Design Principles**

FuzzyDataFrame follows several key design principles that make it both powerful and accessible:

**Pandas-Inspired Interface**: If you know how to use pandas DataFrame, you already understand 
most of FuzzyDataFrame's interface. Methods like ``shape``, ``columns``, ``index``, and 
indexing operations work exactly as you'd expect.

**Fuzzarray Foundation**: Each column is a Fuzzarray - AxisFuzzy's high-performance fuzzy array 
structure. This ensures efficient storage and computation while maintaining the full richness 
of fuzzy information.

**Type Consistency**: All columns in a FuzzyDataFrame share the same fuzzy type (mtype), ensuring 
mathematical operations between columns are well-defined and meaningful.

**Future-Ready Architecture**: While currently built on pandas infrastructure, FuzzyDataFrame 
is designed to potentially migrate to polars backend for even better performance.

**Key Structural Characteristics**

Understanding FuzzyDataFrame's structure helps you work with it effectively:

- **Column-Oriented Storage**: Each column is an independent Fuzzarray containing fuzzy numbers
- **Labeled Axes**: Both rows and columns have labels, just like pandas DataFrame
- **Homogeneous Fuzzy Type**: All fuzzy numbers in the DataFrame share the same mtype (e.g., 'qrofn')
- **Index Alignment**: Row and column operations respect pandas-style index alignment
- **Memory Efficiency**: Leverages Fuzzarray's backend system for optimized memory usage

**Relationship to AxisFuzzy Ecosystem**

FuzzyDataFrame isn't an isolated component - it's deeply integrated with AxisFuzzy's broader 
ecosystem:

- **Components**: Analysis components can consume and produce FuzzyDataFrame objects
- **Pipelines**: FuzzyDataFrame flows seamlessly through analysis pipelines
- **Models**: High-level models can work directly with FuzzyDataFrame inputs and outputs
- **Contracts**: Type contracts ensure FuzzyDataFrame compatibility across the system

Why FuzzyDataFrame Matters
~~~~~~~~~~~~~~~~~~~~~~~~~~

Traditional data analysis assumes your data is precise and certain. But real-world scenarios 
often involve uncertainty, subjective judgments, and imprecise measurements. FuzzyDataFrame 
addresses these limitations in several crucial ways.

**Preserving Information Richness**

When you convert fuzzy data to crisp numbers (like taking just the membership degree), you lose 
valuable information about uncertainty and confidence. FuzzyDataFrame preserves the complete 
fuzzy representation throughout your entire analysis workflow.

Consider a customer satisfaction survey where responses like "somewhat satisfied" contain 
inherent ambiguity. Traditional approaches might convert this to a single number like ``3.5``. 
FuzzyDataFrame preserves the uncertainty, allowing your analysis to account for the fact that 
this rating could reasonably range from ``3.0`` to ``4.0`` with varying degrees of confidence.

**Familiar Yet Powerful Interface**

FuzzyDataFrame leverages pandas conventions, dramatically reducing the learning curve. If you 
can work with pandas DataFrame, you can work with FuzzyDataFrame. This familiarity accelerates 
adoption while providing access to sophisticated fuzzy analysis capabilities.

.. code-block:: python

    # Familiar pandas-style operations
    print(fuzzy_df.shape)           # (100, 5)
    print(fuzzy_df.columns)         # ['feature_1', 'feature_2', ...]
    column_data = fuzzy_df['score'] # Returns a Fuzzarray
    
    # But with fuzzy-aware semantics
    fuzzy_subset = fuzzy_df[fuzzy_df.columns[:3]]  # Maintains fuzzy properties

**Performance at Scale**

FuzzyDataFrame is built on Fuzzarray's efficient backend system, which optimizes memory usage 
and computational performance. This means you can work with large fuzzy datasets without 
sacrificing speed or consuming excessive memory.

The backend system automatically selects the most efficient representation for your specific 
fuzzy number type and operations, ensuring that fuzzy computations scale to real-world datasets.

**Seamless Ecosystem Integration**

Perhaps most importantly, FuzzyDataFrame integrates seamlessly with AxisFuzzy's analysis 
ecosystem. You can:

- Feed FuzzyDataFrame directly into analysis components
- Use it as input/output for fuzzy pipelines
- Apply high-level models that expect fuzzy tabular data
- Leverage the contract system for type-safe data flow

This integration means you can build sophisticated fuzzy analysis workflows without worrying 
about data format conversions or compatibility issues.

**Real-World Applications**

FuzzyDataFrame excels in scenarios where uncertainty and imprecision are inherent:

- **Decision Support Systems**: Where criteria have subjective weights and uncertain outcomes
- **Risk Assessment**: Where probabilities and impacts contain inherent uncertainty
- **Quality Evaluation**: Where ratings and scores reflect subjective judgments
- **Sensor Data Analysis**: Where measurements contain noise and calibration uncertainty
- **Expert Systems**: Where domain knowledge involves linguistic variables and approximate reasoning

By preserving and working with uncertainty rather than discarding it, FuzzyDataFrame enables 
more robust and realistic analysis of complex real-world problems.

Creating and Initializing FuzzyDataFrame
-----------------------------------------

FuzzyDataFrame provides flexible construction patterns to accommodate different data sources 
and use cases. Whether you're starting with crisp data, existing fuzzy arrays, or building 
from scratch, there's an appropriate construction approach.

Basic Construction Patterns
~~~~~~~~~~~~~~~~~~~~~~~~~~~

**Direct Construction from Fuzzarray Dictionary**

Create a FuzzyDataFrame directly from a dictionary mapping column names to Fuzzarray objects:

.. code-block:: python

    from axisfuzzy.analysis.dataframe import FuzzyDataFrame
    from axisfuzzy import fuzzyarray, fuzzynum

    # Create fuzzy arrays
    scores = fuzzyarray([
        fuzzynum((0.8,0.1), q=2),
        fuzzynum((0.7,0.2), q=2)
    ])

    # Construct FuzzyDataFrame
    fuzzy_df = FuzzyDataFrame({'performance': scores})
    print(fuzzy_df.shape)    # (2, 1)
    print(fuzzy_df)

output::

      performance
    0   <0.8,0.1>
    1   <0.7,0.2>

**Construction with Custom Index and Columns**

Specify custom index and column labels for meaningful data organization:

.. code-block:: python

    import pandas as pd

    fuzzy_df = FuzzyDataFrame(
        data={'q1_performance': scores},  # 键名与 columns 匹配
        index=pd.Index(['product_a', 'product_b'], name='products'),
        columns=pd.Index(['q1_performance'], name='quarters')
    )
    print(fuzzy_df)

output::

    quarters  q1_performance
    products                
    product_a      <0.8,0.1>
    product_b      <0.7,0.2>

Converting from Pandas DataFrame
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The most common scenario involves converting crisp data into fuzzy representations using 
the ``from_pandas()`` class method.

**Basic Conversion Process**

.. code-block:: python

    import pandas as pd
    from axisfuzzy.fuzzifier import Fuzzifier

    # Existing crisp data
    sensor_data = pd.DataFrame({
        'temperature': [20.5, 25.3, 18.7],
        'humidity': [65.2, 70.1, 58.9]
    })

    # Configure fuzzification
    fuzzifier = Fuzzifier(
        mf='gaussmf',
        mtype='qrofn',
        q=2,
        mf_params=[{'sigma': 10, 'c': 30}]
    )

    # Convert to FuzzyDataFrame
    fuzzy_data = FuzzyDataFrame.from_pandas(sensor_data, fuzzifier)
    print(f"Fuzzy type: {fuzzy_data.mtype}")

**What Happens During Conversion**

The ``from_pandas()`` method performs these operations:

1. **Column-wise Fuzzification**: Each column is processed by the fuzzifier
2. **Structure Preservation**: Original index and column labels are maintained
3. **Type Consistency**: All fuzzy numbers share the same mtype
4. **Validation**: Ensures proper fuzzifier configuration

Using the Pandas Accessor
~~~~~~~~~~~~~~~~~~~~~~~~~

The pandas accessor provides seamless integration with existing pandas workflows through 
the ``.fuzzy`` accessor.

**Basic Accessor Usage**

.. code-block:: python

   # Existing pandas workflow
   data = pd.DataFrame({
       'feature_1': [1.2, 2.3, 1.8],
       'feature_2': [0.8, 1.5, 1.1]
   })
   
   # Configure and convert
   fuzzifier = Fuzzifier(
        mf='gaussmf',
        mtype='qrofn',
        q=2,
        mf_params=[{'sigma': 10, 'c': 30}]
    )

   fuzzy_data = data.fuzzy.to_fuzz_dataframe(fuzzifier)

**Integration with Analysis Workflows**

The accessor integrates with AxisFuzzy's analysis ecosystem:

.. code-block:: python

   from axisfuzzy.analysis.pipeline import FuzzyPipeline
   
   # Execute pipeline directly from pandas DataFrame
   # pipeline = FuzzyPipeline()
   # result = data.fuzzy.run(pipeline, fuzzifier=fuzzifier)

**Construction Best Practices**

When creating FuzzyDataFrame objects, follow these guidelines:

**Choose the Right Method**:

- Use ``from_pandas()`` for converting crisp data
- Use direct construction for existing Fuzzarray objects
- Use the accessor for pandas workflow integration

**Ensure Consistency**:

- All Fuzzarray columns must have the same length
- All fuzzy numbers should share the same mtype
- Maintain proper index alignment

**Memory Considerations**:

- Process large datasets in chunks when necessary
- Choose appropriate membership function parameters
- Consider backend implications of your mtype choice


Working with FuzzyDataFrame
---------------------------

Creating Your First FuzzyDataFrame
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Before exploring FuzzyDataFrame operations, let's create a sample dataset that we'll 
use throughout this section. This example demonstrates the typical workflow of 
converting crisp data into fuzzy representations.

.. code-block:: python

    import pandas as pd
    from axisfuzzy.analysis.dataframe import FuzzyDataFrame
    from axisfuzzy.fuzzifier import Fuzzifier

    # Create sample crisp data
    crisp_data = pd.DataFrame({
        'temperature': [20.5, 25.3, 18.7, 22.1, 19.8],
        'humidity': [65.2, 70.1, 58.9, 67.5, 62.3],
        'pressure': [78.2, 46.8, 55.5, 57.1, 79.7]
    }, index=['sensor_1', 'sensor_2', 'sensor_3', 'sensor_4', 'sensor_5'])

    # Configure fuzzifier for converting crisp values to fuzzy numbers
    fuzzifier = Fuzzifier(
        mf='gaussmf',           # Gaussian membership function
        mtype='qrofn',          # q-rung orthopair fuzzy numbers
        q=2,                    # q-rung parameter
        mf_params=[{'sigma': 40, 'c': 50}]  # Gaussian parameters
    )

    # Create FuzzyDataFrame from crisp data
    fdf = FuzzyDataFrame.from_pandas(crisp_data, fuzzifier)
    print(fdf)

output:: 

                  temperature         humidity         pressure
    sensor_1  <0.7619,0.6399>  <0.9303,0.3528>    <0.78,0.6178>
    sensor_2  <0.8264,0.5541>  <0.8814,0.4617>       <0.9968,0>
    sensor_3  <0.7363,0.6693>  <0.9756,0.1957>  <0.9906,0.0934>
    sensor_4  <0.7841,0.6126>  <0.9087,0.4052>   <0.9844,0.145>
    sensor_5   <0.752,0.6515>  <0.9538,0.2832>  <0.7591,0.6433>

Now that we have our FuzzyDataFrame ``fdf``, let's explore its capabilities and operations.

Understanding FuzzyDataFrame Fundamentals
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

FuzzyDataFrame serves as your primary tool for organizing and manipulating fuzzy data 
in a structured, tabular format. Think of it as a specialized version of pandas 
DataFrame, but designed specifically to handle the complexities of fuzzy numbers 
while maintaining familiar, intuitive operations.

Unlike traditional data structures that work with crisp values, FuzzyDataFrame 
manages collections of fuzzy numbers (Fuzzarray objects) as columns, ensuring 
that all fuzzy operations preserve uncertainty information throughout your analysis 
workflow.

**Core Architecture**

FuzzyDataFrame organizes data in a column-oriented structure where:

- Each **column** contains a Fuzzarray (a collection of fuzzy numbers)
- Each **row** represents a data record with fuzzy values across different attributes
- All columns must share the same **mtype** (fuzzy number type) for consistency
- Index and column labels follow pandas conventions for familiar navigation

Essential Properties and Information
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

FuzzyDataFrame provides comprehensive properties to understand your data structure 
and content. These properties help you quickly assess data dimensions, types, and 
organization patterns.

**Dimensional Information**

Understand the size and structure of your fuzzy dataset:

.. code-block:: python

   # Get shape as (rows, columns) tuple
   rows, cols = fdf.shape
   print(f"Dataset contains {rows} records with {cols} fuzzy attributes")
   
   # Alternative: get row count directly
   num_records = len(fdf)
   print(f"Total records: {num_records}")

**Index and Column Management**

Access and examine the organizational structure:

.. code-block:: python

   # Examine row labels (index)
   print("Row labels:", fdf.index.tolist())
   
   # Examine column names
   print("Fuzzy attributes:", fdf.columns.tolist())
   
   # Check if index has names
   if fdf.index.name:
       print(f"Index represents: {fdf.index.name}")

**Fuzzy Type Information**

Verify the consistency of fuzzy number types across your dataset:

.. code-block:: python

   # Check the fuzzy number type
   print(f"Fuzzy type: {fdf.mtype}")
   
   # This ensures all columns use the same fuzzy representation
   # (e.g., all triangular, all trapezoidal, etc.)

Column Operations and Data Access
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

FuzzyDataFrame provides intuitive methods for accessing and manipulating individual 
columns and data elements, maintaining the fuzzy nature of your data throughout 
all operations.

**Column Retrieval and Inspection**

Access individual columns as Fuzzarray objects for detailed analysis:

.. code-block:: python

   # Retrieve a specific fuzzy attribute
   temperature_data = fdf['temperature']
   print(f"Temperature column type: {type(temperature_data)}")  # Fuzzarray
   
   # Examine column properties
   print(f"Column length: {len(temperature_data)}")
   print(f"Column fuzzy type: {temperature_data.mtype}")

**Adding and Modifying Columns**

Extend your dataset with new fuzzy attributes:

.. code-block:: python

    # Create new fuzzy data
    from axisfuzzy import fuzzynum, fuzzyarray

    # Prepare new fuzzy values
    pressure_values = [fuzzynum((0.7,0.3), q=2) for _ in range(len(fdf))]
    new_pressure_column = fuzzyarray(pressure_values)

    # Add the new column
    fdf['pressure'] = new_pressure_column

    # Verify addition
    print(f"Updated columns: {fdf.columns.tolist()}")


**Element-Level Access**

Retrieve and examine individual fuzzy numbers:

.. code-block:: python

   # Access specific fuzzy values
   first_temperature = fdf['temperature'][0]
   print(f"First temperature reading: {first_temperature}")
   
   # Access by row and column position
   specific_value = fdf['humidity'][2]  # Third humidity reading
   print(f"Specific humidity value: {specific_value}")

Data Inspection and Visualization
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Effective fuzzy data analysis requires understanding the content and characteristics 
of your dataset. FuzzyDataFrame provides multiple approaches for inspecting and 
visualizing fuzzy information.

**Dataset Overview and Display**

Get a comprehensive view of your fuzzy dataset:

.. code-block:: python

   # Display the complete FuzzyDataFrame
   print(fdf)
   
   # This shows:
   # - All fuzzy values in readable format
   # - Row and column labels
   # - Automatic formatting for large datasets

**Detailed Fuzzy Number Examination**

Inspect the internal structure of individual fuzzy numbers:

.. code-block:: python

    # Select a specific fuzzy value for detailed analysis
    sample_value = fdf['temperature'][0]

    # Examine fuzzy number components
    print(f"Fuzzy value: {sample_value}")
    print(f"membership and non-membership degree: [{sample_value.md}, {sample_value.nmd}]")
    print(f"Score value: {sample_value.score}")

**Data Quality and Consistency Checks**

Verify the integrity and consistency of your fuzzy dataset:

.. code-block:: python

   # Check for empty or invalid data
   if fdf.shape[0] == 0:
       print("Warning: Dataset is empty")
   
   # Verify column consistency
   print(f"All columns have same mtype: {fdf.mtype}")
   
   # Check for proper column lengths
   column_lengths = [len(fdf[col]) for col in fdf.columns]
   if len(set(column_lengths)) == 1:
       print("All columns have consistent length")
   else:
       print("Warning: Column length mismatch detected")

**Working with Subsets and Selections**

Extract and work with portions of your fuzzy dataset:

.. code-block:: python

   # Work with specific columns (individual column access)
   temperature_data = fdf['temperature']
   humidity_data = fdf['humidity']
   
   # Create a subset FuzzyDataFrame with selected columns
   environmental_data = FuzzyDataFrame({
       'temperature': fdf['temperature'],
       'humidity': fdf['humidity']
   }, index=fdf.index)
   
   # Access multiple values from a column
   first_three_temps = [fdf['temperature'][i] for i in range(3)]
   print(f"First three temperature readings: {first_three_temps}")
   
   # Examine data patterns
   for col_name in fdf.columns:
       sample_val = fdf[col_name][0]
       print(f"{col_name}: {sample_val}")

This comprehensive approach to working with FuzzyDataFrame ensures you can effectively 
manage, inspect, and understand your fuzzy data while maintaining the mathematical 
rigor required for accurate fuzzy analysis.


Integration with Analysis Ecosystem
------------------------------------

FuzzyDataFrame serves as the central data structure that connects different parts 
of AxisFuzzy's analysis ecosystem. Think of it as the "common language" that allows 
various analysis tools to work together seamlessly. This section shows you how 
FuzzyDataFrame integrates with the three main parts of the ecosystem: components, 
contracts, and models.

Pandas Accessor Integration
~~~~~~~~~~~~~~~~~~~~~~~~~~~

The most user-friendly way to work with FuzzyDataFrame is through pandas' ``.fuzzy`` 
accessor, which extends any pandas DataFrame with fuzzy analysis capabilities.

**Converting Pandas to FuzzyDataFrame**

Transform your regular pandas data into fuzzy representation:

.. code-block:: python

    import pandas as pd
    from axisfuzzy.fuzzifier import Fuzzifier
    from axisfuzzy.membership import TriangularMF

    # Your regular pandas DataFrame
    df = pd.DataFrame({
        'temperature': [18.5, 22.3, 25.1, 19.8],
        'humidity': [17.2, 26.8, 27.9, 18.3]
    })

    # Create a fuzzifier with triangular membership function
    fuzzifier = Fuzzifier(
        mf='trimf',
        mtype='qrofn',
        q=2,
        mf_params={'a': 15.0, 'b': 22.0, 'c': 30.0}
    )

    # Convert to FuzzyDataFrame using the .fuzzy accessor
    fuzzy_df = df.fuzzy.to_fuzz_dataframe(fuzzifier=fuzzifier)

    # Now you have a FuzzyDataFrame ready for analysis
    print(fuzzy_df)  # <class 'FuzzyDataFrame'>

output::

           temperature         humidity
    0     <0.5,0.8602>   <0.3143,0.944>
    1  <0.9625,0.2522>      <0.4,0.911>
    2  <0.6125,0.7841>  <0.2625,0.9597>
    3   <0.6857,0.721>  <0.4714,0.8762>

**Running Analysis Models**

Execute complex analysis workflows directly from pandas:

.. code-block:: python

   # Assuming you have a pre-built analysis model
   from axisfuzzy.analysis.app.model import Model
   
   # Run the model using pandas accessor
   # Assume 'my_analysis_model' is a pre-built analytical model
   results = df.fuzzy.run(my_analysis_model, weights=[0.6, 0.4])
   
   # The accessor automatically handles data conversion and injection

Component System Integration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Components are the building blocks of fuzzy analysis. ``FuzzyDataFrame`` flows through 
these components, getting transformed at each step.

**Basic Component Workflow**

Here's how components work with ``FuzzyDataFrame``:

.. code-block:: python

   from axisfuzzy.analysis.component.basic import (
       ToolFuzzification, ToolNormalization
   )
   from axisfuzzy.fuzzifier import Fuzzifier
   
   # Start with crisp data
   crisp_data = pd.DataFrame({'score1': [85, 92, 78], 'score2': [88, 85, 90]})
   
   # Step 1: Normalize the crisp data first
   normalizer = ToolNormalization(method='min_max')
   normalized_data = normalizer.run(crisp_data)  # DataFrame → DataFrame
   
   # Step 2: Convert normalized data to fuzzy data
   # Create fuzzifier with triangular membership function
   fuzzifier_config = Fuzzifier(
       mf='trimf',
       mtype='qrofn',
       q=2,
       mf_params={'a': 70, 'b': 85, 'c': 100}  # Adjusted for normalized range [0,1]
   )
   fuzzifier = ToolFuzzification(fuzzifier=fuzzifier_config)
   fuzzy_data = fuzzifier.run(normalized_data)  # Returns FuzzyDataFrame
   
   # Step 3: Access and work with fuzzy data
   # FuzzyDataFrame provides access to underlying Fuzzarray objects
   print(f"Fuzzy data shape: {fuzzy_data.shape}")
   print(f"Columns: {fuzzy_data.columns}")
   
   # Access individual columns as Fuzzarray for further processing
   score1_fuzzy = fuzzy_data['score1']  # Returns Fuzzarray
   score2_fuzzy = fuzzy_data['score2']  # Returns Fuzzarray
   
   # Now you can use Fuzzarray's built-in aggregation methods
   score1_mean = score1_fuzzy.mean()  # Fuzzy mean using extension system
   score2_mean = score2_fuzzy.mean()  # Fuzzy mean using extension system
   
   print(f"Score1 fuzzy mean: {score1_mean}")
   print(f"Score2 fuzzy mean: {score2_mean}")

**Component Chaining**

Components can be chained together for complex workflows. The key is to ensure 
contract compatibility between components:

.. code-block:: python

   from axisfuzzy.analysis.component.basic import (
       ToolFuzzification, ToolNormalization, ToolSimpleAggregation
   )
   from axisfuzzy.fuzzifier import Fuzzifier
   import pandas as pd
   
   # Sample data
   crisp_data = pd.DataFrame({'score1': [85, 92, 78], 'score2': [88, 85, 90]})
   
   # Create components
   normalizer = ToolNormalization(method='min_max')
   fuzzifier_config = Fuzzifier(
       mf='trimf',
       mtype='qrofn',
       q=2,
       mf_params={'a': 80, 'b': 90, 'c': 100}
   )
   fuzzifier = ToolFuzzification(fuzzifier=fuzzifier_config)
   
   # ✅ Correct chaining: normalize → fuzzify → access individual arrays
   normalized_data = normalizer.run(crisp_data)      # DataFrame → DataFrame
   fuzzy_data = fuzzifier.run(normalized_data)       # DataFrame → FuzzyDataFrame
   
   # For aggregation, extract Fuzzarray from FuzzyDataFrame
   score1_fuzzy = fuzzy_data['score1']  # Extract Fuzzarray
   score2_fuzzy = fuzzy_data['score2']  # Extract Fuzzarray
   
   # Use Fuzzarray's built-in aggregation methods
   score1_mean = score1_fuzzy.mean()    # Fuzzy aggregation
   score2_mean = score2_fuzzy.mean()    # Fuzzy aggregation
   
   print(f"Final scores: {score1_mean}, {score2_mean}")
   
   # Alternative: If you need crisp aggregation, convert back to DataFrame first
   # This approach loses fuzzy information but enables ToolSimpleAggregation
   crisp_aggregator = ToolSimpleAggregation(operation='mean')
   crisp_result = crisp_aggregator.run(normalized_data)  # Works on crisp data

Contract System and Type Safety
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The contract system ensures that FuzzyDataFrame is used correctly throughout your 
analysis pipeline. It's like having a safety net that catches data type errors 
before they cause problems.

**Understanding Contracts**

Contracts define what type of data a function expects and returns:

.. code-block:: python

   from axisfuzzy.analysis.contracts.decorator import contract
   from axisfuzzy.analysis.build_in import ContractCrispTable, ContractFuzzyTable
   from axisfuzzy.analysis.component.basic import ToolFuzzification
   from axisfuzzy.fuzzifier import Fuzzifier
   
   @contract
   def my_analysis_function(data: ContractCrispTable) -> ContractFuzzyTable:
       """
       This function expects crisp data and returns fuzzy data.
       The contract decorator automatically validates inputs and outputs.
       """
       # Convert crisp data to FuzzyDataFrame
       fuzzifier_engine = Fuzzifier(mf='trimf', mtype='qrofn', 
                                   mf_params={'a': 0, 'b': 0.5, 'c': 1})
       fuzzifier = ToolFuzzification(fuzzifier=fuzzifier_engine)
       return fuzzifier.run(data)
   
   # The contract system automatically validates:
   # - Input: Must be a pandas DataFrame with numeric data
   # - Output: Must be a FuzzyDataFrame
   result = my_analysis_function(crisp_data)

**Built-in Contracts for FuzzyDataFrame**

AxisFuzzy provides several contracts specifically for FuzzyDataFrame:

.. code-block:: python

   from axisfuzzy.analysis.build_in import (
       ContractFuzzyTable,    # For FuzzyDataFrame
       ContractCrispTable,    # For pandas DataFrame with numeric data
       ContractWeightVector   # For weight arrays
   )
   
   @contract
   def weighted_fuzzy_analysis(
       fuzzy_data: ContractFuzzyTable, 
       weights: ContractWeightVector
   ) -> ContractFuzzyTable:
       # Your analysis logic here
       # Apply weights to fuzzy data and return processed result
       processed_fuzzy_data = fuzzy_data  # Placeholder for actual processing
       return processed_fuzzy_data

Model API Integration
~~~~~~~~~~~~~~~~~~~~~

The Model API provides the highest level of abstraction, allowing you to build 
complex analysis workflows that feel like writing regular Python classes.

**Creating Analysis Models**

Build reusable models that work with FuzzyDataFrame:

.. code-block:: python

   from axisfuzzy.analysis.app.model import Model
   from axisfuzzy.analysis.build_in import ContractCrispTable, ContractFuzzyTable
   from axisfuzzy.analysis.component.basic import ToolFuzzification, ToolNormalization, ToolSimpleAggregation
   from axisfuzzy.fuzzifier import Fuzzifier
   
   class EnvironmentalAnalysisModel(Model):
       def __init__(self, fuzzifier_type='triangular'):
           super().__init__()
           # Define your analysis components
           fuzzifier_engine = Fuzzifier(mf='trimf', mtype='qrofn', 
                                       mf_params={'a': 0, 'b': 0.5, 'c': 1})
           self.fuzzifier = ToolFuzzification(fuzzifier=fuzzifier_engine)
           self.normalizer = ToolNormalization(method='min_max')
           self.aggregator = ToolSimpleAggregation(operation='mean')
       
       def forward(self, environmental_data: ContractCrispTable) -> ContractFuzzyTable:
           # Define your analysis workflow
           # Step 1: Normalize the crisp data first
           normalized_data = self.normalizer(environmental_data)
           # Step 2: Convert normalized crisp data to fuzzy representation
           fuzzy_data = self.fuzzifier(normalized_data)
           # Step 3: For aggregation, we need to extract Fuzzarray from FuzzyDataFrame
           # Since ToolSimpleAggregation expects ContractCrispTable, we'll return fuzzy_data directly
           # Users can extract specific columns as Fuzzarray for fuzzy aggregation if needed
           return fuzzy_data
       
       def get_config(self):
           return {'fuzzifier_type': 'triangular'}

**Using Models**

Once built, models are easy to use:

.. code-block:: python

   # Create and build the model
   model = EnvironmentalAnalysisModel()
   model.build()  # This creates the internal pipeline
   
   # Use the model
   environmental_data = pd.DataFrame({
       'temperature': [20.5, 23.1, 18.9],
       'humidity': [65.2, 58.7, 72.1]
   })
   
   result = model.run(environmental_data=environmental_data)
   
   # Or use with pandas accessor for convenience
   result = environmental_data.fuzzy.run(model)

This integration ecosystem makes FuzzyDataFrame a powerful bridge between different 
analysis approaches, from simple component-based processing to sophisticated 
model-driven workflows, all while maintaining type safety and ease of use.


Advanced Usage and Best Practices
----------------------------------

This section explores advanced techniques for maximizing FuzzyDataFrame's capabilities 
in production environments. Understanding these patterns helps you build robust, 
scalable fuzzy analysis workflows that leverage the full power of AxisFuzzy's 
architecture.

Performance Optimization Strategies
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

FuzzyDataFrame's performance characteristics are fundamentally shaped by its 
column-oriented architecture and integration with Fuzzarray's backend system. 
Understanding these design decisions helps you write efficient fuzzy analysis code.

**Memory Architecture and Optimization**

FuzzyDataFrame employs a Structure-of-Arrays (SoA) design where each column stores 
fuzzy numbers as separate Fuzzarray objects. This architecture provides significant 
performance advantages for analytical workloads:

.. code-block:: python

   # Column-wise operations are highly optimized
   temperature_data = fdf['temperature']  # Direct Fuzzarray access
   humidity_data = fdf['humidity']        # No data copying
   
   # Vectorized operations across entire columns
   comfort_index = temperature_data * 0.6 + humidity_data * 0.4
   
   # Memory-efficient column selection - create subset with individual column access
   subset_data = {
       'temperature': fdf['temperature'],
       'humidity': fdf['humidity'], 
       'pressure': fdf['pressure']
   }
   subset = FuzzyDataFrame(subset_data, index=fdf.index)

**Backend-Aware Performance Patterns**

FuzzyDataFrame automatically leverages Fuzzarray's optimized backends for computational 
efficiency. Understanding these patterns helps you write performance-conscious code:

.. code-block:: python

   # Efficient: Batch operations on crisp data before fuzzification
   # Convert FuzzyDataFrame to crisp representation for normalization
   crisp_data = pd.DataFrame({
       col: [float(fuzz_val.membership) for fuzz_val in fdf[col]]
       for col in fdf.columns
   }, index=fdf.index)
   normalized_scores = normalizer.run(crisp_data)  # Vectorized processing
   
   # Less efficient: Row-by-row processing
   # Avoid this pattern for large datasets
   results = []
   for i in range(len(fdf)):
       row_data = {col: fdf[col][i] for col in fdf.columns}
       results.append(process_single_row(row_data))

**Memory Management for Large Datasets**

When working with large fuzzy datasets, consider memory usage patterns:

.. code-block:: python

   # Memory-efficient data loading
   def load_large_fuzzy_dataset(file_path, fuzzifier, chunk_size=10000):
       """Load large datasets in chunks to manage memory usage."""
       import pandas as pd
       from axisfuzzy.analysis.dataframe import FuzzyDataFrame
       
       chunks = pd.read_csv(file_path, chunksize=chunk_size)
       fuzzy_chunks = []
       
       for chunk in chunks:
           fuzzy_chunk = FuzzyDataFrame.from_pandas(chunk, fuzzifier)
           fuzzy_chunks.append(fuzzy_chunk)
       
       return fuzzy_chunks
   
   # Example usage with proper variable definitions
   from axisfuzzy.fuzzifier import Fuzzifier
   from axisfuzzy.analysis.pipeline import FuzzyPipeline
   
   # Initialize required components
   fuzzifier = Fuzzifier(mtype='qrofn', q=2)
   analysis_pipeline = FuzzyPipeline()  # Configure as needed
   
   # Load and process data
   fuzzy_chunks = load_large_fuzzy_dataset('large_dataset.csv', fuzzifier)
   results = []
   for chunk in fuzzy_chunks:
       chunk_result = analysis_pipeline.run(chunk)
       results.append(chunk_result)


Production-Ready Best Practices
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Building robust fuzzy analysis systems requires attention to data consistency, 
error handling, and integration patterns. These practices ensure your FuzzyDataFrame 
workflows are reliable and maintainable.

**Data Type Consistency and Validation**

Maintaining consistent fuzzy data types across your analysis workflow prevents 
subtle bugs and ensures predictable behavior:

.. code-block:: python

   # Establish consistent fuzzy types early
   def create_standardized_fuzzy_dataframe(crisp_data, analysis_config):
       """Create FuzzyDataFrame with consistent mtype across all columns."""
       fuzzifier = Fuzzifier(
           mtype=analysis_config['fuzzy_type'],  # e.g., 'qrofn'
           **analysis_config['fuzzifier_params']
       )
       
       # Validate input data before conversion
       if not all(pd.api.types.is_numeric_dtype(dtype) for dtype in crisp_data.dtypes):
           raise ValueError("All columns must contain numeric data for fuzzification")
       
       return FuzzyDataFrame.from_pandas(crisp_data, fuzzifier)
   
   # Verify mtype consistency in analysis pipelines
   def validate_fuzzy_compatibility(fdf1, fdf2):
       """Ensure two FuzzyDataFrames have compatible fuzzy types."""
       if fdf1.mtype != fdf2.mtype:
           raise TypeError(f"Incompatible fuzzy types: {fdf1.mtype} vs {fdf2.mtype}")

**Efficient Data Conversion Patterns**

Minimize computational overhead by optimizing data conversion workflows:

.. code-block:: python

   # Pattern 1: Batch conversion for multiple analyses
   class FuzzyAnalysisWorkflow:
       def __init__(self, fuzzifier):
           self.fuzzifier = fuzzifier
           self._fuzzy_cache = {}
       
       def get_fuzzy_data(self, data_key, crisp_data):
           """Cache fuzzy conversions to avoid repeated computation."""
           if data_key not in self._fuzzy_cache:
               self._fuzzy_cache[data_key] = FuzzyDataFrame.from_pandas(
                   crisp_data, self.fuzzifier
               )
           return self._fuzzy_cache[data_key]
   
   # Pattern 2: Incremental data processing
   def process_streaming_data(data_stream, fuzzifier, batch_size=1000):
       """Process streaming data in batches for memory efficiency."""
       batch = []
       
       for record in data_stream:
           batch.append(record)
           
           if len(batch) >= batch_size:
               batch_df = pd.DataFrame(batch)
               fuzzy_batch = FuzzyDataFrame.from_pandas(batch_df, fuzzifier)
               yield fuzzy_batch
               batch = []

**Seamless Ecosystem Integration**

Leverage FuzzyDataFrame's integration with AxisFuzzy's broader ecosystem for 
powerful analysis workflows:

.. code-block:: python

   # Integration with pandas accessor
   def enhanced_data_pipeline(crisp_data):
       """Demonstrate seamless integration patterns."""
       # Traditional pandas preprocessing
       cleaned_data = crisp_data.dropna().reset_index(drop=True)
       
       # Smooth transition to fuzzy analysis
       fuzzy_data = cleaned_data.fuzzy.to_fuzz_dataframe(fuzzifier)
       
       # Component-based analysis
       normalized_data = normalizer.run(fuzzy_data)
       analysis_result = aggregator.run(normalized_data)
       
       return analysis_result
   
   # Integration with Model API
   from axisfuzzy.analysis.app.model import Model
   from axisfuzzy.analysis.component.basic import ToolNormalization, ToolFuzzification, ToolSimpleAggregation
   from axisfuzzy.analysis.build_in import ContractCrispTable
   
   class ProductionAnalysisModel(Model):
       def __init__(self):
           super().__init__()
           self.preprocessor = ToolNormalization()
           self.analyzer = ToolFuzzification(fuzzifier=production_fuzzifier)
           self.aggregator = ToolSimpleAggregation()
       
       def forward(self, input_data: ContractCrispTable):
           # Automatic FuzzyDataFrame handling
           normalized = self.preprocessor(input_data)
           fuzzy_data = self.analyzer(normalized)
           return self.aggregator(fuzzy_data)

**Error Handling and Robustness**

Implement comprehensive error handling for production reliability:

.. code-block:: python

   def robust_fuzzy_analysis(crisp_data, fuzzifier, fallback_strategy='skip'):
       """Robust fuzzy analysis with comprehensive error handling."""
       try:
           # Validate input data
           if crisp_data.empty:
               raise ValueError("Input data is empty")
           
           # Check for required numeric types
           non_numeric_cols = [col for col in crisp_data.columns 
                              if not pd.api.types.is_numeric_dtype(crisp_data[col])]
           if non_numeric_cols:
               if fallback_strategy == 'skip':
                   crisp_data = crisp_data.drop(columns=non_numeric_cols)
               else:
                   raise TypeError(f"Non-numeric columns found: {non_numeric_cols}")
           
           # Create FuzzyDataFrame with validation
           fuzzy_data = FuzzyDataFrame.from_pandas(crisp_data, fuzzifier)
           
           return fuzzy_data
           
       except Exception as e:
           logger.error(f"Fuzzy analysis failed: {str(e)}")
           if fallback_strategy == 'raise':
               raise
           return None

Future Evolution and Roadmap
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

FuzzyDataFrame is designed as an evolving platform that adapts to emerging 
computational paradigms and user needs. Understanding the planned evolution 
helps you prepare for future capabilities.

**Strategic Backend Migration to Polars**

.. note::
   **Polars Integration Roadmap**: AxisFuzzy is planning a strategic migration 
   from pandas to **Polars** as the underlying computational engine. Polars 
   (https://pola.rs/) is a high-performance DataFrame library written in Rust 
   with Python bindings, designed specifically for large-scale data processing 
   and analytical workloads.

The transition to **Polars** represents a fundamental architectural advancement 
that addresses the computational demands of large-scale fuzzy data analysis. 
This migration embodies AxisFuzzy's commitment to performance optimization 
while maintaining complete API compatibility.

**Core Performance Advantages**

**Polars** delivers transformative computational improvements through several 
key technological innovations:

- **Lazy Evaluation Engine**: Query optimization and computational graph analysis 
  reduce overhead for complex multi-step fuzzy operations
- **Native Parallelization**: Multi-threading capabilities leverage modern 
  multi-core architectures for fuzzy number computations
- **Memory Efficiency**: Columnar processing model aligns with FuzzyDataFrame's 
  architecture, optimizing memory utilization patterns
- **Rust-Based Performance**: Zero-copy operations and optimized algorithms 
  deliver substantial speed improvements

**API Compatibility Guarantee**

The **Polars** migration maintains complete backward compatibility:

.. code-block:: python

   # Current pandas-based implementation
   fuzzy_df = FuzzyDataFrame.from_pandas(crisp_data, fuzzifier)
   result = fuzzy_df['temperature'].apply(analysis_function)
   
   # Future Polars-enhanced implementation (identical API)
   fuzzy_df = FuzzyDataFrame.from_pandas(crisp_data, fuzzifier)
   result = fuzzy_df['temperature'].apply(analysis_function)  # Faster execution

**Performance Projections**

Preliminary benchmarking indicates significant improvements:

- **Fuzzification Operations**: 3-5x performance gain for large datasets
- **Aggregation Functions**: 2-4x speedup for complex operations
- **Memory Footprint**: 30-50% reduction in memory usage
- **Query Optimization**: Automatic pipeline optimization

**Extended Analytical Capabilities**

Future **Polars**-enhanced versions will introduce advanced fuzzy operations:

- **Fuzzy Joins**: Similarity-based join operations with fuzzy matching
- **Temporal Fuzzy Analysis**: Time-series operations with fuzzy reasoning
- **Distributed Processing**: Cluster-based fuzzy analysis capabilities
- **Streaming Integration**: Real-time fuzzy data processing support

.. note::
   The **Polars** migration timeline ensures seamless transition with zero 
   breaking changes. Existing FuzzyDataFrame code will automatically benefit 
   from performance improvements without modification.

Conclusion
----------

The data structures in `axisfuzzy.analysis` establish a comprehensive foundation for 
fuzzy data manipulation and analysis, bridging the gap between traditional data 
processing paradigms and fuzzy logic requirements. Through the :class:`FuzzyDataFrame` 
and its supporting ecosystem, developers gain access to powerful tools that maintain 
both computational efficiency and analytical precision.

**Core Architectural Achievements**:

- **Seamless Integration**: Native compatibility with pandas workflows while extending 
  functionality for fuzzy data types and operations
- **Type Safety**: Contract-driven validation ensuring data integrity throughout 
  complex analytical pipelines  
- **Performance Optimization**: Memory-efficient storage and vectorized operations 
  designed for large-scale fuzzy analysis workloads
- **Extensible Design**: Modular architecture supporting custom fuzzy number types 
  and specialized analytical operations

**Practical Impact**:

The unified data structure approach eliminates the traditional friction between 
data preparation and fuzzy analysis, enabling researchers and practitioners to 
focus on analytical insights rather than data transformation complexities. The 
framework's emphasis on familiar pandas-like interfaces reduces learning curves 
while providing the specialized capabilities required for sophisticated fuzzy 
logic applications.

**Future-Ready Foundation**:

This data structure ecosystem positions AxisFuzzy as a scalable platform for 
emerging fuzzy analysis methodologies, with built-in support for streaming data, 
cloud-native deployments, and advanced visualization integration. The commitment 
to API stability ensures long-term viability for research and production systems.

The `axisfuzzy.analysis` data structures transform fuzzy data analysis from a 
specialized, tool-specific domain into an accessible, integrated component of 
modern data science workflows, maintaining scientific rigor while embracing 
practical usability.