oumi.quantize#
Quantization module for Oumi.
This module provides comprehensive model quantization capabilities including AWQ, BitsAndBytes, and GGUF quantization methods.
- class oumi.quantize.AwqQuantization[source]#
Bases:
BaseQuantization
AWQ (Activation-aware Weight Quantization) implementation.
This class handles AWQ quantization with support for simulation mode when AWQ libraries are not available.
- quantize(config: QuantizationConfig) QuantizationResult [source]#
Main quantization method for AWQ.
- Parameters:
config – Quantization configuration
- Returns:
Dictionary containing quantization results
- supported_formats: list[str] = ['safetensors']#
- supported_methods: list[str] = ['awq_q4_0', 'awq_q4_1', 'awq_q8_0', 'awq_f16']#
- class oumi.quantize.BaseQuantization[source]#
Bases:
ABC
Abstract base class for all quantization methods.
This class defines the common interface that all quantization implementations must follow, ensuring consistency across different quantization approaches.
- get_supported_formats() list[str] [source]#
Return list of output formats supported by this quantizer.
- Returns:
List of format names (e.g., [“gguf”, “pytorch”])
- get_supported_methods() list[str] [source]#
Return list of quantization methods supported by this quantizer.
- Returns:
List of method names (e.g., [“awq_q4_0”, “awq_q8_0”])
- abstractmethod quantize(config: QuantizationConfig) QuantizationResult [source]#
Main quantization method - must be implemented by subclasses.
- Parameters:
config – Quantization configuration containing model parameters, method, output path, and other settings.
- Returns:
quantized_size_bytes: Size of the quantized model in bytes
output_path: Path to the quantized model
quantization_method: Quantization method used
format_type: Format type of the quantized model
additional_info: Additional method-specific information
- Return type:
QuantizationResult containing
- Raises:
RuntimeError – If quantization fails for any reason
ValueError – If configuration is invalid for this quantizer
- abstractmethod raise_if_requirements_not_met() None [source]#
Raise an error if the requirements are not met.
- supported_formats: list[str] = []#
- supported_methods: list[str] = []#
- supports_format(format_name: str) bool [source]#
Check if this quantizer supports the given output format.
- Parameters:
format_name – Output format name to check
- Returns:
True if format is supported, False otherwise
- supports_method(method: str) bool [source]#
Check if this quantizer supports the given method.
- Parameters:
method – Quantization method name to check
- Returns:
True if method is supported, False otherwise
- validate_config(config: QuantizationConfig) None [source]#
Validate configuration for this quantizer.
- Parameters:
config – Quantization configuration to validate
- Raises:
ValueError – If configuration is invalid for this quantizer
- class oumi.quantize.BitsAndBytesQuantization[source]#
Bases:
BaseQuantization
BitsAndBytes quantization implementation.
This class handles quantization using the BitsAndBytes library, supporting both 4-bit and 8-bit quantization methods.
- quantize(config: QuantizationConfig) QuantizationResult [source]#
Main quantization method for BitsAndBytes.
- Parameters:
config – Quantization configuration
- Returns:
QuantizationResult containing quantization results
- raise_if_requirements_not_met() None [source]#
Check if BitsAndBytes dependencies are available.
- Raises:
RuntimeError – If BitsAndBytes dependencies are not available.
- supported_formats: list[str] = ['safetensors']#
- supported_methods: list[str] = ['bnb_4bit', 'bnb_8bit']#
- class oumi.quantize.QuantizationResult(quantized_size_bytes: int, output_path: str, quantization_method: str, format_type: str, additional_info: dict[str, ~typing.Any] = <factory>)[source]#
Bases:
object
Result of quantization.
- additional_info: dict[str, Any]#
Additional information about the quantization process.
- format_type: str#
Format type of the quantized model.
- output_path: str#
Path to the quantized model.
- quantization_method: str#
Quantization method used.
- quantized_size_bytes: int#
Size of the quantized model in bytes.
- oumi.quantize.quantize(config: QuantizationConfig) QuantizationResult [source]#
Main quantization function that routes to appropriate quantizer.
- Parameters:
config – Quantization configuration containing method, model parameters, and other settings.
- Returns:
QuantizationResult containing quantization results including file sizes and compression ratios.
- Raises:
ValueError – If quantization method is not supported
RuntimeError – If quantization fails