dqm.diversity package

Submodules

dqm.diversity.main module

dqm.diversity.metric module

Diversity Index Calculator

This module defines the DiversityIndexCalculator class, which offers methods to calculate various diversity indices for categorical data. These indices are useful in statistical analysis and data science to understand the distribution and diversity of categorical data.

Authors:

Faouzi ADJED Anani DJATO

Dependencies:

pandas

Classes:

DiversityIndexCalculator: Provides methods for calculating diversity indices in a dataset.

Functions: None

Usage:

from metric import DiversityIndexCalculator calculator = DiversityIndexCalculator() dataset = pandas.Series([…]) # Replace with your data simpson_index = calculator.simpson(dataset) gini_index = calculator.gini(dataset)

These methods are useful for ecological, sociological, and various other types of categorical data analysis.

class dqm.diversity.metric.DiversityIndexCalculator[source]

Bases: object

This class provides methods to calculate various diversity indices for a given dataset.

num()[source]

Counts the number of each category in a dataset.

simpson()[source]

Calculates the Simpson diversity index.

prob()[source]

Calculates the frequencies of each category in a dataset.

gini()[source]

Calculates the Gini-Simpson index.

RD(variable)[source]
Return type:

float

gini(variable)[source]

Compute the Gini-Simpson index, a metric for assessing diversity that takes into consideration both the quantity of distinct categories and the uniformity of their distribution.

Parameters:

variable (Series) – The data series for which to calculate the Gini-Simpson index.

Returns:

The Gini-Simpson index.

Return type:

g (float)

num(variable)[source]

Calculate the number of each category of a variable.

Parameters:

variable (Series) – The data series for which to count categories.

Returns:

The count of each category.

Return type:

n (Series)

prob(variable)[source]

Calculate the frequencies of each category in a variable.

Parameters:

variable (Series) – The data series for which to calculate frequencies.

Returns:

The frequency of each category.

Return type:

p (Series)

simpson(variable)[source]

Calculate Simpson’s index, which is a measure of diversity.

Parameters:

variable (Series) – The data series for which to calculate the Simpson index.

Returns:

The Simpson diversity index.

Return type:

s (float)

dqm.diversity.twe_logger module

The confiance_logger module provides a preconfigured logger for logging messages with specified formatting and output control. It can log messages to the standard output, to a specified file, or both.

Usage: Import the module and get the default logger: import twe_logger logger = twe_logger.get_logger()

If you need a logger with different parameters, call get_logger with the desired parameters:

logger = twe_logger.get_logger(filename=”my_logs.log”) logger = twe_logger.get_logger(name=”my_logger”, level=’debug’, filename=’my_logs.log’, output=”both”)

Then, use the logger within your code:

logger.info(“This is an info message”) logger.error(“This is an error message”)

dqm.diversity.twe_logger.get_logger(name='twe_logger', level='debug', filename=None, output=None)[source]

Creates and returns a logger.

Parameters:
  • name (str, optional) – The name of the logger.

  • level (int or str, optional) – The logging level.

  • filename (str, optional) – The name of the file where the logger should write.

  • output (str, optional) – Where should the logger write. Can be ‘stdout’, ‘file’, or ‘both’.

Returns:

The logger.

Return type:

logging.Logger

dqm.diversity.twe_logger.log_str_to_level(str_level)[source]

Converts a string to a corresponding logging level.

Parameters:

str_level (str) – The logging level as a string.

Returns:

The corresponding logging level.

Return type:

int

Module contents