dqm.diversity package๏
Submodules๏
dqm.diversity.diversity module๏
This module, DiversityCalculator, calculates various types of diversity in datasets. It focuses on both lexical and visual diversities, employing statistical indices for different metrics such as richness, variety, color, and shape. Useful in linguistics, image processing, and data analysis, it helps understand the diversity of elements in a dataset.
- Authors:
Faouzi ADJED Anani DJATO
- Dependencies:
numpy collections.Counter
- Classes:
- DiversityCalculator: A class that provides methods for calculating
different types of diversity in datasets.
Functions: None
- Usage:
To use this module, create an instance of the DiversityCalculator class and call its compute_diversity method with appropriate arguments. Example: calculator = DiversityCalculator() diversity_score = calculator.compute_diversity(data, โlexicalโ, โrichnessโ)
- class dqm.diversity.diversity.DiversityCalculator[source]๏
Bases:
object
A class to compute various types of diversity within data.
This class offers methods to calculate lexical and visual diversities in datasets using different statistical measures. It can measure lexical diversity in terms of richness and variety, and visual diversity in terms of color and shape using indices like Shannon, Simpson, and Gini-Simpson.
- compute_diversity(data, diversity_type, need)[source]๏
Compute diversity of given data based on type and need.
- Parameters:
data (Iterable) โ Dataset for diversity computation.
diversity_type (str) โ Type of diversity (โlexicalโ or โvisualโ).
need (str) โ Specific need for calculation (โrichnessโ, โvarietyโ, โcolorโ, โshapeโ)
- Returns:
Calculated diversity value.
- Return type:
diversity (float)
- validate_inputs(diversity_type, need)[source]๏
This method is added just to have at least two public methods in a class as required by Python coding standards.
This method validates the inputs for compute_diversity method.
Args: diversity_type (str): Type of diversity to be computed. need (str): Specific need for diversity calculation.
- Return type:
None
dqm.diversity.metric module๏
Diversity Index Calculator
This module defines the DiversityIndexCalculator class, which offers methods to calculate various diversity indices for categorical data. These indices are useful in statistical analysis and data science to understand the distribution and diversity of categorical data.
- Authors:
Faouzi ADJED Anani DJATO
- Dependencies:
pandas
- Classes:
DiversityIndexCalculator: Provides methods for calculating diversity indices in a dataset.
Functions: None
- Usage:
from metric import DiversityIndexCalculator calculator = DiversityIndexCalculator() dataset = pandas.Series([โฆ]) # Replace with your data simpson_index = calculator.simpson(dataset) gini_index = calculator.gini(dataset)
These methods are useful for ecological, sociological, and various other types of categorical data analysis.
- class dqm.diversity.metric.DiversityIndexCalculator[source]๏
Bases:
object
This class provides methods to calculate various diversity indices for a given dataset.
- gini(variable)[source]๏
Compute the Gini-Simpson index, a metric for assessing diversity that takes into consideration both the quantity of distinct categories and the uniformity of their distribution.
- Parameters:
variable (Series) โ The data series for which to calculate the Gini-Simpson index.
- Returns:
The Gini-Simpson index.
- Return type:
g (float)
- num(variable)[source]๏
Calculate the number of each category of a variable.
- Parameters:
variable (Series) โ The data series for which to count categories.
- Returns:
The count of each category.
- Return type:
n (Series)