Measures module
Classes for computing cluster measures. Note: use classes rather than nested functions because nested functions cannot be pickled (source: https://stackoverflow.com/a/12022055/17333120).
- class bipartitepandas.measures.measures.CDFs(cdf_resolution=10, measure='quantile_all', outcome_col='y')
Bases:
object
Generate cdfs of compensation for firms. Used for clustering.
- Parameters
cdf_resolution (int) – how many values to use to approximate the cdfs
measure (str) – how to compute the cdfs (‘quantile_all’ to get quantiles from entire set of data, then have firm-level values between 0 and 1; ‘quantile_firm’ to get quantiles at the firm-level and have values be compensations)
outcome_col (str) – outcome_col column to use for data
- class bipartitepandas.measures.measures.Moments(measures='mean', outcome_col='y')
Bases:
object
Generate compensation moments for firms. Used for clustering.
- Parameters
measures (str or list of str) – how to compute the measures (‘mean’ to compute average income within each firm; ‘var’ to compute variance of income within each firm; ‘max’ to compute max income within each firm; ‘min’ to compute min income within each firm)
outcome_col (str) – outcome_col column to use for data