Attrition class
- class pytwoway.attrition.Attrition(min_movers_threshold=15, attrition_how=None, fe_params=None, cre_params=None, estimate_bs=False, cluster_params=None, clean_params=None)
Bases:
object
Class of Attrition, which generates attrition plots using bipartite labor data.
- Parameters
min_movers_threshold (int) – minimum number of movers required to keep a firm
attrition_how (tw.attrition_utils.AttritionIncreasing() or tw.attrition_utils.AttritionDecreasing()) – instance of AttritionIncreasing() or AttritionDecreasing(), used to specify if attrition should use increasing (building up from a fixed set of firms) or decreasing (with varying sets of firms) fractions of movers; None is equivalent to AttritionIncreasing()
fe_params (ParamsDict or None) – dictionary of parameters for FE estimation. Run tw.fe_params().describe_all() for descriptions of all valid parameters. None is equivalent to tw.fe_params().
cre_params (ParamsDict or None) – dictionary of parameters for CRE estimation. Run tw.cre_params().describe_all() for descriptions of all valid parameters. None is equivalent to tw.cre_params().
estimate_bs (bool) – if True, estimate Borovickova-Shimer model
cluster_params (ParamsDict or None) – dictionary of parameters for clustering in CRE estimation. Run bpd.cluster_params().describe_all() for descriptions of all valid parameters. None is equivalent to bpd.cluster_params().
clean_params (ParamsDict or None) – dictionary of parameters for cleaning. Run bpd.clean_params().describe_all() for descriptions of all valid parameters. None is equivalent to bpd.clean_params().
- attrition(bdf, N=10, ncore=1, copy=False, rng=None)
Run Monte Carlo on attrition estimations of TwoWay to estimate variance of parameter estimates given fraction of movers remaining. Note that this overwrites the stored dataframe, meaning if you want to run attrition with different threshold number of movers, you will have to create multiple Attrition objects, or alternatively, run this method with an increasing threshold for each iteration. Saves results as a dict of dicts of lists of lists in the class attribute .attrition_res: in the first dictionary we choose ‘non_he’ or ‘he’; in the second dictionary we choose ‘fe’ or ‘cre’; then, we are given a list of results for each Monte Carlo simulation; and finally, for a particular Monte Carlo simulation, we are given a list of results for each specified fraction of movers.
- Parameters
bdf (BipartiteBase) – bipartite dataframe (NOTE: we need to avoid saving bdf as a class attribute, otherwise multiprocessing will create a separate copy of it for each core used)
N (int) – number of simulations
ncore (int) – number of cores to use
copy (bool) – if False, avoid copy
rng (np.random.Generator or None) – NumPy random number generator. This overrides the random number generators for FE and CRE. None is equivalent to np.random.default_rng(None).
- boxplots(fe=True, ho=True, he=True, cre=True, bs1=True, bs2=True, xticks_round=1)
Generate attrition result boxplots.
- Parameters
fe (bool) – if True, plot FE results
ho (bool) – if True, plot FE-HO results
he (bool) – if True, plot FE-HE results
cre (bool) – if True, plot CRE results
bs1 (bool) – if True, plot Borovickova-Shimer results for the standard estimator
bs2 (bool) – if True, plot Borovickova-Shimer results for the alternative estimator
line_at_movers_per_firm (bool) – if True, plot a dashed line where movers per firm in the subsample is approximately the number of movers per firm in the full sample
xticks_round (int) – how many digits to round x ticks
- plots(fe=True, ho=True, he=True, cre=True, bs1=True, bs2=True, line_at_movers_per_firm=True, xticks_round=1)
Generate attrition result plots.
- Parameters
fe (bool) – if True, plot FE results
ho (bool) – if True, plot FE-HO results
he (bool) – if True, plot FE-HE results
cre (bool) – if True, plot CRE results
bs1 (bool) – if True, plot Borovickova-Shimer results for the standard estimator
bs2 (bool) – if True, plot Borovickova-Shimer results for the alternative estimator
line_at_movers_per_firm (bool) – if True, plot a dashed line where movers per firm in the subsample is approximately the number of movers per firm in the full sample
xticks_round (int) – how many digits to round x ticks