BLMBootstrap class

class pytwoway.blm.BLMBootstrap(params)

Bases: object

Class for estimating BLM using bootstrapping.

Parameters

params (ParamsDict) – dictionary of parameters for BLM estimation. Run tw.blm_params().describe_all() for descriptions of all valid parameters.

fit(jdata, sdata, blm_model=None, n_samples=5, n_init_estimator=20, n_best=5, frac_movers=0.1, frac_stayers=0.1, method='parametric', cluster_params=None, reallocate=False, reallocate_jointly=True, reallocate_period='first', ncore=1, verbose=True, rng=None)

Estimate bootstrap.

Parameters
  • jdata (BipartitePandas DataFrame) – event study or collapsed event study format labor data for movers

  • sdata (BipartitePandas DataFrame) – event study or collapsed event study format labor data for stayers

  • blm_model (BLMModel or None) – BLM model estimated using true data; if None, estimate model inside the method. For use with parametric bootstrap.

  • n_samples (int) – number of bootstrap samples to estimate

  • n_init_estimator (int) – number of starting guesses to estimate for each bootstrap sample

  • n_best (int) – take the n_best estimates with the highest likelihoods, and then take the estimate with the highest connectedness, for each bootstrap sample

  • frac_movers (float) – fraction of movers to draw (with replacement) for each bootstrap sample. For use with standard bootstrap.

  • frac_stayers (float) – fraction of stayers to draw (with replacement) for each bootstrap sample. For use with standard bootstrap.

  • method (str) – if ‘parametric’, estimate BLM model on full data, simulate worker types and wages using estimated parameters, estimate BLM model on each set of simulated data, and construct bootstrapped errors; if ‘standard’, estimate standard bootstrap by sampling from original data, estimating BLM model on each sample, and constructing bootstrapped errors

  • cluster_params (ParamsDict or None) – dictionary of parameters for clustering firms. Run bpd.cluster_params().describe_all() for descriptions of all valid parameters. None is equivalent to bpd.cluster_params().

  • reallocate (bool) – if True and method is ‘parametric’, draw worker type proportions independently of firm type; if False, uses worker type proportions that are conditional on firm type

  • reallocate_jointly (bool) – if True, worker type proportions take the average over movers and stayers (i.e. all workers use the same type proportions); if False, consider movers and stayers separately

  • reallocate_period (str) – if ‘first’, compute type proportions based on first period parameters; if ‘second’, compute type proportions based on second period parameters; if ‘all’, compute type proportions based on average over first and second period parameters

  • ncore (int) – number of cores for multiprocessing

  • verbose (bool) – if True, print progress during data cleaning for each sample

  • rng (np.random.Generator or None) – NumPy random number generator; None is equivalent to np.random.default_rng(None)

plot_log_earnings(period='first', grid=True, dpi=None)

Plot log-earnings by worker-firm type pairs.

Parameters
  • period (str) – ‘first’ plots log-earnings in the first period; ‘second’ plots log-earnings in the second period; ‘all’ plots the average over log-earnings in the first and second periods

  • grid (bool) – if True, plot grid

  • dpi (float or None) – dpi for plot

plot_type_proportions(period='first', subset='all', dpi=None)

Plot proportions of worker types at each firm class.

Parameters
  • period (str) – ‘first’ plots type proportions in the first period; ‘second’ plots type proportions in the second period; ‘all’ plots the average over type proportions in the first and second periods

  • subset (str) – ‘all’ plots a weighted average over movers and stayers; ‘movers’ plots movers; ‘stayers’ plots stayers

  • dpi (float or None) – dpi for plot