Borovickova-Shimer example

[1]:
# Add PyTwoWay to system path (do not run this)
# import sys
# sys.path.append('../../..')

Import the PyTwoWay package

Make sure to install it using pip install pytwoway.

[2]:
from pandas import Series
import pytwoway as tw
import bipartitepandas as bpd

First, check out parameter options

Do this by running:

  • Cleaning - bpd.clean_params().describe_all()

  • Simulating - tw.sim_bs_params().describe_all()

Alternatively, run x_params().keys() to view all the keys for a parameter dictionary, then x_params().describe(key) to get a description for a single key.

Second, set parameter choices

Note

We specify connectedness=strongly_connected in clean_params because we need to compute the strongly connected set of firms to estimate the Borovickova-Shimer estimator.

Note

We set copy=False in clean_params to avoid unnecessary copies (although this may modify the original dataframe).

[3]:
# Cleaning
clean_params_1 = bpd.clean_params(
    {
        'drop_single_stayers': True,
        'drop_returns': 'returns',
        'copy': False,
        'verbose': False
    }
)
clean_params_2 = bpd.clean_params(
    {
        'is_sorted': True,
        'copy': False,
        'verbose': False
    }
)
# Simulating
sim_params = tw.sim_bs_params(
    {
        'n_workers': 10000,
        'n_firms': 100,
        'sigma_lambda_sq': 1.25,
        'sigma_mu_sq': 0.75,
        'sigma_wages': 2.5,
        'rho': -0.5
    }
)

Third, extract data (we simulate for the example)

PyTwoWay contains the class SimBS which we use here to simulate from the Borovickova-Shimer dgp. If you have your own data, you can import it during this step. Load it as a Pandas DataFrame and then convert it into a BipartitePandas DataFrame in the next step.

[4]:
sim_data = tw.SimBS(sim_params).simulate()

Fourth, prepare data

This is exactly how you should prepare real data prior to running the Borovickova-Shimer estimator.

  • First, we convert the data into a BipartitePandas DataFrame

  • Second, we clean the data (e.g. drop NaN observations, drop returns, make sure firm and worker ids are contiguous, etc.)

  • Third, we collapse the data at the worker-firm spell level (take mean wage over the spell)

  • Fourth, we ensure all firms and workers have at least 2 observations

  • Fifth, we clean up firm and worker ids

Further details on BipartitePandas can be found in the package documentation, available here.

[5]:
# Convert into BipartitePandas DataFrame
bdf = bpd.BipartiteDataFrame(sim_data)
# Clean
bdf = bdf.clean(clean_params_1)
# Collapse
bdf = bdf.collapse(is_sorted=True, copy=False)
# Make sure all workers and firms have at least 2 observations
bdf = bdf.min_joint_obs_frame(2, 2, 'j', 'i', is_sorted=True, copy=False)
# Clean up worker and firm ids
bdf = bdf.clean(clean_params_2)

Fifth, initialize and run the estimator

Note

We can also fit the alternative estimator by specifying alternative_estimator=True.

[6]:
# Initialize Borovickova-Shimer estimator
bs_estimator = tw.BSEstimator()
# Fit Borovickova-Shimer estimator
bs_estimator.fit(bdf, alternative_estimator=False)

Finally, investigate the results

Results correspond to:

  • y: income (outcome) column

  • lambda: worker effects

  • mu: firm effects

[7]:
bs_estimator.res
[7]:
{'mean(y)': -0.07862680349885089,
 'var(lambda)': 1.2422121496618905,
 'var(mu)': 0.8050475806005325,
 'cov(lambda, mu)': -0.47932450909729013,
 'corr(lambda, mu)': -0.4793149502917461}