Borovickova-Shimer example
[1]:
# Add PyTwoWay to system path (do not run this)
# import sys
# sys.path.append('../../..')
Import the PyTwoWay package
Make sure to install it using pip install pytwoway
.
[2]:
from pandas import Series
import pytwoway as tw
import bipartitepandas as bpd
First, check out parameter options
Do this by running:
Cleaning -
bpd.clean_params().describe_all()
Simulating -
tw.sim_bs_params().describe_all()
Alternatively, run x_params().keys()
to view all the keys for a parameter dictionary, then x_params().describe(key)
to get a description for a single key.
Second, set parameter choices
Note
We specify connectedness=strongly_connected
in clean_params
because we need to compute the strongly connected set of firms to estimate the Borovickova-Shimer estimator.
Note
We set copy=False
in clean_params
to avoid unnecessary copies (although this may modify the original dataframe).
[3]:
# Cleaning
clean_params_1 = bpd.clean_params(
{
'drop_single_stayers': True,
'drop_returns': 'returns',
'copy': False,
'verbose': False
}
)
clean_params_2 = bpd.clean_params(
{
'is_sorted': True,
'copy': False,
'verbose': False
}
)
# Simulating
sim_params = tw.sim_bs_params(
{
'n_workers': 10000,
'n_firms': 100,
'sigma_lambda_sq': 1.25,
'sigma_mu_sq': 0.75,
'sigma_wages': 2.5,
'rho': -0.5
}
)
Third, extract data (we simulate for the example)
PyTwoWay
contains the class SimBS
which we use here to simulate from the Borovickova-Shimer dgp. If you have your own data, you can import it during this step. Load it as a Pandas DataFrame
and then convert it into a BipartitePandas DataFrame
in the next step.
[4]:
sim_data = tw.SimBS(sim_params).simulate()
Fourth, prepare data
This is exactly how you should prepare real data prior to running the Borovickova-Shimer estimator.
First, we convert the data into a
BipartitePandas DataFrame
Second, we clean the data (e.g. drop NaN observations, drop returns, make sure firm and worker ids are contiguous, etc.)
Third, we collapse the data at the worker-firm spell level (take mean wage over the spell)
Fourth, we ensure all firms and workers have at least 2 observations
Fifth, we clean up firm and worker ids
Further details on BipartitePandas
can be found in the package documentation, available here.
[5]:
# Convert into BipartitePandas DataFrame
bdf = bpd.BipartiteDataFrame(sim_data)
# Clean
bdf = bdf.clean(clean_params_1)
# Collapse
bdf = bdf.collapse(is_sorted=True, copy=False)
# Make sure all workers and firms have at least 2 observations
bdf = bdf.min_joint_obs_frame(2, 2, 'j', 'i', is_sorted=True, copy=False)
# Clean up worker and firm ids
bdf = bdf.clean(clean_params_2)
Fifth, initialize and run the estimator
Note
We can also fit the alternative estimator by specifying alternative_estimator=True
.
[6]:
# Initialize Borovickova-Shimer estimator
bs_estimator = tw.BSEstimator()
# Fit Borovickova-Shimer estimator
bs_estimator.fit(bdf, alternative_estimator=False)
Finally, investigate the results
Results correspond to:
y
: income (outcome) columnlambda
: worker effectsmu
: firm effects
[7]:
bs_estimator.res
[7]:
{'mean(y)': -0.07862680349885089,
'var(lambda)': 1.2422121496618905,
'var(mu)': 0.8050475806005325,
'cov(lambda, mu)': -0.47932450909729013,
'corr(lambda, mu)': -0.4793149502917461}