BipartiteLongCollapsed class

class bipartitepandas.bipartitelongcollapsed.BipartiteLongCollapsed(*args, col_reference_dict=None, col_collapse_dict=None, **kwargs)

Bases: BipartiteLongBase

Class for bipartite networks of firms and workers in collapsed long form (i.e. employment spells are collapsed into a single observation). Inherits from BipartiteLongBase.

Parameters

*args – arguments for BipartiteLongBase
col_reference_dict (dict or None) – clarify which columns are associated with a general column name, e.g. {‘i’: ‘i’, ‘j’: [‘j1’, ‘j2’]}; None is equivalent to {}
col_collapse_dict (dict or None) – how to collapse column (None indicates the column should be dropped), e.g. {‘y’: ‘mean’}; None is equivalent to {}
**kwargs – keyword arguments for BipartiteLongBase

get_worker_m(is_sorted=False)

Get NumPy array indicating whether the worker associated with each observation is a mover.

Parameters: is_sorted (bool) – not used for collapsed long format
Returns: indicates whether the worker associated with each observation is a mover
Return type: (NumPy Array)

recollapse(drop_returns_to_stays=False, is_sorted=False, copy=True)

Recollapse data by job spells (so each spell for a particular worker at a particular firm is one observation). This method is necessary in the case of biconnected data - it can occur that a worker works at firms A and B in the order A B A, but the biconnected components removes firm B. So the data is now A A, and needs to be recollapsed so this is marked as a stayer.

Parameters

drop_returns_to_stays (bool) – if True, when recollapsing collapsed data, drop observations that need to be recollapsed instead of collapsing (this is for computational efficiency when re-collapsing data for leave-one-out connected components, where intermediate observations can be dropped, causing a worker who returns to a firm to become a stayer)
is_sorted (bool) – if False, dataframe will be sorted by i (and t, if included). Returned dataframe will be sorted. Sorting may alter original dataframe if copy is set to False. Set is_sorted to True if dataframe is already sorted.
copy (bool) – if False, avoid copy

Returns

recollapsed dataframe

Return type

(BipartiteLongCollapsed)

to_permutedeventstudy(order='sequential', move_to_worker=False, is_sorted=False, copy=True, rng=None)

Return collapsed long form data reformatted into collapsed permuted event study data. In this method, permuting the data means combining each set of two observations drawn from a single worker into an event study observation (e.g. if a worker works at firms A, B, and C, this will create data with rows A-B; B-C; and A-C).

Parameters

order (str) – if ‘sequential’, each observation will be in sequential order; if ‘income’, order will be set based on the average income of the worker
move_to_worker (bool) – if True, each move is treated as a new worker
is_sorted (bool) – if False, dataframe will be sorted by i (and t, if included). Returned dataframe will be sorted. Sorting may alter original dataframe if copy is set to False. Set is_sorted to True if dataframe is already sorted.
copy (bool) – if False, avoid copy
rng (np.random.Generator or None) – NumPy random number generator; None is equivalent to np.random.default_rng(None)

Returns

permuted collapsed event study dataframe

Return type

(Pandas DataFrame)

uncollapse(drop_no_collapse_columns=True, is_sorted=False, copy=True)

Return collapsed long data reformatted into long data, by assuming variables constant over spells.

Parameters

drop_no_collapse_columns (bool) – if True, columns marked by self.col_collapse_dict as None (i.e. they should be dropped) will be dropped
is_sorted (bool) – if False, dataframe will be sorted by i (and t, if included). Returned dataframe will be sorted. Sorting may alter original dataframe if copy is set to False. Set is_sorted to True if dataframe is already sorted.
copy (bool) – if False, avoid copy

Returns

collapsed long data reformatted as long data

Return type

(BipartiteLong)