BipartiteLongCollapsed class

class bipartitepandas.bipartitelongcollapsed.BipartiteLongCollapsed(*args, col_reference_dict=None, col_collapse_dict=None, **kwargs)

Bases: BipartiteLongBase

Class for bipartite networks of firms and workers in collapsed long form (i.e. employment spells are collapsed into a single observation). Inherits from BipartiteLongBase.

Parameters
  • *args – arguments for BipartiteLongBase

  • col_reference_dict (dict or None) – clarify which columns are associated with a general column name, e.g. {‘i’: ‘i’, ‘j’: [‘j1’, ‘j2’]}; None is equivalent to {}

  • col_collapse_dict (dict or None) – how to collapse column (None indicates the column should be dropped), e.g. {‘y’: ‘mean’}; None is equivalent to {}

  • **kwargs – keyword arguments for BipartiteLongBase

get_worker_m(is_sorted=False)

Get NumPy array indicating whether the worker associated with each observation is a mover.

Parameters

is_sorted (bool) – not used for collapsed long format

Returns

indicates whether the worker associated with each observation is a mover

Return type

(NumPy Array)

recollapse(drop_returns_to_stays=False, is_sorted=False, copy=True)

Recollapse data by job spells (so each spell for a particular worker at a particular firm is one observation). This method is necessary in the case of biconnected data - it can occur that a worker works at firms A and B in the order A B A, but the biconnected components removes firm B. So the data is now A A, and needs to be recollapsed so this is marked as a stayer.

Parameters
  • drop_returns_to_stays (bool) – if True, when recollapsing collapsed data, drop observations that need to be recollapsed instead of collapsing (this is for computational efficiency when re-collapsing data for leave-one-out connected components, where intermediate observations can be dropped, causing a worker who returns to a firm to become a stayer)

  • is_sorted (bool) – if False, dataframe will be sorted by i (and t, if included). Returned dataframe will be sorted. Sorting may alter original dataframe if copy is set to False. Set is_sorted to True if dataframe is already sorted.

  • copy (bool) – if False, avoid copy

Returns

recollapsed dataframe

Return type

(BipartiteLongCollapsed)

to_permutedeventstudy(order='sequential', move_to_worker=False, is_sorted=False, copy=True, rng=None)

Return collapsed long form data reformatted into collapsed permuted event study data. In this method, permuting the data means combining each set of two observations drawn from a single worker into an event study observation (e.g. if a worker works at firms A, B, and C, this will create data with rows A-B; B-C; and A-C).

Parameters
  • order (str) – if ‘sequential’, each observation will be in sequential order; if ‘income’, order will be set based on the average income of the worker

  • move_to_worker (bool) – if True, each move is treated as a new worker

  • is_sorted (bool) – if False, dataframe will be sorted by i (and t, if included). Returned dataframe will be sorted. Sorting may alter original dataframe if copy is set to False. Set is_sorted to True if dataframe is already sorted.

  • copy (bool) – if False, avoid copy

  • rng (np.random.Generator or None) – NumPy random number generator; None is equivalent to np.random.default_rng(None)

Returns

permuted collapsed event study dataframe

Return type

(Pandas DataFrame)

uncollapse(drop_no_collapse_columns=True, is_sorted=False, copy=True)

Return collapsed long data reformatted into long data, by assuming variables constant over spells.

Parameters
  • drop_no_collapse_columns (bool) – if True, columns marked by self.col_collapse_dict as None (i.e. they should be dropped) will be dropped

  • is_sorted (bool) – if False, dataframe will be sorted by i (and t, if included). Returned dataframe will be sorted. Sorting may alter original dataframe if copy is set to False. Set is_sorted to True if dataframe is already sorted.

  • copy (bool) – if False, avoid copy

Returns

collapsed long data reformatted as long data

Return type

(BipartiteLong)