Python API

Overview

The main BipartitePandas API is split into eleven classes, four of which are base classes and one of which is for simulating bipartite data. It also has two modules for clustering: one for computing measures and one for grouping on measures. BipartitePandas is canonically imported using

import bipartitepandas as bpd

Main classes

  • bipartitepandas.BipartiteDataFrame: Class to easily construct bipartite dataframes without explicitly specifying a format

  • bipartitepandas.BipartiteLong: Class for bipartite networks in long format

  • bipartitepandas.BipartiteLongCollapsed: Class for bipartite networks in collapsed long format (i.e. employment spells are collapsed into a single observation)

  • bipartitepandas.BipartiteEventStudy: Class for bipartite networks in event study format

  • bipartitepandas.BipartiteEventStudyCollapsed: Class for bipartite networks in collapsed event study format (i.e. employment spells are collapsed into a single observation)

  • bipartitepandas.BipartiteExtendedEventStudy: Class for bipartite networks in extended event study format

  • bipartitepandas.BipartiteExtendedEventStudyCollapsed: Class for bipartite networks in collapsed extended event study format (i.e. employment spells are collapsed into a single observation)

  • bipartitepandas.SimBipartite: Class for simulating bipartite networks

Base classes

  • bipartitepandas.BipartiteBase: Base class for BipartiteLongBase and BipartiteEventStudyBase. All methods are usable by any class that inherits from BipartiteBase.

  • bipartitepandas.BipartiteLongBase: Base class for BipartiteLong and BipartiteLongCollapsed. All methods are usable by any class that inherits from BipartiteLongBase.

  • bipartitepandas.BipartiteEventStudyBase: Base class for BipartiteEventStudy and BipartiteEventStudyCollapsed. All methods are usable by any class that inherits from BipartiteEventStudyBase.

  • bipartitepandas.BipartiteExtendedEventStudyBase: Base class for BipartiteExtendedEventStudy and BipartiteExtendedEventStudyCollapsed. All methods are usable by any class that inherits from BipartiteExtendedEventStudyBase.

Clustering modules

  • bipartitepandas.measures: Module for computing measures

  • bipartitepandas.grouping: Module for grouping on measures

Classes and Methods

bipartitepandas.BipartiteDataFrame

BipartiteDataFrame(i[, j, j1, j2, y, y1, ...])

Constructor class for easily constructing BipartitePandas dataframes without explicitly specifying a format.

bipartitepandas.BipartiteBase

BipartiteBase(*args[, columns_req, ...])

Base class for BipartitePandas, where BipartitePandas gives a bipartite network of firms and workers.

add_column(col_name[, col_data, ...])

Safe method for adding custom columns.

cluster([params, rng])

Cluster data and assign a new column giving the cluster for each firm.

copy([deep])

Return copy of self.

diagnostic()

Run diagnostic and print diagnostic report.

drop(labels[, axis, inplace, ...])

Drop labels along axis.

drop_rows(rows[, drop_returns_to_stays, ...])

Drop particular rows.

get_column_properties(col_name)

Return dictionary linking properties to their value for a particular column.

log(message[, level])

Log a message at the specified level.

log_on([on])

Toggle logger on or off.

merge(*args, **kwargs)

Merge two BipartiteBase objects.

min_movers_firms([threshold, is_sorted, copy])

List firms with at least threshold many movers.

n_clusters()

Get the number of unique clusters.

n_firms()

Get the number of unique firms.

n_unique_ids(id_col)

Number of unique ids in column.

n_workers()

Get the number of unique workers.

original_ids([copy])

Return self merged with original column ids.

print_column_properties(col_name)

Print properties associated with a particular column.

rename(rename_dict[, axis, inplace, ...])

Rename a column.

set_column_properties(col_name[, ...])

Safe method for setting the properties of pre-existing custom columns.

sort_cols([copy])

Sort frame columns (not in-place).

sort_rows([j_if_no_t, is_sorted, copy])

Sort rows by i and t.

summary()

Print summary statistics.

unique_ids(id_col)

Unique ids in column.

bipartitepandas.BipartiteLongBase

BipartiteLongBase(*args[, col_reference_dict])

Base class for BipartiteLong and BipartiteLongCollapsed, where BipartiteLong and BipartiteLongCollapsed give a bipartite network of firms and workers in long and collapsed long form, respectively.

clean([params])

Clean data to make sure there are no NaN or duplicate observations, observations where workers leave a firm then return to it are removed, firms are connected by movers, and categorical ids are contiguous.

construct_artificial_time([time_per_worker, ...])

Construct artificial time column(s) to enable conversion to (collapsed) event study format.

drop_ids(id_col, drop_ids_list[, ...])

Drop ids belonging to a given set of ids.

gen_m([force, copy])

Generate m column for data (m == 0 if stayer, m == 1 or 2 if mover).

keep_ids(id_col, keep_ids_list[, ...])

Only keep ids belonging to a given set of ids.

keep_rows(rows_list[, ...])

Only keep particular rows.

min_joint_obs_frame([threshold_1, ...])

Return dataframe where column 1 ids have at least threshold_1 many observations and column 2 ids have at least threshold_2 many observations.

min_movers_frame([threshold, ...])

Return dataframe where all firms have at least threshold many movers.

min_moves_firms([threshold])

List firms with at least threshold many moves.

min_moves_frame([threshold, ...])

Return dataframe where all firms have at least threshold many moves.

min_obs_ids([threshold, id_col, is_sorted, copy])

List column ids with at least threshold many observations.

min_obs_frame([threshold, id_col, ...])

Return dataframe of column ids with at least threshold many observations.

min_workers_firms([threshold, is_sorted, copy])

List firms with at least threshold many workers.

min_workers_frame([threshold, ...])

Return dataframe of firms with at least threshold many workers.

to_eventstudy([move_to_worker, is_sorted, copy])

Return (collapsed) long form data reformatted into (collapsed) event study data.

to_extendedeventstudy([periods_pre, ...])

Return (collapsed) long form data reformatted into (collapsed) extended event study data.

bipartitepandas.BipartiteLong

BipartiteLong(*args[, col_reference_dict, ...])

Class for bipartite networks of firms and workers in long form.

collapse([level, is_sorted, copy])

Collapse long data at the worker-firm spell/match level (so each spell/match for a particular worker at a particular firm becomes one observation).

fill_missing_periods([fill_dict, is_sorted, ...])

Return Pandas dataframe of long format data with missing periods filled in as unemployed.

get_worker_m([is_sorted])

Get NumPy array indicating whether the worker associated with each observation is a mover.

bipartitepandas.BipartiteLongCollapsed

BipartiteLongCollapsed(*args[, ...])

Class for bipartite networks of firms and workers in collapsed long form (i.e.

get_worker_m([is_sorted])

Get NumPy array indicating whether the worker associated with each observation is a mover.

recollapse([drop_returns_to_stays, ...])

Recollapse data by job spells (so each spell for a particular worker at a particular firm is one observation).

to_permutedeventstudy([order, ...])

Return collapsed long form data reformatted into collapsed permuted event study data.

uncollapse([drop_no_collapse_columns, ...])

Return collapsed long data reformatted into long data, by assuming variables constant over spells.

bipartitepandas.BipartiteEventStudyBase

BipartiteEventStudyBase(*args[, ...])

Base class for BipartiteEventStudy and BipartiteEventStudyCollapsed, which give bipartite networks of firms and workers in event study and collapsed event study form, respectively.

clean([params])

Clean data to make sure there are no NaN or duplicate observations, observations where workers leave a firm then return to it are removed, firms are connected by movers, and categorical ids are contiguous.

construct_artificial_time([time_per_worker, ...])

Construct artificial time columns to enable conversion to (collapsed) long format.

diagnostic()

Run diagnostic and print diagnostic report.

drop_ids(id_col, drop_ids_list[, ...])

Drop ids belonging to a given set of ids.

gen_m([force, copy])

Generate m column for data (m == 0 if stayer, m == 1 if mover).

get_cs([copy])

Return (collapsed) event study data reformatted into cross section data.

keep_ids(id_col, keep_ids_list[, ...])

Only keep ids belonging to a given set of ids.

keep_rows(rows[, drop_returns_to_stays, ...])

Only keep particular rows.

min_joint_obs_frame([threshold_1, ...])

Return dataframe where column 1 ids have at least threshold_1 many observations and column 2 ids have at least threshold_2 many observations.

min_movers_frame([threshold, ...])

Return dataframe where all firms have at least threshold many movers.

min_moves_firms([threshold, is_sorted, copy])

List firms with at least threshold many moves.

min_moves_frame([threshold, ...])

Return dataframe where all firms have at least threshold many moves.

min_obs_ids([threshold, id_col, is_sorted, copy])

List column ids with at least threshold many observations.

min_obs_frame([threshold, id_col, ...])

Return dataframe of column ids with at least threshold many observations.

min_workers_firms([threshold, is_sorted, copy])

List firms with at least threshold many workers.

min_workers_frame([threshold, ...])

Return dataframe of firms with at least threshold many workers.

to_long([is_clean, drop_no_split_columns, ...])

Return (collapsed) event study data reformatted into (collapsed) long form.

bipartitepandas.BipartiteEventStudy

BipartiteEventStudy(*args[, ...])

Class for bipartite networks of firms and workers in event study form.

collapse([level, is_sorted, copy])

Collapse event study data at the worker-firm spell level (so each spell for a particular worker at a particular firm becomes one observation).

get_worker_m([is_sorted])

Get NumPy array indicating whether the worker associated with each observation is a mover.

bipartitepandas.BipartiteEventStudyCollapsed

BipartiteEventStudyCollapsed(*args[, ...])

Class for bipartite networks of firms and workers in collapsed event study form (i.e.

get_worker_m([is_sorted])

Get NumPy array indicating whether the worker associated with each observation is a mover.

uncollapse([drop_no_collapse_columns, ...])

Return collapsed event study data reformatted into event study data, by assuming variables constant over spells.

bipartitepandas.BipartiteExtendedEventStudyBase

BipartiteExtendedEventStudyBase(*args[, ...])

Base class for BipartiteExtendedEventStudy and BipartiteExtendedEventStudyCollapsed, which give bipartite networks of firms and workers in event study and collapsed event study form, respectively.

clean([params])

Clean data to make sure there are no NaN or duplicate observations, observations where workers leave a firm then return to it are removed, firms are connected by movers, and categorical ids are contiguous.

construct_artificial_time([time_per_worker, ...])

Construct artificial time columns to enable conversion to (collapsed) long format.

diagnostic()

Run diagnostic and print diagnostic report.

drop_ids(id_col, drop_ids_list[, ...])

Drop ids belonging to a given set of ids.

gen_m([force, copy])

Generate m column for data (m == 0 if stayer, m == 1 if mover).

keep_ids(id_col, keep_ids_list[, ...])

Only keep ids belonging to a given set of ids.

keep_rows(rows[, drop_returns_to_stays, ...])

Only keep particular rows.

min_joint_obs_frame([threshold_1, ...])

Return dataframe where column 1 ids have at least threshold_1 many observations and column 2 ids have at least threshold_2 many observations.

min_movers_frame([threshold, ...])

Return dataframe where all firms have at least threshold many movers.

min_moves_firms([threshold, is_sorted, copy])

List firms with at least threshold many moves.

min_moves_frame([threshold, ...])

Return dataframe where all firms have at least threshold many moves.

min_obs_ids([threshold, id_col, is_sorted, copy])

List column ids with at least threshold many observations.

min_obs_frame([threshold, id_col, ...])

Return dataframe of column ids with at least threshold many observations.

min_workers_firms([threshold, is_sorted, copy])

List firms with at least threshold many workers.

min_workers_frame([threshold, ...])

Return dataframe of firms with at least threshold many workers.

to_long([drop_no_split_columns, is_sorted, copy])

Return (collapsed) extended event study data reformatted into (collapsed) long form.

bipartitepandas.BipartiteExtendedEventStudy

BipartiteExtendedEventStudy(*args[, ...])

Class for bipartite networks of firms and workers in extended event study form.

collapse([level, is_sorted, copy])

Collapse extended event study data at the worker-firm spell level (so each spell for a particular worker at a particular firm becomes one observation).

get_worker_m([is_sorted])

Get NumPy array indicating whether the worker associated with each observation is a mover.

bipartitepandas.BipartiteEventExtendedStudyCollapsed

bipartitepandas.SimBipartite

SimBipartite([params])

Class of SimBipartite, where SimBipartite simulates a bipartite network of firms and workers.

simulate([rng])

Simulate panel data corresponding to the calibrated model.

Modules and Methods

bipartitepandas.measures

CDFs([cdf_resolution, measure, outcome_col])

Generate cdfs of compensation for firms.

Moments([measures, outcome_col])

Generate compensation moments for firms.

bipartitepandas.grouping

KMeans(**kwargs)

Compute KMeans groups for data.

Quantiles([n_quantiles])

Compute quantile groups for data.