{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Formats\n", "\n", "## Import the BipartitePandas package\n", "\n", "Make sure to install it using `pip install bipartitepandas`." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import bipartitepandas as bpd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Get your data ready\n", "\n", "For this notebook, we simulate data." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ijytlkalphapsi
0031-0.0663540210.000000-0.908458
1024-1.3192291210.000000-0.908458
2024-2.1417752210.000000-0.908458
3024-1.8311863210.000000-0.908458
40240.0868224210.000000-0.908458
...........................
4999599991882.8772010490.9674221.335178
4999699991261.4158291460.9674220.348756
4999799991260.7721082460.9674220.348756
4999899991762.2631563480.9674220.908458
4999999991761.7964054480.9674220.908458
\n", "

50000 rows × 8 columns

\n", "
" ], "text/plain": [ " i j y t l k alpha psi\n", "0 0 31 -0.066354 0 2 1 0.000000 -0.908458\n", "1 0 24 -1.319229 1 2 1 0.000000 -0.908458\n", "2 0 24 -2.141775 2 2 1 0.000000 -0.908458\n", "3 0 24 -1.831186 3 2 1 0.000000 -0.908458\n", "4 0 24 0.086822 4 2 1 0.000000 -0.908458\n", "... ... ... ... .. .. .. ... ...\n", "49995 9999 188 2.877201 0 4 9 0.967422 1.335178\n", "49996 9999 126 1.415829 1 4 6 0.967422 0.348756\n", "49997 9999 126 0.772108 2 4 6 0.967422 0.348756\n", "49998 9999 176 2.263156 3 4 8 0.967422 0.908458\n", "49999 9999 176 1.796405 4 4 8 0.967422 0.908458\n", "\n", "[50000 rows x 8 columns]" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "df = bpd.SimBipartite().simulate()\n", "display(df)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Columns\n", "\n", "BipartitePandas includes seven pre-defined general columns:\n", "\n", "#### Required\n", "- `i`: worker id (any type)\n", "- `j`: firm id (any type)\n", "- `y`: income (float or int)\n", "\n", "#### Optional\n", "- `t`: time (int)\n", "- `g`: firm type (any type)\n", "- `w`: weight (float or int)\n", "- `m`: move indicator (int)\n", "\n", "## Formats\n", "\n", "BipartitePandas includes six formats:\n", "\n", "- *Long* - each row gives a single observation\n", "- *Collapsed Long* - like *Long*, but employment spells at the same firm, or entire worker-firm matches, are collapsed into a single observation (these will differ if there are workers who leave, then return to, a particular firm)\n", "- *Event Study* - each row gives two consecutive observations\n", "- *Collapsed Event Study* - like *Event Study*, but employment spells at the same firm, or entire worker-firm matches, are collapsed into a single observation (these will differ if there are workers who leave, then return to, a particular firm)\n", "- *Extended Event Study* - each row gives arbitrarily many consecutive observations\n", "- *Collapsed Extended Event Study* - like *Extended Event Study*, but employment spells at the same firm, or entire worker-firm matches, are collapsed into a single observation (these will differ if there are workers who leave, then return to, a particular firm)\n", "\n", "These formats divide general columns differently:\n", "\n", "- *Long* - `i`, `j`, `y`, `t`, `g`, `w`, `m`\n", "- *Collapsed Long* - `i`, `j`, `y`, `t1`, `t2`, `g`, `w`, `m`\n", "- *Event Study* - `i`, `j1`, `j2`, `y1`, `y2`, `t1`, `t2`, `g1`, `g2`, `w1`, `w2`, `m`\n", "- *Collapsed Event Study* - `i`, `j1`, `j2`, `y1`, `y2`, `t11`, `t12`, `t21`, `t22`, `g1`, `g2`, `w1`, `w2`, `m`\n", "- *Extended Event Study* - `i`, `j1`, ..., `jp`, `y1`, ..., `yp`, `t1`, ..., `tp`, `g1`, ..., `gp`, `w1`, ..., `wp`, `m`\n", "- *Collapsed Extended Event Study* - `i`, `j1`, ..., `jp`, `y1`, ..., `yp`, `t11`, `t12`, ..., `tp1`, `tp2`, `g1`, ..., `gp`, `w1`, ..., `wp`, `m`\n", "\n", "
\n", "\n", "Note\n", "\n", "*Event Study* and *Extended Event Study* differ even if *Extended Event Study* has 2 periods. This is because *Event Study* treats each stayer observation as a new event study, while *Extended Event Study* treats stayers the same as movers: event studies are based on consecutive observations.\n", "\n", "In addition to the fact that stayers are treated differently for non-collapsed data, *Collapsed Event Study* will contain stayers, but *Collapsed Extended Event Study* will not.\n", "\n", "
\n", "\n", "## Constructing DataFrames\n", "\n", "Our simulated data is in *Long* format. How do we construct a *Long* dataframe?" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ijyt
0031-0.0663540
1024-1.3192291
2024-2.1417752
3024-1.8311863
40240.0868224
...............
4999599991882.8772010
4999699991261.4158291
4999799991260.7721082
4999899991762.2631563
4999999991761.7964054
\n", "

50000 rows × 4 columns

\n", "
" ], "text/plain": [ " i j y t\n", "0 0 31 -0.066354 0\n", "1 0 24 -1.319229 1\n", "2 0 24 -2.141775 2\n", "3 0 24 -1.831186 3\n", "4 0 24 0.086822 4\n", "... ... ... ... ..\n", "49995 9999 188 2.877201 0\n", "49996 9999 126 1.415829 1\n", "49997 9999 126 0.772108 2\n", "49998 9999 176 2.263156 3\n", "49999 9999 176 1.796405 4\n", "\n", "[50000 rows x 4 columns]" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "bdf_long = bpd.BipartiteDataFrame(\n", " i=df['i'], j=df['j'], y=df['y'], t=df['t']\n", ")\n", "display(bdf_long)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Are we sure this is long? Let's check the datatype:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "bipartitepandas.bipartitelong.BipartiteLong" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(bdf_long)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This method works to construct any format! Just make sure not to mix up columns between formats." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Converting between formats\n", "\n", "Converting between formats is meant to be easy. Methods exist to go from:\n", "\n", "- *Long* to *Collapsed Long* (`.collapse()`)\n", "- *Long* to *Event Study* (`.to_eventstudy()`)\n", "- *Long* to *Extended Event Study* (`.to_extendedeventstudy()`)\n", "- *Collapsed Long* to *Long* (`.uncollapse()`)\n", "- *Collapsed Long* to *Collapsed Event Study* (`.to_eventstudy()`)\n", "- *Collapsed Long* to *Collapsed Extended Event Study* (`.to_extendedeventstudy()`)\n", "- *Event Study* to *Long* (`.to_long()`)\n", "- *Collapsed Event Study* to *Collapsed Long* (`.to_long()`)\n", "\n", "Let's experiment with these and see what happens. Before we start, we just need to clean our data to make sure the conversions work properly (notice the new `m` column)." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "checking required columns and datatypes\n", "sorting rows\n", "dropping NaN observations\n", "generating 'm' column\n", "keeping highest paying job for i-t (worker-year) duplicates (how='max')\n", "dropping workers who leave a firm then return to it (how=False)\n", "making 'i' ids contiguous\n", "making 'j' ids contiguous\n", "computing largest connected set (how=None)\n", "sorting columns\n", "resetting index\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ijytm
0031-0.06635401
1024-1.31922911
2024-2.14177520
3024-1.83118630
40240.08682240
..................
4999599991882.87720101
4999699991261.41582911
4999799991260.77210821
4999899991762.26315631
4999999991761.79640540
\n", "

50000 rows × 5 columns

\n", "
" ], "text/plain": [ " i j y t m\n", "0 0 31 -0.066354 0 1\n", "1 0 24 -1.319229 1 1\n", "2 0 24 -2.141775 2 0\n", "3 0 24 -1.831186 3 0\n", "4 0 24 0.086822 4 0\n", "... ... ... ... .. ..\n", "49995 9999 188 2.877201 0 1\n", "49996 9999 126 1.415829 1 1\n", "49997 9999 126 0.772108 2 1\n", "49998 9999 176 2.263156 3 1\n", "49999 9999 176 1.796405 4 0\n", "\n", "[50000 rows x 5 columns]" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "bdf_long = bdf_long.clean()\n", "display(bdf_long)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### *Long* to *Collapsed Long*\n", "\n", "Notice that:\n", "\n", "- We specify `level='spell'` to collapse employment spells at the same firm into single observations\n", "- `t` splits into `t1` and `t2`, which indicate the start the end of the spell, respectively\n", "- `w` is new - it gives the number of observations in the spell" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ijyt1t2wm
0031-0.0663540011
1024-1.3013421441
211400.2377480121
3181-1.2623752212
419-1.0980723421
........................
2968499981390.3618190341
29685999830-0.1369804411
2968699991882.8772010011
2968799991261.0939691222
2968899991762.0297813421
\n", "

29689 rows × 7 columns

\n", "
" ], "text/plain": [ " i j y t1 t2 w m\n", "0 0 31 -0.066354 0 0 1 1\n", "1 0 24 -1.301342 1 4 4 1\n", "2 1 140 0.237748 0 1 2 1\n", "3 1 81 -1.262375 2 2 1 2\n", "4 1 9 -1.098072 3 4 2 1\n", "... ... ... ... .. .. .. ..\n", "29684 9998 139 0.361819 0 3 4 1\n", "29685 9998 30 -0.136980 4 4 1 1\n", "29686 9999 188 2.877201 0 0 1 1\n", "29687 9999 126 1.093969 1 2 2 2\n", "29688 9999 176 2.029781 3 4 2 1\n", "\n", "[29689 rows x 7 columns]" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "bdf_collapsedlong = bdf_long.collapse(level='spell')\n", "display(bdf_collapsedlong)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### *Long* to *Event Study*\n", "\n", "Notice that:\n", "\n", "- `j` splits into `j1` and `j2`, which indicate the first and second firm id in the event study, respectively\n", "- `y` splits into `y1` and `y2`, which indicate the first and second income in the event study, respectively\n", "- `t` splits into `t1` and `t2`, which indicate the first and second period in the event study, respectively\n", "\n", "
\n", "\n", "Note\n", "\n", "For stayers (individuals who stay at the same firm for all their observations), each row in the event study represents a single observation, since they never move firms.\n", "\n", "
" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ij1j2y1y2t1t2m
003124-0.066354-1.319229011
102424-1.319229-2.141775120
202424-2.141775-1.831186230
302424-1.8311860.086822340
411401401.157531-0.682035010
...........................
40637999813930-0.256372-0.136980341
4063899991881262.8772011.415829011
4063999991261261.4158290.772108120
4064099991261760.7721082.263156231
4064199991761762.2631561.796405340
\n", "

40642 rows × 8 columns

\n", "
" ], "text/plain": [ " i j1 j2 y1 y2 t1 t2 m\n", "0 0 31 24 -0.066354 -1.319229 0 1 1\n", "1 0 24 24 -1.319229 -2.141775 1 2 0\n", "2 0 24 24 -2.141775 -1.831186 2 3 0\n", "3 0 24 24 -1.831186 0.086822 3 4 0\n", "4 1 140 140 1.157531 -0.682035 0 1 0\n", "... ... ... ... ... ... .. .. ..\n", "40637 9998 139 30 -0.256372 -0.136980 3 4 1\n", "40638 9999 188 126 2.877201 1.415829 0 1 1\n", "40639 9999 126 126 1.415829 0.772108 1 2 0\n", "40640 9999 126 176 0.772108 2.263156 2 3 1\n", "40641 9999 176 176 2.263156 1.796405 3 4 0\n", "\n", "[40642 rows x 8 columns]" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "bdf_eventstudy = bdf_long.to_eventstudy()\n", "display(bdf_eventstudy)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### *Collapsed Long* to *Collapsed Event Study*\n", "\n", "Notice that:\n", "\n", "- `j` splits into `j1` and `j2`, which indicate the first and second firm id in the event study, respectively\n", "- `y` splits into `y1` and `y2`, which indicate the first and second income in the event study, respectively\n", "- `t1` splits into `t11` and `t12`, which indicate the start the end of the spell for the first observation in the event study, respectively\n", "- `t2` splits into `t21` and `t22`, which indicate the start the end of the spell for the second observation in the event study, respectively\n", "- `w` splits into `w1` and `w2`, which indicate number of observations in the first and second spell in the event study, respectively" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ij1j2y1y2t11t12t21t22w1w2m
003124-0.066354-1.3013420014141
11140810.237748-1.2623750122211
21819-1.262375-1.0980722234121
321261310.5385770.6283380012121
42131330.628338-0.6461541234221
.......................................
2032699961051560.4935850.0972400124231
20327999714798-0.331835-1.3637380124231
203289998139300.361819-0.1369800344411
2032999991881262.8772011.0939690012121
2033099991261761.0939692.0297811234221
\n", "

20331 rows × 12 columns

\n", "
" ], "text/plain": [ " i j1 j2 y1 y2 t11 t12 t21 t22 w1 w2 m\n", "0 0 31 24 -0.066354 -1.301342 0 0 1 4 1 4 1\n", "1 1 140 81 0.237748 -1.262375 0 1 2 2 2 1 1\n", "2 1 81 9 -1.262375 -1.098072 2 2 3 4 1 2 1\n", "3 2 126 131 0.538577 0.628338 0 0 1 2 1 2 1\n", "4 2 131 33 0.628338 -0.646154 1 2 3 4 2 2 1\n", "... ... ... ... ... ... ... ... ... ... .. .. ..\n", "20326 9996 105 156 0.493585 0.097240 0 1 2 4 2 3 1\n", "20327 9997 147 98 -0.331835 -1.363738 0 1 2 4 2 3 1\n", "20328 9998 139 30 0.361819 -0.136980 0 3 4 4 4 1 1\n", "20329 9999 188 126 2.877201 1.093969 0 0 1 2 1 2 1\n", "20330 9999 126 176 1.093969 2.029781 1 2 3 4 2 2 1\n", "\n", "[20331 rows x 12 columns]" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "bdf_collapsedeventstudy = bdf_collapsedlong.to_eventstudy()\n", "display(bdf_collapsedeventstudy)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We showed how to get from *Long* to any other format, but feel free to experiment and see what happens when you convert in other directions!\n", "\n", "## Initializing from different formats\n", "\n", "If your data is saved in a format other than *Long*, it's simple to construct a BipartiteDataFrame.\n", "\n", "#### Initializing from *Collapsed Long* format" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ijyt1t2
0031-0.06635400
1024-1.30134214
211400.23774801
3181-1.26237522
419-1.09807234
..................
2968499981390.36181903
29685999830-0.13698044
2968699991882.87720100
2968799991261.09396912
2968899991762.02978134
\n", "

29689 rows × 5 columns

\n", "
" ], "text/plain": [ " i j y t1 t2\n", "0 0 31 -0.066354 0 0\n", "1 0 24 -1.301342 1 4\n", "2 1 140 0.237748 0 1\n", "3 1 81 -1.262375 2 2\n", "4 1 9 -1.098072 3 4\n", "... ... ... ... .. ..\n", "29684 9998 139 0.361819 0 3\n", "29685 9998 30 -0.136980 4 4\n", "29686 9999 188 2.877201 0 0\n", "29687 9999 126 1.093969 1 2\n", "29688 9999 176 2.029781 3 4\n", "\n", "[29689 rows x 5 columns]" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "i = bdf_collapsedlong['i']\n", "j = bdf_collapsedlong['j']\n", "y = bdf_collapsedlong['y']\n", "t1 = bdf_collapsedlong['t1']\n", "t2 = bdf_collapsedlong['t2']\n", "bdf_collapsedlong = bpd.BipartiteDataFrame(\n", " i=i, j=j, y=y, t1=t1, t2=t2\n", ")\n", "display(bdf_collapsedlong)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's check the datatype:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "bipartitepandas.bipartitelongcollapsed.BipartiteLongCollapsed" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(bdf_collapsedlong)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Initializing from *Event Study* format" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ij1j2y1y2t1t2
003124-0.066354-1.31922901
102424-1.319229-2.14177512
202424-2.141775-1.83118623
302424-1.8311860.08682234
411401401.157531-0.68203501
........................
40637999813930-0.256372-0.13698034
4063899991881262.8772011.41582901
4063999991261261.4158290.77210812
4064099991261760.7721082.26315623
4064199991761762.2631561.79640534
\n", "

40642 rows × 7 columns

\n", "
" ], "text/plain": [ " i j1 j2 y1 y2 t1 t2\n", "0 0 31 24 -0.066354 -1.319229 0 1\n", "1 0 24 24 -1.319229 -2.141775 1 2\n", "2 0 24 24 -2.141775 -1.831186 2 3\n", "3 0 24 24 -1.831186 0.086822 3 4\n", "4 1 140 140 1.157531 -0.682035 0 1\n", "... ... ... ... ... ... .. ..\n", "40637 9998 139 30 -0.256372 -0.136980 3 4\n", "40638 9999 188 126 2.877201 1.415829 0 1\n", "40639 9999 126 126 1.415829 0.772108 1 2\n", "40640 9999 126 176 0.772108 2.263156 2 3\n", "40641 9999 176 176 2.263156 1.796405 3 4\n", "\n", "[40642 rows x 7 columns]" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "i = bdf_eventstudy['i']\n", "j1 = bdf_eventstudy['j1']\n", "j2 = bdf_eventstudy['j2']\n", "y1 = bdf_eventstudy['y1']\n", "y2 = bdf_eventstudy['y2']\n", "t1 = bdf_eventstudy['t1']\n", "t2 = bdf_eventstudy['t2']\n", "bdf_eventstudy = bpd.BipartiteDataFrame(\n", " i=i, j1=j1, j2=j2, y1=y1, y2=y2, t1=t1, t2=t2\n", ")\n", "display(bdf_eventstudy)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's check the datatype:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "bipartitepandas.bipartiteeventstudy.BipartiteEventStudy" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(bdf_eventstudy)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Initializing from *Collapsed Event Study* format" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ij1j2y1y2t11t12t21t22
003124-0.066354-1.3013420014
11140810.237748-1.2623750122
21819-1.262375-1.0980722234
321261310.5385770.6283380012
42131330.628338-0.6461541234
..............................
2032699961051560.4935850.0972400124
20327999714798-0.331835-1.3637380124
203289998139300.361819-0.1369800344
2032999991881262.8772011.0939690012
2033099991261761.0939692.0297811234
\n", "

20331 rows × 9 columns

\n", "
" ], "text/plain": [ " i j1 j2 y1 y2 t11 t12 t21 t22\n", "0 0 31 24 -0.066354 -1.301342 0 0 1 4\n", "1 1 140 81 0.237748 -1.262375 0 1 2 2\n", "2 1 81 9 -1.262375 -1.098072 2 2 3 4\n", "3 2 126 131 0.538577 0.628338 0 0 1 2\n", "4 2 131 33 0.628338 -0.646154 1 2 3 4\n", "... ... ... ... ... ... ... ... ... ...\n", "20326 9996 105 156 0.493585 0.097240 0 1 2 4\n", "20327 9997 147 98 -0.331835 -1.363738 0 1 2 4\n", "20328 9998 139 30 0.361819 -0.136980 0 3 4 4\n", "20329 9999 188 126 2.877201 1.093969 0 0 1 2\n", "20330 9999 126 176 1.093969 2.029781 1 2 3 4\n", "\n", "[20331 rows x 9 columns]" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "i = bdf_collapsedeventstudy['i']\n", "j1 = bdf_collapsedeventstudy['j1']\n", "j2 = bdf_collapsedeventstudy['j2']\n", "y1 = bdf_collapsedeventstudy['y1']\n", "y2 = bdf_collapsedeventstudy['y2']\n", "t11 = bdf_collapsedeventstudy['t11']\n", "t12 = bdf_collapsedeventstudy['t12']\n", "t21 = bdf_collapsedeventstudy['t21']\n", "t22 = bdf_collapsedeventstudy['t22']\n", "bdf_collapsedeventstudy = bpd.BipartiteDataFrame(\n", " i=i, j1=j1, j2=j2, y1=y1, y2=y2,\n", " t11=t11, t12=t12, t21=t21, t22=t22\n", ")\n", "display(bdf_collapsedeventstudy)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's check the datatype:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "bipartitepandas.bipartiteeventstudycollapsed.BipartiteEventStudyCollapsed" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(bdf_collapsedeventstudy)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.4" } }, "nbformat": 4, "nbformat_minor": 4 }