Abrir en Google Colab Abrir en Binder Descargar notebook

Creando rutinas de preprocesamiento declarativas

Introducción

Las programación declaritiva es una tendencia en ingeniería de software donde, a diferencía de la programación imperativa, permite describir las instrucciones que se desean realizar para que luego, un proceso, convierta esas instrucciones en código que puede ser ejecutado.

Kedro es una framework de código abierto de Python diseñado para crear código de ciencia de datos declarativo, reproducible, mantenible y modular.

Kedro aplica prácticas de ingeniería de software para ayudar a los usuarios a crear pipelines de ciencia de datos y data engineering listos para producción.

Instalación

Necesitaremos instalar las librerias:

[1]:
%pip install kedro kedro-viz --quiet
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 198.6/198.6 kB 13.2 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 98.2/98.2 kB 7.8 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 236.5/236.5 kB 11.1 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 206.5/206.5 kB 10.2 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 41.2/41.2 kB 2.6 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 118.6/118.6 kB 8.3 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 66.4/66.4 kB 4.9 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 739.1/739.1 kB 31.1 MB/s eta 0:00:00

Sobre el conjunto de datos del censo UCI

El conjunto de datos del censo de la UCI es un conjunto de datos en el que cada registro representa a una persona. Cada registro contiene 14 columnas que describen a una una sola persona, de la base de datos del censo de Estados Unidos de 1994. Esto incluye información como la edad, el estado civil y el nivel educativo. La tarea es determinar si una persona tiene un ingreso alto (definido como ganar más de $50 mil al año). Esta tarea, dado el tipo de datos que utiliza, se usa a menudo en el estudio de equidad, en parte debido a los atributos comprensibles del conjunto de datos, incluidos algunos que contienen tipos sensibles como la edad y el género, y en parte también porque comprende una tarea claramente del mundo real.

Descargamos el conjunto de datos

[4]:
!wget https://santiagxf.blob.core.windows.net/public/datasets/uci_census.zip \
    --quiet --no-clobber
!mkdir -p datasets/uci_census
!unzip -qq uci_census.zip -d datasets/uci_census

Lo importamos

[5]:
import pandas as pd
import numpy as np

train = pd.read_csv('datasets/uci_census/data/adult-train.csv')
test = pd.read_csv('datasets/uci_census/data/adult-test.csv')
[6]:
train
[6]:
income age workclass fnlwgt education education-num marital-status occupation relationship race gender capital-gain capital-loss hours-per-week native-country
0 <=50K 39 State-gov 77516 Bachelors 13 Never-married Adm-clerical Not-in-family White Male 2174 0 40 United-States
1 <=50K 50 Self-emp-not-inc 83311 Bachelors 13 Married-civ-spouse Exec-managerial Husband White Male 0 0 13 United-States
2 <=50K 38 Private 215646 HS-grad 9 Divorced Handlers-cleaners Not-in-family White Male 0 0 40 United-States
3 <=50K 53 Private 234721 11th 7 Married-civ-spouse Handlers-cleaners Husband Black Male 0 0 40 United-States
4 <=50K 28 Private 338409 Bachelors 13 Married-civ-spouse Prof-specialty Wife Black Female 0 0 40 Cuba
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
32556 <=50K 27 Private 257302 Assoc-acdm 12 Married-civ-spouse Tech-support Wife White Female 0 0 38 United-States
32557 >50K 40 Private 154374 HS-grad 9 Married-civ-spouse Machine-op-inspct Husband White Male 0 0 40 United-States
32558 <=50K 58 Private 151910 HS-grad 9 Widowed Adm-clerical Unmarried White Female 0 0 40 United-States
32559 <=50K 22 Private 201490 HS-grad 9 Never-married Adm-clerical Own-child White Male 0 0 20 United-States
32560 >50K 52 Self-emp-inc 287927 HS-grad 9 Married-civ-spouse Exec-managerial Wife White Female 15024 0 40 United-States

32561 rows × 15 columns

Proyectos

Kedro utiliza el concepto de proyectos para organizar los elementos necesarios en un dado problema.

[2]:
!kedro new --starter=spaceflights-pandas
[06/29/25 16:47:10] INFO     Using                               ]8;id=669892;file:///usr/local/lib/python3.11/dist-packages/kedro/framework/project/__init__.py\__init__.py]8;;\:]8;id=391138;file:///usr/local/lib/python3.11/dist-packages/kedro/framework/project/__init__.py#272\272]8;;\
                             '/usr/local/lib/python3.11/dist-pac
                             kages/kedro/framework/project/rich_
                             logging.yml' as logging
                             configuration.

Project Name
============
Please enter a human readable name for your new project.
Spaces, hyphens, and underscores are allowed.
 [Spaceflights Pandas]: spaceflights

Congratulations!
Your project 'spaceflights' has been created in the directory
/content/spaceflights

[06/29/25 16:47:22] INFO     Kedro is sending anonymous usage data ]8;id=475426;file:///usr/local/lib/python3.11/dist-packages/kedro_telemetry/plugin.py\plugin.py]8;;\:]8;id=492610;file:///usr/local/lib/python3.11/dist-packages/kedro_telemetry/plugin.py#233\233]8;;\
                             with the sole purpose of improving
                             the product. No personal data or IP
                             addresses are stored on our side. If
                             you want to opt out, set the
                             `KEDRO_DISABLE_TELEMETRY` or
                             `DO_NOT_TRACK` environment variables,
                             or create a `.telemetry` file in the
                             current working directory with the
                             contents `consent: false`. Read more
                             at
                             https://docs.kedro.org/en/stable/conf
                             iguration/telemetry.html

Instalar dependencias del proyecto

Los proyectos de Kedro tienen un archivo requirements.txt para especificar sus dependencias y habilitar proyectos compartibles al garantizar la coherencia en los paquetes y versiones de Python.

[1]:
!cat spaceflights/requirements.txt
ipython>=8.10
jupyterlab>=3.0
notebook
kedro[jupyter]~=0.19.14
kedro-datasets[pandas-csvdataset, pandas-exceldataset, pandas-parquetdataset, plotly-plotlydataset, plotly-jsondataset, matplotlib-matplotlibwriter]>=3.0
kedro-viz>=6.7.0
scikit-learn~=1.5.1
seaborn~=0.12.1

Para instalar todas las dependencias específicas del proyecto, ejecute lo siguiente desde el directorio raíz del proyecto:

[5]:
%pip install -r spaceflights/requirements.txt
Collecting ipython>=8.10 (from -r spaceflights/requirements.txt (line 1))
  Downloading ipython-9.3.0-py3-none-any.whl.metadata (4.4 kB)
Collecting jupyterlab>=3.0 (from -r spaceflights/requirements.txt (line 2))
  Downloading jupyterlab-4.4.4-py3-none-any.whl.metadata (16 kB)
Requirement already satisfied: notebook in /usr/local/lib/python3.11/dist-packages (from -r spaceflights/requirements.txt (line 3)) (6.5.7)
Requirement already satisfied: kedro~=0.19.14 in /usr/local/lib/python3.11/dist-packages (from kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (0.19.14)
Collecting kedro-datasets>=3.0 (from kedro-datasets[matplotlib-matplotlibwriter,pandas-csvdataset,pandas-exceldataset,pandas-parquetdataset,plotly-jsondataset,plotly-plotlydataset]>=3.0->-r spaceflights/requirements.txt (line 5))
  Downloading kedro_datasets-7.0.0-py3-none-any.whl.metadata (24 kB)
Collecting kedro-viz>=6.7.0 (from -r spaceflights/requirements.txt (line 6))
  Downloading kedro_viz-11.0.2-py3-none-any.whl.metadata (12 kB)
Collecting scikit-learn~=1.5.1 (from -r spaceflights/requirements.txt (line 7))
  Downloading scikit_learn-1.5.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (13 kB)
Collecting seaborn~=0.12.1 (from -r spaceflights/requirements.txt (line 8))
  Downloading seaborn-0.12.2-py3-none-any.whl.metadata (5.4 kB)
Requirement already satisfied: decorator in /usr/local/lib/python3.11/dist-packages (from ipython>=8.10->-r spaceflights/requirements.txt (line 1)) (4.4.2)
Collecting ipython-pygments-lexers (from ipython>=8.10->-r spaceflights/requirements.txt (line 1))
  Downloading ipython_pygments_lexers-1.1.1-py3-none-any.whl.metadata (1.1 kB)
Collecting jedi>=0.16 (from ipython>=8.10->-r spaceflights/requirements.txt (line 1))
  Downloading jedi-0.19.2-py2.py3-none-any.whl.metadata (22 kB)
Requirement already satisfied: matplotlib-inline in /usr/local/lib/python3.11/dist-packages (from ipython>=8.10->-r spaceflights/requirements.txt (line 1)) (0.1.7)
Requirement already satisfied: pexpect>4.3 in /usr/local/lib/python3.11/dist-packages (from ipython>=8.10->-r spaceflights/requirements.txt (line 1)) (4.9.0)
Requirement already satisfied: prompt_toolkit<3.1.0,>=3.0.41 in /usr/local/lib/python3.11/dist-packages (from ipython>=8.10->-r spaceflights/requirements.txt (line 1)) (3.0.51)
Requirement already satisfied: pygments>=2.4.0 in /usr/local/lib/python3.11/dist-packages (from ipython>=8.10->-r spaceflights/requirements.txt (line 1)) (2.19.2)
Collecting stack_data (from ipython>=8.10->-r spaceflights/requirements.txt (line 1))
  Downloading stack_data-0.6.3-py3-none-any.whl.metadata (18 kB)
Collecting traitlets>=5.13.0 (from ipython>=8.10->-r spaceflights/requirements.txt (line 1))
  Downloading traitlets-5.14.3-py3-none-any.whl.metadata (10 kB)
Requirement already satisfied: typing_extensions>=4.6 in /usr/local/lib/python3.11/dist-packages (from ipython>=8.10->-r spaceflights/requirements.txt (line 1)) (4.14.0)
Collecting async-lru>=1.0.0 (from jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2))
  Downloading async_lru-2.0.5-py3-none-any.whl.metadata (4.5 kB)
Requirement already satisfied: httpx>=0.25.0 in /usr/local/lib/python3.11/dist-packages (from jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2)) (0.28.1)
Requirement already satisfied: ipykernel>=6.5.0 in /usr/local/lib/python3.11/dist-packages (from jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2)) (6.17.1)
Requirement already satisfied: jinja2>=3.0.3 in /usr/local/lib/python3.11/dist-packages (from jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2)) (3.1.6)
Requirement already satisfied: jupyter-core in /usr/local/lib/python3.11/dist-packages (from jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2)) (5.8.1)
Collecting jupyter-lsp>=2.0.0 (from jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2))
  Downloading jupyter_lsp-2.2.5-py3-none-any.whl.metadata (1.8 kB)
Collecting jupyter-server<3,>=2.4.0 (from jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2))
  Downloading jupyter_server-2.16.0-py3-none-any.whl.metadata (8.5 kB)
Collecting jupyterlab-server<3,>=2.27.1 (from jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2))
  Downloading jupyterlab_server-2.27.3-py3-none-any.whl.metadata (5.9 kB)
Requirement already satisfied: notebook-shim>=0.2 in /usr/local/lib/python3.11/dist-packages (from jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2)) (0.2.4)
Requirement already satisfied: packaging in /usr/local/lib/python3.11/dist-packages (from jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2)) (24.2)
Requirement already satisfied: setuptools>=41.1.0 in /usr/local/lib/python3.11/dist-packages (from jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2)) (75.2.0)
Requirement already satisfied: tornado>=6.2.0 in /usr/local/lib/python3.11/dist-packages (from jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2)) (6.4.2)
Requirement already satisfied: pyzmq>=17 in /usr/local/lib/python3.11/dist-packages (from notebook->-r spaceflights/requirements.txt (line 3)) (24.0.1)
Requirement already satisfied: argon2-cffi in /usr/local/lib/python3.11/dist-packages (from notebook->-r spaceflights/requirements.txt (line 3)) (25.1.0)
Requirement already satisfied: jupyter-client<8,>=5.3.4 in /usr/local/lib/python3.11/dist-packages (from notebook->-r spaceflights/requirements.txt (line 3)) (6.1.12)
Requirement already satisfied: ipython-genutils in /usr/local/lib/python3.11/dist-packages (from notebook->-r spaceflights/requirements.txt (line 3)) (0.2.0)
Requirement already satisfied: nbformat in /usr/local/lib/python3.11/dist-packages (from notebook->-r spaceflights/requirements.txt (line 3)) (5.10.4)
Requirement already satisfied: nbconvert>=5 in /usr/local/lib/python3.11/dist-packages (from notebook->-r spaceflights/requirements.txt (line 3)) (7.16.6)
Requirement already satisfied: nest-asyncio>=1.5 in /usr/local/lib/python3.11/dist-packages (from notebook->-r spaceflights/requirements.txt (line 3)) (1.6.0)
Requirement already satisfied: Send2Trash>=1.8.0 in /usr/local/lib/python3.11/dist-packages (from notebook->-r spaceflights/requirements.txt (line 3)) (1.8.3)
Requirement already satisfied: terminado>=0.8.3 in /usr/local/lib/python3.11/dist-packages (from notebook->-r spaceflights/requirements.txt (line 3)) (0.18.1)
Requirement already satisfied: prometheus-client in /usr/local/lib/python3.11/dist-packages (from notebook->-r spaceflights/requirements.txt (line 3)) (0.22.1)
Requirement already satisfied: nbclassic>=0.4.7 in /usr/local/lib/python3.11/dist-packages (from notebook->-r spaceflights/requirements.txt (line 3)) (1.3.1)
Requirement already satisfied: attrs>=21.3 in /usr/local/lib/python3.11/dist-packages (from kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (25.3.0)
Requirement already satisfied: build>=0.7.0 in /usr/local/lib/python3.11/dist-packages (from kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (1.2.2.post1)
Requirement already satisfied: cachetools>=4.1 in /usr/local/lib/python3.11/dist-packages (from kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (5.5.2)
Requirement already satisfied: click<8.2.0,>=4.0 in /usr/local/lib/python3.11/dist-packages (from kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (8.1.8)
Requirement already satisfied: cookiecutter<3.0,>=2.1.1 in /usr/local/lib/python3.11/dist-packages (from kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (2.6.0)
Requirement already satisfied: dynaconf<4.0,>=3.1.2 in /usr/local/lib/python3.11/dist-packages (from kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (3.2.11)
Requirement already satisfied: fsspec>=2021.4 in /usr/local/lib/python3.11/dist-packages (from kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (2025.3.2)
Requirement already satisfied: gitpython>=3.0 in /usr/local/lib/python3.11/dist-packages (from kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (3.1.44)
Requirement already satisfied: importlib-metadata<9.0,>=3.6 in /usr/local/lib/python3.11/dist-packages (from kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (8.7.0)
Requirement already satisfied: importlib_resources<7.0,>=1.3 in /usr/local/lib/python3.11/dist-packages (from kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (6.5.2)
Requirement already satisfied: kedro-telemetry>=0.5.0 in /usr/local/lib/python3.11/dist-packages (from kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (0.6.3)
Requirement already satisfied: more_itertools>=8.14.0 in /usr/local/lib/python3.11/dist-packages (from kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (10.7.0)
Requirement already satisfied: omegaconf>=2.1.1 in /usr/local/lib/python3.11/dist-packages (from kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (2.3.0)
Requirement already satisfied: parse>=1.19.0 in /usr/local/lib/python3.11/dist-packages (from kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (1.20.2)
Requirement already satisfied: pluggy>=1.0 in /usr/local/lib/python3.11/dist-packages (from kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (1.6.0)
Requirement already satisfied: pre-commit-hooks in /usr/local/lib/python3.11/dist-packages (from kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (5.0.0)
Requirement already satisfied: PyYAML<7.0,>=4.2 in /usr/local/lib/python3.11/dist-packages (from kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (6.0.2)
Requirement already satisfied: rich<15.0,>=12.0 in /usr/local/lib/python3.11/dist-packages (from kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (13.9.4)
Requirement already satisfied: rope<2.0,>=0.21 in /usr/local/lib/python3.11/dist-packages (from kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (1.13.0)
Requirement already satisfied: toml>=0.10.0 in /usr/local/lib/python3.11/dist-packages (from kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (0.10.2)
Requirement already satisfied: lazy_loader in /usr/local/lib/python3.11/dist-packages (from kedro-datasets>=3.0->kedro-datasets[matplotlib-matplotlibwriter,pandas-csvdataset,pandas-exceldataset,pandas-parquetdataset,plotly-jsondataset,plotly-plotlydataset]>=3.0->-r spaceflights/requirements.txt (line 5)) (0.4)
Requirement already satisfied: aiofiles>=22.1.0 in /usr/local/lib/python3.11/dist-packages (from kedro-viz>=6.7.0->-r spaceflights/requirements.txt (line 6)) (24.1.0)
Collecting click-default-group (from kedro-viz>=6.7.0->-r spaceflights/requirements.txt (line 6))
  Downloading click_default_group-1.2.4-py2.py3-none-any.whl.metadata (2.8 kB)
Requirement already satisfied: fastapi<0.200.0,>=0.100.0 in /usr/local/lib/python3.11/dist-packages (from kedro-viz>=6.7.0->-r spaceflights/requirements.txt (line 6)) (0.115.13)
Collecting ipython>=8.10 (from -r spaceflights/requirements.txt (line 1))
  Downloading ipython-8.37.0-py3-none-any.whl.metadata (5.1 kB)
Requirement already satisfied: networkx>=2.5 in /usr/local/lib/python3.11/dist-packages (from kedro-viz>=6.7.0->-r spaceflights/requirements.txt (line 6)) (3.5)
Requirement already satisfied: orjson<4.0,>=3.9 in /usr/local/lib/python3.11/dist-packages (from kedro-viz>=6.7.0->-r spaceflights/requirements.txt (line 6)) (3.10.18)
Requirement already satisfied: pandas>=1.3 in /usr/local/lib/python3.11/dist-packages (from kedro-viz>=6.7.0->-r spaceflights/requirements.txt (line 6)) (2.2.2)
Collecting pathspec>=0.12.1 (from kedro-viz>=6.7.0->-r spaceflights/requirements.txt (line 6))
  Downloading pathspec-0.12.1-py3-none-any.whl.metadata (21 kB)
Requirement already satisfied: plotly>=4.0 in /usr/local/lib/python3.11/dist-packages (from kedro-viz>=6.7.0->-r spaceflights/requirements.txt (line 6)) (5.24.1)
Requirement already satisfied: pydantic>=2.0.0 in /usr/local/lib/python3.11/dist-packages (from kedro-viz>=6.7.0->-r spaceflights/requirements.txt (line 6)) (2.11.7)
Collecting secure>=0.3.0 (from kedro-viz>=6.7.0->-r spaceflights/requirements.txt (line 6))
  Downloading secure-1.0.1-py3-none-any.whl.metadata (12 kB)
Requirement already satisfied: sqlalchemy<3,>=1.4 in /usr/local/lib/python3.11/dist-packages (from kedro-viz>=6.7.0->-r spaceflights/requirements.txt (line 6)) (2.0.41)
Collecting strawberry-graphql<1.0,>=0.192.0 (from kedro-viz>=6.7.0->-r spaceflights/requirements.txt (line 6))
  Downloading strawberry_graphql-0.275.5-py3-none-any.whl.metadata (7.4 kB)
Requirement already satisfied: uvicorn<1.0,>=0.30.0 in /usr/local/lib/python3.11/dist-packages (from uvicorn[standard]<1.0,>=0.30.0->kedro-viz>=6.7.0->-r spaceflights/requirements.txt (line 6)) (0.34.3)
Collecting watchfiles>=0.24.0 (from kedro-viz>=6.7.0->-r spaceflights/requirements.txt (line 6))
  Downloading watchfiles-1.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.9 kB)
Requirement already satisfied: numpy>=1.19.5 in /usr/local/lib/python3.11/dist-packages (from scikit-learn~=1.5.1->-r spaceflights/requirements.txt (line 7)) (2.0.2)
Requirement already satisfied: scipy>=1.6.0 in /usr/local/lib/python3.11/dist-packages (from scikit-learn~=1.5.1->-r spaceflights/requirements.txt (line 7)) (1.15.3)
Requirement already satisfied: joblib>=1.2.0 in /usr/local/lib/python3.11/dist-packages (from scikit-learn~=1.5.1->-r spaceflights/requirements.txt (line 7)) (1.5.1)
Requirement already satisfied: threadpoolctl>=3.1.0 in /usr/local/lib/python3.11/dist-packages (from scikit-learn~=1.5.1->-r spaceflights/requirements.txt (line 7)) (3.6.0)
Requirement already satisfied: matplotlib!=3.6.1,>=3.1 in /usr/local/lib/python3.11/dist-packages (from seaborn~=0.12.1->-r spaceflights/requirements.txt (line 8)) (3.10.0)
Requirement already satisfied: openpyxl<4.0,>=3.0.6 in /usr/local/lib/python3.11/dist-packages (from kedro-datasets[matplotlib-matplotlibwriter,pandas-csvdataset,pandas-exceldataset,pandas-parquetdataset,plotly-jsondataset,plotly-plotlydataset]>=3.0->-r spaceflights/requirements.txt (line 5)) (3.1.5)
Requirement already satisfied: pyarrow>=6.0 in /usr/local/lib/python3.11/dist-packages (from kedro-datasets[matplotlib-matplotlibwriter,pandas-csvdataset,pandas-exceldataset,pandas-parquetdataset,plotly-jsondataset,plotly-plotlydataset]>=3.0->-r spaceflights/requirements.txt (line 5)) (18.1.0)
Collecting ipylab>=1.0.0 (from kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4))
  Downloading ipylab-1.1.0-py3-none-any.whl.metadata (6.7 kB)
Collecting notebook (from -r spaceflights/requirements.txt (line 3))
  Downloading notebook-7.4.3-py3-none-any.whl.metadata (10 kB)
Requirement already satisfied: pyproject_hooks in /usr/local/lib/python3.11/dist-packages (from build>=0.7.0->kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (1.2.0)
Requirement already satisfied: binaryornot>=0.4.4 in /usr/local/lib/python3.11/dist-packages (from cookiecutter<3.0,>=2.1.1->kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (0.4.4)
Requirement already satisfied: python-slugify>=4.0.0 in /usr/local/lib/python3.11/dist-packages (from cookiecutter<3.0,>=2.1.1->kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (8.0.4)
Requirement already satisfied: requests>=2.23.0 in /usr/local/lib/python3.11/dist-packages (from cookiecutter<3.0,>=2.1.1->kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (2.32.3)
Requirement already satisfied: arrow in /usr/local/lib/python3.11/dist-packages (from cookiecutter<3.0,>=2.1.1->kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (1.3.0)
Requirement already satisfied: starlette<0.47.0,>=0.40.0 in /usr/local/lib/python3.11/dist-packages (from fastapi<0.200.0,>=0.100.0->kedro-viz>=6.7.0->-r spaceflights/requirements.txt (line 6)) (0.46.2)
Requirement already satisfied: gitdb<5,>=4.0.1 in /usr/local/lib/python3.11/dist-packages (from gitpython>=3.0->kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (4.0.12)
Requirement already satisfied: anyio in /usr/local/lib/python3.11/dist-packages (from httpx>=0.25.0->jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2)) (4.9.0)
Requirement already satisfied: certifi in /usr/local/lib/python3.11/dist-packages (from httpx>=0.25.0->jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2)) (2025.6.15)
Requirement already satisfied: httpcore==1.* in /usr/local/lib/python3.11/dist-packages (from httpx>=0.25.0->jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2)) (1.0.9)
Requirement already satisfied: idna in /usr/local/lib/python3.11/dist-packages (from httpx>=0.25.0->jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2)) (3.10)
Requirement already satisfied: h11>=0.16 in /usr/local/lib/python3.11/dist-packages (from httpcore==1.*->httpx>=0.25.0->jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2)) (0.16.0)
Requirement already satisfied: zipp>=3.20 in /usr/local/lib/python3.11/dist-packages (from importlib-metadata<9.0,>=3.6->kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (3.23.0)
Requirement already satisfied: debugpy>=1.0 in /usr/local/lib/python3.11/dist-packages (from ipykernel>=6.5.0->jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2)) (1.8.0)
Requirement already satisfied: psutil in /usr/local/lib/python3.11/dist-packages (from ipykernel>=6.5.0->jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2)) (5.9.5)
Requirement already satisfied: ipywidgets<9,>=7.6.0 in /usr/local/lib/python3.11/dist-packages (from ipylab>=1.0.0->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (7.7.1)
Requirement already satisfied: parso<0.9.0,>=0.8.4 in /usr/local/lib/python3.11/dist-packages (from jedi>=0.16->ipython>=8.10->-r spaceflights/requirements.txt (line 1)) (0.8.4)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.11/dist-packages (from jinja2>=3.0.3->jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2)) (3.0.2)
Collecting jupyter-client>=6.1.12 (from ipykernel>=6.5.0->jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2))
  Downloading jupyter_client-8.6.3-py3-none-any.whl.metadata (8.3 kB)
Collecting jupyter-events>=0.11.0 (from jupyter-server<3,>=2.4.0->jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2))
  Downloading jupyter_events-0.12.0-py3-none-any.whl.metadata (5.8 kB)
Collecting jupyter-server-terminals>=0.4.4 (from jupyter-server<3,>=2.4.0->jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2))
  Downloading jupyter_server_terminals-0.5.3-py3-none-any.whl.metadata (5.6 kB)
Collecting overrides>=5.0 (from jupyter-server<3,>=2.4.0->jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2))
  Downloading overrides-7.7.0-py3-none-any.whl.metadata (5.8 kB)
Requirement already satisfied: websocket-client>=1.7 in /usr/local/lib/python3.11/dist-packages (from jupyter-server<3,>=2.4.0->jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2)) (1.8.0)
Requirement already satisfied: platformdirs>=2.5 in /usr/local/lib/python3.11/dist-packages (from jupyter-core->jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2)) (4.3.8)
Requirement already satisfied: babel>=2.10 in /usr/local/lib/python3.11/dist-packages (from jupyterlab-server<3,>=2.27.1->jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2)) (2.17.0)
Collecting json5>=0.9.0 (from jupyterlab-server<3,>=2.27.1->jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2))
  Downloading json5-0.12.0-py3-none-any.whl.metadata (36 kB)
Requirement already satisfied: jsonschema>=4.18.0 in /usr/local/lib/python3.11/dist-packages (from jupyterlab-server<3,>=2.27.1->jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2)) (4.24.0)
Requirement already satisfied: appdirs>=1.4.4 in /usr/local/lib/python3.11/dist-packages (from kedro-telemetry>=0.5.0->kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (1.4.4)
Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib!=3.6.1,>=3.1->seaborn~=0.12.1->-r spaceflights/requirements.txt (line 8)) (1.3.2)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.11/dist-packages (from matplotlib!=3.6.1,>=3.1->seaborn~=0.12.1->-r spaceflights/requirements.txt (line 8)) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.11/dist-packages (from matplotlib!=3.6.1,>=3.1->seaborn~=0.12.1->-r spaceflights/requirements.txt (line 8)) (4.58.4)
Requirement already satisfied: kiwisolver>=1.3.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib!=3.6.1,>=3.1->seaborn~=0.12.1->-r spaceflights/requirements.txt (line 8)) (1.4.8)
Requirement already satisfied: pillow>=8 in /usr/local/lib/python3.11/dist-packages (from matplotlib!=3.6.1,>=3.1->seaborn~=0.12.1->-r spaceflights/requirements.txt (line 8)) (11.2.1)
Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib!=3.6.1,>=3.1->seaborn~=0.12.1->-r spaceflights/requirements.txt (line 8)) (3.2.3)
Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.11/dist-packages (from matplotlib!=3.6.1,>=3.1->seaborn~=0.12.1->-r spaceflights/requirements.txt (line 8)) (2.9.0.post0)
Requirement already satisfied: antlr4-python3-runtime==4.9.* in /usr/local/lib/python3.11/dist-packages (from omegaconf>=2.1.1->kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (4.9.3)
Requirement already satisfied: et-xmlfile in /usr/local/lib/python3.11/dist-packages (from openpyxl<4.0,>=3.0.6->kedro-datasets[matplotlib-matplotlibwriter,pandas-csvdataset,pandas-exceldataset,pandas-parquetdataset,plotly-jsondataset,plotly-plotlydataset]>=3.0->-r spaceflights/requirements.txt (line 5)) (2.0.0)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.11/dist-packages (from pandas>=1.3->kedro-viz>=6.7.0->-r spaceflights/requirements.txt (line 6)) (2025.2)
Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.11/dist-packages (from pandas>=1.3->kedro-viz>=6.7.0->-r spaceflights/requirements.txt (line 6)) (2025.2)
Requirement already satisfied: ptyprocess>=0.5 in /usr/local/lib/python3.11/dist-packages (from pexpect>4.3->ipython>=8.10->-r spaceflights/requirements.txt (line 1)) (0.7.0)
Requirement already satisfied: tenacity>=6.2.0 in /usr/local/lib/python3.11/dist-packages (from plotly>=4.0->kedro-viz>=6.7.0->-r spaceflights/requirements.txt (line 6)) (8.5.0)
Requirement already satisfied: wcwidth in /usr/local/lib/python3.11/dist-packages (from prompt_toolkit<3.1.0,>=3.0.41->ipython>=8.10->-r spaceflights/requirements.txt (line 1)) (0.2.13)
Requirement already satisfied: annotated-types>=0.6.0 in /usr/local/lib/python3.11/dist-packages (from pydantic>=2.0.0->kedro-viz>=6.7.0->-r spaceflights/requirements.txt (line 6)) (0.7.0)
Requirement already satisfied: pydantic-core==2.33.2 in /usr/local/lib/python3.11/dist-packages (from pydantic>=2.0.0->kedro-viz>=6.7.0->-r spaceflights/requirements.txt (line 6)) (2.33.2)
Requirement already satisfied: typing-inspection>=0.4.0 in /usr/local/lib/python3.11/dist-packages (from pydantic>=2.0.0->kedro-viz>=6.7.0->-r spaceflights/requirements.txt (line 6)) (0.4.1)
Requirement already satisfied: markdown-it-py>=2.2.0 in /usr/local/lib/python3.11/dist-packages (from rich<15.0,>=12.0->kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (3.0.0)
Requirement already satisfied: pytoolconfig>=1.2.2 in /usr/local/lib/python3.11/dist-packages (from pytoolconfig[global]>=1.2.2->rope<2.0,>=0.21->kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (1.3.1)
Requirement already satisfied: greenlet>=1 in /usr/local/lib/python3.11/dist-packages (from sqlalchemy<3,>=1.4->kedro-viz>=6.7.0->-r spaceflights/requirements.txt (line 6)) (3.2.3)
Collecting graphql-core<3.4.0,>=3.2.0 (from strawberry-graphql<1.0,>=0.192.0->kedro-viz>=6.7.0->-r spaceflights/requirements.txt (line 6))
  Downloading graphql_core-3.2.6-py3-none-any.whl.metadata (11 kB)
Collecting httptools>=0.6.3 (from uvicorn[standard]<1.0,>=0.30.0->kedro-viz>=6.7.0->-r spaceflights/requirements.txt (line 6))
  Downloading httptools-0.6.4-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.6 kB)
Collecting python-dotenv>=0.13 (from uvicorn[standard]<1.0,>=0.30.0->kedro-viz>=6.7.0->-r spaceflights/requirements.txt (line 6))
  Downloading python_dotenv-1.1.1-py3-none-any.whl.metadata (24 kB)
Collecting uvloop>=0.15.1 (from uvicorn[standard]<1.0,>=0.30.0->kedro-viz>=6.7.0->-r spaceflights/requirements.txt (line 6))
  Downloading uvloop-0.21.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.9 kB)
Requirement already satisfied: websockets>=10.4 in /usr/local/lib/python3.11/dist-packages (from uvicorn[standard]<1.0,>=0.30.0->kedro-viz>=6.7.0->-r spaceflights/requirements.txt (line 6)) (15.0.1)
Requirement already satisfied: ruamel.yaml>=0.15 in /usr/local/lib/python3.11/dist-packages (from pre-commit-hooks->kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (0.18.14)
Collecting executing>=1.2.0 (from stack_data->ipython>=8.10->-r spaceflights/requirements.txt (line 1))
  Downloading executing-2.2.0-py2.py3-none-any.whl.metadata (8.9 kB)
Collecting asttokens>=2.1.0 (from stack_data->ipython>=8.10->-r spaceflights/requirements.txt (line 1))
  Downloading asttokens-3.0.0-py3-none-any.whl.metadata (4.7 kB)
Collecting pure-eval (from stack_data->ipython>=8.10->-r spaceflights/requirements.txt (line 1))
  Downloading pure_eval-0.2.3-py3-none-any.whl.metadata (6.3 kB)
Requirement already satisfied: sniffio>=1.1 in /usr/local/lib/python3.11/dist-packages (from anyio->httpx>=0.25.0->jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2)) (1.3.1)
Requirement already satisfied: argon2-cffi-bindings in /usr/local/lib/python3.11/dist-packages (from argon2-cffi->notebook->-r spaceflights/requirements.txt (line 3)) (21.2.0)
Requirement already satisfied: chardet>=3.0.2 in /usr/local/lib/python3.11/dist-packages (from binaryornot>=0.4.4->cookiecutter<3.0,>=2.1.1->kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (5.2.0)
Requirement already satisfied: smmap<6,>=3.0.1 in /usr/local/lib/python3.11/dist-packages (from gitdb<5,>=4.0.1->gitpython>=3.0->kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (5.0.2)
Requirement already satisfied: widgetsnbextension~=3.6.0 in /usr/local/lib/python3.11/dist-packages (from ipywidgets<9,>=7.6.0->ipylab>=1.0.0->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (3.6.10)
Requirement already satisfied: jupyterlab-widgets>=1.0.0 in /usr/local/lib/python3.11/dist-packages (from ipywidgets<9,>=7.6.0->ipylab>=1.0.0->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (3.0.15)
Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /usr/local/lib/python3.11/dist-packages (from jsonschema>=4.18.0->jupyterlab-server<3,>=2.27.1->jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2)) (2025.4.1)
Requirement already satisfied: referencing>=0.28.4 in /usr/local/lib/python3.11/dist-packages (from jsonschema>=4.18.0->jupyterlab-server<3,>=2.27.1->jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2)) (0.36.2)
Requirement already satisfied: rpds-py>=0.7.1 in /usr/local/lib/python3.11/dist-packages (from jsonschema>=4.18.0->jupyterlab-server<3,>=2.27.1->jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2)) (0.25.1)
Collecting python-json-logger>=2.0.4 (from jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2))
  Downloading python_json_logger-3.3.0-py3-none-any.whl.metadata (4.0 kB)
Collecting rfc3339-validator (from jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2))
  Downloading rfc3339_validator-0.1.4-py2.py3-none-any.whl.metadata (1.5 kB)
Collecting rfc3986-validator>=0.1.1 (from jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2))
  Downloading rfc3986_validator-0.1.1-py2.py3-none-any.whl.metadata (1.7 kB)
Requirement already satisfied: mdurl~=0.1 in /usr/local/lib/python3.11/dist-packages (from markdown-it-py>=2.2.0->rich<15.0,>=12.0->kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (0.1.2)
Requirement already satisfied: beautifulsoup4 in /usr/local/lib/python3.11/dist-packages (from nbconvert>=5->notebook->-r spaceflights/requirements.txt (line 3)) (4.13.4)
Requirement already satisfied: bleach!=5.0.0 in /usr/local/lib/python3.11/dist-packages (from bleach[css]!=5.0.0->nbconvert>=5->notebook->-r spaceflights/requirements.txt (line 3)) (6.2.0)
Requirement already satisfied: defusedxml in /usr/local/lib/python3.11/dist-packages (from nbconvert>=5->notebook->-r spaceflights/requirements.txt (line 3)) (0.7.1)
Requirement already satisfied: jupyterlab-pygments in /usr/local/lib/python3.11/dist-packages (from nbconvert>=5->notebook->-r spaceflights/requirements.txt (line 3)) (0.3.0)
Requirement already satisfied: mistune<4,>=2.0.3 in /usr/local/lib/python3.11/dist-packages (from nbconvert>=5->notebook->-r spaceflights/requirements.txt (line 3)) (3.1.3)
Requirement already satisfied: nbclient>=0.5.0 in /usr/local/lib/python3.11/dist-packages (from nbconvert>=5->notebook->-r spaceflights/requirements.txt (line 3)) (0.10.2)
Requirement already satisfied: pandocfilters>=1.4.1 in /usr/local/lib/python3.11/dist-packages (from nbconvert>=5->notebook->-r spaceflights/requirements.txt (line 3)) (1.5.1)
Requirement already satisfied: fastjsonschema>=2.15 in /usr/local/lib/python3.11/dist-packages (from nbformat->notebook->-r spaceflights/requirements.txt (line 3)) (2.21.1)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.11/dist-packages (from python-dateutil>=2.7->matplotlib!=3.6.1,>=3.1->seaborn~=0.12.1->-r spaceflights/requirements.txt (line 8)) (1.17.0)
Requirement already satisfied: text-unidecode>=1.3 in /usr/local/lib/python3.11/dist-packages (from python-slugify>=4.0.0->cookiecutter<3.0,>=2.1.1->kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (1.3)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.11/dist-packages (from requests>=2.23.0->cookiecutter<3.0,>=2.1.1->kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (3.4.2)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.11/dist-packages (from requests>=2.23.0->cookiecutter<3.0,>=2.1.1->kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (2.4.0)
Requirement already satisfied: ruamel.yaml.clib>=0.2.7 in /usr/local/lib/python3.11/dist-packages (from ruamel.yaml>=0.15->pre-commit-hooks->kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (0.2.12)
Requirement already satisfied: types-python-dateutil>=2.8.10 in /usr/local/lib/python3.11/dist-packages (from arrow->cookiecutter<3.0,>=2.1.1->kedro~=0.19.14->kedro[jupyter]~=0.19.14->-r spaceflights/requirements.txt (line 4)) (2.9.0.20250516)
Requirement already satisfied: webencodings in /usr/local/lib/python3.11/dist-packages (from bleach!=5.0.0->bleach[css]!=5.0.0->nbconvert>=5->notebook->-r spaceflights/requirements.txt (line 3)) (0.5.1)
Requirement already satisfied: tinycss2<1.5,>=1.1.0 in /usr/local/lib/python3.11/dist-packages (from bleach[css]!=5.0.0->nbconvert>=5->notebook->-r spaceflights/requirements.txt (line 3)) (1.4.0)
Collecting fqdn (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2))
  Downloading fqdn-1.5.1-py3-none-any.whl.metadata (1.4 kB)
Collecting isoduration (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2))
  Downloading isoduration-20.11.0-py3-none-any.whl.metadata (5.7 kB)
Requirement already satisfied: jsonpointer>1.13 in /usr/local/lib/python3.11/dist-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2)) (3.0.0)
Collecting uri-template (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2))
  Downloading uri_template-1.3.0-py3-none-any.whl.metadata (8.8 kB)
Requirement already satisfied: webcolors>=24.6.0 in /usr/local/lib/python3.11/dist-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab>=3.0->-r spaceflights/requirements.txt (line 2)) (24.11.1)
Requirement already satisfied: cffi>=1.0.1 in /usr/local/lib/python3.11/dist-packages (from argon2-cffi-bindings->argon2-cffi->notebook->-r spaceflights/requirements.txt (line 3)) (1.17.1)
Requirement already satisfied: soupsieve>1.2 in /usr/local/lib/python3.11/dist-packages (from beautifulsoup4->nbconvert>=5->notebook->-r spaceflights/requirements.txt (line 3)) (2.7)
Requirement already satisfied: pycparser in /usr/local/lib/python3.11/dist-packages (from cffi>=1.0.1->argon2-cffi-bindings->argon2-cffi->notebook->-r spaceflights/requirements.txt (line 3)) (2.22)
Downloading jupyterlab-4.4.4-py3-none-any.whl (12.3 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12.3/12.3 MB 66.7 MB/s eta 0:00:00
Downloading kedro_datasets-7.0.0-py3-none-any.whl (220 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 220.6/220.6 kB 19.8 MB/s eta 0:00:00
Downloading kedro_viz-11.0.2-py3-none-any.whl (3.3 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.3/3.3 MB 89.2 MB/s eta 0:00:00
Downloading ipython-8.37.0-py3-none-any.whl (831 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 831.9/831.9 kB 51.8 MB/s eta 0:00:00
Downloading scikit_learn-1.5.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.3 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.3/13.3 MB 96.6 MB/s eta 0:00:00
Downloading seaborn-0.12.2-py3-none-any.whl (293 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 293.3/293.3 kB 24.1 MB/s eta 0:00:00
Downloading notebook-7.4.3-py3-none-any.whl (14.3 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.3/14.3 MB 94.6 MB/s eta 0:00:00
Downloading async_lru-2.0.5-py3-none-any.whl (6.1 kB)
Downloading ipylab-1.1.0-py3-none-any.whl (101 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 101.6/101.6 kB 8.5 MB/s eta 0:00:00
Downloading jedi-0.19.2-py2.py3-none-any.whl (1.6 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 65.4 MB/s eta 0:00:00
Downloading jupyter_lsp-2.2.5-py3-none-any.whl (69 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 69.1/69.1 kB 5.8 MB/s eta 0:00:00
Downloading jupyter_server-2.16.0-py3-none-any.whl (386 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 386.9/386.9 kB 30.3 MB/s eta 0:00:00
Downloading jupyterlab_server-2.27.3-py3-none-any.whl (59 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 59.7/59.7 kB 5.0 MB/s eta 0:00:00
Downloading pathspec-0.12.1-py3-none-any.whl (31 kB)
Downloading secure-1.0.1-py3-none-any.whl (26 kB)
Downloading strawberry_graphql-0.275.5-py3-none-any.whl (306 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 306.3/306.3 kB 24.3 MB/s eta 0:00:00
Downloading traitlets-5.14.3-py3-none-any.whl (85 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 85.4/85.4 kB 7.0 MB/s eta 0:00:00
Downloading watchfiles-1.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (453 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 453.1/453.1 kB 29.1 MB/s eta 0:00:00
Downloading click_default_group-1.2.4-py2.py3-none-any.whl (4.1 kB)
Downloading stack_data-0.6.3-py3-none-any.whl (24 kB)
Downloading asttokens-3.0.0-py3-none-any.whl (26 kB)
Downloading executing-2.2.0-py2.py3-none-any.whl (26 kB)
Downloading graphql_core-3.2.6-py3-none-any.whl (203 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 203.4/203.4 kB 17.5 MB/s eta 0:00:00
Downloading httptools-0.6.4-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (459 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 459.8/459.8 kB 32.7 MB/s eta 0:00:00
Downloading json5-0.12.0-py3-none-any.whl (36 kB)
Downloading jupyter_client-8.6.3-py3-none-any.whl (106 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 106.1/106.1 kB 8.4 MB/s eta 0:00:00
Downloading jupyter_events-0.12.0-py3-none-any.whl (19 kB)
Downloading jupyter_server_terminals-0.5.3-py3-none-any.whl (13 kB)
Downloading overrides-7.7.0-py3-none-any.whl (17 kB)
Downloading python_dotenv-1.1.1-py3-none-any.whl (20 kB)
Downloading uvloop-0.21.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.0 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.0/4.0 MB 94.3 MB/s eta 0:00:00
Downloading pure_eval-0.2.3-py3-none-any.whl (11 kB)
Downloading python_json_logger-3.3.0-py3-none-any.whl (15 kB)
Downloading rfc3986_validator-0.1.1-py2.py3-none-any.whl (4.2 kB)
Downloading rfc3339_validator-0.1.4-py2.py3-none-any.whl (3.5 kB)
Downloading fqdn-1.5.1-py3-none-any.whl (9.1 kB)
Downloading isoduration-20.11.0-py3-none-any.whl (11 kB)
Downloading uri_template-1.3.0-py3-none-any.whl (11 kB)
Installing collected packages: pure-eval, uvloop, uri-template, traitlets, secure, rfc3986-validator, rfc3339-validator, python-json-logger, python-dotenv, pathspec, overrides, json5, jedi, httptools, graphql-core, fqdn, executing, click-default-group, async-lru, asttokens, watchfiles, strawberry-graphql, stack_data, scikit-learn, jupyter-server-terminals, seaborn, jupyter-client, isoduration, ipython, jupyter-events, jupyter-server, jupyterlab-server, jupyter-lsp, jupyterlab, notebook, ipylab, kedro-datasets, kedro-viz
  Attempting uninstall: traitlets
    Found existing installation: traitlets 5.7.1
    Uninstalling traitlets-5.7.1:
      Successfully uninstalled traitlets-5.7.1
  Attempting uninstall: scikit-learn
    Found existing installation: scikit-learn 1.6.1
    Uninstalling scikit-learn-1.6.1:
      Successfully uninstalled scikit-learn-1.6.1
  Attempting uninstall: seaborn
    Found existing installation: seaborn 0.13.2
    Uninstalling seaborn-0.13.2:
      Successfully uninstalled seaborn-0.13.2
  Attempting uninstall: jupyter-client
    Found existing installation: jupyter-client 6.1.12
    Uninstalling jupyter-client-6.1.12:
      Successfully uninstalled jupyter-client-6.1.12
  Attempting uninstall: ipython
    Found existing installation: ipython 7.34.0
    Uninstalling ipython-7.34.0:
      Successfully uninstalled ipython-7.34.0
  Attempting uninstall: jupyter-server
    Found existing installation: jupyter-server 1.16.0
    Uninstalling jupyter-server-1.16.0:
      Successfully uninstalled jupyter-server-1.16.0
  Attempting uninstall: notebook
    Found existing installation: notebook 6.5.7
    Uninstalling notebook-6.5.7:
      Successfully uninstalled notebook-6.5.7
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-colab 1.0.0 requires ipython==7.34.0, but you have ipython 8.37.0 which is incompatible.
google-colab 1.0.0 requires notebook==6.5.7, but you have notebook 7.4.3 which is incompatible.
jupyter-kernel-gateway 2.5.2 requires jupyter-client<8.0,>=5.2.0, but you have jupyter-client 8.6.3 which is incompatible.
jupyter-kernel-gateway 2.5.2 requires notebook<7.0,>=5.7.6, but you have notebook 7.4.3 which is incompatible.
Successfully installed asttokens-3.0.0 async-lru-2.0.5 click-default-group-1.2.4 executing-2.2.0 fqdn-1.5.1 graphql-core-3.2.6 httptools-0.6.4 ipylab-1.1.0 ipython-8.37.0 isoduration-20.11.0 jedi-0.19.2 json5-0.12.0 jupyter-client-8.6.3 jupyter-events-0.12.0 jupyter-lsp-2.2.5 jupyter-server-2.16.0 jupyter-server-terminals-0.5.3 jupyterlab-4.4.4 jupyterlab-server-2.27.3 kedro-datasets-7.0.0 kedro-viz-11.0.2 notebook-7.4.3 overrides-7.7.0 pathspec-0.12.1 pure-eval-0.2.3 python-dotenv-1.1.1 python-json-logger-3.3.0 rfc3339-validator-0.1.4 rfc3986-validator-0.1.1 scikit-learn-1.5.2 seaborn-0.12.2 secure-1.0.1 stack_data-0.6.3 strawberry-graphql-0.275.5 traitlets-5.14.3 uri-template-1.3.0 uvloop-0.21.0 watchfiles-1.1.0

Data type cannot be displayed: application/vnd.colab-display-data+json

Conjuntos de datos del proyecto

El proyecto de ejemplo que tenemos de vuelos espaciales utiliza tres conjuntos de datos ficticios de empresas que transportan clientes a la Luna y de regreso. Los datos vienen en dos formatos diferentes: .csv y .xlsx:

  • companies.csv: Contiene datos sobre las empresas de viajes espaciales, como su ubicación, número de flotas y clasificación

  • reviews.csv: es un conjunto de reseñas de clientes para categorías como comodidad y precio

  • shuttles.xlsx: es un conjunto de atributos para las naves espaciales de toda la flota, como su tipo de motor y capacidad de pasajeros

Esta información se encuentra en el directorio data/01_raw, que simula su origine de datos.

Registro de conjuntos de datos

Herramientas como querdro pueden mantener un catalog de datos con información de los origines de datos disponibles para el procesamiento. La siguiente información sobre un conjunto de datos debe registrarse antes de que Kedro pueda cargarlo:

  • Ubicación del archivo (ruta)

  • Parámetros para el conjunto de datos dado

  • Tipo de datos

  • Version

Kedro utiliza archivos declarativos en formato YAML, un formato ampliamente utilizado en la industria para estas tareas.

catalog.yml

companies:
  type: pandas.CSVDataset
  filepath: data/01_raw/companies.csv

reviews:
  type: pandas.CSVDataset
  filepath: data/01_raw/reviews.csv

shuttles:
  type: pandas.ExcelDataset
  filepath: data/01_raw/shuttles.xlsx
  load_args:
    engine: openpyxl

Para mostrar como Kedro puede configurar nuestro ambiente automáticamente, utilizaremos una función para cargar el contexto del proyecto en el que estamos trabajando en Colaboratory. Esto se logra con la función reload_kedro.

En ambientes productivos, usted no necesitaría ejecutar el siguiente código ya está preconfigurado. En este ejemplo, haremos que Kedro cargue el contexto del projecto dentro de Colaboratory.

[6]:
from kedro.ipython import reload_kedro

reload_kedro(path='spaceflights')
[06/29/25 17:04:48] INFO     Kedro is sending anonymous usage data plugin.py:233
                             with the sole purpose of improving
                             the product. No personal data or IP
                             addresses are stored on our side. If
                             you want to opt out, set the
                             `KEDRO_DISABLE_TELEMETRY` or
                             `DO_NOT_TRACK` environment variables,
                             or create a `.telemetry` file in the
                             current working directory with the
                             contents `consent: false`. Read more
                             at
                             https://docs.kedro.org/en/stable/conf
                             iguration/telemetry.html
                    INFO     Kedro project spaceflights          __init__.py:146
                    INFO     Defined global variable 'context',  __init__.py:147
                             'session', 'catalog' and
                             'pipelines'
[06/29/25 17:04:49] INFO     Registered line magic 'run_viz'     __init__.py:153

El siguiente comando crea una variable (companies), que es de tipo pandas.DataFrame y carga el conjunto de datos (también llamado companies según la configuración en el archivo catalog.yml) desde la ruta del archivo subyacente data/01_raw/companies.csv.

[9]:
companies = catalog.load("companies")
[06/29/25 17:05:55] INFO     Loading data from companies     data_catalog.py:403
                             (CSVDataset)...
[10]:
companies.head()
[10]:
id company_rating company_location total_fleet_count iata_approved
0 3888 100% Isle of Man 1.0 f
1 46728 100% NaN 1.0 f
2 34618 38% Isle of Man 1.0 f
3 28619 100% Bosnia and Herzegovina 1.0 f
4 8240 NaN Chile 1.0 t

Podemos también ver todos los conjuntos que tenemos disponibles:

[15]:
catalog.list()
[15]:

[
    'companies',
    'reviews',
    'shuttles',
    'preprocessed_companies',
    'preprocessed_shuttles',
    'model_input_table',
    'regressor',
    'shuttle_passenger_capacity_plot_exp',
    'shuttle_passenger_capacity_plot_go',
    'dummy_confusion_matrix',
    'parameters',
    'params:model_options',
    'params:model_options.test_size',
    'params:model_options.random_state',
    'params:model_options.features'
]

Note que el proyecto tiene más conjuntos de datos de los que configuramos en un principio. Puede ignorar los otros conjuntos de datos.

Crear un pipeline de procesamiento de datos

El proceso de procesamiento de datos prepara los datos para la construcción del modelo combinando los conjuntos de datos para crear una tabla con los predictores del modelo.

En este ejemplo, el proceso de procesamiento de datos se compone de lo siguiente:

[ ]:
##