From 0bd4f424fb764e68f0d946ffc18dca314a67e9f0 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Diego=20Crist=C3=B3bal=20Herreros?= Date: Sun, 22 Jan 2023 08:34:57 +0100 Subject: [PATCH 1/2] Doc: document the story and data source --- README.md | 6 +- entrega.ipynb | 2740 ++++--------------------------------------------- 2 files changed, 195 insertions(+), 2551 deletions(-) diff --git a/README.md b/README.md index 6b32ea8..4f9c725 100644 --- a/README.md +++ b/README.md @@ -1 +1,5 @@ -# ml-entrega2 \ No newline at end of file +# Segunda entrega de machine learning + +Una empresa de apuestas solicita conocer quien gana en un partido de NBA. Para ello, se usan los datos actualizados hasta la tempporada 2022 https://www.kaggle.com/datasets/nathanlauga/nba-games?resource=download + +Para conseguir el objetivo, se decide usar varios algoritmos de machine learning para poder escoger el que mejor resultado pueda satisfacer las necesidades de la empresa. \ No newline at end of file diff --git a/entrega.ipynb b/entrega.ipynb index 967a12a..5132acf 100644 --- a/entrega.ipynb +++ b/entrega.ipynb @@ -1,22 +1,5 @@ { "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "En este ejercicio, el alumno tendrá que plantear y resolver un problema utilizando aprendizaje\n", - "automático. Se tendrán que llevar a cabo todos los pasos necesarios para la resolución del\n", - "problema: planteamiento, diseño, adquisición de los datos, análisis, entrenamiento de\n", - "algoritmos, evaluación, discusión de resultados, etc. Todo este proceso se desarrollará en un\n", - "cuaderno de Jupyter que conformará la entrega final.\n", - "El problema planteado podrá ser inventado o real, y puede estar basado en un problema de\n", - "negocio, un artículo científico, una competición, etc. Existen únicamente dos requisitos en\n", - "cuanto al caso de uso:\n", - "1. Tiene que utilizarse un conjunto de datos para su resolución.\n", - "2. La solución debe estar basada en aprendizaje automático, es decir, se debe haber\n", - "entrenado al menos un algoritmo." - ] - }, { "cell_type": "markdown", "metadata": {}, @@ -28,12 +11,12 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Una empresa dedicada a las apuestas, necesita un modelo de entrenamiento fiable en el cual poder saber si gana un equipo u otro dependiendo de unos datos de entrada. En este ejercicio se haran pruebas de distintos modelos y se seleccionaran distintas variables para probar su eficacia." + "**Una empresa dedicada a las apuestas, necesita un modelo de entrenamiento fiable en el cual poder saber si gana un equipo u otro dependiendo de unos datos de entrada. En este ejercicio se haran pruebas de distintos modelos y se seleccionaran distintas variables para probar su eficacia.**" ] }, { "cell_type": "code", - "execution_count": 137, + "execution_count": 1, "metadata": {}, "outputs": [], "source": [ @@ -62,14 +45,14 @@ }, { "cell_type": "code", - "execution_count": 138, + "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ - "/tmp/ipykernel_14939/1756342927.py:3: DtypeWarning: Columns (6) have mixed types. Specify dtype option on import or set low_memory=False.\n", + "/tmp/ipykernel_5567/1756342927.py:3: DtypeWarning: Columns (6) have mixed types. Specify dtype option on import or set low_memory=False.\n", " df_games_details = pd.read_csv(zf.open('games_details.csv'))\n" ] } @@ -92,7 +75,7 @@ }, { "cell_type": "code", - "execution_count": 139, + "execution_count": 3, "metadata": {}, "outputs": [ { @@ -938,7 +921,7 @@ }, { "cell_type": "code", - "execution_count": 140, + "execution_count": 4, "metadata": {}, "outputs": [ { @@ -1054,7 +1037,7 @@ }, { "cell_type": "code", - "execution_count": 141, + "execution_count": 5, "metadata": {}, "outputs": [ { @@ -1251,7 +1234,7 @@ }, { "cell_type": "code", - "execution_count": 142, + "execution_count": 6, "metadata": {}, "outputs": [ { @@ -1675,7 +1658,7 @@ }, { "cell_type": "code", - "execution_count": 143, + "execution_count": null, "metadata": {}, "outputs": [ { @@ -1723,2207 +1706,216 @@ }, "metadata": {}, "output_type": "display_data" - }, - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
GAME_IDTEAM_IDTEAM_ABBREVIATIONTEAM_CITYPLAYER_IDPLAYER_NAMENICKNAMESTART_POSITIONCOMMENTMIN...OREBDREBREBASTSTLBLKTOPFPTSPLUS_MINUS
0222004771610612759SASSan Antonio1629641Romeo LangfordRomeoFNaN18:06...1.01.02.00.01.00.02.05.02.0-2.0
1222004771610612759SASSan Antonio1631110Jeremy SochanJeremyFNaN31:01...6.03.09.06.01.00.02.01.023.0-14.0
2222004771610612759SASSan Antonio1627751Jakob PoeltlJakobCNaN21:42...1.03.04.01.01.00.02.04.013.0-4.0
3222004771610612759SASSan Antonio1630170Devin VassellDevinGNaN30:20...0.09.09.05.03.00.02.01.010.0-18.0
4222004771610612759SASSan Antonio1630200Tre JonesTreGNaN27:44...0.02.02.03.00.00.02.02.019.00.0
..................................................................
668623112000051610612743DENDenver202706Jordan HamiltonNaNNaNNaN19...0.02.02.00.02.00.01.03.017.0NaN
668624112000051610612743DENDenver202702Kenneth FariedNaNNaNNaN23...1.00.01.01.01.00.03.03.018.0NaN
668625112000051610612743DENDenver201585Kosta KoufosNaNNaNNaN15...3.05.08.00.01.00.00.03.06.0NaN
668626112000051610612743DENDenver202389Timofey MozgovNaNNaNNaN19...1.02.03.01.00.00.04.02.02.0NaN
668627112000051610612743DENDenver201951Ty LawsonNaNNaNNaN27...0.02.02.06.02.00.06.01.08.0NaN
\n", - "

668628 rows × 29 columns

\n", - "
" - ], - "text/plain": [ - " GAME_ID TEAM_ID TEAM_ABBREVIATION TEAM_CITY PLAYER_ID \\\n", - "0 22200477 1610612759 SAS San Antonio 1629641 \n", - "1 22200477 1610612759 SAS San Antonio 1631110 \n", - "2 22200477 1610612759 SAS San Antonio 1627751 \n", - "3 22200477 1610612759 SAS San Antonio 1630170 \n", - "4 22200477 1610612759 SAS San Antonio 1630200 \n", - "... ... ... ... ... ... \n", - "668623 11200005 1610612743 DEN Denver 202706 \n", - "668624 11200005 1610612743 DEN Denver 202702 \n", - "668625 11200005 1610612743 DEN Denver 201585 \n", - "668626 11200005 1610612743 DEN Denver 202389 \n", - "668627 11200005 1610612743 DEN Denver 201951 \n", - "\n", - " PLAYER_NAME NICKNAME START_POSITION COMMENT MIN ... OREB \\\n", - "0 Romeo Langford Romeo F NaN 18:06 ... 1.0 \n", - "1 Jeremy Sochan Jeremy F NaN 31:01 ... 6.0 \n", - "2 Jakob Poeltl Jakob C NaN 21:42 ... 1.0 \n", - "3 Devin Vassell Devin G NaN 30:20 ... 0.0 \n", - "4 Tre Jones Tre G NaN 27:44 ... 0.0 \n", - "... ... ... ... ... ... ... ... \n", - "668623 Jordan Hamilton NaN NaN NaN 19 ... 0.0 \n", - "668624 Kenneth Faried NaN NaN NaN 23 ... 1.0 \n", - "668625 Kosta Koufos NaN NaN NaN 15 ... 3.0 \n", - "668626 Timofey Mozgov NaN NaN NaN 19 ... 1.0 \n", - "668627 Ty Lawson NaN NaN NaN 27 ... 0.0 \n", - "\n", - " DREB REB AST STL BLK TO PF PTS PLUS_MINUS \n", - "0 1.0 2.0 0.0 1.0 0.0 2.0 5.0 2.0 -2.0 \n", - "1 3.0 9.0 6.0 1.0 0.0 2.0 1.0 23.0 -14.0 \n", - "2 3.0 4.0 1.0 1.0 0.0 2.0 4.0 13.0 -4.0 \n", - "3 9.0 9.0 5.0 3.0 0.0 2.0 1.0 10.0 -18.0 \n", - "4 2.0 2.0 3.0 0.0 0.0 2.0 2.0 19.0 0.0 \n", - "... ... ... ... ... ... ... ... ... ... \n", - "668623 2.0 2.0 0.0 2.0 0.0 1.0 3.0 17.0 NaN \n", - "668624 0.0 1.0 1.0 1.0 0.0 3.0 3.0 18.0 NaN \n", - "668625 5.0 8.0 0.0 1.0 0.0 0.0 3.0 6.0 NaN \n", - "668626 2.0 3.0 1.0 0.0 0.0 4.0 2.0 2.0 NaN \n", - "668627 2.0 2.0 6.0 2.0 0.0 6.0 1.0 8.0 NaN \n", - "\n", - "[668628 rows x 29 columns]" - ] - }, - "metadata": {}, - "output_type": "display_data" - } - ], - "source": [ - "print(\"--------- DETALLES PARTIDOS ----------\")\n", - "print(\"--------------------------------------\")\n", - "display(df_games_details.isnull().sum())\n", - "display(df_games_details[df_games_details.isna().any(axis=1)])" - ] - }, - { - "cell_type": "code", - "execution_count": 144, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "--------- JUGADORES ----------\n", - "------------------------------\n" - ] - }, - { - "data": { - "text/plain": [ - "PLAYER_NAME 0\n", - "TEAM_ID 0\n", - "PLAYER_ID 0\n", - "SEASON 0\n", - "dtype: int64" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
PLAYER_NAMETEAM_IDPLAYER_IDSEASON
\n", - "
" - ], - "text/plain": [ - "Empty DataFrame\n", - "Columns: [PLAYER_NAME, TEAM_ID, PLAYER_ID, SEASON]\n", - "Index: []" - ] - }, - "metadata": {}, - "output_type": "display_data" - } - ], - "source": [ - "print(\"--------- JUGADORES ----------\")\n", - "print(\"------------------------------\")\n", - "display(df_players.isnull().sum())\n", - "display(df_players[df_players.isna().any(axis=1)])" - ] - }, - { - "cell_type": "code", - "execution_count": 145, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "--------- RANKING LIGA ----------\n", - "---------------------------------\n" - ] - }, - { - "data": { - "text/plain": [ - "TEAM_ID 0\n", - "LEAGUE_ID 0\n", - "SEASON_ID 0\n", - "STANDINGSDATE 0\n", - "CONFERENCE 0\n", - "TEAM 0\n", - "G 0\n", - "W 0\n", - "L 0\n", - "W_PCT 0\n", - "HOME_RECORD 0\n", - "ROAD_RECORD 0\n", - "RETURNTOPLAY 206352\n", - "dtype: int64" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
TEAM_IDLEAGUE_IDSEASON_IDSTANDINGSDATECONFERENCETEAMGWLW_PCTHOME_RECORDROAD_RECORDRETURNTOPLAY
016106127430220222022-12-22WestDenver3019110.63310-39-8NaN
116106127630220222022-12-22WestMemphis3019110.63313-26-9NaN
216106127400220222022-12-22WestNew Orleans3119120.61313-46-8NaN
316106127560220222022-12-22WestPhoenix3219130.59414-45-9NaN
416106127460220222022-12-22WestLA Clippers3319140.57611-78-7NaN
..........................................
21033716106127650220132014-09-01EastDetroit8229530.35417-2412-29NaN
21033816106127380220132014-09-01EastBoston8225570.30516-259-32NaN
21033916106127530220132014-09-01EastOrlando8223590.28019-224-37NaN
21034016106127550220132014-09-01EastPhiladelphia8219630.23210-319-32NaN
21034116106127490220132014-09-01EastMilwaukee8215670.18310-315-36NaN
\n", - "

206352 rows × 13 columns

\n", - "
" - ], - "text/plain": [ - " TEAM_ID LEAGUE_ID SEASON_ID STANDINGSDATE CONFERENCE \\\n", - "0 1610612743 0 22022 2022-12-22 West \n", - "1 1610612763 0 22022 2022-12-22 West \n", - "2 1610612740 0 22022 2022-12-22 West \n", - "3 1610612756 0 22022 2022-12-22 West \n", - "4 1610612746 0 22022 2022-12-22 West \n", - "... ... ... ... ... ... \n", - "210337 1610612765 0 22013 2014-09-01 East \n", - "210338 1610612738 0 22013 2014-09-01 East \n", - "210339 1610612753 0 22013 2014-09-01 East \n", - "210340 1610612755 0 22013 2014-09-01 East \n", - "210341 1610612749 0 22013 2014-09-01 East \n", - "\n", - " TEAM G W L W_PCT HOME_RECORD ROAD_RECORD RETURNTOPLAY \n", - "0 Denver 30 19 11 0.633 10-3 9-8 NaN \n", - "1 Memphis 30 19 11 0.633 13-2 6-9 NaN \n", - "2 New Orleans 31 19 12 0.613 13-4 6-8 NaN \n", - "3 Phoenix 32 19 13 0.594 14-4 5-9 NaN \n", - "4 LA Clippers 33 19 14 0.576 11-7 8-7 NaN \n", - "... ... .. .. .. ... ... ... ... \n", - "210337 Detroit 82 29 53 0.354 17-24 12-29 NaN \n", - "210338 Boston 82 25 57 0.305 16-25 9-32 NaN \n", - "210339 Orlando 82 23 59 0.280 19-22 4-37 NaN \n", - "210340 Philadelphia 82 19 63 0.232 10-31 9-32 NaN \n", - "210341 Milwaukee 82 15 67 0.183 10-31 5-36 NaN \n", - "\n", - "[206352 rows x 13 columns]" - ] - }, - "metadata": {}, - "output_type": "display_data" - } - ], - "source": [ - "print(\"--------- RANKING LIGA ----------\")\n", - "print(\"---------------------------------\")\n", - "display(df_ranking.isnull().sum())\n", - "display(df_ranking[df_ranking.isna().any(axis=1)])" - ] - }, - { - "cell_type": "code", - "execution_count": 146, - "metadata": { - "scrolled": true - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "--------- EQUIPOS ----------\n", - "----------------------------\n" - ] - }, - { - "data": { - "text/plain": [ - "LEAGUE_ID 0\n", - "TEAM_ID 0\n", - "MIN_YEAR 0\n", - "MAX_YEAR 0\n", - "ABBREVIATION 0\n", - "NICKNAME 0\n", - "YEARFOUNDED 0\n", - "CITY 0\n", - "ARENA 0\n", - "ARENACAPACITY 4\n", - "OWNER 0\n", - "GENERALMANAGER 0\n", - "HEADCOACH 0\n", - "DLEAGUEAFFILIATION 0\n", - "dtype: int64" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
LEAGUE_IDTEAM_IDMIN_YEARMAX_YEARABBREVIATIONNICKNAMEYEARFOUNDEDCITYARENAARENACAPACITYOWNERGENERALMANAGERHEADCOACHDLEAGUEAFFILIATION
20161061274020022019NOPPelicans2002New OrleansSmoothie King CenterNaNTom BensonTrajan LangdonAlvin GentryNo Affiliate
120161061275119762019BKNNets1976BrooklynBarclays CenterNaNJoe TsaiSean MarksKenny AtkinsonLong Island Nets
160161061275519492019PHI76ers1949PhiladelphiaWells Fargo CenterNaNJoshua HarrisElton BrandBrett BrownDelaware Blue Coats
170161061275619682019PHXSuns1968PhoenixTalking Stick Resort ArenaNaNRobert SarverJames JonesMonty WilliamsNorthern Arizona Suns
\n", - "
" - ], - "text/plain": [ - " LEAGUE_ID TEAM_ID MIN_YEAR MAX_YEAR ABBREVIATION NICKNAME \\\n", - "2 0 1610612740 2002 2019 NOP Pelicans \n", - "12 0 1610612751 1976 2019 BKN Nets \n", - "16 0 1610612755 1949 2019 PHI 76ers \n", - "17 0 1610612756 1968 2019 PHX Suns \n", - "\n", - " YEARFOUNDED CITY ARENA ARENACAPACITY \\\n", - "2 2002 New Orleans Smoothie King Center NaN \n", - "12 1976 Brooklyn Barclays Center NaN \n", - "16 1949 Philadelphia Wells Fargo Center NaN \n", - "17 1968 Phoenix Talking Stick Resort Arena NaN \n", - "\n", - " OWNER GENERALMANAGER HEADCOACH DLEAGUEAFFILIATION \n", - "2 Tom Benson Trajan Langdon Alvin Gentry No Affiliate \n", - "12 Joe Tsai Sean Marks Kenny Atkinson Long Island Nets \n", - "16 Joshua Harris Elton Brand Brett Brown Delaware Blue Coats \n", - "17 Robert Sarver James Jones Monty Williams Northern Arizona Suns " - ] - }, - "metadata": {}, - "output_type": "display_data" - } - ], - "source": [ - "print(\"--------- EQUIPOS ----------\")\n", - "print(\"----------------------------\")\n", - "display(df_teams.isnull().sum())\n", - "display(df_teams[df_teams.isna().any(axis=1)])" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "En el df de partidos podemos ver que hay NaN en el 2003, borramos todas las filas con partidos anteriores al 2010 y comprobamos si se han quedado NaN" - ] - }, - { - "cell_type": "code", - "execution_count": 147, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 147, - "metadata": {}, - "output_type": "execute_result" - }, - { - "data": { - "image/png": "\n", - "text/plain": [ - "
" - ] - }, - "metadata": {}, - "output_type": "display_data" - } - ], - "source": [ - "df_games.isna().sum()[df_games.isna().sum()>0].plot(kind='bar')" - ] - }, - { - "cell_type": "code", - "execution_count": 148, - "metadata": {}, - "outputs": [], - "source": [ - "df_games = df_games.loc[df_games['GAME_DATE_EST'] >= \"2004-01-01\"].reset_index(drop=True)" - ] - }, - { - "cell_type": "code", - "execution_count": 149, - "metadata": { - "scrolled": true - }, - "outputs": [ - { - "data": { - "text/plain": [ - "GAME_DATE_EST 0\n", - "GAME_ID 0\n", - "GAME_STATUS_TEXT 0\n", - "HOME_TEAM_ID 0\n", - "VISITOR_TEAM_ID 0\n", - "SEASON 0\n", - "TEAM_ID_home 0\n", - "PTS_home 0\n", - "FG_PCT_home 0\n", - "FT_PCT_home 0\n", - "FG3_PCT_home 0\n", - "AST_home 0\n", - "REB_home 0\n", - "TEAM_ID_away 0\n", - "PTS_away 0\n", - "FG_PCT_away 0\n", - "FT_PCT_away 0\n", - "FG3_PCT_away 0\n", - "AST_away 0\n", - "REB_away 0\n", - "HOME_TEAM_WINS 0\n", - "dtype: int64" - ] - }, - "execution_count": 149, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "df_games.isnull().sum()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Para saber que equipo es, podríamos fusionar algunos datos del dataframe de partidos con el de equipos de alguna manera" - ] - }, - { - "cell_type": "code", - "execution_count": 150, - "metadata": { - "scrolled": false - }, - "outputs": [ - { - "data": { - "text/plain": [ - "Index(['GAME_DATE_EST', 'GAME_ID', 'GAME_STATUS_TEXT', 'HOME_TEAM_ID',\n", - " 'VISITOR_TEAM_ID', 'SEASON', 'TEAM_ID_home', 'PTS_home', 'FG_PCT_home',\n", - " 'FT_PCT_home', 'FG3_PCT_home', 'AST_home', 'REB_home', 'TEAM_ID_away',\n", - " 'PTS_away', 'FG_PCT_away', 'FT_PCT_away', 'FG3_PCT_away', 'AST_away',\n", - " 'REB_away', 'HOME_TEAM_WINS'],\n", - " dtype='object')" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "text/plain": [ - "Index(['LEAGUE_ID', 'TEAM_ID', 'MIN_YEAR', 'MAX_YEAR', 'ABBREVIATION',\n", - " 'NICKNAME', 'YEARFOUNDED', 'CITY', 'ARENA', 'ARENACAPACITY', 'OWNER',\n", - " 'GENERALMANAGER', 'HEADCOACH', 'DLEAGUEAFFILIATION'],\n", - " dtype='object')" - ] - }, - "metadata": {}, - "output_type": "display_data" - } - ], - "source": [ - "display(df_games.columns)\n", - "display(df_teams.columns)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Vemos como podemos sustituir los IDS por los nombres de los equipos" - ] - }, - { - "cell_type": "code", - "execution_count": 151, - "metadata": {}, - "outputs": [], - "source": [ - "df_teams = df_teams[['TEAM_ID', 'NICKNAME']]\n", - "\n", - "# Reemplaza HOME_TEAM_ID por los nombres del dataframe teams\n", - "nombres_local = df_teams.copy()\n", - "nombres_local.columns = ['HOME_TEAM_ID', 'NICKNAME']\n", - "# Se unen el ID de lequipo por el nickname\n", - "result_1 = pd.merge(df_games['HOME_TEAM_ID'], nombres_local, how =\"left\", on=\"HOME_TEAM_ID\") \n", - "df_games['HOME_TEAM_ID'] = result_1['NICKNAME']\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Vemos como se ha cambiado el HOME_TEAM_ID por el nombre del equipo, haremos lo mismo con el visitante" - ] - }, - { - "cell_type": "code", - "execution_count": 152, - "metadata": { - "scrolled": true - }, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
GAME_DATE_ESTGAME_IDGAME_STATUS_TEXTHOME_TEAM_IDVISITOR_TEAM_IDSEASONTEAM_ID_homePTS_homeFG_PCT_homeFT_PCT_home...AST_homeREB_homeTEAM_ID_awayPTS_awayFG_PCT_awayFT_PCT_awayFG3_PCT_awayAST_awayREB_awayHOME_TEAM_WINS
02022-12-2222200477FinalPelicans161061275920221610612740126.00.4840.926...25.046.01610612759117.00.4780.8150.32123.044.01
12022-12-2222200478FinalJazz161061276420221610612762120.00.4880.952...16.040.01610612764112.00.5610.7650.33320.037.01
22022-12-2122200466FinalCavaliers161061274920221610612739114.00.4820.786...22.037.01610612749106.00.4700.6820.43320.046.01
32022-12-2122200467Final76ers161061276520221610612755113.00.4410.909...27.049.0161061276593.00.3920.7350.26115.046.01
42022-12-2122200468FinalHawks161061274120221610612737108.00.4291.000...22.047.01610612741110.00.5000.7730.29220.047.00
52022-12-2122200469FinalCeltics161061275420221610612738112.00.3860.840...26.062.01610612754117.00.4690.7780.46227.047.00
62022-12-2122200470FinalNets161061274420221610612751143.00.6430.875...42.032.01610612744113.00.4940.7600.36432.036.01
72022-12-2122200471FinalKnicks161061276120221610612752106.00.5530.611...25.038.01610612761113.00.4470.9090.26517.038.00
82022-12-2122200472FinalRockets161061275320221610612745110.00.4660.647...22.049.01610612753116.00.4510.6970.29719.045.00
92022-12-2122200473FinalTimberwolves16106127422022161061275099.00.4940.700...23.039.01610612742104.00.4530.8520.33317.039.00
102022-12-2122200474FinalThunder161061275720221610612760101.00.4680.840...19.037.0161061275798.00.4940.6670.38929.036.01
112022-12-2122200475FinalKings161061274720221610612758134.00.5050.750...29.046.01610612747120.00.5000.8330.45825.039.01
122022-12-2122200476FinalClippers161061276620221610612746126.00.5060.913...29.048.01610612766105.00.4020.7590.29025.040.01
132022-12-2022200461FinalPistons161061276220221610612765111.00.5060.741...22.043.01610612762126.00.5050.6320.43527.043.00
142022-12-2022200462FinalHeat161061274120221610612748103.00.4690.706...26.035.01610612741113.00.5480.8330.41924.039.00
152022-12-2022200463FinalKnicks161061274420221610612752132.00.5170.781...27.047.0161061274494.00.4730.9230.34323.029.01
162022-12-2022200464FinalSuns161061276420221610612756110.00.4610.789...26.044.01610612764113.00.4750.7030.40722.041.00
172022-12-2022200465FinalNuggets161061276320221610612743105.00.4490.600...28.048.0161061276391.00.4440.6670.19225.042.01
182022-12-1922200452FinalCavaliers161061276220221610612739122.00.6140.808...24.045.0161061276299.00.3870.7390.29419.035.01
192022-12-1922200453Final76ers161061276120221610612755104.00.4000.926...22.041.01610612761101.00.4200.8000.27524.050.01
\n", - "

20 rows × 21 columns

\n", - "
" - ], - "text/plain": [ - " GAME_DATE_EST GAME_ID GAME_STATUS_TEXT HOME_TEAM_ID VISITOR_TEAM_ID \\\n", - "0 2022-12-22 22200477 Final Pelicans 1610612759 \n", - "1 2022-12-22 22200478 Final Jazz 1610612764 \n", - "2 2022-12-21 22200466 Final Cavaliers 1610612749 \n", - "3 2022-12-21 22200467 Final 76ers 1610612765 \n", - "4 2022-12-21 22200468 Final Hawks 1610612741 \n", - "5 2022-12-21 22200469 Final Celtics 1610612754 \n", - "6 2022-12-21 22200470 Final Nets 1610612744 \n", - "7 2022-12-21 22200471 Final Knicks 1610612761 \n", - "8 2022-12-21 22200472 Final Rockets 1610612753 \n", - "9 2022-12-21 22200473 Final Timberwolves 1610612742 \n", - "10 2022-12-21 22200474 Final Thunder 1610612757 \n", - "11 2022-12-21 22200475 Final Kings 1610612747 \n", - "12 2022-12-21 22200476 Final Clippers 1610612766 \n", - "13 2022-12-20 22200461 Final Pistons 1610612762 \n", - "14 2022-12-20 22200462 Final Heat 1610612741 \n", - "15 2022-12-20 22200463 Final Knicks 1610612744 \n", - "16 2022-12-20 22200464 Final Suns 1610612764 \n", - "17 2022-12-20 22200465 Final Nuggets 1610612763 \n", - "18 2022-12-19 22200452 Final Cavaliers 1610612762 \n", - "19 2022-12-19 22200453 Final 76ers 1610612761 \n", - "\n", - " SEASON TEAM_ID_home PTS_home FG_PCT_home FT_PCT_home ... AST_home \\\n", - "0 2022 1610612740 126.0 0.484 0.926 ... 25.0 \n", - "1 2022 1610612762 120.0 0.488 0.952 ... 16.0 \n", - "2 2022 1610612739 114.0 0.482 0.786 ... 22.0 \n", - "3 2022 1610612755 113.0 0.441 0.909 ... 27.0 \n", - "4 2022 1610612737 108.0 0.429 1.000 ... 22.0 \n", - "5 2022 1610612738 112.0 0.386 0.840 ... 26.0 \n", - "6 2022 1610612751 143.0 0.643 0.875 ... 42.0 \n", - "7 2022 1610612752 106.0 0.553 0.611 ... 25.0 \n", - "8 2022 1610612745 110.0 0.466 0.647 ... 22.0 \n", - "9 2022 1610612750 99.0 0.494 0.700 ... 23.0 \n", - "10 2022 1610612760 101.0 0.468 0.840 ... 19.0 \n", - "11 2022 1610612758 134.0 0.505 0.750 ... 29.0 \n", - "12 2022 1610612746 126.0 0.506 0.913 ... 29.0 \n", - "13 2022 1610612765 111.0 0.506 0.741 ... 22.0 \n", - "14 2022 1610612748 103.0 0.469 0.706 ... 26.0 \n", - "15 2022 1610612752 132.0 0.517 0.781 ... 27.0 \n", - "16 2022 1610612756 110.0 0.461 0.789 ... 26.0 \n", - "17 2022 1610612743 105.0 0.449 0.600 ... 28.0 \n", - "18 2022 1610612739 122.0 0.614 0.808 ... 24.0 \n", - "19 2022 1610612755 104.0 0.400 0.926 ... 22.0 \n", - "\n", - " REB_home TEAM_ID_away PTS_away FG_PCT_away FT_PCT_away FG3_PCT_away \\\n", - "0 46.0 1610612759 117.0 0.478 0.815 0.321 \n", - "1 40.0 1610612764 112.0 0.561 0.765 0.333 \n", - "2 37.0 1610612749 106.0 0.470 0.682 0.433 \n", - "3 49.0 1610612765 93.0 0.392 0.735 0.261 \n", - "4 47.0 1610612741 110.0 0.500 0.773 0.292 \n", - "5 62.0 1610612754 117.0 0.469 0.778 0.462 \n", - "6 32.0 1610612744 113.0 0.494 0.760 0.364 \n", - "7 38.0 1610612761 113.0 0.447 0.909 0.265 \n", - "8 49.0 1610612753 116.0 0.451 0.697 0.297 \n", - "9 39.0 1610612742 104.0 0.453 0.852 0.333 \n", - "10 37.0 1610612757 98.0 0.494 0.667 0.389 \n", - "11 46.0 1610612747 120.0 0.500 0.833 0.458 \n", - "12 48.0 1610612766 105.0 0.402 0.759 0.290 \n", - "13 43.0 1610612762 126.0 0.505 0.632 0.435 \n", - "14 35.0 1610612741 113.0 0.548 0.833 0.419 \n", - "15 47.0 1610612744 94.0 0.473 0.923 0.343 \n", - "16 44.0 1610612764 113.0 0.475 0.703 0.407 \n", - "17 48.0 1610612763 91.0 0.444 0.667 0.192 \n", - "18 45.0 1610612762 99.0 0.387 0.739 0.294 \n", - "19 41.0 1610612761 101.0 0.420 0.800 0.275 \n", - "\n", - " AST_away REB_away HOME_TEAM_WINS \n", - "0 23.0 44.0 1 \n", - "1 20.0 37.0 1 \n", - "2 20.0 46.0 1 \n", - "3 15.0 46.0 1 \n", - "4 20.0 47.0 0 \n", - "5 27.0 47.0 0 \n", - "6 32.0 36.0 1 \n", - "7 17.0 38.0 0 \n", - "8 19.0 45.0 0 \n", - "9 17.0 39.0 0 \n", - "10 29.0 36.0 1 \n", - "11 25.0 39.0 1 \n", - "12 25.0 40.0 1 \n", - "13 27.0 43.0 0 \n", - "14 24.0 39.0 0 \n", - "15 23.0 29.0 1 \n", - "16 22.0 41.0 0 \n", - "17 25.0 42.0 1 \n", - "18 19.0 35.0 1 \n", - "19 24.0 50.0 1 \n", - "\n", - "[20 rows x 21 columns]" - ] - }, - "execution_count": 152, - "metadata": {}, - "output_type": "execute_result" } ], "source": [ - "df_games.head(20)" + "print(\"--------- DETALLES PARTIDOS ----------\")\n", + "print(\"--------------------------------------\")\n", + "display(df_games_details.isnull().sum())\n", + "display(df_games_details[df_games_details.isna().any(axis=1)])" ] }, { - "cell_type": "markdown", + "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ - "Haremos lo mismo con los equipos visitantes" + "print(\"--------- JUGADORES ----------\")\n", + "print(\"------------------------------\")\n", + "display(df_players.isnull().sum())\n", + "display(df_players[df_players.isna().any(axis=1)])" ] }, { "cell_type": "code", - "execution_count": 153, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ - "# Reemplaza VISITOR_TEAM_ID por los nombres del dataframe teams\n", - "nombres_visitante = df_teams.copy()\n", - "nombres_visitante.columns = ['VISITOR_TEAM_ID', 'NICKNAME']\n", - "# Se unen el ID del equipo por el nickname\n", - "result_2 = pd.merge(df_games['VISITOR_TEAM_ID'], nombres_visitante, how =\"left\", on=\"VISITOR_TEAM_ID\") \n", - "df_games['VISITOR_TEAM_ID'] = result_2['NICKNAME']" + "print(\"--------- RANKING LIGA ----------\")\n", + "print(\"---------------------------------\")\n", + "display(df_ranking.isnull().sum())\n", + "display(df_ranking[df_ranking.isna().any(axis=1)])" ] }, { "cell_type": "code", - "execution_count": 154, + "execution_count": null, "metadata": { "scrolled": true }, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
GAME_DATE_ESTGAME_IDGAME_STATUS_TEXTHOME_TEAM_IDVISITOR_TEAM_IDSEASONTEAM_ID_homePTS_homeFG_PCT_homeFT_PCT_home...AST_homeREB_homeTEAM_ID_awayPTS_awayFG_PCT_awayFT_PCT_awayFG3_PCT_awayAST_awayREB_awayHOME_TEAM_WINS
02022-12-2222200477FinalPelicansSpurs20221610612740126.00.4840.926...25.046.01610612759117.00.4780.8150.32123.044.01
12022-12-2222200478FinalJazzWizards20221610612762120.00.4880.952...16.040.01610612764112.00.5610.7650.33320.037.01
22022-12-2122200466FinalCavaliersBucks20221610612739114.00.4820.786...22.037.01610612749106.00.4700.6820.43320.046.01
32022-12-2122200467Final76ersPistons20221610612755113.00.4410.909...27.049.0161061276593.00.3920.7350.26115.046.01
42022-12-2122200468FinalHawksBulls20221610612737108.00.4291.000...22.047.01610612741110.00.5000.7730.29220.047.00
\n", - "

5 rows × 21 columns

\n", - "
" - ], - "text/plain": [ - " GAME_DATE_EST GAME_ID GAME_STATUS_TEXT HOME_TEAM_ID VISITOR_TEAM_ID \\\n", - "0 2022-12-22 22200477 Final Pelicans Spurs \n", - "1 2022-12-22 22200478 Final Jazz Wizards \n", - "2 2022-12-21 22200466 Final Cavaliers Bucks \n", - "3 2022-12-21 22200467 Final 76ers Pistons \n", - "4 2022-12-21 22200468 Final Hawks Bulls \n", - "\n", - " SEASON TEAM_ID_home PTS_home FG_PCT_home FT_PCT_home ... AST_home \\\n", - "0 2022 1610612740 126.0 0.484 0.926 ... 25.0 \n", - "1 2022 1610612762 120.0 0.488 0.952 ... 16.0 \n", - "2 2022 1610612739 114.0 0.482 0.786 ... 22.0 \n", - "3 2022 1610612755 113.0 0.441 0.909 ... 27.0 \n", - "4 2022 1610612737 108.0 0.429 1.000 ... 22.0 \n", - "\n", - " REB_home TEAM_ID_away PTS_away FG_PCT_away FT_PCT_away FG3_PCT_away \\\n", - "0 46.0 1610612759 117.0 0.478 0.815 0.321 \n", - "1 40.0 1610612764 112.0 0.561 0.765 0.333 \n", - "2 37.0 1610612749 106.0 0.470 0.682 0.433 \n", - "3 49.0 1610612765 93.0 0.392 0.735 0.261 \n", - "4 47.0 1610612741 110.0 0.500 0.773 0.292 \n", - "\n", - " AST_away REB_away HOME_TEAM_WINS \n", - "0 23.0 44.0 1 \n", - "1 20.0 37.0 1 \n", - "2 20.0 46.0 1 \n", - "3 15.0 46.0 1 \n", - "4 20.0 47.0 0 \n", - "\n", - "[5 rows x 21 columns]" - ] - }, - "execution_count": 154, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], + "source": [ + "print(\"--------- EQUIPOS ----------\")\n", + "print(\"----------------------------\")\n", + "display(df_teams.isnull().sum())\n", + "display(df_teams[df_teams.isna().any(axis=1)])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "En el df de partidos podemos ver que hay NaN en el 2003, borramos todas las filas con partidos anteriores al 2010 y comprobamos si se han quedado NaN" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "df_games.isna().sum()[df_games.isna().sum()>0].plot(kind='bar')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], "source": [ - "df_games.head()" + "df_games = df_games.loc[df_games['GAME_DATE_EST'] >= \"2004-01-01\"].reset_index(drop=True)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "df_games.isnull().sum()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Exploración de los valores únicos de las variables del dataframe de partidos" + "Para saber que equipo es, podríamos fusionar algunos datos del dataframe de partidos con el de equipos de alguna manera" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": false + }, + "outputs": [], + "source": [ + "display(df_games.columns)\n", + "display(df_teams.columns)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Vemos como podemos sustituir los IDS por los nombres de los equipos" ] }, { "cell_type": "code", - "execution_count": 155, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "GAME_STATUS_TEXT ['Final']\n", - "HOME_TEAM_WINS [1 0]\n" - ] - } - ], + "outputs": [], "source": [ + "df_teams = df_teams[['TEAM_ID', 'NICKNAME']]\n", "\n", - "for column in df_games.columns:\n", - " if len(df_games[column].unique()) < 10:\n", - " print(column, df_games[column].unique())\n", - " else:\n", - " continue" + "# Reemplaza HOME_TEAM_ID por los nombres del dataframe teams\n", + "nombres_local = df_teams.copy()\n", + "nombres_local.columns = ['HOME_TEAM_ID', 'NICKNAME']\n", + "# Se unen el ID de lequipo por el nickname\n", + "result_1 = pd.merge(df_games['HOME_TEAM_ID'], nombres_local, how =\"left\", on=\"HOME_TEAM_ID\") \n", + "df_games['HOME_TEAM_ID'] = result_1['NICKNAME']\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Al menos GAME_STATUS_TEXT nos sobra ya que solo tiene un valor unico y no tiene relevancia" + "Vemos como se ha cambiado el HOME_TEAM_ID por el nombre del equipo, haremos lo mismo con el visitante" ] }, { "cell_type": "code", - "execution_count": 156, + "execution_count": null, "metadata": { "scrolled": true }, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
GAME_DATE_ESTGAME_IDHOME_TEAM_IDVISITOR_TEAM_IDSEASONTEAM_ID_homePTS_homeFG_PCT_homeFT_PCT_homeFG3_PCT_homeAST_homeREB_homeTEAM_ID_awayPTS_awayFG_PCT_awayFT_PCT_awayFG3_PCT_awayAST_awayREB_awayHOME_TEAM_WINS
02022-12-2222200477PelicansSpurs20221610612740126.00.4840.9260.38225.046.01610612759117.00.4780.8150.32123.044.01
12022-12-2222200478JazzWizards20221610612762120.00.4880.9520.45716.040.01610612764112.00.5610.7650.33320.037.01
22022-12-2122200466CavaliersBucks20221610612739114.00.4820.7860.31322.037.01610612749106.00.4700.6820.43320.046.01
32022-12-212220046776ersPistons20221610612755113.00.4410.9090.29727.049.0161061276593.00.3920.7350.26115.046.01
42022-12-2122200468HawksBulls20221610612737108.00.4291.0000.37822.047.01610612741110.00.5000.7730.29220.047.00
\n", - "
" - ], - "text/plain": [ - " GAME_DATE_EST GAME_ID HOME_TEAM_ID VISITOR_TEAM_ID SEASON TEAM_ID_home \\\n", - "0 2022-12-22 22200477 Pelicans Spurs 2022 1610612740 \n", - "1 2022-12-22 22200478 Jazz Wizards 2022 1610612762 \n", - "2 2022-12-21 22200466 Cavaliers Bucks 2022 1610612739 \n", - "3 2022-12-21 22200467 76ers Pistons 2022 1610612755 \n", - "4 2022-12-21 22200468 Hawks Bulls 2022 1610612737 \n", - "\n", - " PTS_home FG_PCT_home FT_PCT_home FG3_PCT_home AST_home REB_home \\\n", - "0 126.0 0.484 0.926 0.382 25.0 46.0 \n", - "1 120.0 0.488 0.952 0.457 16.0 40.0 \n", - "2 114.0 0.482 0.786 0.313 22.0 37.0 \n", - "3 113.0 0.441 0.909 0.297 27.0 49.0 \n", - "4 108.0 0.429 1.000 0.378 22.0 47.0 \n", - "\n", - " TEAM_ID_away PTS_away FG_PCT_away FT_PCT_away FG3_PCT_away AST_away \\\n", - "0 1610612759 117.0 0.478 0.815 0.321 23.0 \n", - "1 1610612764 112.0 0.561 0.765 0.333 20.0 \n", - "2 1610612749 106.0 0.470 0.682 0.433 20.0 \n", - "3 1610612765 93.0 0.392 0.735 0.261 15.0 \n", - "4 1610612741 110.0 0.500 0.773 0.292 20.0 \n", - "\n", - " REB_away HOME_TEAM_WINS \n", - "0 44.0 1 \n", - "1 37.0 1 \n", - "2 46.0 1 \n", - "3 46.0 1 \n", - "4 47.0 0 " - ] - }, - "execution_count": 156, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], + "source": [ + "df_games.head(20)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Haremos lo mismo con los equipos visitantes" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Reemplaza VISITOR_TEAM_ID por los nombres del dataframe teams\n", + "nombres_visitante = df_teams.copy()\n", + "nombres_visitante.columns = ['VISITOR_TEAM_ID', 'NICKNAME']\n", + "# Se unen el ID del equipo por el nickname\n", + "result_2 = pd.merge(df_games['VISITOR_TEAM_ID'], nombres_visitante, how =\"left\", on=\"VISITOR_TEAM_ID\") \n", + "df_games['VISITOR_TEAM_ID'] = result_2['NICKNAME']" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "df_games.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Exploración de los valores únicos de las variables del dataframe de partidos" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "\n", + "for column in df_games.columns:\n", + " if len(df_games[column].unique()) < 10:\n", + " print(column, df_games[column].unique())\n", + " else:\n", + " continue" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Al menos GAME_STATUS_TEXT nos sobra ya que solo tiene un valor unico y no tiene relevancia" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], "source": [ "df_games = df_games.drop(columns=['GAME_STATUS_TEXT'])\n", "df_games.head()" @@ -3938,7 +1930,7 @@ }, { "cell_type": "code", - "execution_count": 157, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ @@ -3954,20 +1946,9 @@ }, { "cell_type": "code", - "execution_count": 158, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "(542, 20)" - ] - }, - "execution_count": 158, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "df_games_2022.shape" ] @@ -3981,19 +1962,11 @@ }, { "cell_type": "code", - "execution_count": 159, + "execution_count": null, "metadata": { "scrolled": false }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "['GAME_DATE_EST', 'GAME_ID', 'HOME_TEAM_ID', 'VISITOR_TEAM_ID', 'SEASON', 'TEAM_ID_home', 'PTS_home', 'FG_PCT_home', 'FT_PCT_home', 'FG3_PCT_home', 'AST_home', 'REB_home', 'TEAM_ID_away', 'PTS_away', 'FG_PCT_away', 'FT_PCT_away', 'FG3_PCT_away', 'AST_away', 'REB_away', 'HOME_TEAM_WINS']\n" - ] - } - ], + "outputs": [], "source": [ "variables = list(df_games_2022.columns)\n", "print(variables)" @@ -4008,7 +1981,7 @@ }, { "cell_type": "code", - "execution_count": 160, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ @@ -4025,7 +1998,7 @@ }, { "cell_type": "code", - "execution_count": 161, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ @@ -4042,160 +2015,11 @@ }, { "cell_type": "code", - "execution_count": 162, + "execution_count": null, "metadata": { "scrolled": true }, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
PTS_homeFG_PCT_homeFT_PCT_homeFG3_PCT_homeAST_homeREB_homePTS_awayFG_PCT_awayFT_PCT_awayFG3_PCT_awayAST_awayREB_away
0126.00.4840.9260.38225.046.0117.00.4780.8150.32123.044.0
1120.00.4880.9520.45716.040.0112.00.5610.7650.33320.037.0
2114.00.4820.7860.31322.037.0106.00.4700.6820.43320.046.0
3113.00.4410.9090.29727.049.093.00.3920.7350.26115.046.0
4108.00.4291.0000.37822.047.0110.00.5000.7730.29220.047.0
\n", - "
" - ], - "text/plain": [ - " PTS_home FG_PCT_home FT_PCT_home FG3_PCT_home AST_home REB_home \\\n", - "0 126.0 0.484 0.926 0.382 25.0 46.0 \n", - "1 120.0 0.488 0.952 0.457 16.0 40.0 \n", - "2 114.0 0.482 0.786 0.313 22.0 37.0 \n", - "3 113.0 0.441 0.909 0.297 27.0 49.0 \n", - "4 108.0 0.429 1.000 0.378 22.0 47.0 \n", - "\n", - " PTS_away FG_PCT_away FT_PCT_away FG3_PCT_away AST_away REB_away \n", - "0 117.0 0.478 0.815 0.321 23.0 44.0 \n", - "1 112.0 0.561 0.765 0.333 20.0 37.0 \n", - "2 106.0 0.470 0.682 0.433 20.0 46.0 \n", - "3 93.0 0.392 0.735 0.261 15.0 46.0 \n", - "4 110.0 0.500 0.773 0.292 20.0 47.0 " - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "text/plain": [ - "0 1\n", - "1 1\n", - "2 1\n", - "3 1\n", - "4 0\n", - "Name: HOME_TEAM_WINS, dtype: int64" - ] - }, - "metadata": {}, - "output_type": "display_data" - } - ], + "outputs": [], "source": [ "display(X.head())\n", "display(y.head())" @@ -4210,7 +2034,7 @@ }, { "cell_type": "code", - "execution_count": 163, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ @@ -4229,7 +2053,7 @@ }, { "cell_type": "code", - "execution_count": 164, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ @@ -4244,154 +2068,11 @@ }, { "cell_type": "code", - "execution_count": 165, + "execution_count": null, "metadata": { "scrolled": true }, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
PTS_home_normFG_PCT_home_normFT_PCT_home_normFG3_PCT_home_normAST_home_normREB_home_normPTS_away_normFG_PCT_away_normFT_PCT_away_normFG3_PCT_away_normAST_away_normREB_away_norm
4400.2054790.3901730.4000.3670100.080.4418600.4761900.4316110.8814290.4574710.6296300.684211
2470.1506850.0953760.6880.2288660.040.5813950.2698410.2492400.6828570.3103450.4814810.578947
4340.3287670.4104050.6000.1628870.480.3720930.3809520.3647420.7357140.4873560.2592590.763158
570.7534250.5809250.6660.7030930.560.5116280.6666670.5471120.8414290.4252870.6666670.421053
2340.6438360.8612720.4660.7587630.760.4883720.5238100.3465050.3542860.3816090.5925930.552632
\n", - "
" - ], - "text/plain": [ - " PTS_home_norm FG_PCT_home_norm FT_PCT_home_norm FG3_PCT_home_norm \\\n", - "440 0.205479 0.390173 0.400 0.367010 \n", - "247 0.150685 0.095376 0.688 0.228866 \n", - "434 0.328767 0.410405 0.600 0.162887 \n", - "57 0.753425 0.580925 0.666 0.703093 \n", - "234 0.643836 0.861272 0.466 0.758763 \n", - "\n", - " AST_home_norm REB_home_norm PTS_away_norm FG_PCT_away_norm \\\n", - "440 0.08 0.441860 0.476190 0.431611 \n", - "247 0.04 0.581395 0.269841 0.249240 \n", - "434 0.48 0.372093 0.380952 0.364742 \n", - "57 0.56 0.511628 0.666667 0.547112 \n", - "234 0.76 0.488372 0.523810 0.346505 \n", - "\n", - " FT_PCT_away_norm FG3_PCT_away_norm AST_away_norm REB_away_norm \n", - "440 0.881429 0.457471 0.629630 0.684211 \n", - "247 0.682857 0.310345 0.481481 0.578947 \n", - "434 0.735714 0.487356 0.259259 0.763158 \n", - "57 0.841429 0.425287 0.666667 0.421053 \n", - "234 0.354286 0.381609 0.592593 0.552632 " - ] - }, - "execution_count": 165, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "X_train.head()" ] @@ -4405,7 +2086,7 @@ }, { "cell_type": "code", - "execution_count": 166, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ @@ -4426,30 +2107,9 @@ }, { "cell_type": "code", - "execution_count": 167, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "['PTS_home_norm' 'FG_PCT_home_norm' 'AST_home_norm' 'PTS_away_norm'\n", - " 'FG_PCT_away_norm' 'FG3_PCT_away_norm']\n", - "Variable PTS_home_norm: 4.3868\n", - "Variable FG_PCT_home_norm: 3.7384\n", - "Variable FT_PCT_home_norm: 0.3319\n", - "Variable FG3_PCT_home_norm: 3.0046\n", - "Variable AST_home_norm: 3.0950\n", - "Variable REB_home_norm: 0.5436\n", - "Variable PTS_away_norm: 5.7462\n", - "Variable FG_PCT_away_norm: 3.8130\n", - "Variable FT_PCT_away_norm: 0.1706\n", - "Variable FG3_PCT_away_norm: 4.5008\n", - "Variable AST_away_norm: 2.6714\n", - "Variable REB_away_norm: 1.6384\n" - ] - } - ], + "outputs": [], "source": [ "selector = SelectKBest(chi2, k=6)\n", "\n", @@ -4475,18 +2135,9 @@ }, { "cell_type": "code", - "execution_count": 168, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Accuracy en train: 0.9287598944591029\n", - "Accuracy en test: 0.9141104294478528\n" - ] - } - ], + "outputs": [], "source": [ "# Creamos el objeto del modelo con parámetros por defecto, fijando la semilla para evitar aleatoriedad\n", "logreg = LogisticRegression(random_state=42)\n", @@ -4506,20 +2157,9 @@ }, { "cell_type": "code", - "execution_count": 169, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "0.9447852760736196" - ] - }, - "execution_count": 169, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "pipe = Pipeline([('scaler', StandardScaler()), ('svc', SVC())])\n", "pipe.fit(X_train, y_train)\n", From 510cb89feaa5c32cc406269587511d4f448f0b29 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Diego=20Crist=C3=B3bal=20Herreros?= Date: Sun, 22 Jan 2023 08:38:46 +0100 Subject: [PATCH 2/2] Fix: remove typo --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 4f9c725..c85d51b 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,5 @@ # Segunda entrega de machine learning -Una empresa de apuestas solicita conocer quien gana en un partido de NBA. Para ello, se usan los datos actualizados hasta la tempporada 2022 https://www.kaggle.com/datasets/nathanlauga/nba-games?resource=download +Una empresa de apuestas solicita conocer quien gana en un partido de NBA. Para ello, se usan los datos actualizados hasta la temporada 2022 https://www.kaggle.com/datasets/nathanlauga/nba-games?resource=download Para conseguir el objetivo, se decide usar varios algoritmos de machine learning para poder escoger el que mejor resultado pueda satisfacer las necesidades de la empresa. \ No newline at end of file