Skip to content

Latest commit

 

History

History
229 lines (186 loc) · 6.41 KB

File metadata and controls

229 lines (186 loc) · 6.41 KB
comments difficulty edit_url tags
true
困难
数据库

English Version

题目描述

表:Teams

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| player_id   | int     |
| team_name   | varchar | 
+-------------+---------+
player_id 是这张表的唯一主键。
每一行包含队员的唯一标识以及在该场比赛中参赛的某支队伍的名称。

表:Passes

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| pass_from   | int     |
| time_stamp  | varchar |
| pass_to     | int     |
+-------------+---------+
(pass_from, time_stamp) 是这张表的主键。
pass_from 是指向 Teams 表 player_id 字段的外键。
每一行代表比赛期间的一次传球,time_stamp 表示传球发生的分钟时间(00:00-90:00),
pass_to 表示 player_id 对应队员接球。

编写一个解决方案来计算每支球队 在上半场的优势得分。规则如下:

  • 一场比赛分为两个半场:上半场00:00-45:00 分钟)和 下半场45:01-90:00 分钟)
  • 优势得分是根据成功和截获的传球来计算的:
    • 当 pass_to 是 同球队 的队员:+1
    • 当 pass_to 是 对方球队 的队员(截获):-1
  • 更高的优势得分表明传球表现更好

返回结果表以 team_name 和 half_number 升序 排序。

结果格式如下所示。

 

示例:

输入:

Teams 表:

+------------+-----------+
| player_id  | team_name |
+------------+-----------+
| 1          | Arsenal   |
| 2          | Arsenal   |
| 3          | Arsenal   |
| 4          | Chelsea   |
| 5          | Chelsea   |
| 6          | Chelsea   |
+------------+-----------+

Passes 表:

+-----------+------------+---------+
| pass_from | time_stamp | pass_to |
+-----------+------------+---------+
| 1         | 00:15      | 2       |
| 2         | 00:45      | 3       |
| 3         | 01:15      | 1       |
| 4         | 00:30      | 1       |
| 2         | 46:00      | 3       |
| 3         | 46:15      | 4       |
| 1         | 46:45      | 2       |
| 5         | 46:30      | 6       |
+-----------+------------+---------+

输出:

+-----------+-------------+-----------+
| team_name | half_number | dominance |
+-----------+-------------+-----------+
| Arsenal   | 1           | 3         |
| Arsenal   | 2           | 1         |
| Chelsea   | 1           | -1        |
| Chelsea   | 2           | 1         |
+-----------+-------------+-----------+

解释:

  • 前半场(00:00-45:00):
    <ul>
    	<li>阿森纳的传球:
    	<ul>
    		<li>1 → 2 (00:15):成功传球(+1)</li>
    		<li>2 → 3 (00:45):成功传球(+1)</li>
    		<li>3 → 1 (01:15):成功传球(+1)</li>
    	</ul>
    	</li>
    	<li>切尔西的传球:
    	<ul>
    		<li>4 → 1 (00:30): 被阿森纳截获(-1)</li>
    	</ul>
    	</li>
    </ul>
    </li>
    <li><strong>下半场(45:01-90:00):</strong>
    <ul>
    	<li>阿森纳的传球:
    	<ul>
    		<li>2 → 3 (46:00):成功传球(+1)</li>
    		<li>3 → 4 (46:15):被切尔西截获&nbsp;(-1)</li>
    		<li>1 → 2 (46:45):成功传球(+1)</li>
    	</ul>
    	</li>
    	<li>切尔西的传球:
    	<ul>
    		<li>5 → 6 (46:30):成功传球(+1)</li>
    	</ul>
    	</li>
    </ul>
    </li>
    <li>结果以 team_name 和 half_number 升序排序</li>
    

解法

方法一:等值连接 + 分组求和

我们可以通过等值连接,找到每次传球的发起方和接收方所在的球队,然后根据传球的时间戳判断传球发生在上半场还是下半场,根据传球的发起方和接收方所在的球队是否相同,计算出每次传球的优势值,最后按照球队名称和半场编号进行分组求和,得到每支球队在上半场和下半场的优势值。

MySQL

# Write your MySQL query statement below
WITH
    T AS (
        SELECT
            t1.team_name,
            IF(time_stamp <= '45:00', 1, 2) half_number,
            IF(t1.team_name = t2.team_name, 1, -1) dominance
        FROM
            Passes p
            JOIN Teams t1 ON p.pass_from = t1.player_id
            JOIN Teams t2 ON p.pass_to = t2.player_id
    )
SELECT team_name, half_number, SUM(dominance) dominance
FROM T
GROUP BY 1, 2
ORDER BY 1, 2;

Pandas

import pandas as pd


def calculate_team_dominance(teams: pd.DataFrame, passes: pd.DataFrame) -> pd.DataFrame:
    passes_with_teams = passes.merge(
        teams, left_on="pass_from", right_on="player_id", suffixes=("", "_team_from")
    ).merge(
        teams,
        left_on="pass_to",
        right_on="player_id",
        suffixes=("_team_from", "_team_to"),
    )
    passes_with_teams["half_number"] = passes_with_teams["time_stamp"].apply(
        lambda x: 1 if x <= "45:00" else 2
    )
    passes_with_teams["dominance"] = passes_with_teams.apply(
        lambda row: 1 if row["team_name_team_from"] == row["team_name_team_to"] else -1,
        axis=1,
    )
    result = (
        passes_with_teams.groupby(["team_name_team_from", "half_number"])["dominance"]
        .sum()
        .reset_index()
    )
    result.columns = ["team_name", "half_number", "dominance"]
    result = result.sort_values(by=["team_name", "half_number"])
    return result