comments | difficulty | edit_url | tags | |
---|---|---|---|---|
true |
Hard |
|
Table: Teams
+-------------+---------+ | Column Name | Type | +-------------+---------+ | player_id | int | | team_name | varchar | +-------------+---------+ player_id is the unique key for this table. Each row contains the unique identifier for player and the name of one of the teams participating in that match.
Table: Passes
+-------------+---------+ | Column Name | Type | +-------------+---------+ | pass_from | int | | time_stamp | varchar | | pass_to | int | +-------------+---------+ (pass_from, time_stamp) is the primary key for this table. pass_from is a foreign key to player_id from Teams table. Each row represents a pass made during a match, time_stamp represents the time in minutes (00:00-90:00) when the pass was made, pass_to is the player_id of the player receiving the pass.
Write a solution to calculate the dominance score for each team in both halves of the match. The rules are as follows:
- A match is divided into two halves: first half (
00:00
-45:00
minutes) and second half (45:01
-90:00
minutes) - The dominance score is calculated based on successful and intercepted passes:
- When pass_to is a player from the same team: +
1
point - When pass_to is a player from the opposing team (interception):
-1
point
- When pass_to is a player from the same team: +
- A higher dominance score indicates better passing performance
Return the result table ordered by team_name
and half_number
in ascending order.
The result format is in the following example.
Example:
Input:
Teams table:
+------------+-----------+ | player_id | team_name | +------------+-----------+ | 1 | Arsenal | | 2 | Arsenal | | 3 | Arsenal | | 4 | Chelsea | | 5 | Chelsea | | 6 | Chelsea | +------------+-----------+
Passes table:
+-----------+------------+---------+ | pass_from | time_stamp | pass_to | +-----------+------------+---------+ | 1 | 00:15 | 2 | | 2 | 00:45 | 3 | | 3 | 01:15 | 1 | | 4 | 00:30 | 1 | | 2 | 46:00 | 3 | | 3 | 46:15 | 4 | | 1 | 46:45 | 2 | | 5 | 46:30 | 6 | +-----------+------------+---------+
Output:
+-----------+-------------+-----------+ | team_name | half_number | dominance | +-----------+-------------+-----------+ | Arsenal | 1 | 3 | | Arsenal | 2 | 1 | | Chelsea | 1 | -1 | | Chelsea | 2 | 1 | +-----------+-------------+-----------+
Explanation:
- First Half (00:00-45:00):
<ul> <li>Arsenal's passes: <ul> <li>1 → 2 (00:15): Successful pass (+1)</li> <li>2 → 3 (00:45): Successful pass (+1)</li> <li>3 → 1 (01:15): Successful pass (+1)</li> </ul> </li> <li>Chelsea's passes: <ul> <li>4 → 1 (00:30): Intercepted by Arsenal (-1)</li> </ul> </li> </ul> </li> <li><strong>Second Half (45:01-90:00):</strong> <ul> <li>Arsenal's passes: <ul> <li>2 → 3 (46:00): Successful pass (+1)</li> <li>3 → 4 (46:15): Intercepted by Chelsea (-1)</li> <li>1 → 2 (46:45): Successful pass (+1)</li> </ul> </li> <li>Chelsea's passes: <ul> <li>5 → 6 (46:30): Successful pass (+1)</li> </ul> </li> </ul> </li> <li>The results are ordered by team_name and then half_number</li>
We can use an equi-join to find the teams of both the passer and the receiver for each pass. Then, based on the timestamp, we determine whether the pass occurred in the first half or the second half. By checking if the passer and receiver belong to the same team, we calculate the advantage value for each pass. Finally, we group by team name and half number, and sum the advantage values to get the advantage value for each team in the first and second halves.
# Write your MySQL query statement below
WITH
T AS (
SELECT
t1.team_name,
IF(time_stamp <= '45:00', 1, 2) half_number,
IF(t1.team_name = t2.team_name, 1, -1) dominance
FROM
Passes p
JOIN Teams t1 ON p.pass_from = t1.player_id
JOIN Teams t2 ON p.pass_to = t2.player_id
)
SELECT team_name, half_number, SUM(dominance) dominance
FROM T
GROUP BY 1, 2
ORDER BY 1, 2;
import pandas as pd
def calculate_team_dominance(teams: pd.DataFrame, passes: pd.DataFrame) -> pd.DataFrame:
passes_with_teams = passes.merge(
teams, left_on="pass_from", right_on="player_id", suffixes=("", "_team_from")
).merge(
teams,
left_on="pass_to",
right_on="player_id",
suffixes=("_team_from", "_team_to"),
)
passes_with_teams["half_number"] = passes_with_teams["time_stamp"].apply(
lambda x: 1 if x <= "45:00" else 2
)
passes_with_teams["dominance"] = passes_with_teams.apply(
lambda row: 1 if row["team_name_team_from"] == row["team_name_team_to"] else -1,
axis=1,
)
result = (
passes_with_teams.groupby(["team_name_team_from", "half_number"])["dominance"]
.sum()
.reset_index()
)
result.columns = ["team_name", "half_number", "dominance"]
result = result.sort_values(by=["team_name", "half_number"])
return result