登录
OAmaster

Hi there! At the moment, FP doesn’t yet support multi-function challenges—like the ones you see on LeetCode—but we’re actively working on adding that capability. Thanks for your patience! In this problem, you'll implement a method of approximating metric percentiles. A metric is a numeric measurement that might reflect the performance, health, or behavior of a system over time. For example, a metric of page_load_ms might measure the milliseconds it takes to load your website. In your code, every time your website is visited, you'd measure how long it took and report it to the page_load_ms metric. A percentile is a common aggregate statistic used by engineers to summarize certain metrics. The n-th percentile is the value where n percent of data points are less than or equal to that value. For example, if the 90th percentile (denoted p90) of page_load_ms is 800, that means 90% of page loads are less than or equal to 800ms. Looking at a few percentiles (e.g. p50, p90, and p99) helps understand how your system is performing. While there are several accepted ways to calculate percentiles, for this problem we'll use the following definition: Let data be a list of all values reported for a metric so far, sorted in ascending order. Calculate rank = p * n / 100, where p is the percentile you are trying to calculate, ceil is the ceiling function, and n is the length of data. Finally, percentile p is equal to data[rank-1]. A common challenge of percentiles is calculating them efficiently for a large number of values. You can calculate percentile p through brute force - just sort the data and follow the procedure described above. While this would work, it's not efficient because you need to resort the list every time as values are added. In this problem, you'll implement a system to efficiently approximate percentiles using a histogram data structure. Note: we typically attach timestamps to metrics and calculate percentiles over a specific time period, e.g. the last 24 hours. For this problem, we'll ignore time and consider all values reported. Approximating percentiles To approximate percentiles, we can use histograms, which group values into fixed-range buckets and track counts in each bucket. For example, with the inclusive range [1, 100] split into 10 buckets, the first bucket counts values in 1, 10, the second bucket 11, 20, and so on. Given values: [75, 9, 5, 15, 39, 32, 8, 58, 45, 12, 23] The histogram buckets would be:

  • To estimate p60, we:
  • Calculate the target position: 60% of 11 values is about the 7th value
  • Count values in buckets until we reach p
  • Find this lands in bucket 31, 40
  • Estimate result as bucket midpoint: 35.5 We're using the bucket midpoint as a simple heuristic - other heuristics are possible. The actual p60, the 7th value, is 32 - our answer of 35.5 is reasonably close. The benefit of this strategy is that even if we have millions of datapoints, our system will still run quickly. Important note on approximation This is an "approximate solution" problem. Your implementation will be required to get close to the real answer, but it's not expected to calculate the exact answer. We recommend not over-optimizing for accuracy. Start simple and pass the early test cases. You can work on improving accuracy if you're still failing tests. Your task Implement the Percentiles class with the following methods:
  • constructor(limit: int): Initialize the system with the guarantee that metrics will have values in the inclusive range [0, limit].
  • report_data(metric: string, value: int) → void: Report the metric metric with the value value.
  • percentile(p: int, metric: string) → float: Calculate the p percentile for the metric metric. You can assume that the metric will have some values reported.

Example 1

Input:

limit = 100
metric = "page_load_ms"
value = 75
p = 60

Output:

35.5

Explanation: Given the values [75, 9, 5, 15, 39, 32, 8, 58, 45, 12, 23], the histogram buckets would be:

  • To estimate p60, we calculate the target position as 60% of 11 values, which is about the 7th value. Counting values in buckets until we reach p, we find this lands in bucket 31, 40. We estimate the result as the bucket midpoint, which is 35.5.

解法

每个 metric 维护一个固定的直方图:把 [1, limit] 划成 10 个等宽 bucket(或可配置数),report 时把 value 归入对应 bucket。percentile 时计算 rank = ceil(p * n / 100),从第一个 bucket 开始累加直到累积计数 ≥ rank,返回该 bucket 的中点。report O(1)percentile O(B),B 为 bucket 数。

import math

class Percentiles:
    BUCKETS = 10
    def __init__(self, limit: int):
        self.limit = limit
        self.bucket_size = max(1, math.ceil(limit / self.BUCKETS))
        self.metrics = {}

    def report_data(self, metric: str, value: int) -> None:
        if metric not in self.metrics:
            self.metrics[metric] = [0] * self.BUCKETS
        idx = min((max(value, 1) - 1) // self.bucket_size, self.BUCKETS - 1)
        self.metrics[metric][idx] += 1

    def percentile(self, p: int, metric: str) -> float:
        buckets = self.metrics[metric]
        n = sum(buckets)
        rank = math.ceil(p * n / 100)
        rank = max(rank, 1)
        cum = 0
        for i, c in enumerate(buckets):
            cum += c
            if cum >= rank:
                lo = i * self.bucket_size + 1
                hi = lo + self.bucket_size - 1
                return (lo + hi) / 2.0
        return float(self.limit)

The engineering organization is having an offsite, and everyone has been tasked with coming up with a piece of AIrtable trivia they can share with the team. You've decided to find out which engineering team is the most hydrated, based on the average cans of sparkling water consumed by all employees on that team in the last year. There are n employees, numbered from 0 to n-1. You are given two integer arrays cansConsumed and manager of length n, where: cansConsumed[i] represents employee i's cans consumed manager[i] represents employee i's direct manager (-1 indicates no manager) A "team" consists of all employees who report to a common manager (directly or indirectly through other managers). The common manager is not part of the team. The hydration score of a team is the average cans consumed of all employees on that team. Return the hydration score of the team with the highest hydration score. Only consider teams with 2 or more employees. Function Description Complete the function getMostHydratedTeam in the editor. getMostHydratedTeam has the following parameters:

    1. int[] cansConsumed: an array of integers representing the cans consumed by each employee
    1. int[] manager: an array of integers representing each employee's direct manager Returns int: the highest hydration score of any team

Example 1

Input:

cansConsumed = [70, 78, 53, 65, 100, 80, 92, 85, 43]
manager = [8, 4, 3, -1, 3, 8, 4, 4, 3]

Output:

85

Explanation: The team under employee 4 has three employees. Their hydration score is (78 + 85 + 92)/3 = 255/3 = 85. This is the highest hydration score of any team.

解法

构建反向树(manager -> children),对每个非叶子的 manager 节点收集其下整个子树的所有员工 cans,求平均;这里 "team" 指某 manager 下的所有员工。DFS 后序聚合 (sum, count),对每个 manager 节点(实际作为团队的根)计算其聚合 (sum, count),若 count ≥ 2 则更新答案。复杂度 O(n)

from typing import List
from collections import defaultdict
import sys

def get_most_hydrated_team(cansConsumed: List[int], manager: List[int]) -> float:
    sys.setrecursionlimit(200000)
    n = len(cansConsumed)
    children = defaultdict(list)
    for i, m in enumerate(manager):
        if m != -1:
            children[m].append(i)
    best = 0.0
    def dfs(u):
        nonlocal best
        sub_sum = cansConsumed[u]
        sub_cnt = 1
        for c in children[u]:
            s, k = dfs(c)
            sub_sum += s
            sub_cnt += k
        team_sum = sub_sum - cansConsumed[u]
        team_cnt = sub_cnt - 1
        if team_cnt >= 2:
            avg = team_sum / team_cnt
            if avg > best:
                best = avg
        return sub_sum, sub_cnt
    for i, m in enumerate(manager):
        if m == -1:
            dfs(i)
    return best
import java.util.*;

class Solution {
    double best = 0.0;
    List<List<Integer>> children;
    int[] cans;

    double getMostHydratedTeam(int[] cansConsumed, int[] manager) {
        int n = cansConsumed.length;
        cans = cansConsumed;
        children = new ArrayList<>();
        for (int i = 0; i < n; i++) children.add(new ArrayList<>());
        for (int i = 0; i < n; i++) if (manager[i] != -1) children.get(manager[i]).add(i);
        for (int i = 0; i < n; i++) if (manager[i] == -1) dfs(i);
        return best;
    }

    private long[] dfs(int u) {
        long sum = cans[u]; int cnt = 1;
        for (int c : children.get(u)) {
            long[] sub = dfs(c);
            sum += sub[0];
            cnt += (int) sub[1];
        }
        long teamSum = sum - cans[u];
        int teamCnt = cnt - 1;
        if (teamCnt >= 2) {
            double avg = (double) teamSum / teamCnt;
            if (avg > best) best = avg;
        }
        return new long[]{sum, cnt};
    }
}
class Solution {
public:
    double best = 0.0;
    vector<vector<int>> children;
    vector<int> cans;

    double getMostHydratedTeam(vector<int>& cansConsumed, vector<int>& manager) {
        int n = cansConsumed.size();
        cans = cansConsumed;
        children.assign(n, {});
        for (int i = 0; i < n; i++) if (manager[i] != -1) children[manager[i]].push_back(i);
        for (int i = 0; i < n; i++) if (manager[i] == -1) dfs(i);
        return best;
    }

private:
    pair<long long, int> dfs(int u) {
        long long sum = cans[u]; int cnt = 1;
        for (int c : children[u]) {
            auto [s, k] = dfs(c);
            sum += s; cnt += k;
        }
        long long teamSum = sum - cans[u];
        int teamCnt = cnt - 1;
        if (teamCnt >= 2) {
            double avg = (double) teamSum / teamCnt;
            if (avg > best) best = avg;
        }
        return {sum, cnt};
    }
};

Hi there! At the moment, FP doesn’t yet support multi-function challenges—like the ones you see on LeetCode—but we’re actively working on adding that capability. Thanks for your patience! In this problem, you'll implement a simplified version of Table data structure. A table consists of records (rows) with a number of defined fields (columns). Records are numbered from 1 to n and fields have string names. A specific field in a specific record is called a cell. For our purposes, we will only consider string values. Your implementation will need to support reads, writes, and references (detailed in the next section). Cell references Each cell can either store a string value or reference another cell. When a cell references another cell, it inherits that cell's value. References can chain - if cell A references cell B which references cell C, then cell A will have cell C's value. Consider this table tracking event attendees (assume Age is a string): If we make Carol's favorite food reference Erin's favorite food, Carol's food would change to "None". If Erin's food is later changed to "Hamburger", Carol's would automatically update to "Hamburger" as well. References can span across fields - a cell in the "Name" field could reference a cell in the "Age" field. You do not need to support circular references. Your task With this in mind, implement the Table class with the following methods: constructor(int fields: List[string]) Initialize the table with empty records. The fields array represents the names of the fields. get_cell(record_number: int, field: string) -> string Get the value of the cell in the field field for the record_number record. If the cell has no value, return an empty string. You can assume record_number and field are valid. set_cell(record_number: int, field: string, value: string) -> void Set the value of the cell in the field field for record_number to value. You can assume record_number and field are valid. reference_cell(record_number: int, field: string, target_record: int, target_field: string) -> void Set the cell in the field field for the record_number record to reference the cell in the field target_field for the target_record record. You can assume that both cells exist. delete_cell(record_number: int, field: string) -> void Delete the contents of the cell in field field for record_number. After this operation, the cell will either be referencing another cell or have a value (this method will not be called on an empty cell). As get_cell is the only method that returns anything, your solution will be judged on the outputs of this method. Implement the Percentiles class with the following methods: constructor(int: Int) Initialize the system with the guarantee that metrics will have values in the inclusive range [0, limit] report_data(metric: string, value: Int) -> void Report the metric metric with the value value. percentile(int, metric: string) -> float Return the p percentile for the metric metric. You can assume that the metric will have some data reported already. As percentile is the only method that returns anything, your solution will be judged on the outputs of this method. Example: limit = 10 Then, the following calls are made, in order:

    • report_data("x", 3)
    • report_data("x", 8)
    • report_data("y", 1)
    • percentile(25, "x") Returns 2
    • report_data("x", 4)
    • report_data("x", 6)
    • report_data("x", 7)
    • percentile(50, "x") Returns 5
    • percentile(0, "x") Returns 5 Constraints • 1 ≤ limit ≤ 2,147,483,647 • 1 ≤ metric.length ≤ 25 • 0 ≤ p ≤ 100 • 1 ≤ n ≤ 100,000 All input metrics will be made in the memory of Percentiles per test case.

Example 1

Input:

fields = ["Name", "City"]
record_number = 1
field = "Name"
value = "Alice"

Output:

void

Explanation: The set_cell method is called to set the value "Alice" for the "Name" field of record number 1.

Example 2

Input:

record_number = 3
field = "City"
target_record = 1
target_field = "City"

Output:

void

Explanation: The reference_cell method is called to set a reference from the "City" field of record number 3 to the "City" field of record number 1.

Example 3

Input:

record_number = 3
field = "City"

Output:

"New York"

Explanation: The get_cell method is called to get the value of the "City" field for record number 3, which returns "New York" because it references the "City" field of record number 1.

解法

每个 cell 存两种状态之一:value 字符串或 (target_record, target_field) 引用。get_cell 时一路顺着 reference 链找终点的 value。delete_cell 直接移除。无需处理循环引用。复杂度引用链长度。

class Table:
    def __init__(self, fields):
        self.fields = fields
        self.cells = {}  # (record, field) -> value or ("ref", record, field)

    def get_cell(self, record_number: int, field: str) -> str:
        key = (record_number, field)
        seen = set()
        while key in self.cells:
            entry = self.cells[key]
            if isinstance(entry, tuple) and entry[0] == "ref":
                if key in seen:
                    return ""
                seen.add(key)
                key = (entry[1], entry[2])
            else:
                return entry
        return ""

    def set_cell(self, record_number: int, field: str, value: str) -> None:
        self.cells[(record_number, field)] = value

    def reference_cell(self, record_number: int, field: str,
                       target_record: int, target_field: str) -> None:
        self.cells[(record_number, field)] = ("ref", target_record, target_field)

    def delete_cell(self, record_number: int, field: str) -> None:
        self.cells.pop((record_number, field), None)