Loading problems...
TL;DR: The optimal solution maintains a sliding window of size m partitioned into three ordered sets (bottom k, middle, top k) to dynamically track the sum of the middle elements in logarithmic time.
In LeetCode 1825, titled Finding MK Average, we are asked to process a stream of integers and calculate a specific metric called the MKAverage. This metric is defined only when we have at least m elements. We consider the last m elements received, discard the smallest k and the largest k values, and calculate the average of the remaining m - 2k elements (rounded down).
This is a classic "sliding window" problem combined with "order statistics." We need a way to efficiently add new numbers, remove old numbers as the window slides, and query the sum of the central elements.
The naive approach involves strictly following the problem statement's simulation steps every time we need to calculate the average.
calculateMKAverage is called:
m elements.m elements.k (smallest) and last k (largest).m - 2k.Pseudo-code:
pythondef calculateMKAverage(): if len(stream) < m: return -1 window = stream[-m:] # Copy last m elements window.sort() # O(m log m) middle = window[k : m-k] return sum(middle) // len(middle)
Why this fails:
The time complexity of sorting is . The problem constraints allow up to calls to calculateMKAverage and m can be up to .
In the worst case, the total complexity would be roughly , which equals operations. This far exceeds the typical limit of operations per second, resulting in a Time Limit Exceeded (TLE) error.
The core insight derived from the Heap - Two Heaps for Median Finding pattern is that we can maintain order statistics dynamically without re-sorting the entire array.
Instead of a single list, imagine maintaining three separate "buckets" or containers that hold the elements currently in the sliding window:
k elements.m - 2k elements.k elements.We also need a First-In-First-Out (FIFO) queue to track the arrival order of elements so we know which element "expires" (leaves the window) when a new one arrives.
The Key Invariants:
k.k.If we maintain these invariants, the answer to calculateMKAverage is simply sum(Mid) / size(Mid).
Visual Description:
Imagine the data flowing into a pipeline. As a new number enters, it falls into one of the three buckets based on its value. Simultaneously, the oldest number leaves one of the buckets. This might disrupt the size constraints (e.g., Left might now have k+1 elements). To fix this, we "spill over" elements between buckets—moving the largest of Left to Mid, or the smallest of Right to Mid—until the sizes are restored to exactly k.

Let's trace addElement(num) assuming the window is full (size m):
Add New Element:
num into the global FIFO queue.num into Left, Mid, or Right. A simple heuristic: if num is small, put in Left; if large, put in Right; otherwise Mid.Remove Old Element:
expired from the global FIFO queue.expired in Left, Mid, or Right and remove it.Rebalance (The "Heap" Logic):
size(Left) > k, remove the largest element from Left and move it to Mid. If size(Left) < k, remove the smallest element from Mid and move it to Left.size(Right) > k, remove the smallest element from Right and move it to Mid. If size(Right) < k, remove the largest element from Mid and move it to Right.current_mid_sum.Calculate Average:
current_mid_sum is maintained live, simply return current_mid_sum / (m - 2k).The algorithm correctly calculates the MK Average because it maintains the invariant that Left contains the smallest k elements and Right contains the largest k elements of the current window.
By the properties of ordered sets (or heaps), we ensure that:
When we calculate the average, we are guaranteed to be summing exactly the elements that are not in the bottom k or top k. Since the sum is updated incrementally, the query operation is strictly consistent with the current state of the sliding window.
C++ std::multiset is the ideal candidate here as it keeps elements sorted and allows finding/removing elements in logarithmic time.
cpp#include <queue> #include <set> class MKAverage { private: int m, k; long long midSum; std::queue<int> q; std::multiset<int> left, mid, right; public: MKAverage(int m, int k) : m(m), k(k), midSum(0) {} void addElement(int num) { // 1. Add new element q.push(num); if (left.empty() || num <= *left.rbegin()) { left.insert(num); } else if (right.empty() || num >= *right.begin()) { right.insert(num); } else { mid.insert(num); midSum += num; } // 2. Remove old element if window is too large if (q.size() > m) { int expired = q.front(); q.pop(); auto it = left.find(expired); if (it != left.end()) { left.erase(it); } else { it = mid.find(expired); if (it != mid.end()) { midSum -= expired; mid.erase(it); } else { right.erase(right.find(expired)); } } } // 3. Rebalance containers to satisfy sizes: left=k, right=k // Ensure Left has exactly k while (left.size() > k) { int val = *left.rbegin(); left.erase(prev(left.end())); mid.insert(val); midSum += val; } while (left.size() < k && !mid.empty()) { int val = *mid.begin(); mid.erase(mid.begin()); midSum -= val; left.insert(val); } // Ensure Right has exactly k while (right.size() > k) { int val = *right.begin(); right.erase(right.begin()); mid.insert(val); midSum += val; } while (right.size() < k && !mid.empty()) { int val = *mid.rbegin(); mid.erase(prev(mid.end())); midSum -= val; right.insert(val); } } int calculateMKAverage() { if (q.size() < m) return -1; return midSum / (m - 2 * k); } };
Java lacks a direct multiset equivalent that supports efficient arbitrary removal of duplicates. TreeMap can be used to map value -> count, essentially simulating a multiset.
javaimport java.util.*; class MKAverage { private int m, k; private long midSum; private Deque<Integer> queue; // TreeMap acts as a multiset: key=number, value=count private TreeMap<Integer, Integer> left, mid, right; private int leftSize, midSize, rightSize; public MKAverage(int m, int k) { this.m = m; this.k = k; this.queue = new ArrayDeque<>(); this.left = new TreeMap<>(); this.mid = new TreeMap<>(); this.right = new TreeMap<>(); this.midSum = 0; this.leftSize = 0; this.midSize = 0; this.rightSize = 0; } private void add(TreeMap<Integer, Integer> map, int num) { map.put(num, map.getOrDefault(num, 0) + 1); } private void remove(TreeMap<Integer, Integer> map, int num) { if (map.get(num) == 1) map.remove(num); else map.put(num, map.get(num) - 1); } public void addElement(int num) { queue.offer(num); // Naive insert: always add to left, then rebalance add(left, num); leftSize++; if (queue.size() > m) { int expired = queue.poll(); if (left.containsKey(expired)) { remove(left, expired); leftSize--; } else if (mid.containsKey(expired)) { remove(mid, expired); midSize--; midSum -= expired; } else { remove(right, expired); rightSize--; } } // Rebalance // 1. Move from Left to Mid if Left is too big while (leftSize > k) { int val = left.lastKey(); remove(left, val); leftSize--; add(mid, val); midSize++; midSum += val; } // 2. Move from Mid to Right if Mid is too big (indirectly balancing Right) // Actually, simpler logic: Ensure Left=k, then Right=k // If Left < k (can happen after removal), steal from Mid while (leftSize < k && midSize > 0) { int val = mid.firstKey(); remove(mid, val); midSize--; midSum -= val; add(left, val); leftSize++; } // Now balance Right while (midSize > 0 && rightSize < k) { // Fill Right from Mid int val = mid.lastKey(); remove(mid, val); midSize--; midSum -= val; add(right, val); rightSize++; } // If Mid has elements larger than Right's smallest (Inversion check) while (midSize > 0 && rightSize > 0 && mid.lastKey() > right.firstKey()) { int midVal = mid.lastKey(); int rightVal = right.firstKey(); remove(mid, midVal); midSize--; midSum -= midVal; remove(right, rightVal); rightSize--; add(right, midVal); rightSize++; add(mid, rightVal); midSize++; midSum += rightVal; } // Final check: Right might be too big if we just kept pushing to Left/Mid // But with the flow above, we mostly need to ensure Right has exactly k // The standard 3-container rebalance logic: while(rightSize > k) { int val = right.firstKey(); remove(right, val); rightSize--; add(mid, val); midSize++; midSum += val; } } public int calculateMKAverage() { if (queue.size() < m) return -1; return (int)(midSum / (m - 2 * k)); } }
Python does not have a built-in ordered multiset (like C++) or TreeMap (like Java). In an interview, using SortedList from the sortedcontainers library is the standard accepted approach for Hard problems requiring order-statistic trees. If external libraries are strictly forbidden, one would need to implement two heaps with "lazy deletion," which is significantly more code. Below is the SortedList approach, which is clean and interview-ready for Python environments that support it (like LeetCode).
pythonfrom collections import deque from sortedcontainers import SortedList class MKAverage: def __init__(self, m: int, k: int): self.m = m self.k = k self.stream = deque() self.sl = SortedList() self.total_sum = 0 self.first_k_sum = 0 self.last_k_sum = 0 def addElement(self, num: int) -> None: self.stream.append(num) # Insert new element - O(log m) index = self.sl.bisect_left(num) self.sl.add(num) self.total_sum += num # Update sum of first k (smallest) if index < self.k: self.first_k_sum += num # If we pushed an element out of the first k range if len(self.sl) > self.k: self.first_k_sum -= self.sl[self.k] # Update sum of last k (largest) # The logic for last_k is symmetrical but indices shift # Simpler approach: Re-calculate boundary changes or use 3 SortedLists. # However, with one SortedList, we can track indices. # Optimization: # Instead of complex index tracking with one list, # let's use the exact 3-container logic for clarity and robustness. pass # Re-implementing with the 3-container logic for strict pattern adherence # and clarity, using standard deque/list logic where possible, # but SortedList is required for O(log m). class MKAverage: def __init__(self, m: int, k: int): self.m, self.k = m, k self.q = deque() self.L = SortedList() # Bottom k self.M = SortedList() # Middle self.R = SortedList() # Top k self.sum_mid = 0 def addElement(self, num: int) -> None: self.q.append(num) # 1. Insert into appropriate list if not self.L or num <= self.L[-1]: self.L.add(num) elif not self.R or num >= self.R[0]: self.R.add(num) else: self.M.add(num) self.sum_mid += num # 2. Remove expired element if len(self.q) > self.m: expired = self.q.popleft() if expired in self.L: self.L.remove(expired) elif expired in self.R: self.R.remove(expired) else: self.M.remove(expired) self.sum_mid -= expired # 3. Rebalance L (must be size k) while len(self.L) > self.k: val = self.L.pop(-1) self.M.add(val) self.sum_mid += val while len(self.L) < self.k and self.M: val = self.M.pop(0) self.sum_mid -= val self.L.add(val) # 4. Rebalance R (must be size k) while len(self.R) > self.k: val = self.R.pop(0) self.M.add(val) self.sum_mid += val while len(self.R) < self.k and self.M: val = self.M.pop(-1) self.sum_mid -= val self.R.add(val) def calculateMKAverage(self) -> int: if len(self.q) < self.m: return -1 return self.sum_mid // (self.m - 2 * self.k)
Let be the number of operations and be the window size.
Time Complexity: per call to addElement.
The Heap - Two Heaps for Median Finding pattern and its generalization (Three Heaps/Containers) are powerful for problems involving sliding window order statistics.
Why does the naive solution TLE?
calculateMKAverage.Why can't I just use a standard Heap?
calculateMKAverage is because we maintain the sum.Space Complexity: .
m elements across the three containers and the queue.PriorityQueueremove(Object) is in Java/Python.Forgetting to update midSum during rebalancing.
Incorrect Integer Division.