Top K Frequent Elements - How To Implement

Top K Frequent Elements - How To Implement

Introduction

In this blog post, we will discuss the problem of finding the top K frequent elements in an array. We will explore an algorithmic approach to efficiently solve this problem.

Problem Statement

Given an array of integers nums and an integer k, we need to find the top K frequent elements in the array.

Data Structure used

To solve this problem, we will use a map to count the frequency of each element in the array. We will also use a vector to store the unique elements.

Algorithmic approach

We will use a combination of counting frequencies and the quickselect algorithm to find the top K frequent elements. The algorithm is as follows:

  1. Initialize a map count_map to count the frequency of each element in the array.

  2. Iterate through each element in the array and update its frequency in the count_map.

  3. Initialize a vector unique to store the unique elements from the array.

  4. Implement the quickselect algorithm to find the element at the (n - k)th position in the unique vector, where n is the size of the unique vector.

    • Select a pivot element randomly.

    • Partition the unique vector around the pivot index, so that elements smaller than the pivot are on the left and elements larger than the pivot are on the right.

    • If the pivot index is equal to (n - k), we have found the top K frequent elements. Return from the quickselect algorithm.

    • If the pivot index is less than (n - k), recursively call quickselect on the right side of the pivot.

    • If the pivot index is greater than (n - k), recursively call quickselect on the left side of the pivot.

  5. Create a new vector top_k_frequent with size K and copy the top K frequent elements from the unique vector into it.

  6. Return the top_k_frequent vector.

Implementation

void quickselect(int left, int right, int k_smallest)
{

    if (left == right)
        return;

    int pivot_index = left + rand() % (right - left + 1);

    pivot_index = partition(left, right, pivot_index);

    if (k_smallest == pivot_index)
    {
        return;
    }
    else if (k_smallest < pivot_index)
    {
        quickselect(left, pivot_index - 1, k_smallest);
    }
    else
    {
        quickselect(pivot_index + 1, right, k_smallest);
    }
}

vector<int> topKFrequent(vector<int> &nums, int k)
{
    for (int n : nums)
    {
        count_map[n] += 1;
    }

    int n = count_map.size();
    for (pair<int, int> p : count_map)
    {
        unique.push_back(p.first);
    }

    quickselect(0, n - 1, n - k);
    vector<int> top_k_frequent(k);
    copy(unique.begin() + n - k, unique.end(), top_k_frequent.begin());
    return top_k_frequent;
}

Time Complexity

The time complexity of this algorithm depends on the quickselect algorithm, which has an average time complexity of O(N), where N is the number of elements in the array. Additionally, counting the frequencies of elements in the array takes O(N) time. Therefore, the overall time complexity is O(N).

Space Complexity

The space complexity of this algorithm is O(N), where N is the number of elements in the array. The space is used to store the count_map, unique, and top_k_frequent vectors.

Conclusion

In this blog post, we discussed the problem of finding the top K frequent elements in an array. We explored an algorithmic approach that combines counting frequencies with the quickselect algorithm to efficiently solve the problem. By understanding the problem requirements and implementing an efficient algorithm, we can find the top K frequent elements in an array.

Links: