C++ 用于在给定的字符串流中分组变位词

一个 变位词 是由另一个单词的字母重新排列形成的单词，例如 “listen” 和 “silent” 。要在每个字符串流中分组变位词，我们需要将所有彼此是变位词的字符串分组在一起。

示例1

C++中使用哈希表分组变位词的代码片段：

#include <iostream>
#include <algorithm>
#include <unordered_map>
#include <vector>
#include <string>
// Function to group anagrams
void groupAnagrams(std::vector<std::string>& strs) {
std::unordered_map<std::string, std::vector<std::string>>hashTable;
    for (const auto&str : strs) {
std::string temp = str;
        // Sort the string to form a key
std::sort(temp.begin(), temp.end());
hashTable[temp].push_back(str);
    }
strs.clear();
    for (const auto& [key, value] :hashTable) {
        for (const auto&str : value) {
strs.push_back(str);
        }
    }
}

// Main function to test the code
int main() {
std::vector<std::string> strs = {"eat", "tea", "tan", "ate", "nat", "bat"};
groupAnagrams(strs);
    for (const auto&str : strs) {
std::cout<< str << " ";
    }
    return 0;
}

输出

bat tan nat eat tea ate

在这个实现中，首先创建一个无序映射 hashTable ，其中键是排序后的字符串，值是彼此为变位词的字符串的向量。然后，我们遍历输入向量 strs 中的每个字符串，将字符串排序形成一个键，并将其添加到哈希表中。最后，我们清空输入向量 “strs” ，并遍历哈希表将变位词分组追加到strs中。

在 C++ 中，有其他方法可以对字符串流中的变位词进行分组。其中一种方法是使用一对进行存储，将排序后的字符串作为一对中的第一个元素，原始字符串作为第二个元素。然后，可以根据排序后的字符串对此对的向量进行排序，这样变位词将相邻。最后，可以提取原始字符串并将其存储在一个单独的向量中。

示例2

以下是这种方法的一种实现：

#include <iostream>
#include <algorithm>
#include <vector>
#include <string>
// Function to group anagrams
void groupAnagrams(std::vector<std::string>& strs) {
std::vector<std::pair<std::string, std::string>>sortedStrs;
    for (const auto&str : strs) {
std::string sortedStr = str;
std::sort(sortedStr.begin(), sortedStr.end());
sortedStrs.push_back(std::make_pair(sortedStr, str));
    }
std::sort(sortedStrs.begin(), sortedStrs.end());
strs.clear();
    for (const auto&pair :sortedStrs) {
strs.push_back(pair.second);
    }
}
// Main function to test the code
int main() {
std::vector<std::string> strs = {"eat", "tea", "tan", "ate", "nat", "bat"};
groupAnagrams(strs);
    for (const auto&str : strs) {
std::cout<< str << " ";
    }
    return 0;
}

输出

bat ate eat tea nat tan

在这个实现中，我们首先创建了一个成对向量 sortedStrs ，其中每对的第一个元素是 排序后的字符串 ，第二个元素是 原始字符串 。然后，我们遍历输入向量strs中的每个字符串，对该字符串进行排序以形成排序后的字符串，并将该对加入到 sortedStrs 中。我们使用 std::sort() 对 sortedStrs 基于排序后的字符串进行排序，该函数根据每对的第一个元素进行字典排序。最后，我们清空输入向量 “strs” ，并遍历 sortedStrs 提取每对的第二个元素，并将其追加到 “strs” 中。

这种方法的时间复杂度为 O(N * M * log M) ，其中 N 是输入向量中的字符串数量， M 是 最长字符串 的长度。空间复杂度为 O(N * M) ，因为我们需要为每个输入字符串存储排序后的字符串。

示例3

还有一种实现方法是使用计数排序的方法：

#include 
#include 
#include 
#include 
#include 

// Function to group anagrams
void groupAnagrams(std::vector& strs) {
    std::unordered_map> map;
    for (const auto&str : strs) {
        int count[26] = {0};
        for (const auto&c : str) {
count[c - 'a']++;
        }
std::string key;
        for (int i = 0; i< 26; i++) {
            key += std::to_string(count[i]) + "#";
        }
        map[key].push_back(str);
    }
strs.clear();
    for (const auto&pair : map) {
        for (const auto& str : pair.second) {
strs.push_back(str);
        }
    }
}

// Main function to test the code
int main() {
    std::vector strs = {"eat", "tea", "tan", "ate", "nat", "bat"};
groupAnagrams(strs);
    for (const auto& str : strs) {
std::cout<< str << " ";
    }
    return 0;
}

输出结果

bat tan nat eat tea ate

在这个实现中，我们首先创建一个无序映射 map ，其中键是基于字符串中每个字符的计数排序的字符串，值是具有相同排序字符串的原始字符串的向量。之后，我们遍历输入向量strs中的每个字符串，使用计数数组count计算每个字符的频率，并通过拼接每个字符的计数与 “#” 分隔符构建键。我们将原始字符串插入到map中与键对应的值向量中。最后，我们清除输入向量 strs 并遍历map以提取值向量并将其元素附加到strs中。

还有一些其他方法：

Trie方法： 我们可以使用Trie数据结构来根据其排序字符串分组异位词。我们首先对每个字符串进行排序，然后将其插入到Trie中，并将原始字符串存储在对应于其排序字符串的叶节点上。之后，我们可以遍历Trie来提取异位词组。这种方法的时间复杂度为 O(N * M * log M) ，其中N是输入向量中的字符串数量，M是最长字符串的长度。
桶排序方法： 我们可以使用桶排序算法根据其排序字符串来分组异位词。我们可以创建一个向量的数组，其中每个向量的索引对应于排序字符串，并将原始字符串附加到该索引处的向量中。之后，我们可以连接数组中的向量来提取异位词组。这种方法的时间复杂度为 O(N * M) ，其中N是输入向量中的字符串数量，M是最长字符串的长度。

所有这些方法都有各自的优点和缺点，最佳方法取决于输入向量的大小、字符串的长度和可用内存。