LeetCode 049 字母异位分组
有勇气的牛排
27
算法
2025-08-04 22:00:46
题目
给你一个字符串数组,请你将 字母异位词 组合在一起。可以按任意顺序返回列表。
示例1:
输入: strs = [“eat”, “tea”, “tan”, “ate”, “nat”, “bat”]
输出: [[“bat”],[“nat”,“tan”],[“ate”,“eat”,“tea”]]
解释:
- 在 strs 中没有字符串可以通过重新排列来形成
"bat"
。
- 字符串 “nat” 和 “tan” 是字母异位词,因为他们可以重新排列形成彼此。
- 字符串 “ate”,“eat”,“tea” 是字母异位词,因为他们可以重新排列以形成彼此。
示例 2:
输入: strs = [""]
输出: [[""]]
示例 3:
输入: strs = [“a”]
输出: [[“a”]]
提示:
1 <= strs.length <= 104
0 <= strs[i].length <= 100
strs[i]
仅包含小写字母
排序法(常用,易实现)
核心:字母异位词排序后想同。例如:
"eat" → "aet"
"tea" → "aet"
这样可以用排序后的字符串作为分组的key。
复杂度:
- 时间:O(N × K logK)
- 空间:O(N × K)
优点:
缺点:
- 排序多了个 logK 因子
- 如果字符串很长,排序会耗时
from collections import defaultdict
def groupAnagrams(strs):
groups = defaultdict(list)
for str in strs:
str_sort = "".join(sorted(str))
groups[str_sort].append(str)
return list(groups.values())
strs = ["eat", "tea", "tan", "ate", "nat", "bat"]
print(groupAnagrams(strs))
strs = ["a"]
print(groupAnagrams(strs))
strs = [""]
print(groupAnagrams(strs))

计数法(高效)
核心:26个小写字母,用一个长度为26的元组作为唯一标识
步骤:
- 对每个字符串,用数组计数26个字母出现次数。
- 把该计数数组转成元组作为key。
- 存储哈希表分组。
复杂度:
- 时间O(N × K)(比排序法少了个 logK 因子)
- 空间:O(N × 26)
优点:
缺点:
from collections import defaultdict
def groupAnagrams(strs):
groups = defaultdict(list)
for str in strs:
count = [0] * 26
for char in str:
count[ord(char) - ord('a')] += 1
groups[tuple(count)].append(str)
return list(groups.values())
strs = ["eat", "tea", "tan", "ate", "nat", "bat"]
print(groupAnagrams(strs))
strs = ["a"]
print(groupAnagrams(strs))
strs = [""]
print(groupAnagrams(strs))
错误思路
假设取每个字母的 ASCII 码作为key。
这种方案不能正确区分所有字母异位词,会出现哈希冲突:
反例:
"ad" → ord('a') + ord('d') = 97 + 100 = 197
"bc" → ord('b') + ord('c') = 98 + 99 = 197
"ad"
和 "bc"
不是字母异位词,但它们的 ASCII 码和一样,会被错误分到同一组。
所以这个算法 不是一个可靠的分组依据。
<h2><a id="_0"></a>题目</h2>
<p>给你一个字符串数组,请你将 <strong>字母异位词</strong> 组合在一起。可以按任意顺序返回列表。</p>
<p><strong>示例1:</strong></p>
<blockquote>
<p><strong>输入:</strong> strs = [“eat”, “tea”, “tan”, “ate”, “nat”, “bat”]</p>
<p><strong>输出:</strong> [[“bat”],[“nat”,“tan”],[“ate”,“eat”,“tea”]]</p>
<p><strong>解释:</strong></p>
<ul>
<li>在 strs 中没有字符串可以通过重新排列来形成 <code>"bat"</code>。</li>
<li>字符串 “nat” 和 “tan” 是字母异位词,因为他们可以重新排列形成彼此。</li>
<li>字符串 “ate”,“eat”,“tea” 是字母异位词,因为他们可以重新排列以形成彼此。</li>
</ul>
</blockquote>
<p><strong>示例 2:</strong></p>
<blockquote>
<p><strong>输入:</strong> strs = [""]</p>
<p><strong>输出:</strong> [[""]]</p>
</blockquote>
<p><strong>示例 3:</strong></p>
<blockquote>
<p><strong>输入:</strong> strs = [“a”]</p>
<p><strong>输出:</strong> [[“a”]]</p>
</blockquote>
<p><strong>提示:</strong></p>
<ul>
<li><code>1 <= strs.length <= 104</code></li>
<li><code>0 <= strs[i].length <= 100</code></li>
<li><code>strs[i]</code> 仅包含小写字母</li>
</ul>
<h2><a id="_36"></a>排序法(常用,易实现)</h2>
<p>核心:字母异位词排序后想同。例如:</p>
<pre><div class="hljs"><code class="lang-shell">"eat" → "aet"
"tea" → "aet"
</code></div></pre>
<p>这样可以用排序后的字符串作为<strong>分组的key</strong>。</p>
<p><strong>复杂度</strong>:</p>
<ul>
<li>时间:O(N × K logK)</li>
<li>空间:O(N × K)</li>
</ul>
<p><strong>优点</strong>:</p>
<ul>
<li>思路直观,代码简单</li>
<li>排序后作为key非常容易</li>
</ul>
<p><strong>缺点</strong>:</p>
<ul>
<li>排序多了个 logK 因子</li>
<li>如果字符串很长,排序会耗时</li>
</ul>
<pre><div class="hljs"><code class="lang-python"><span class="hljs-comment"># -*- coding: utf-8 -*-</span>
<span class="hljs-keyword">from</span> collections <span class="hljs-keyword">import</span> defaultdict
<span class="hljs-keyword">def</span> <span class="hljs-title function_">groupAnagrams</span>(<span class="hljs-params">strs</span>):
groups = defaultdict(<span class="hljs-built_in">list</span>)
<span class="hljs-keyword">for</span> <span class="hljs-built_in">str</span> <span class="hljs-keyword">in</span> strs:
str_sort = <span class="hljs-string">""</span>.join(<span class="hljs-built_in">sorted</span>(<span class="hljs-built_in">str</span>))
groups[str_sort].append(<span class="hljs-built_in">str</span>)
<span class="hljs-keyword">return</span> <span class="hljs-built_in">list</span>(groups.values())
strs = [<span class="hljs-string">"eat"</span>, <span class="hljs-string">"tea"</span>, <span class="hljs-string">"tan"</span>, <span class="hljs-string">"ate"</span>, <span class="hljs-string">"nat"</span>, <span class="hljs-string">"bat"</span>]
<span class="hljs-built_in">print</span>(groupAnagrams(strs))
strs = [<span class="hljs-string">"a"</span>]
<span class="hljs-built_in">print</span>(groupAnagrams(strs))
strs = [<span class="hljs-string">""</span>]
<span class="hljs-built_in">print</span>(groupAnagrams(strs))
</code></div></pre>
<p><img src="https://static.couragesteak.com/article/89db152ec2d69b704050977906448581.png" alt="image.png" /></p>
<h2><a id="_88"></a>计数法(高效)</h2>
<p>核心:26个小写字母,用一个长度为26的元组作为唯一标识</p>
<p><strong>步骤</strong>:</p>
<ol>
<li>对每个字符串,用数组计数26个字母出现次数。</li>
<li>把该计数数组转成元组作为key。</li>
<li>存储哈希表分组。</li>
</ol>
<p><strong>复杂度</strong>:</p>
<ul>
<li>时间O(N × K)(比排序法少了个 logK 因子)</li>
<li>空间:O(N × 26)</li>
</ul>
<p><strong>优点</strong>:</p>
<ul>
<li>更快,尤其是 K 大时</li>
<li>不依赖排序</li>
</ul>
<p><strong>缺点</strong>:</p>
<ul>
<li>代码比排序法稍长</li>
</ul>
<pre><div class="hljs"><code class="lang-python"><span class="hljs-comment"># -*- coding: utf-8 -*-</span>
<span class="hljs-keyword">from</span> collections <span class="hljs-keyword">import</span> defaultdict
<span class="hljs-keyword">def</span> <span class="hljs-title function_">groupAnagrams</span>(<span class="hljs-params">strs</span>):
groups = defaultdict(<span class="hljs-built_in">list</span>)
<span class="hljs-keyword">for</span> <span class="hljs-built_in">str</span> <span class="hljs-keyword">in</span> strs: <span class="hljs-comment"># O(n)</span>
count = [<span class="hljs-number">0</span>] * <span class="hljs-number">26</span>
<span class="hljs-keyword">for</span> char <span class="hljs-keyword">in</span> <span class="hljs-built_in">str</span>: <span class="hljs-comment"># O(k)</span>
count[<span class="hljs-built_in">ord</span>(char) - <span class="hljs-built_in">ord</span>(<span class="hljs-string">'a'</span>)] += <span class="hljs-number">1</span>
groups[<span class="hljs-built_in">tuple</span>(count)].append(<span class="hljs-built_in">str</span>)
<span class="hljs-keyword">return</span> <span class="hljs-built_in">list</span>(groups.values())
strs = [<span class="hljs-string">"eat"</span>, <span class="hljs-string">"tea"</span>, <span class="hljs-string">"tan"</span>, <span class="hljs-string">"ate"</span>, <span class="hljs-string">"nat"</span>, <span class="hljs-string">"bat"</span>]
<span class="hljs-built_in">print</span>(groupAnagrams(strs))
strs = [<span class="hljs-string">"a"</span>]
<span class="hljs-built_in">print</span>(groupAnagrams(strs))
strs = [<span class="hljs-string">""</span>]
<span class="hljs-built_in">print</span>(groupAnagrams(strs))
</code></div></pre>
<h2><a id="_138"></a>错误思路</h2>
<p>假设取每个字母的 ASCII 码作为key。</p>
<p>这种方案<strong>不能正确区分所有字母异位词</strong>,会出现<strong>哈希冲突</strong>:</p>
<p><strong>反例</strong>:</p>
<pre><div class="hljs"><code class="lang-python"><span class="hljs-string">"ad"</span> → <span class="hljs-built_in">ord</span>(<span class="hljs-string">'a'</span>) + <span class="hljs-built_in">ord</span>(<span class="hljs-string">'d'</span>) = <span class="hljs-number">97</span> + <span class="hljs-number">100</span> = <span class="hljs-number">197</span>
<span class="hljs-string">"bc"</span> → <span class="hljs-built_in">ord</span>(<span class="hljs-string">'b'</span>) + <span class="hljs-built_in">ord</span>(<span class="hljs-string">'c'</span>) = <span class="hljs-number">98</span> + <span class="hljs-number">99</span> = <span class="hljs-number">197</span>
</code></div></pre>
<p><code>"ad"</code> 和 <code>"bc"</code> 不是字母异位词,但它们的 ASCII 码和一样,会被错误分到同一组。</p>
<p>所以这个算法 <strong>不是一个可靠的分组依据</strong>。</p>
评论区