This rule raises an issue when a group iterator returned by itertools.groupby() is stored in a container that persists beyond the
current loop iteration, instead of being consumed immediately.
The itertools.groupby() function groups consecutive items from an iterable based on a key function. It returns pairs of (key,
group_iterator) where each group_iterator is a sub-iterator that yields items for that specific group.
A critical characteristic of these group iterators is that they share the same underlying data source as the main groupby iterator.
Every time the outer loop advances to the next group, the previous group iterator is immediately invalidated — regardless of whether it has been
iterated or not. Storing a raw group iterator and consuming it after the loop has moved on will always yield an empty sequence.
When group iterators are reused, they yield no items. Your code will process empty sequences instead of the actual grouped data, leading to incorrect results, silent failures, and data loss. These issues can be difficult to debug because the code does not raise exceptions but simply produces wrong output.
There are two correct approaches:
list, tuple, set, etc.)
before the loop advances. This allows the data to be stored and accessed later.
from itertools import groupby
data = [1, 1, 2, 2, 3]
groups = {}
for key, group in groupby(data):
groups[key] = group # Noncompliant
for key, group in groups.items():
print(f"{key}: {list(group)}") # Empty results
from itertools import groupby
data = [1, 1, 2, 2, 3]
groups = {}
for key, group in groupby(data):
groups[key] = list(group) # Convert to list immediately
for key, group in groups.items():
print(f"{key}: {group}") # Correct results
itertools.groupby