Python 3 has an odd oversight; there’s no function for finding the length of an iterable. You can do len(aList) of course, and len(aTuple). But there’s no len(anIterator). What’s really surprising is there’s no itertools.len(anIterator), some function implementation.
The naive solution is len(tuple(anIterator)). But that’s inefficient because it constructs a tuple in memory. This discussion suggests sum(1 for e in anIterator), which if I understand generators correctly should be pretty efficient and involve no allocation of new collections. Surprised this function isn’t in itertools.
Update: see also cardinality.count() for a clever faster implementation.
Update 2: I had some yak-shaving time, so I made an iPython notebook timing three different solutions for counting the length of an iterable.
- 1x: len of tuple of iterable
- 7x: deque an enumerate of iterable
- 11x: sum 1 over the iterable
The results are roughly the same for iterables of size 1000 to 1,000,000 to 100,000,000 and hold whether the iterable is a list or an ephemeral itertools iterable.
What’s astonishing is how much faster the len(tuple(iterable)) method is than the others. Naively it should be slower, it’s doing a lot more work building that tuple in memory. But perhaps this is a highly optimized codepath in Python. The tradeoff is it temporarily allocates memory to consume the whole iterable, so it’s using O(n) memory compared to the O(1) memory for the other solutions.