Starting with a Clean Slate

“Iterators provide a way to loop over a collection of items without having to load all the items into memory at once.

An iterator keeps track of where you are in the iterable object and knows how to give you the next element when you ask for it.

Let us see an example how to use an iterator to traverse through a list:

my_list = [1, 2, 3, 4, 5]
iterator = iter(my_list)
print(next(iterator))
print(next(iterator))
print(next(iterator))
#     1
#     2
#     3

In this example, we create an iterator from a list of numbers using the ‘iter’ function. We then use the next  function to retrieve each element of the sequence one-by-one. When there are no more elements, the next function raises a StopIteration exception.

Moving a Step Higher😎

Custom Iterators

In Python, all iterators implements these methods:

  1. The __iter__ method is used to create an iterator object and returns the iterator itself. This method is required to allow the iterator to be used in a for loop or any other construct that expects an iterable.
  2. The __next__ method is used to return the next element in the sequence. If there are no more elements to iterate over, it raises the StopIteration exception. This is the standard way to signal the end of the iteration process.
demonstrating next
Itertators return next element only when __next__ is called.

The previous example was a simple example of using an iterator. Now let’s see how to make your own iterator and know what’s going on behind the scene.

class EvenNumbersIterator:
    def __init__(self, num):
        self.num = num
        self.current = 0

    def __iter__(self):
        return self

    def __next__(self):
        if self.current >= 2 * self.num:
            raise StopIteration

        result = self.current
        self.current += 2

        return result

even_numbers =EvenNumbersIterator(5)
for num in even_numbers:
    print(num)

# ->  0
#     2
#     4
#     6
#     8

In this example, the iterator takes num as a parameter which is the number of even numbers to be returned. The __iter__() returns the iterator object itself. The __next__() method calculates the next even number in the sequence, updates the current value, and returns the current value. The iterator stops after returning the first 2 * n even numbers.

Why use the above approach if we can simply use a function and a for loop?

def generate_evens(num):
    for i in range(num * 2):
        if i % 2 == 0:
            print(i)
generate_evens(5)

This is the most straightforward approach to perform what we just did in the previous code example. The code has an immediate output and it is simple. But what it lacks is Lazy Evaluation and Lack of Reusability

It lacks reusability in a sense that, It is specific to the task of printing even numbers within a given range. If you want to use the generated even numbers in different contexts or pass them as input to other functions, you would need to modify the function to return a list or use other mechanisms.

Why use an iterator?

While we can use a ‘for’ loop to traverse a list directly, iterators provide some additional benefits an flexibility that can be useful in certain situations.

  • Memory efficiency: Iterators only load one item at a time into memory, while a traditional for loop loads the entire sequence into memory. This can be important for large datasets or memory-limited systems.
  • Lazy evaluation: Iterator approach enables lazy evaluation, generating the next element on the fly as it is requested during iteration. This can be beneficial when working with large or infinite sequences, as it avoids the need to generate all numbers upfront and saves memory.
  • Flexibility and reusability: Iterators provide a more flexible and reusable way to traverse a sequence than a for loop. You can define your own custom iterators to generate sequences that are tailored to your specific needs.

When to use an iterator?

Suppose you have a dataset that consists of millions of rows of data, and you need to perform some calculations on each row. One way to process this dataset would be to load the entire dataset into memory as a list or other sequence object, and then loop through the rows using a for loop. However, if the dataset is very large, this approach could quickly use up all available memory on your system.

Instead, you could use an iterator to read the data from the dataset file one row at a time, and perform your calculations on each row as you go. Here’s an example of how you might use an iterator to process a large dataset:

Open the dataset file
with open('large_dataset.csv', 'r') as f:

    # Create an iterator that reads one row at a time
    rows = csv.reader(f)
    
    # Loop through the rows using the iterator
    for row in rows:
        # Process the row and perform calculations
        # ...

In this code, the csv.reader() function takes a file object f as an argument and returns an iterator rows. This iterator is then used in a for loop to iterate over each row of the CSV file, with the variable row taking on the value of each row as a list of values. The processing of each row happens inside the loop where the row variable is available as a list of values.

Summing UP

  • An iterator is an object that allows you to traverse a sequence one item at a time.
  • Iterators provide a more memory-efficient way to work with large datasets because they only load one item at a time into memory, rather than loading the entire sequence at once.
  • Lazy Evaluation: Iterators  provide lazy evaluation of a sequence, meaning that you can generate items on the fly as they are needed, rather than generating all of them upfront.

If someone in your group or an interviewer brings up a discussion about iterators, remember the term Lazy Evaluation because this is why we use iterators instead of other traditional methods of solving a problem.😁. If you are confused with Generators and Iterators, then read my blog on Generators and iterators.

Leave a Reply

Your email address will not be published. Required fields are marked *