Tips and Tricks for Python Lists

Tips and Tricks for Python Lists

This article is for Python programmers who have gone beyond the basics of using lists, but would like to see some interesting concepts. It isn’t very advanced and should be easy for you to understand.

Ensure that a List Contains Unique Elements.


Sets contain unique elements but are unordered. On the other hand, lists can contain duplicate elements but retain their insertion order. This example shows one way to get the benefits of both sets and lists — an ordered sequence of unique elements.

We are subclassing the Python built-in collections.UserList object. We could subclass list , or use collections.abc to create a new abstract base class of type MutableSequence . But those two approaches entail significantly more work to correctly implement all of the __dunder__ methods.

Here, we’re only interested in ensuring that sequences passed to the constructor, or any new elements appended to the end of the list, are unique.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
from collections import UserList


class UniquesList(UserList):
    """
    A List Class which works just like a list, except
    that it only holds unique values - similar to a set.
    >>> ul = UniquesList("The Jolly Green Giant")
    >>> print("".join(ul))
    The JolyGrniat
    """
    def __init__(self, initlist=None):
        """__init__.
        Args:
            initlist:
        """

        self.data = []

        if initlist:
            if isinstance(initlist, UniquesList):
                self.data[:] = initlist.data[:]
            else:
                for k in initlist:
                    self.append(k)

    def append(self, item) -> None:
        """Append an item to the end of the list.
        Args:
            item: Only unique values are appended, duplicates are omitted
        Returns:
            None:
        """

        if not self.data.count(item):
            super(UniquesList, self).append(item)


dl = UniquesList()
dl.append("Text Value One")
dl.append("Text Value One")
dl.append("Text Value One")
dl.append("Text Value One")
dl.append("Text Value Two")
dl.append("Text Value Two")
dl.append("Text Value Two")
dl.append("Text Value Two")
assert len(dl) == 2

dl = UniquesList()
for i in range(1000):
    dl.append("a")
assert len(dl) == 1

Line #39 is doing all the work. If the count() method is false: greater than 1, then skip the append.

Find all the Index Values of a Matching a Test Condition


What do you do if you want to efficiently test a large list to find the indices of all matching elements? This can be quite easy to get wrong: either by being too slow or consuming too much memory.

This is one way, to use an iterator and a generator.

Line #13 coalesces all the iterator’s yielded results into a list.

import typing


def get_indices(the_list: list, test_value: object) -> typing.Iterable[int]:
    """
    Returns the indices of matching list items.
    Uses a generator to create an iterator.
    Args:
        the_list: the list containing search elements
        test_value: what we want to find
    Returns: the index of matching list items
    >>> print(list(get_indices("The jolly green giant", "e")))
    [2, 12, 13]
    """

    generator = (key for key, val in enumerate(the_list) if test_value == val)
    for key in generator:
        yield key

It is also possible to use the get_indices() method in a for loop:

for k in get_indices("The jolly green giant", "e"):
    print(k)

Flatten a List of Lists into one Super List


What do we do if we want to flatten a really long list, perhaps many thousands of items, and each one is itself a list? Again, I’ve opted to use the technique of not using return, and instead using yield.

The function flatten_nested_lists doesn’t really complete in a traditional sense. Every time a value is computed it is yielded for the calling programming to make use of. This is a really important trick, also known as lazy evaluation. If my code decided I only needed the first 10 values, I can stop there and not need to calculate everything.

from itertools import chain


def flatten_nested_lists(*input_list: list) -> list:
    """flatten_nested_lists.
    Args:
        input_list:
    Returns:
        list:
    >>> l1: list = []
    >>> l1.append([1, "2", 3])
    >>> l1.append([4, 5, 6, 7])
    >>> l1.append(["Eight", {"this one": 9}])
    >>> l1.append([10, 11, 12])
    >>> print(list(flatten_nested_lists(*l1)))
    [1, '2', 3, 4, 5, 6, 7, 'Eight', {'this one': 9}, 10, 11, 12]
    """
    for i in chain.from_iterable(input_list):
        yield i


l2: list = []
l2.append([1, 2, 3])
l2.append([4, 5, 6])
l2.append([10, 11, 12])

for list_item in flatten_nested_lists(*l2):
    print(list_item)

The chain.from_iterable() method is very useful, and it does what the name implies! According to the docs, this method is roughly equivalent to:

def from_iterable(iterables):
    # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
    for it in iterables:
        for element in it:
            yield element

How get first n item from generator


1 - itertools.islice (preferred method)

import itertools

list(itertools.islice(generator, n))

example:

import itertools

>>> generator = (i for i in range(10))
>>> list(itertools.islice(generator, 4))
[0, 1, 2, 3]
>>> list(itertools.islice(generator, 4))
[4, 5, 6, 7]
>>> list(itertools.islice(generator, 4))
[8, 9]
>>> list(itertools.islice(generator, 4))
[]

2 - use next function

>>> generator = (i for i in range(10))
>>> list(next(generator) for _ in range(4))
[0, 1, 2, 3]
>>> list(next(generator) for _ in range(4))
[4, 5, 6, 7]
>>> list(next(generator) for _ in range(4))
Traceback (most recent call last):
  File "<stdin>", line 1, in <genexpr>
StopIteration

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: generator raised StopIteration

3 - zip function with list

>>> generator = (i for i in range(10))
>>> [x for _, x in zip(range(4), generator)]
[0, 1, 2, 3]
>>> [x for _, x in zip(range(4), generator)]
[4, 5, 6, 7]
>>> [x for _, x in zip(range(4), generator)]
[8, 9]
>>> [x for _, x in zip(range(4), generator)]
[]

4 - zip function with generator

>>> generator = (i for i in range(10))            
>>> gen = (x for _, x in zip(range(4), generator))
>>> list(gen)
[0, 1, 2, 3]
>>> gen = (x for _, x in zip(range(4), generator))
>>> list(gen)
[4, 5, 6, 7]
>>> gen = (x for _, x in zip(range(4), generator))
>>> list(gen)
[8, 9]
>>> gen = (x for _, x in zip(range(4), generator))
>>> list(gen)
[]

Implement a FrozenList


Python doesn’t contain a FrozenList object out of the box. The suggestion was rejected by the Python community, not least because it’s easy for us to implement ourselves.

I’ve implemented a subclass of the collections.UserList object, and overriden the methods which allow changes to the list’s elements.

from collections.abc import Iterable
from collections import UserList


def immutable_decorator(f):
    def wrapper(self, *args, **kwargs):
        raise TypeError("Object is frozen")
    return wrapper


class FrozenList(UserList):  # pylint: disable=too-many-ancestors
    """
    A List which is immutable.
    >>> fl: FrozenList = FrozenList("hello")
    >>> fl:FrozenList = FrozenList([1, 2, 4])
    >>> print(fl[1:2])
    [2]
    >>> print(fl)
    [1, 2, 4]
    >>> fl.append(1)
    Traceback (most recent call last):
     ...
    TypeError: Object is frozen
    >>> fl.extend(1)
    Traceback (most recent call last):
     ...
    TypeError: Object is frozen
    """
    @immutable_decorator
    def __setitem__(self, i: int, o) -> None:
        pass

    @immutable_decorator
    def __add__(self, other):
        pass

    @immutable_decorator
    def __iadd__(self, other):
        pass

    @immutable_decorator
    def __mul__(self, n: int):
        pass

    @immutable_decorator
    def __imul__(self, n: int):
        pass

    @immutable_decorator
    def append(self, item) -> None:
        pass

    @immutable_decorator
    def insert(self, i: int, item) -> None:
        pass

    @immutable_decorator
    def pop(self, i: int):
        pass

    @immutable_decorator
    def remove(self, item) -> None:
        pass

    @immutable_decorator
    def clear(self) -> None:
        pass

    @immutable_decorator
    def reverse(self) -> None:
        pass

    @immutable_decorator
    def extend(self, other) -> None:
        pass


l: list = [1, 2, 4]
fl: FrozenList = FrozenList(l)
assert fl[1:2] == [2]
fl: FrozenList = FrozenList("help")
assert fl[1::2] == ["e", "p"]

Lines 5–8 define a simple decorator, which raises an exception on whatever methods it decorates.

Create a List Which Only Accepts Specific Object Types


Of course, it is possible to create a List which only accepts certain object types or only certain values.

Again, subclassing UserList is an easy way to accomplish this: simply by selecting all the methods which accept input and validating it contains what we want.

from collections import UserList
from urllib.parse import urlparse
import json
import requests


def recursive_key_values(dictionary):
    for key, value in dictionary.items():
        i = 0
        if type(value) is str:
            yield (key, value)
        elif type(value) is dict:
            yield from recursive_key_values(value)
        elif type(value) in (list, tuple, set):
            for seq_item in value:
                yield from recursive_key_values({f"{key}_{str(i)}": seq_item})
                i = i + 1
        else:
            yield (key, str(value))


class URLFilteredList(UserList):
    """
    URLFilteredList. Will only accept URLs via append.
    """
    def __init__(self):
        self.data = []

    def append(self, item) -> None:
        if self._is_url(item):
            super().append(item)

    def __setitem__(self, i: int, item):
        if self._is_url(item):
            super().append(item)

    @staticmethod
    def _is_url(value: str) -> bool:
        if value and isinstance(value, str):
            validation = urlparse(value)
            if all([validation.scheme, validation.netloc]):
                return True

        return False


dict1 = dict(
    json.loads(
        requests.get("http://ergast.com/api/f1/2014/5/results.json").text))

ul: URLFilteredList = URLFilteredList()
for k, v in recursive_key_values(dict1):
    ul.append(v)

assert "http://en.wikipedia.org/wiki/2014_Spanish_Grand_Prix" in ul
assert "http://en.wikipedia.org/wiki/Daniel_Ricciardo" in ul
ul[0] = "definitely not a url"
assert ul[0] == 'http://ergast.com/mrd/1.4'

print(ul)

# output:
['http://ergast.com/mrd/1.4', 'http://ergast.com/api/f1/2014/5/results.json', 'http://en.wikipedia.org/wiki/2014_Spanish_Grand_Prix', 'http://en.wikipedia.org/wiki/Circuit_de_Barcelona-Catalunya', 'http://en.wikipedia.org/wiki/Lewis_Hamilton', 'http://en.wikipedia.org/wiki/Mercedes-Benz_in_Formula_One', 'http://en.wikipedia.org/wiki/Nico_Rosberg', 'http://en.wikipedia.org/wiki/Mercedes-Benz_in_Formula_One', 'http://en.wikipedia.org/wiki/Daniel_Ricciardo', 'http://en.wikipedia.org/wiki/Red_Bull_Racing', 'http://en.wikipedia.org/wiki/Sebastian_Vettel', 'http://en.wikipedia.org/wiki/Red_Bull_Racing', 'http://en.wikipedia.org/wiki/Valtteri_Bottas', 'http://en.wikipedia.org/wiki/Williams_Grand_Prix_Engineering', 'http://en.wikipedia.org/wiki/Fernando_Alonso', 'http://en.wikipedia.org/wiki/Scuderia_Ferrari', 'http://en.wikipedia.org/wiki/Kimi_R%C3%A4ikk%C3%B6nen', 'http://en.wikipedia.org/wiki/Scuderia_Ferrari', 'http://en.wikipedia.org/wiki/Romain_Grosjean', 'http://en.wikipedia.org/wiki/Lotus_F1', 'http://en.wikipedia.org/wiki/Sergio_P%C3%A9rez', 'http://en.wikipedia.org/wiki/Racing_Point_Force_India', 'http://en.wikipedia.org/wiki/Nico_H%C3%BClkenberg', 'http://en.wikipedia.org/wiki/Racing_Point_Force_India', 'http://en.wikipedia.org/wiki/Jenson_Button', 'http://en.wikipedia.org/wiki/McLaren', 'http://en.wikipedia.org/wiki/Kevin_Magnussen', 'http://en.wikipedia.org/wiki/McLaren', 'http://en.wikipedia.org/wiki/Felipe_Massa', 'http://en.wikipedia.org/wiki/Williams_Grand_Prix_Engineering', 'http://en.wikipedia.org/wiki/Daniil_Kvyat', 'http://en.wikipedia.org/wiki/Scuderia_Toro_Rosso', 'http://en.wikipedia.org/wiki/Pastor_Maldonado', 'http://en.wikipedia.org/wiki/Lotus_F1', 'http://en.wikipedia.org/wiki/Esteban_Guti%C3%A9rrez', 'http://en.wikipedia.org/wiki/Sauber', 'http://en.wikipedia.org/wiki/Adrian_Sutil', 'http://en.wikipedia.org/wiki/Sauber', 'http://en.wikipedia.org/wiki/Jules_Bianchi', 'http://en.wikipedia.org/wiki/Marussia_F1', 'http://en.wikipedia.org/wiki/Max_Chilton', 'http://en.wikipedia.org/wiki/Marussia_F1', 'http://en.wikipedia.org/wiki/Marcus_Ericsson', 'http://en.wikipedia.org/wiki/Caterham_F1', 'http://en.wikipedia.org/wiki/Kamui_Kobayashi', 'http://en.wikipedia.org/wiki/Caterham_F1', 'http://en.wikipedia.org/wiki/Jean-%C3%89ric_Vergne', 'http://en.wikipedia.org/wiki/Scuderia_Toro_Rosso']

SUBSCRIBE FOR NEW ARTICLES

@
comments powered by Disqus