I l@ve RuBoard Previous Section Next Section

1.12 Unzipping Simple List-Like Objects

Credit: gyro funch

1.12.1 Problem

You have a sequence and need to pull it apart into a number of pieces.

1.12.2 Solution

There's no built-in unzip counterpart to zip, but it's not hard to code our own:

def unzip(p, n):
    """ Split a sequence p into a list of n tuples, repeatedly taking the
    next unused element of p and adding it to the next tuple.  Each of the
    resulting tuples is of the same length; if p%n != 0, the shorter tuples
    are padded with None (closer to the behavior of map than to that of zip).
        Example:
        >>> unzip(['a','b','c','d','e'], 3)
        [('a', 'd'), ('b', 'e'), ('c', None)]
    """
    # First, find the length for the longest sublist
    mlen, lft = divmod(len(p), n)
    if lft != 0: mlen += 1

    # Then, initialize a list of lists with suitable lengths
    lst = [[None]*mlen for i in range(n)]

    # Loop over all items of the input sequence (index-wise), and
    # Copy a reference to each into the appropriate place
    for i in range(len(p)):
        j, k = divmod(i, n)    # Find sublist-index and index-within-sublist
        lst[k][j] = p[i]       # Copy a reference appropriately

    # Finally, turn each sublist into a tuple, since the unzip function
    # is specified to return a list of tuples, not a list of lists
    return map(tuple, lst)

1.12.3 Discussion

The function in this recipe takes a list and pulls it apart into a user-defined number of pieces. It acts like a sort of reverse zip function (although it deals with only the very simplest cases). This recipe was useful to me recently when I had to take a Python list and break it down into a number of different pieces, putting each consecutive item of the list into a separate sublist.

Preallocating the result as a list of lists of None is generally more efficient than building up each sublist by repeated calls to append. Also, in this case, it already ensures the padding with None that we would need anyway (unless length(p) just happens to be a multiple of n).

The algorithm that unzip uses is quite simple: a reference to each item of the input sequence is placed into the appropriate item of the appropriate sublist. The built-in function divmod computes the quotient and remainder of a division, which just happen to be the indexes we need for the appropriate sublist and item in it.

Although we specified that unzip must return a list of tuples, we actually build a list of sublists, and we turn each sublist into a tuple as late in the process as possible by applying the built-in function tuple over each sublist with a single call to map. It is much simpler to build sublists first. Lists are mutable, so we can bind specific items separately; tuples are immutable, so we would have a harder time working with them in our unzip function's main loop.

1.12.4 See Also

Documentation for the zip and divmod built-ins in the Library Reference.

    I l@ve RuBoard Previous Section Next Section