I l@ve RuBoard Previous Section Next Section

2.2 Sorting a Dictionary

Credit: Alex Martelli, Raymond Hettinger

2.2.1 Problem

You want to sort a dictionary. Because sorting is a concept that only makes sense for sequences, this presumably means that you want a sequence of the values of the dictionary in the order obtained by sorting the keys.

2.2.2 Solution

The simplest approach is to sort the items (i.e., key/value pairs), then pick just the values:

def sortedDictValues1(adict):
    items = adict.items(  )
    items.sort(  )
    return [value for key, value in items]

However, an alternative implementation that sorts just the keys, then uses them to index into the dictionary to build the result, happens to run more than twice as fast (for a dictionary of a few thousand entries) on my system:

def sortedDictValues2(adict):
    keys = adict.keys(  )
    keys.sort(  )
    return [adict[key] for key in keys]

A further small speed-up (15% on my system) is to perform the last step by mapping a bound method. map is often marginally faster than a list comprehension when no lambda is involved:

def sortedDictValues3(adict):
    keys = adict.keys(  )
    keys.sort(  )
    return map(adict.get, keys)

A really tiny extra speed-up (about 3% on my system) is available in Python 2.2 by using adict._ _getitem_ _ rather than adict.get in this latest, bound-method version.

2.2.3 Discussion

The concept of sorting applies only to a collection that has order梚n other words, a sequence. A mapping, such as a dictionary, has no order, so it cannot be sorted. And yet, "How do I sort a dictionary?" is a frequent question on the Python lists. More often than not, the question is about sorting some sequence of keys and/or values from the dictionary.

A dictionary's keys can be extracted as a list, which can then be sorted. The functions in this recipe return the values in order of sorted keys, which corresponds to the most frequent actual need when it comes to sorting a dictionary. Another frequent need is sorting by the values in the dictionary, for which you should see Recipe 17.7.

The implementation choices are interesting. Because we are sorting key/value pairs by the key field and returning only the list of value fields, it seems conceptually simplest to use the first solution, which gets a list of the key/value pairs, sorts them, and then uses a list comprehension to pick the values. However, this is not the fastest solution. Instead, with Python 2.2, on dictionaries of a few thousand items, extracting just the keys, sorting them, and then accessing the dictionary for each key in the resulting list comprehension梩he second solution梐ppears to be over twice as fast.

This faster approach can be further optimized by extracting the bound method adict.get, which turns each key into its corresponding value, and then using the built-in function map to build the list by applying this callable to each item in the sorted list of keys. In Python 2.2, using adict._ _getitem_ _ rather than adict.get is even a little bit better (probably not enough to justify making your program version-dependent, but if you're already dependent on Python 2.2 for other reasons, you may as well use this approach).

Simplicity is one the greatest virtues for any program, but the second and third solutions aren't really more complicated than the first; they are just, perhaps, a little bit more subtle. Those solutions are probably worth using to sort any dictionary, even though their performance advantages are really measurable only for very large ones.

2.2.4 See Also

Recipe 17.7 for another application of sorting on dictionaries.

    I l@ve RuBoard Previous Section Next Section