I l@ve RuBoard

### 4.24 Computing Directory Sizes in a Cross-Platform Way

Credit: Frank Fejes

#### 4.24.1 Problem

You need to compute the total size of a directory (or set of directories) in a way that works under both Windows and Unix-like platforms.

#### 4.24.2 Solution

There are easier platform-dependent solutions, such as Unix's du, but Python also makes it quite feasible to have a cross-platform solution:

```import os
from os.path import *

class DirSizeError(Exception): pass

def dir_size(start, follow_links=0, start_depth=0, max_depth=0, skip_errs=0):

# Get a list of all names of files and subdirectories in directory start
try: dir_list = os.listdir(start)
except:
# If start is a directory, we probably have permission problems
if os.path.isdir(start):
raise DirSizeError('Cannot list directory %s'%start)
else:  # otherwise, just re-raise the error so that it propagates
raise

total = 0L
for item in dir_list:
# Get statistics on each item--file and subdirectory--of start
path = join(start, item)
try: stats = os.stat(path)
except:
if not skip_errs:
raise DirSizeError('Cannot stat %s'%path)
# The size in bytes is in the seventh item of the stats tuple, so:
total += stats[6]
# recursive descent if warranted
bytes = dir_size(path, follow_links, start_depth+1, max_depth)
total += bytes
if max_depth and (start_depth < max_depth):
print_path(path, bytes)

def print_path(path, bytes, units='b'):
if units == 'k':
print '%-8ld%s' % (bytes / 1024, path)
elif units == 'm':
print '%-5ld%s' % (bytes / 1024 / 1024, path)
else:
print '%-11ld%s' % (bytes, path)

def usage (name):
print "usage: %s [-bkLm] [-d depth] directory [directory...]" % name
print '\t-b\t\tDisplay in Bytes (default)'
print '\t-k\t\tDisplay in Kilobytes'
print '\t-m\t\tDisplay in Megabytes'
print '\t-d, --depth\t# of directories down to print (default = 0)'

if _ _name_ _=='_ _main_ _':
# When used as a script:
import string, sys, getopt

units = 'b'
depth = 0

try:
opts, args = getopt.getopt(sys.argv[1:], "bkLmd:", ["depth="])
except getopt.GetoptError:
usage(sys.argv[0])
sys.exit(1)

for o, a in opts:
if o == '-b': units = 'b'
elif o == '-k': units = 'k'
elif o == '-L': follow_links = 1
elif o == '-m': units = 'm'
elif o in ('-d', '--depth'):
try: depth = int(a)
except:
print "Not a valid integer: (%s)" % a
usage(sys.argv[0])
sys.exit(1)

if len(args) < 1:
print "No directories specified"
usage(sys.argv[0])
sys.exit(1)
else:
paths = args

for path in paths:
try: bytes = dir_size(path, follow_links, 0, depth)
except DirSizeError, x: print "Error:", x
else: print_path(path, bytes)```

#### 4.24.3 Discussion

Unix-like platforms have the du command, but that doesn't help when you need to get information about disk-space usage in a cross-platform way. This recipe has been tested under both Windows and Unix, although it is most useful under Windows, where the normal way of getting this information requires using a GUI. In any case, the recipe's code can be used both as a module (in which case you'll normally call only the dir_size function) or as a command-line script. Typical use as a script is:

`C:\> python dir_size.py "c:\Program Files"`

This will give you some idea of where all your disk space has gone. To help you narrow the search, you can, for example, display each subdirectory:

`C:\> python dir_size.py --depth=1 "c:\Program Files"`

The recipe's operation is based on recursive descent. os.listdir provides a list of names of all the files and subdirectories of a given directory. If dir_size finds a subdirectory, it calls itself recursively. An alternative architecture might be based on os.path.walk, which handles the recursion on our behalf and just does callbacks to a function we specify, for each subdirectory it visits. However, here we need to be able to control the depth of descent (e.g., to allow the useful --depth command-line option, which turns into the max_depth argument of the dir_size function). This control is easier to attain when we administer the recursion directly, rather than letting os.path.walk handle it on our behalf.