|I l@ve RuBoard|
4.24 Computing Directory Sizes in a Cross-Platform Way
Credit: Frank Fejes
There are easier platform-dependent solutions, such as Unix's du, but Python also makes it quite feasible to have a cross-platform solution:
import os from os.path import * class DirSizeError(Exception): pass def dir_size(start, follow_links=0, start_depth=0, max_depth=0, skip_errs=0): # Get a list of all names of files and subdirectories in directory start try: dir_list = os.listdir(start) except: # If start is a directory, we probably have permission problems if os.path.isdir(start): raise DirSizeError('Cannot list directory %s'%start) else: # otherwise, just re-raise the error so that it propagates raise total = 0L for item in dir_list: # Get statistics on each item--file and subdirectory--of start path = join(start, item) try: stats = os.stat(path) except: if not skip_errs: raise DirSizeError('Cannot stat %s'%path) # The size in bytes is in the seventh item of the stats tuple, so: total += stats # recursive descent if warranted if isdir(path) and (follow_links or not islink(path)): bytes = dir_size(path, follow_links, start_depth+1, max_depth) total += bytes if max_depth and (start_depth < max_depth): print_path(path, bytes) return total def print_path(path, bytes, units='b'): if units == 'k': print '%-8ld%s' % (bytes / 1024, path) elif units == 'm': print '%-5ld%s' % (bytes / 1024 / 1024, path) else: print '%-11ld%s' % (bytes, path) def usage (name): print "usage: %s [-bkLm] [-d depth] directory [directory...]" % name print '\t-b\t\tDisplay in Bytes (default)' print '\t-k\t\tDisplay in Kilobytes' print '\t-m\t\tDisplay in Megabytes' print '\t-L\t\tFollow symbolic links (meaningful on Unix only)' print '\t-d, --depth\t# of directories down to print (default = 0)' if _ _name_ _=='_ _main_ _': # When used as a script: import string, sys, getopt units = 'b' follow_links = 0 depth = 0 try: opts, args = getopt.getopt(sys.argv[1:], "bkLmd:", ["depth="]) except getopt.GetoptError: usage(sys.argv) sys.exit(1) for o, a in opts: if o == '-b': units = 'b' elif o == '-k': units = 'k' elif o == '-L': follow_links = 1 elif o == '-m': units = 'm' elif o in ('-d', '--depth'): try: depth = int(a) except: print "Not a valid integer: (%s)" % a usage(sys.argv) sys.exit(1) if len(args) < 1: print "No directories specified" usage(sys.argv) sys.exit(1) else: paths = args for path in paths: try: bytes = dir_size(path, follow_links, 0, depth) except DirSizeError, x: print "Error:", x else: print_path(path, bytes)
Unix-like platforms have the du command, but that doesn't help when you need to get information about disk-space usage in a cross-platform way. This recipe has been tested under both Windows and Unix, although it is most useful under Windows, where the normal way of getting this information requires using a GUI. In any case, the recipe's code can be used both as a module (in which case you'll normally call only the dir_size function) or as a command-line script. Typical use as a script is:
C:\> python dir_size.py "c:\Program Files"
This will give you some idea of where all your disk space has gone. To help you narrow the search, you can, for example, display each subdirectory:
C:\> python dir_size.py --depth=1 "c:\Program Files"
The recipe's operation is based on recursive descent. os.listdir provides a list of names of all the files and subdirectories of a given directory. If dir_size finds a subdirectory, it calls itself recursively. An alternative architecture might be based on os.path.walk, which handles the recursion on our behalf and just does callbacks to a function we specify, for each subdirectory it visits. However, here we need to be able to control the depth of descent (e.g., to allow the useful --depth command-line option, which turns into the max_depth argument of the dir_size function). This control is easier to attain when we administer the recursion directly, rather than letting os.path.walk handle it on our behalf.
4.24.4 See Also
|I l@ve RuBoard|