Gist
MASS GAFFITTER
I'm very fond of gaffitter, a smart little console program that scans a list of files and/or directories and fumbles through them until it comes up with a subset of that list that will fit in a given space. It's perfect for taking large directories of stuff and segmenting them into archivable collections.
I recently ran out of disk space on my desktop and realized I had hundreds, nay thousands, of music directories that I needed to put somewhere else. I suppose I could have bought more hard drive space, but more than that I wanted a lot of it just to be put away. I ran gaffitter on the collection with a limit of 4.2 GB, the reliable size of a data DVD, and it said I had about 35 collections worth. Great, I thought, but how to organize the output of gaffitter into subdirectories that I could then burn onto DVDs?
For that, I wrote mass_gaffiter.py. It's a very simple little script that uses gaffitter's regular output as its input, and then turns around and spits out another script (a shell script this time) that, for each collection gaffitter has identified, creates a subdirectory and moves everything in that collection into the subdirectory. When it's done, your cluttered directory is organized into a collection of subdirs named "gaf_disk_01", "gaf_disk_02", etc., all ready for growisofs or whatever other DVD burning software you like.
I'm trying to get into the habit of sharing the little utilities in life that I can't work without. I think of them as little throwaways, but some of them I've kept for years, so I figure someone else might have good use of them. Here's mass_gaffiter.py:
#!/usr/bin/env python
import sys
import re
re_m = re.compile(r'^\[(\d+)\] Sum')
f = open(sys.argv[1], "r")
accum = []
for l in f:
    g = re_m.match(l)
    if not g:
        accum.append(l[:-1])
        continue
    print 'mkdir gaf_disk_%03d' % int(g.group(1))
    print 'mv %s gaf_disk_%03d' % (
        ' '.join(['"' + i + '"' for i in accum if i]),
        int(g.group(1)))
    accum = []
