Elf Sternberg: If you're a senior developer, you have to accept some WET code.

In some programming languages there is an essential, powerful tension between two common pieces of advice: Don’t Repeat Yourself and Meaningful Names over Code Comments. Both are really good pieces of advice.

“Don’t Repeat Yourself” (DRY) means that if you can find an abstraction that allows you to avoid repetition in your code, you can remove the need to debug multiple code blocks if you find an error, and you can test the abstraction more reliably and more efficiently. The opposite of DRY is, of course, WET, “Written-out Every Time.”

“Meaningful Names over Code Comments” means that if you have strong, descriptive names for classes, functions, and variables, then code comments are often not merely unnecessary but possibly harmful as they drift out-of-date with the actual content of the code.

At my ${DAY_JOB}, I ran into this conflict in a big way. This example is in Python, but it applies to any language with metalanguage capabilities, which includes Ruby, Lisp, Rust, and even C++.

Let’s say you’re developing a CMS for lawyers, so you’ll have Courts, Cases, Lawyers, Tags, Outcomes, etc. Someone on the team has written an OpenAPI file describing the URLs that will allow authorized users to access these items, along with the expected data format of any objects that come out of it.

In test-driven-development, your next step is to handle the following story: “Write a unit test that accesses the URL for Lawyers, verifies that at least one Lawyer exists in the test data, and validates that body of the message matches the expected data format for Lawyer.” Another story would be “Fetch one Lawyer and verify…” Then, naturally, you write the code that does those things.

There are a few Python libraries that parse OpenAPI for you. And there are Python libraries that convert the OpenAPI specification into JSON, and since you’ve decided your CMS will speak mostly JSON, we’ll use that. Eventually we’ll get down to the schema collection.

The thing is, other team members want to have access to those schema objects as individual library items, so they can be precise and meaningful about the objects they’re manipulating. The openAPI library produces an object that looks like spec["components"]["schemas"] that contains your schema items, and maybe you don’t want to be passing around or importing the entire specification when all you really want to know is “What does a Laywer look like on the wire?”

So you devolve the library into exportable names:

<code class="sourceCode python"><a title="1" class="sourceLine" id="cb1-1"><span class="co"># schemas.py</span></a>
<a title="2" class="sourceLine" id="cb1-2"><span class="co"># ...</span></a>
<a title="3" class="sourceLine" id="cb1-3">schemas <span class="op">=</span> spec[<span class="st">"components"</span>][<span class="st">"schemas"</span>]</a>
<a title="4" class="sourceLine" id="cb1-4">lawyerSchema <span class="op">=</span> schemas[<span class="st">"Lawyers"</span>]</a>
<a title="5" class="sourceLine" id="cb1-5">lawyerCollectionSchema <span class="op">=</span> { <span class="st">"type"</span>: <span class="st">"array"</span>, <span class="st">"items"</span>: lawyerSchema }</a>
<a title="6" class="sourceLine" id="cb1-6">caseSchema <span class="op">=</span> schemas[<span class="st">"Cases"</span>]</a>
<a title="7" class="sourceLine" id="cb1-7">caseCollectionSchema <span class="op">=</span> { <span class="st">"type"</span>: <span class="st">"array"</span>, <span class="st">"items"</span>: caseSchema }</a>
<a title="8" class="sourceLine" id="cb1-8">courtSchema <span class="op">=</span> schemas[<span class="st">"Courts"</span>]</a>
<a title="9" class="sourceLine" id="cb1-9">courtCollectionSchema <span class="op">=</span> { <span class="st">"type"</span>: <span class="st">"array"</span>, <span class="st">"items"</span>: courtSchema }</a>
<a title="10" class="sourceLine" id="cb1-10"><span class="co"># ... and so on ...</span></a></code>

That can get to be a lot of repetition. So I decided to DRY the code:

<code class="sourceCode python"><a title="1" class="sourceLine" id="cb2-1"><span class="co"># schemas.py</span></a>
<a title="2" class="sourceLine" id="cb2-2"><span class="co"># ...</span></a>
<a title="3" class="sourceLine" id="cb2-3">schemas <span class="op">=</span> specification[<span class="st">"components"</span>][<span class="st">"schemas"</span>]</a>
<a title="4" class="sourceLine" id="cb2-4">current_module <span class="op">=</span> sys.modules.get(<span class="va">__name__</span>)</a>
<a title="5" class="sourceLine" id="cb2-5"><span class="cf">for</span> schema_name <span class="kw">in</span> schemas:</a>
<a title="6" class="sourceLine" id="cb2-6">    lower_cased <span class="op">=</span> schema_name.lower()</a>
<a title="7" class="sourceLine" id="cb2-7">    a_schema <span class="op">=</span> schemas[schema_name]</a>
<a title="8" class="sourceLine" id="cb2-8">    <span class="bu">setattr</span>(current_module, <span class="st">"</span><span class="sc">{}</span><span class="st">Schema"</span>.<span class="bu">format</span>(lower_cased), a_schema)</a>
<a title="9" class="sourceLine" id="cb2-9">    <span class="bu">setattr</span>(current_module, <span class="st">"</span><span class="sc">{}</span><span class="st">CollectionSchema"</span>.<span class="bu">format</span>(lower_cased), { </a>
<a title="10" class="sourceLine" id="cb2-10">        <span class="st">"type"</span>: <span class="st">"array"</span>, </a>
<a title="11" class="sourceLine" id="cb2-11">         <span class="st">"items"</span>: a_schema </a>
<a title="12" class="sourceLine" id="cb2-12">    )}</a></code>

There. Completely DRY. Hugely problematic.

First, unless you’re really comfortable with this kind of meta-programming, reaching into the Python VM and modifying the in-memory representations of library objects at run-time, this sort of thing looks like (and frankly, is) just pure shenanigans.

Secondly, this means that there now exist, in the running Python program’s actual, functioning namespace, several new object names that exist nowhere in the source code. You can’t search for where the object lawyerCollectionSchema is first constructed. It and its sibling objects are being built out of an external resource, a file written in YAML no less, read into the system the first time schemas.py is imported.

As someone who’s spent a lot of time studying the internals of Python’s import statement, I’m both comfortable with and completely aware of how arbitrary the import mechanism really is. My team is not made up of people like me and, indeed, they couldn’t parse this code without a walk-through and an introductory paragraph. They’d never seen this kind of metaprogramming before.

I went back and just did the repetition. Okay, I wrote a python script to write it for me, and committed the script to git so others could follow along. Which is, itself, a kind of meta-programming, it’s just one that leaves a permanent, visible trace of its output where not-so-senior devs can see and understand it.

Modern languages with this kind of accessibility to their internals allow developers to be succinct to the point of misleading obfuscation, even when not intended. Sometimes, to help less experienced devs out, you have to make your code explicit and WET.

Elf M. Sternberg

Full Stack Web Developer

Where one teaches, two learn.

Blog

IF YOU'RE A SENIOR DEVELOPER, YOU HAVE TO ACCEPT SOME WET CODE.