Clever is better than explicit?
Then earlier today, somehow, I ended up reading a Python Cookbook recipe, written by the brilliant Alex Martelli, titled Add an entry to a dictionary, unless the entry is already there. It offers an alternative to the common pattern when dealing with, for example, lists embedded in dictionaries. The usual, "obvious" way is to do this:
if d.has_key(key):
d[key].append(value)
else:
d[key] = [value]
But it's rather many lines to say a fundamentally simple thing (a simple thing in Python programs, at least). So Alex proposes a more succinct way to express the same thing in Python code:
d.setdefault(key,[]).append(value)It exploits the
setdefault() dict method, which, as Python Library Reference puts it, works like this:
a.setdefault(k[, x]) a[k] if k in a, else x (also setting it)(OK, maybe "exploit" is too strong a word, because the above usage is just what
setdefault() is for.)
Anyhow.
It got me thinking whether this is an improvement of the obvious way to do the same thing. (I am using "obvious" here in a probably non-obvious way: meaning that it's obvious to me.) I did not come to any decisive conclusion, but my initial reaction was that the latter way is not exactly explicit. The code doesn't say, directly (to me, again), that "I want this value to added to the list of this key". But then again, neither does the former.
When working with immutable values, like counting the number of occurances of different words, for example, you can express the similar pattern with the aid of dicts get(), like this:
d[word] = d.get(word, 0) + 1But this reads more clearly, I think. It says that "if we haven't found any occurances of
word yet, that equals to 0 occurances so far".
What I wanted to say with all this, I guess, is that if I were to read, for the first time, the source code of some program that I needed to understand, code like d.setdefault(key,[]).append(value) would make me a bit baffled. (Before I knew this particular idiom, that is.) While the more verbose way to say it, if d.has_key(key): ..., would be instantly obvious. But I am not sure about this whole thing. I know that I am not on solid ground with these arguments. I do not know, anymore, what really is more obvious and more explicit, in the good way. Even for myself.
[permalink] [4 comments] 30.01.2005, 20:48
- Comments:
Posted by Simon Brunning at 01.02.2005, 12:11
This is what you might call an idiom. You might trip over it the first time you see it - I know I did - but it's common enough in Python code these days that you get the hang if it pretty quickly.
I've also heard things like these called 'Textures'. Like Patterns, only smaller. ;-)Posted by Mark Eichin at 02.02.2005, 07:01
I've been looking for an idiom like that - to the point that I have code that calls a safe_append helper function - helper for the reader, that is. The idiom is a *bit* dense (setdefault rather confused me when I was first learning python) but exposure would probably help.
The other reason not to use the if-statement version in normal code is that it devotes far to much reader-attention-space to what is fundamentally a simple thing -- all you're trying to do is append to a named bucket, python just doesn't have a good way to handle the base case.
Maybe a named-bucket class makes more sense :-)Posted by Mark Eichin at 02.02.2005, 07:12
Oh, actually, looking through my notes I at least have a clearer version of the "if" statement:
if k not in n:
n[k] = []
n[k].append(v)
This makes it a little clearer that the [] is an initialization, and that you only ever do *one* thing with the value - append it. The reading-optimal case might just be
n[k].append(v)
and I think you could make that work - if n were a hash deriviative where __getitem__ returned [] instead of throwing an exception, but __contains__ didn't. Not sure that wouldn't be worse in *other* ways (that reminds me that that was what I thought setdefault "must" do - set the default value for the whole hash, to use instead of throwing KeyError. I have no idea where I *got* that idea from, though, it isn't like the documentation isn't clear) but it has some potential...Posted by Mark Eichin at 07.03.2005, 06:03
Last week, Peter Norvig came up with a clear implementation of putting the smarts all the way back in the dictionary (using setdefault in __getitem__) that seems to provide for reasonably readable code in the end...http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/389639