Bad Coder, Bad Coder!

My title here is in the sense of “Bad dog, bad dog!”, a scolding I sometimes see dog owners use to tame their pets, and is also an allusion to Bad Reporter, a sometimes hilarious and always irreverent political comic strip in the San Francisco Chronicle. And my title is intended to convey the point that I think that “good programming practice” rules are sometimes taken overly seriously.

I’ll comment on two aspects here:

  • Variable names, spacing etc.
  • “Side effects.”

The second is the more unsettling of the two, but the first is interesting because Hadley Wickham is involved, which always gets people’s attention. 🙂

What prompted this is that there is currently a discussion in the ASA Statistical Learning and Data Mining Section e-mail list on Hadley’s coding style guidelines.  Some of you may know that Google has also had its own R style guidelines for some years now.

Much as I admire Hadley and Google, I really think this is much ado about nothing. I do have my own personal style (if you are curious, just look at my code), but seriously, folks, I don’t think it really matters. As long as a programmer is consistent  and has given careful thought as to how to be clear — not only clear to others who read the code, but also clear to the programmer herself when she reads the code cold two months from now — it’s fine.

I would like to see better style in programmers’ English, though. Please STOP using the terms “curly braces” and “square brackets”! Braces ARE curly, by definition, and brackets ARE square, right? Or, at least be consistent in your redundancy, speaking of “round parentheses,” “dotty periods,” “long dashes,” “big capitals,” “elevated overbars” and so on. 🙂

And now, for the hard stuff, side effects. R is a functional language, meaning among other things that no variable is changed upon executing a function, other than assignment of the return value. In other words,

y <- f(x)

will change y but neither x nor any other variables, e.g. globals, will be changed.

For many in the R community, this restriction is followed with religious-like fervor. They consider it safe, clear and easy to debug.

Yes, yes, I understand the arguments, and the No Side Effecters have a point. But it is much too dogmatic, in my opinion, and often causes one to incur BIG performance penalties.

I am not here to praise Caeser, but I’m not here to bury him either. 🙂 If you want to subscribe to that anti-side effects philosophy, fine. My only point is that, like it or not, R is slowly moving away from the banning of side effects. I would cite a few examples:

  • Reference classes. R has always had some ways to create side effects for the antisocial 🙂 , one notable example being environments. Reference classes carry that loophole one step further.
  • The bigmemory package. Yes, it was developed as a workaround to R’s 32-bit memory constraints (somewhat but not completely resolved in more recent versions of R), but I’m told that one of the reasons for the package’s popularity is its ability change data in-place in memory. This is basically a side effects issue, as traditionally the assignment to y above has created a new, separate copy of y in memory, often a big performance killer. (Again, this too has been ameliorated somewhat in recent versions of R.)
  • The data.table package. Same comments here as for the “insidious” use of bigmemory above, but also consider the operation> setkey(mtc1,cyl).Ever notice that no assignment is made? If you rave about data.table, you should keep in mind what factors underly that lightning speed.

I won’t get into the global variables issue here, other than to say again that I think the arguments against them often tend to be more ideological than logical.

In summary, I believe that careful thought, rather than an obsession with rules, is your ticket to good — clear and efficient — code.

 

Advertisements

8 thoughts on “Bad Coder, Bad Coder!”

  1. I don’t see style guides as much ado about nothing. They are, after all, guidelines, not inflexible rules. Just like there are style suggestions in spoken languages, so there should be style guidelines in programming languages. And with R being rather permissive, lack of style lead to messy code. We should write code just like a writer would write a book: for others to understand. Internal consistency is good, external consistency is even better. Imagine if all authors of papers used a different citation style, perhaps internally consistent, but completely heterogeneous among authors.

    1. Are you sure that citation style is a good analogy? I don’t see it.

      I think that if you go to a set of style guidelines and look through them one-by-one, you’ll have a hard time finding many whose violation would necessarily produce unclear code. Most are rather petty.

      The one thing I do like about the Google guidelines is that the recommend sticking to S3 classes, not S4. 🙂

  2. — I do have my own personal style (if you are curious, just look at my code), but seriously, folks, I don’t think it really matters. As long as a programmer is consistent and has given careful thought as to how to be clear — not only clear to others who read the code, but also clear to the programmer herself when she reads the code cold two months from now — it’s fine.

    without getting into the “environments” and scoping issues (java’s make infinitely more sense), this one is easy to deal with. there are myriad rule-driven code re-formatters out there. and more mature coding shops not only make them available, but require their use on check-in. so each coder can make the text look the way it should for maximum comfort, and put it back in whatever the standard is. just as the proper desk and chair are necessary to make the IT folk forget that they exist, so too the appearance of the text at hand.

  3. Norm, I am not clear about:
    “If you rave about data.table, you should keep in mind what factors underly that lightning speed”…

    Can you elaborate what you mean?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s