1 files changed, 47 insertions, 0 deletions
diff --git a/posts/2013-02-19-should-I-read-the-code.org b/posts/2013-02-19-should-I-read-the-code.org
new file mode 100644
index 0000000..909ed46
--- /dev/null
+++ b/posts/2013-02-19-should-I-read-the-code.org
@@ -0,0 +1,47 @@
+This conversation happened twice in the last few weeks at work, the
+first time during my 1:1 with my manager, and a second time with the
+whole team.
+
+We were investigating [[http://riemann.io/][Riemann]], and we started to
+discuss what it would means to adopt this technology in our stack.
+Riemann is written in Clojure, and no one at work is really familiar
+with this language (except for me, and I'm don't consider myself
+efficient with it).
+
+The question is how do you deal with a new tool when there's only one
+person in the team that can read the code, and therefore contribute to
+the project to add features, fix bugs, etc. Are we supposed to be
+familiar with the code of the things that we use?
+
+I've never read the code of MySQL, I've read parts of Apache's code, and
+I've never looked at the source of the Linux' kernel. At the same time,
+I usually read the code of the Perl and Python libraries I use
+frequently, I've also read the source of statsd and Graphite (two other
+tools that we looked at) in order to understand what they do and hunt
+for issues (in the way we use them) or bugs.
+
+I see two ways to approach this question so far: as a developer and as a
+user. As a developer, I consider that I *have to* read and understand
+the code of the libraries my code depends on (we've found some serious
+issues in libraries we use daily because of this approach).
+
+For services we use in the infrastructure, it depends of the size of the
+tool and it's community. For a new product, or when the documentation is
+too sparse, or when the community is rather small, it's a good thing to
+be able to look at the code and explain what it does. For bigger
+projects (MySQL, Apache, Riak), so far I've relied on the experiences
+people had with the tools, the community.
+
+I'll conclude this post with an anecdote. Last Thursday we were trying
+to understand why the CPU load on the Graphite's box went over the roof
+when we added about 25% more metrics to it. With Abe and Hachi we said
+"ok let's dig into this problem". You could have guessed who are the ops
+while looking at the scene. We were looking for the same things: reads
+and write. Abe and Hachi started to do that with the help of =strace=,
+while I started to walk through the code. I think the two ways are
+valid, at least you can use one to correlate the other, and they gave
+you different information (=strace= will help you to time the
+operations, while the code would explain what you're writing and
+reading).
+
+I'm curious to hear how other approach this problem.