Software Engineering for Data Scientists: Journaling
Last week, we talked about the "Drafting" technique. Using a drafted essay or paper as an example, we talked about tools and techniques you might use to take your code from “rough draft” to productionized.
We’ll say two things about that: we’ll call that “bottoms up” in terms of design and productivity, and we’ll say that it’s applying good techniques in the small.
Drafting can be used to implement, tune and finish code at the line by line level.
We’re going to jump up - way up - from there and start at the top. How do you plan your coding sessions from the top down? How do you think about a project from the very top in terms of the “biggest” things? We’ll go over that today with a technique we call Journaling.
Journaling is a project management technique that applies to the individual over a single coding session or multiple.
It consists of three practices and the tools and principles to support them.
Alexander Graham Bell, Public domain, via Wikimedia Commons
Just in Time Planning: the Mikado Technique
The Mikado Technique is based on a game not unlike pick up sticks (or the opposite of Jenga), whereby you try and plan how to pick up one of many wooden sticks without disturbing the others.
The idea is to write down your high-level goal for each change. Then, break that goal into multiple subgoals, and keep breaking things down until either: (1) there’s no more obvious breakdown or (2) each of the leaf tasks will take less than a timebox to complete in your estimate.
When you hit a timebox, you evaluate whether or not you met your goal. If you did, you commit and mark it off the list. If you didn’t - and you’re being strict - you roll back and re-evaluate your plan. The coding work discovered something about the problem. Rather than just pushing through, Mikado says: wait, stop, think. What’s the next best step?
You use the same plan to take notes on thoughts about what to do next. Ask yourself, “What is and is not working?" In the end, you’ll have a detailed document describing the steps you took and your comments on how and why.
You practice a lot of subskills here. Moreover, if you find this difficult, you can improve aspects of JIT planning using Mikado by training and practicing these subskills.
The ability to take an enormous task and break it into a smaller one is one of the most valuable skills you can build in your life.
There’s an illusion we’ve covered in our other writing that the inability to break a task down means it is “impressive.” That’s not at all the case. Going to the moon the first time was impressive; you can bet NASA was able to break that down.
The actual formal activities of Work Breakdown Structures aren’t what we’re after here. We’re just trying to split tasks into subtasks.
You can practice this activity by doing things like:
Plan a trip across the country
Plan how to build a backyard fort
Plan how you’d go to Mars
Plan a wedding
What’s a good task? A good task is going to have a few things:
Concrete exit conditions or acceptance criteria
You’re going to start from a known good state
You’re going to end in a known good state
This will sound a lot like a good user story or SMART goals. That’s because the same principles apply.
Here, our definition of ready is “the code is in a known good state,” and the definition of done is the same.
Each task is only complete when the exit condition is met, all tests pass, and lints are clean per our Drafting blog.
You can imagine yourself crossing a river. Each rock where you put your foot is a known safe place. Jumping between rocks is unsafe. You want to minimize that time, so you want your tasks to be as small as possible.
Or alternatively, you’re rock climbing. Your feet give you a platform to reach and explore with your hands. Your feet are the “known good state.”
This is especially important when using the Mikado technique as it advocates you rollback any failed changes. If your state started out with failing tests, you no longer have a safe place for reversion!
Suitable tasks aren’t necessarily small or big. But big tasks need to be broken down into small tasks that can fit within your timebox.
Again, figuring out how to break down extensive work into smaller work is very hard. You need each bit of smaller work to depend on good code when it starts and ensure good code when it finishes. It also needs to complete at least one concrete, value-added function. It will take your entire career to get good at it.
The best way to start getting good is to try; then see where your plan doesn’t work by using the Mikado technique as feedback. What parts of the plan need to be rewritten?
Learn from Your Mistakes
Doing many of these methods will make you better at many related tasks. For instance, being good at these small project plans will make you better at bigger ones. Being good at coming up with exit conditions will make you better at finding concrete user stories.
But seeing where your plans fail - and revising and experimenting with new strategies - helps you be a better planner overall. You provide your own feedback by sticking to a rollback plan and timebox. Each time a task fails to complete, you’re thinking, "Hrm, do I need to break this down more, or did I not make this concrete enough, or…"
Your plans are serving as a hypothesis of work. You create a plan, and you believe that plan will lead to success. By trying the plan as an experiment and seeing where it fails, you’ll be able to adapt the plan.
This ends up being true in a lot of these practices. We put Outlining last after Drafting and Journaling because Outlining relies on a lot of intuition you get by fixing your mistakes in Drafting and Journaling.
The better you get at Drafting, the more significant leaps you can reliably make between stones on the river. The better you get at Journaling, the more reliably you can guess where those stones should be.
Outlining helps leverage those longer leaps and better predictions into a kind of work that can speed up how fast you jump.
A Few Weeks of Coding Can Save You Hours of Planning
We’re not advocating that you plan out your work here for hours. Again, you may sit down with a plan daily or make one at larger intervals.
Why does planning work, though? Why does it save so much effort when we mix in a little? And why, ironically, does doing too much of it also waste time?
Planning works in three ways.
First, by reifying the work itself, we separate the action of working from thinking about working. This allows us to think about the work itself differently and spot patterns slowing us down. Plans become like code, capable of being refactored and reasoned about separate from the work they entail.
Second, because plans are hypotheses, we can far more quickly spot when things have gone south. If you don’t have a plan and just try to barrel through code, it may be weeks - or even months - before you realize you made a critical design or architectural mistake. You were so intent on white-knuckling through things that you didn’t stop to think if you should.
Plans, with their timeboxes, give us a cue to think, "Hrm, did something go wrong? Was the plan wrong? Do I need to re-work it given what I discovered, or is this the wrong approach (this design won’t work)?"
Third, and most importantly, by making planning work itself, we dedicate time, energy, and a cue to turning the abstract concrete.
A parable comes to mind. College students (all psychological research is done on college students) were tasked with one of two papers. One was to loosely describe their day and why they think it went the way it did. The other was to explain how the water cycle worked. They were given a cash incentive to turn in papers.
The paper with a concrete goal - describing the water cycle - got much higher completion rates. Why? The page requirements weren’t different. And the water cycle paper required some outside research. It was completed at a much higher rate because people tend to put off turning the abstract concrete. It’s scary to them; they don’t know what done looks like for a good paper about their day and values. They know what done looks like with a water cycle paper.
Having concrete goals keeps us motivated and comfortable. Having abstract, wishy-washy goals makes us uncomfortable. Most people who think they’re procrastinators aren't; they just have a problem they don’t know how to solve. Simply trying to figure it out will yield fruit; but it's uncomfortable to start because they don’t know what the future looks like.
By specifically planning, we’re setting apart some time and energy just to think about turning the wishy-washy into a concrete goal. Is this goal correct? Maybe not. Maybe our “wishy-washy” goal of “increase maintainability” isn’t actually done by the more tangible goal of “unit test this function.” But we’ll know most quickly by trying.
Why, then, does planning sometimes lead to too much work and get a nasty reputation? Generally, by that point, we’ve lost reification. The plan has melted back into the work. If folks think a plan is so detailed that the work falls out, they’re working, not planning.
And we get behind schedule because over planning has the same result as under planning. We don’t discover we’re on the wrong track. We haven’t actually set aside time to turn the abstract into the concrete; we can’t easily refactor or rethink the idea of the work separate from the work.
We’re going backward rather than taking a random assortment of tasks or stories and estimating the number of story points. Instead, we’re saying we want to fit a task inside a known timebox. So we’ll keep breaking things down until it does.
You’re not going to necessarily have a team with you to play planning poker. Instead, your estimates are entirely your own, and you succeed or fail by them. This becomes easier because you’re just trying to get under a timebox limit.
Like any other estimation technique, you will draw from your experience on how long a thing will take. But, unlike other estimations, you’ll far more quickly learn if you’re right or wrong and correct them.
Any task that exceeds the timeboxed estimate needs to be rolled back (because it’s the wrong approach) or broken down more. This will help you realize that “Test Function” takes far more than 15 minutes; and when you’re finally done, you’ve learned testing an unknown function can take multiple days. You’ll remember that.
The minimum-viable-product technique is handy: if only anyone used it. MVPs are supposed to be experiments. We’re supposed to discover whether or not the market was there for our product. Often we lack the managerial courage to follow through and gut a project if it’s failing.
Making decisions daily - that a planned approach isn’t the right one after you’ve done some trial implementation - makes the whole ordeal easier. You’ll find it easier to cut your losses on many other things, including product decisions like MVPs.
You don’t want to get glued to your plans. You want them to change with what you’ve learned. As the saying goes, opinions should change when the facts change (and the facts should change based on your attempt at implementation). A plan that survives “contact with the enemy” isn’t a sign the planner was a genius; it’s more a sign that the planner is in denial.
There are many benefits to this kind of just-in-time planning. Let’s review the process:
Make a plan
Each task should be broken down into a timeboxed increment
At the end of each timebox, you decide go-no-go on the work. Did it accomplish the goal?
If it didn’t, and the design is flawed, you redo the plan. If it didn’t, and it just needs to be broken down, you break it down further in the plan. If it did, but you learned something, you still update your plan.
Go to step 2
Meanwhile, you’re keeping notes right with the plan itself as you go.
Your plan will serve as excellent implementation documentation when you finish this commit. This isn’t for a user of the code to use your functions. Instead, it’s for a maintainer of the code to use as a reference to find out why on earth you did things in such a crazy way. That maintainer may be you, by the way.
The plan also allows you to much more easily take a break and not lose context. Deep work is only possible when we can stay enmeshed in a context for long periods. By planning with detail, we can quite quickly figure out where our context was and get back into things, whether we got interrupted or needed a break to clear our heads.
Finally, you create momentum by having a checklist. It’s fun to complete things.
What Goes in the Plan and What Goes in Pseudocode?
Some might ask, "What goes in my plan, and what goes in the code itself?" The documentation is writing, and so is the pseudocode.
I’d argue if you’re breaking down your plan so far that you’re actually writing things that could be code - you’ve gone too far. You should not be copy-pasting from your plan into your IDE.
This will get worse, not better, with the introduction of Outlining. However, some actual tasks can show up in your plan: “Do pseudocode for function X” or “Outline model for Y.”
If what you’re writing will correspond to a few lines of code very descriptively, put it in pseudocode. If what you’re writing will have the state of “TODO” or “Done,” put it in your plan.
Speaking of which, should you use TODOs or put everything in your plan?
This one is even tricker. Honestly, when you’re faced with tricky questions like this, it’s a matter of taste. You won’t go wrong either way. If the TODO is more about the state of a project or an overall approach, you can move it to your plan. If it’s about a design idea or a code idea and the TODO really belongs on a line near some code it would affect, put it in the code.
Micro Timeboxing: The Pomodoro Technique
We’ve talked through the Mikado method and how it was inspired by a game of pick up sticks. Now we’re going to talk about another technique inspired by… a tomato.
A tomato timer, to be specific.
The Pomodoro Technique fills in all the details on the timebox.
You can play around with the exact times to see what works for you; overall, it generally calls for about 25 minutes for a timebox and then a five-minute break. After four Pomodoros, you get a more extended break.
You Need Breaks
Breaks are a part of the Pomodoro technique and should not be dropped. You need breaks not just to stretch your legs but because a lot of thinking happens during breaks.
We talk about deep work. We’ve all had instances where we look up and realize we’ve been focusing on something for a few hours - whether that’s code, a paper that’s due, or just one more turn of Civilization.
Context switching is terrible; but avoiding context switching is not just about focusing on the task at hand. It’s avoiding being forced to load a new context. It’s basically like invalidating the entire cache and slowly waiting for the cache to fill again.
A break to clear your head does not invalidate the cache. A break also helps activate the Default Network, which you can think of as where insight is born. The default network is what’s activated when you’re taking a shower or a walk. It’s just a little daydreamy lack of focus, and suddenly ideas will just occur to you out of the blue.
By ensuring we take breaks, we’re flipping back and forth between deep focus work and insight work, all on the same task, with the same context.
Waterfall vs. Adaptive
Initially, it may feel easiest to apply a mini-waterfall to this cycle. The first part of your 25 minutes you might spend updating your plan. The next part of your 25 minutes is implementing one task (that fits within 15-25 minutes). And the last part you might use to clean up lint issues, ensure tests run, and commit.
Once you feel like this rhythm is down, you can be a little less phased and a little more adaptive. Updating the plan as soon as you realize something won’t work can be good. Fixing a lint issue as soon as it appears can be good.
Sometimes it’s good to finish a thought, so we put off work. That’s fine, too. However, the initial impulse of “first plan, then do, then QA” will suffer from some of the same issues in the small as Waterfall does in the large. Namely, you’ll forget some things in your planning if you don’t capture them as soon as you learn. You’ll also lose some context if you let QA wait until the end.
The losses are minimal, though, compared to the massive losses of waiting 6 months to do QA.
You may want to use block scheduling to clear out parts of your schedule for tomatoes.
You can’t very well do a Pomodoro if you have a meeting in 5 minutes. And breaks aren’t for meetings or chats; though you can use them to, say, check Slack or something lightweight.
Meetings are work. In-depth discussions are work. You are choosing to code now, so you need to turn all that off and ensure you have the freedom to do so. Turn off the chats. Turn off the email notifications (you should be scheduling when notifications can be turned on anyway, and certainly silence them on your phone). Clear the deck of anything else that might interrupt you.
You can use an app like focusme to help. Whatever app you use, set up large flow blocks on your schedule multiple tomatoes wide (25 min + 5 min break, and remember there are more significant breaks).
Many focus apps can also support this schedule, which is excellent. You can also put links in the calendar event (and prevent others from scheduling meetings during that time) to your ‘context’ like your workona workspace, a JIRA query, or GitHub issue, and the Pomodoro site. This allows you to easily click-click-click to get your work environment set up. Similarly, IDEs like PyCharm have “tasks and contexts” to load specific files associated with tickets.
Pair Programming: Programmer, Rubberduck Thyself
Pair programming is when two programmers work together simultaneously on the same problem, possibly on the same computer. You can do it remotely and share screens.
It can be seen as a quick peer review. Closing a feedback loop as tight as possible can benefit overall latency on any particular task. This is because many tasks are “insight-driven,” as described above. Sheer effort won’t crack them; instead, you just have to sit around and wait for breakthroughs. Two folks working together tend to have more insights because they’re in conversation.
Still, this blog is about improving personal productivity. How will this help if it requires another person?
Well, you can pair with yourself. It’s just going to seem weird at first.
You can theoretically pair with anyone - even a non-programmer - and get some benefit. You often catch yourself making inevitable mistakes or assumptions by explaining what you’re doing to someone.
Some have even used this trick without another person. Instead, they just put a rubber duck on their desk and “explained” to the rubber duck what they were doing. They found it beneficial. Maybe not as helpful as an expert asking questions or even a novice asking questions. But more valuable than not talking to the rubber duck.
You can go further than this. Being productive is all about doing more with less - or doing a lot more with the same. How can you get more out of a conversation with yourself?
By recording it. By having an active conversation with yourself throughout your coding session, you’re rubber ducking as you go. But more importantly, you’ve hyper-journaled the whole thing. Your video (or, more usefully, your transcript) serves as another layer of documentation for a maintainer (or yourself).
This can seem like overkill, but at the same time, many people would recommend rubber ducking. The amount of effort is the same; you should still rubber duck. But by recording, you’re just capturing the conversation.
Between this and your plan, your documentation now includes two things. What you planned on doing and then what you actually did. Both are useful bits of information to any future maintainer.
Don’t get me wrong - transcripts and detailed plans are not adequate documentation for maintenance. But they are incredibly thorough and easily searchable with transcripts. They are excellent reference material for a maintainer who might be doing research rather than any upfront tutorial or guided tour.
You can Coach Yourself
Using the video and the plan, you can compare and contrast what you did with what it feels like you did. This is similar to how a boxer might videotape themselves to watch what mistakes they made at the moment.
Now, coding is primarily a cognitive exercise rather than a physical one; so it’s not as helpful as athletic performances. However, by comparing the video to the plan, you can pretend to be an objective third party and analyze your work as another coder. (Only do this occasionally. I wouldn’t do this all the time; otherwise, work would literally take twice as long. What’s next? Videotape yourself watching the video?)
This might point out errors in your workflow that you otherwise would have lost in the moment; it can give you something to target and improve. You can also send these videos to engineering coaches and have them critique your finished design and the process by which you got there.
Do you need one of them fancy engineering coaches to help you grow your software engineering or data science career? Contact Us!
This Isn’t Unlike Self-Therapy
Why does this work? It’s not unlike the trick of asking, “What advice would you give yourself.”
Despite what I said above, the physical act of coding still relies on affective (emotional) and physical skills. That’s why practicing things the way we want to do them will make us better coders. It’s also why you can know how to spell maintenance all you want - but unless you physically practice spelling it correctly, you’ll type maintainence repeatedly. (Or at least, I will.)
By having a conversation with yourself, you pull the affective and physical back into the cognitive. We can often give better advice to others than we can follow in the moment. This is one way to turn that problem on its head. We get better advice in the moment via Rubber Ducking and then again reviewing select videos later.
You can read about these concepts and principles, and they may not occur to you in the moment. But reviewing things later, detached, they may appear to you then. Or, by trying to intentionally explain to yourself what you’re doing, they may occur to you in the moment. This helps you train yourself.
When I started doing these practices - Mikado, Pomodoro, and Rubber Ducking - I noticed a few patterns crop up, and a few principles apply each time.
I mentioned everything about “you can coach yourself” because it happened to me. By trying to explain my project plan to myself, I asked myself in a detached way - is this a good project plan? And most of the time, it wasn’t.
As you’ll see, these principles all have a common root, but they’re good things to keep in mind when doing your planning and updating your planning. They’re also good things to keep in mind that the “rubber duck” might ask as you go along.
You ain’t gonna need it.
If your task fails, you have three options if you need to roll back. Not just two.
How to break out a task more
Find another way to approach it
Ask: Is this really required?
When you fail a task, you’re saying, hey, this won’t fit into the 25-minute window. You’re saying it’s an expensive idea, or at least it’s more costly than you thought. Ultimately, we make choices because we believe the positives outweigh the negatives. When a task fails, it’s a cue to tell us - the negatives were more extensive than we thought.
You have to question - is it worth it? Is this a feature that I need? Otherwise, you risk more or less admitting to yourself that overall effort doesn’t matter, which isn’t an excellent way to work. Just recall how frustrated you might be when you estimated a massive effort to do the project your Product Manager prioritized and see them not flinch. The estimate of effort should matter to the decision.
Some may argue: this will help with future maintainability. Okay, that’s fine - but guess (knowing what you know now that you failed the task) how many hours will it save in the future, and how many hours will it take now? Remember, the future is discounted compared to now. Future hours are not worth the same as current hours!
It needs to be worth it.
Keep it simple, stupid.
KISS seems similar to YAGNI but is slightly different. YAGNI is about entire features or solutions or ideas that aren’t needed. KISS is about the implementation.
When a task fails, you can replan and break the task down further. Or you can decide on a different approach. There may be a few different ways to accomplish the same goal. You chose one because you thought (similar to YAGNI) that the benefits outweighed the costs. You’re learning that the costs are higher than you anticipated through a failed task.
Is there another approach that solves the same problem and is simpler? Again, similar to YAGNI, we are glued to specific approaches (or, at least, I am) because they feel elegant or long-term maintainable. You’ve got to estimate - knowing what you learned by trying to implement the design - the long-term costs of a different, more straightforward solution vs. the short-term costs of getting it exactly right.
Good Enough is Good Enough
Code is never done. There will always be another refactor, another iteration, another feature that can be added.
You will never work down your backlog. The size of a backlog is a useless metric - instead, we need a good prioritization scheme.
Good enough is good enough - that means that some tasks aren’t worth doing. I often have ideas for new, cleaner designs as I’m coding. But, I don’t have the energy to implement them now; so I put them in a TODO in the code or add them to the plan.
Later on, when I review these, I still don’t have the energy. And you know what? The current solution works just fine. Are these ideas even worth keeping around as TODOs?
Learn when to move on. It’s okay to call something finished if it hits some measurable, quality goals and your exit conditions. You’ll always feel like there’s more you could have done - one more change that would have made it prettier. But you’ve got to timebox these efforts, which includes eventually saying no to specific tasks and moving on.
Some nuts and bolts.
The Mikado technique talks about drawing a diagram. This can be useful. You can use Miro for that.
I do not use diagrams personally; instead, I use nested bullets.
Org-Mode is excellent as you can have bulleted headers and TODO markings similar to a personal Kanban. You can also have a lot of notes attached to that header, allowing you to write down your thoughts as you go. The levels of nesting allow breakdowns of large tasks into smaller ones.
Folding the text allows you to quickly zero in on where you want to zero in.
I also used a tool called ghi. This allowed me to manage GitHub issues from the command line, and it opened up Vim to do it. This meant I could use ghi to open the issue, then set the filetype to org and manage the entire comment as an org-mode file.
When I was done, ghi automatically pushed all the text up to GitHub issues, so I had a project outline attached to the issue I was working on and all the notes I took in the meantime.
This can be pretty fantastic.
I’ve since moved on to Obsidian. Obsidian misses out on the ghi hook. It also does not track with pretty colors TODO, DONE, etc., like Org-Mode.
However, it does one thing a lot better: Markdown is its native file type. GitHub issues use Markdown - they render org files pretty terribly. You can use the same project breakdown in Obsidian with hashtag headers # that you would use in Org Mode bullet lists.
To use Obsidian, you’d just keep your plan in an Obsidian note (probably named after the issue). Copy and paste the Markdown into a GitHub comment on the issue when done.
(For those who want to keep using Org-mode, a nice note-taking extension called Org-roam hits many of the Obsidian buttons. To fix the issue with Org files not rendering in a GitHub commit, you can use Pandoc to convert from Org to Markdown. - Thanks to Matthew for this.)
One feature that’s quite useful is to ensure you have easy ways to get timestamps. This turns your project plan into a journal, with timestamped notes between each task header.
This is easy enough to do in Obsidian or Vim that I’ll leave it as an exercise to the reader.
What about a TODO app?
You can use a todo app like JIRA or Todoist for your plan; however, I’ve found that to be a little heavier than I like for this. Journaling should be fast, local, and seamless. Using web-based tools with lots of clicks makes things feel less seamless. They’re absolutely fantastic at collaborating between people or handling longer-term todos.
But I don’t really need to store the TODO project breakdown, for, say, using a Kalman filter to dynamically measure some parameters next to my plan to take my cat to the vet.
It’s also a little clunkier to store notes in the task - not impossible, obviously. But Obsidian or Org-Mode is just so fast that the idea of adding comment after comment to a JIRA task each time I have a thought as I’m working can get tedious.
How to Combine This with Drafting?
Well, it combines pretty effortlessly with Drafting. Alt-tab into your editor. Draft. Alt-tab away when you need to think. Journal.
As we said initially, you might waterfall things a bit - dedicate some time to plan out your work, then at the end, QA things before you commit. So, plan, draft, then QA. As you get better at the techniques, you’ll be switching between Journaling and Drafting while recording and talking to yourself.
Some Drafting tasks make sense as Journaling task breakdowns. Pseudocode, maybe passing a final unit test as the highest level task, etc. So Drafting as a goal gives you some steps to break your high-level tasks into.
Takeaways and What’s Next
So we’ve talked about the bottom-up approach of Drafting and now covered the top town of Journaling. Next time, we’ll finish with the most advanced: a set of middle-out techniques we call Outlining.
Journaling, like Drafting, has a dose-response effect. Even if you do just one of the techniques outlined above, and even if you don’t do it very well, you should still see some improvement.
Since you can break down both Drafting and Journaling into a set of techniques that you can practice, we’d suggest rolling it out precisely that way - one practice at a time, until you’re either bored with that practice or comfortable with it.
Just like Drafting, you’ll want to engage in deliberate practice. You need to set a goal of using the technique when you have the time and energy. Let that deliberate practice guide how you “play” and do work when you have tight deadlines and limited budgets (of energy).
As you practice, the techniques become more accessible and require less thought. Eventually, they’ll invade your everyday work without thinking about it, and you’ll wonder how you coded before.
Did you find any of this useful? Then we need your help! We’re on a mission to make jobs suck less by teaching one management lesson at a time. Help us teach other Data Scientists about Software Engineering and share this far and wide!
Or, do you want more tips to be more productive? Contact us!