Theodo apps

How Effective Debugging can save you several days of productivity

Or how to avoid getting lost in 2-day debugging tunnels

During my projects, I often collaborated with other teams, integrating their code into my work. This required me to debug code written in various languages, sometimes outside of my area of expertise. I’m a TS/JS dev and I had to work with Swift/Objective-C, Kotlin/Java, and even Golang code.

Given that I wasn't a native developer and wasn't familiar with the various codebases, I had one major problem: investigating bugs was really time-consuming. I faced two main challenges:

  • Limited knowledge of the codebases
  • An ineffective bug investigation methodology

For the first issue, there is no magic solution: you have to read the code and ask for explanations. However, the second issue is more manageable, and I’ll share an effective method to help you save time during your bug investigations.

Before that, it is interesting to understand the problem by answering the following question:

How to be sure to waste 2 days unnecessarily on a bug?

It's very simple. Just follow a short list of steps:

How to take a tunnel

While this procedure may work with simple bugs, you are almost guaranteed to lose a lot of time when facing a more complex problem.

Moreover, even after a stroke of genius (or luck) that fixes the problem, you then need to clean up all your tests, identify which parts of the tests are necessary for the fix, remember which test was conducted, and most importantly, understand why it works to avoid a shaky fix that works half-heartedly and by chance.

In other words, even after our brilliant insight, the problem is far from being truly resolved.

Let's take an example of a bug I recently encountered:

Context

I have an application where you can launch a video. The player used to play the video is an external library. When you press the escape key on your keyboard, you exit the video via an onExit function that calls the goBack function of React Navigation.

Here is the VideoPlayer component:

The Bug

When I exit the page with the video, the memory does not decrease, the video remains in memory → I have a memory leak.

Investigation following our 4 (bad) steps

Step 1: My goal is to fix the memory leak. My plan is to fix the memory leak.

Step 2:

  • I think the problem comes from the destroyPlayer function of the external library. What luck! I have access to this code, the environment to be able to debug it only takes a day to set up!
  • I spend a day setting up the environment to be able to debug the library code.
  • I debug with lots of console.log and the debugger (I’m a pro 😎)
  • I modify a bit randomly to see if what I do has an impact (~1/2 day)

Step 3:

  • I can't even note here what I tested because I obviously didn’t take notes of what I was doing.

Step 4:

  • No stroke of genius, I dive into a codebase I don’t master well enough, I don’t ask for help.
  • No matter what, I still have the memory leak.
  • Despair.

Result

→ 1.5 days wasted with no result.

→ No usable result to prevent someone else from repeating the same tests I did.

How to avoid this?

When encountering a bug, it’s important to determine the scope of the bug. Which components are involved? Which pages? Under what conditions?

Reproduce the bug

The first step is to be able to consistently reproduce this bug by identifying all the parameters (environment, branch, commit, procedure to reproduce). For simple problems, this step is often immediate. I just added a function that causes side effects or simply does not work.

In our example, I quickly notice that the memory leak is there as soon as I exit with the escape button. This allows me to narrow the scope of my bug to the steps involved in closing the player. I don't waste time analyzing the creation of the player, how memory behaves during playback, etc. A well-defined scope already saves a lot of investigation time.

Understand the bug

Once the bug is identified, I do not dive into a deep code reading, I calmly read the logs. Most of the time, the bug/error will be clear and can be fixed in 5 minutes without any problem.

Unfortunately for me, this time no logs 😓.

If the error persists, I can then start preparing. I take out paper and pens, Excalidraw or draw.io. I need to understand the bug.I managed to identify how to reproduce it, now I seek to understand the flow that causes it. Complex bugs are often due to interactions between multiple components, states, hooks, etc. I could try to remember everything, but to keep my future self from going insane, I prefer to write it all down.

I draw the buggy flow as well as the “perfect” situation. The drawing does not need to be exhaustive at first. I can iterate on it to make more complete until I have a good representation of both flows.

Perfect situation:

Buggy situation:

At this stage, I don’t know why the destroyPlayer method isn’t fully executed, I only know it’s called but the player isn’t destroyed.

Having both situations drawn out will not only help identify the problem but also think through a solution. Additionally, these drawings will be very useful for seeking help and explaining the issue to someone else clearly.

Do not hesitate to diversify the diagrams at first to find the one that best illustrates the problem. Here I used a sequence diagram but you could have used an activity diagram or just rectangles and arrows if you want. The important thing is to have a visual representation of what happens.

Find the precise location of the bug

Once the process is drawn, I can start thinking about why our bug is there. This is often when I dive into a tunnel and come out 2 days later feeling like I’ve been going in circles forever. Worse, after a 30-minute session trying to test something, I can simply forget why I was setting up this test. Here is my method to avoid these tunnels:

I think of several hypotheses and for each, a plan to confirm/refute them as simply as possible. The goal is to identify precisely where the bug is and to avoid wasting time on improvised implementations.

In my example, I know the error occurs during the call to destroyPlayer. I can then formulate two hypotheses covering all possibilities:

  • First hypothesis: The destroyPlayer function doesn’t work → the problem comes from the external library.

Test plan: Call destroyPlayer manually rather than in the useEffect cleanup.

  • Second hypothesis: My implementation is faulty → the problem comes from my code.

The test plan for hypothesis 1 will validate/refute this hypothesis.

Let’s document the results as exhaustively as possible. Some test results will invalidate other hypotheses without needing additional tests. This is the case for our test plan above.

For my memory leak, I manually tested the destruction functions of the player, without the useEffect cleanup, and confirmed that the player was properly destroyed. I concluded that the library's functions work correctly. The problem, therefore, lies in my implementation.

Find the cause

The function called in this cleanup is asynchronous and was trying to update a component in the DOM but this component was already unmounted. So the function to destroy crashed hence the memory leak. The solution was to destroy the player before navigating.

How much time was saved ?

This bug investigation showed me how proper organisation can save hours, if not days. With a proper and simple plan I was able to identify the problem in less than an hour:

  • I didn’t have to set up a complex environment as the problem was not in the lib but in my code. Saved me one day of set up.
  • I didn’t have to understand the lib and try to change things in it. Saved me half a day of debugging a correct code.

Pitfall to avoid:

When formulating hypothesis and debugging, it’s very easy to fall into a loop of increasingly complex hypotheses and waste a lot of time. During an investigation, we often discover auxiliary problems or possible causes that do not directly impact the main issue. For example, while trying to resolve a memory leak, we might encounter a performance error or unexpected behavior in another part of the code.

The trap is to "dive" into this auxiliary bug or problem, forgetting the main objective of the investigation. This leads to dispersed efforts and can significantly prolong the resolution time of the initial problem. Instead of solving the main bug, we end up debugging several unrelated issues, which can become frustrating and inefficient.

Things you need to remember to avoid tunnels:

  1. Understand your bug before trying to solve it: Read the logs and draw diagrams of what’s happening and what should happen. Don’t try to think about a solution before understanding the problem, you will find yourself debugging a wrong part of the code… or an external lib because obviously the error is not in your code… 😭
  2. Make a plan: Write your hypothesis and a plan to validate / invalidate them.
  3. Only test one thing at a time: There should only be one modification at a time, one test.
  4. Timebox your efforts: Set specific time blocks for working on the main bug. At the end of each period, take a break and revisit the initial hypothesis to check if the tests you are conducting are still relevant and aligned with the main issue.
  5. Stay focused on the main hypothesis: Regularly reevaluate your investigation plan to ensure you are still working towards solving the main problem. If you start to stray, refocus your efforts on the main hypothesis.
  6. Document auxiliary findings: When you encounter problems or possible causes not directly related to the main bug, note them down but don’t dive into them immediately. Save them for a later investigation once the main bug is resolved.
  7. Document EVERYTHING: Don’t give yourself the chance to forget something. It will be helpful to avoid doing the same tests or to explain the problem to someone else.
  8. Ask for help: Just explaining the problem often gives you the solution. Asking for help will allow you to step back, gain perspective, and refocus on the problem.
  9. Take regular breaks: If you spend more than two hours straight on a bug, even if you’re well-organized, you risk finding yourself in a work tunnel. Taking breaks allows you to step back and reassess the situation with a fresh mind.

By adopting these practices, you will be able to solve bugs more efficiently and systematically.

If you want to know more about debugging I recommend you read the debug guide which really helped me creating this methodology.

Kudos to Delphine for her amazing drawing of me losing my mind over this bug.

Développeur mobile ?

Rejoins nos équipes