comparison mrjunejune/src/blog/thoughts-on-tdd/index.md @ 100:65e5a5b89a4e

[Seobeo] Migrated everything to this page.
author June Park <parkjune1995@gmail.com>
date Sat, 03 Jan 2026 07:48:07 -0800
parents
children 1c0878eb17de
comparison
equal deleted inserted replaced
99:684edfaf93b7 100:65e5a5b89a4e
1 # Thoughts on TDD
2
3 Is testing important? Ask yourself that question. If you had to think about it for more than a few seconds, you’re either an inexperienced programmer or someone who has never had to release a product to a large group of users. Testing is not just important—it’s essential. It ensures that the software you release is less buggy and more stable because it allows you to catch issues before your customers do. That’s the "why" behind testing that everyone would agree to.
4
5 The real debates arise around the *how*—specifically, approaches to testing, including methodologies like Test-Driven Development (TDD). Over my career, I’ve worked at multiple companies, all of which practiced TDD in some form. This often involved writing unit tests, integration tests, and end-to-end (e2e) tests. However, despite the commonality of TDD, every company seemed to implement it in its own convoluted way, making the code harder to write, debug, and maintain. I want to talk about this practice and my problem with it.
6
7 ## Unit Tests
8
9 > "Unit testing is the process where you test the smallest functional unit of code. Software testing helps ensure code quality, and it's an integral part of software development. It's a software development best practice to write software as small, functional units then write a unit test for each code unit."
10 >
11 > — Definition from AWS
12
13 You might agree with the above definition, disagree, or be unsure about what qualifies as a "unit," but most people are likely to agree with it overall. Now, imagine you’re a 21-year-old physics graduate with three months of self-taught coding experience (and a serious League of Legends addiction). Somehow, you land your first job as a software engineer and are tasked with writing a serializer for a `GET` API in Django—and... you guessed it! testing it!
14
15 Here's what such a serializer might look like:
16
17 ```python
18 from rest_framework import serializers
19
20 class FooSerializer(serializers.Serializer):
21 name = serializers.CharField(max_length=100)
22 unit_price = serializers.FloatField()
23 quantity_on_hand = serializers.IntegerField(default=0)
24
25 def create(self, validated_data):
26 return Foo.objects.create(**validated_data)
27
28 def update(self, instance, validated_data):
29 instance.name = validated_data.get('name', instance.name)
30 instance.unit_price = validated_data.get('unit_price', instance.unit_price)
31 instance.quantity_on_hand = validated_data.get('quantity_on_hand', instance.quantity_on_hand)
32 instance.save()
33 return instance
34 ```
35
36 If you’re unfamiliar with Django, the `create` and `update` methods save or update records in the database. It’s normal to serialize an object like this into a response for use in an API endpoint, often with a tool like `JSONRenderer().render(foo.data)`.
37
38 Now, we understand detailed implementation as much as senior INSERT_LIBRARY engineer, the question is: *How do you write a unit test for this?*
39
40
41 There are two main reasons why this serializer is hard to test as a true "unit":
42
43 **Dependency on Framework Classes**
44
45 The serializer inherits from Django REST Framework’s `Serializer` class, which comes with built-in behaviors for things like validation and field handling. If you test whether `name` exceeds 100 characters or if `unit_price` is a float, you’re essentially testing whether Django itself works, which is redundant. But you still need to create a test given a *correct* vallue and *incorrect* value because you are testing more for business logic rather than the code. So are we creating unit test for the function ? or the library?
46
47 **Database Dependency**
48
49 The `create` and `update` methods interact with the database directly. Testing them requires either mocking the database (which introduces complexity). If you try to write a test that checks whether the `create` method works, you might end up mocking the `Foo` model, overriding its methods, and verifying that the mock functions were called with the correct arguments. While this might make sense for complex logic, it often feels like overkill for a simple serializer.
50
51 Here’s an example of how you might write such a *unit* test:
52
53 ```python
54 from unittest.mock import patch
55 from rest_framework.exceptions import ValidationError
56
57 CORRECT_DATA = {
58 'name': 'Test Item',
59 'unit_price': 10.99,
60 'quantity_on_hand': 5
61 }
62 BAD_DATA = {
63 'name': 'Test Item' * 100,
64 'unit_price': "yo",
65 'quantity_on_hand': 5
66 }
67
68 def test_serializer_create():
69 serializer = FooSerializer(data=CORRECT_DATA)
70
71 # This is where you are testing the frameworks....
72 assert serializer.is_valid()
73
74 with patch('app.models.Foo.objects.create') as mock_create:
75 serializer.save()
76 mock_create.assert_called_once_with(
77 name='Test Item',
78 unit_price=10.99,
79 quantity_on_hand=5
80 )
81
82 # And you need to create 3 more! (2 update using CORRECT_DATA, and BAD_DATA and 1 with create only)
83 ```
84
85 This test ensures that the `create` method is called with the right data. But if the serializer logic gets more complex, the mocking can become cumbersome and annoying. Think about below case, where `FooSerializers` needs to have a redis singleton because it needs to store certain data inside of redis for faster accesibility for few minutes.
86
87 ```python
88 class FooSerializer(serializers.Serializer):
89 ...
90 def create(self, validated_data):
91 self.bar(**validated_data, { ttl: 1000 })
92 ...
93
94 @cache_property
95 def bar(self):
96 return self.get_bar()
97
98 def get_bar(self):
99 return Bar()
100 ```
101
102 In this case, the `get_bar` method is introduced to facilitate testing of the singleton behavior. This allows you to override the `get_bar` method during tests to avoid interacting with the actual Redis singleton. However, this adds another layer of complexity to your tests.
103
104 ```python
105 def test_serializer_create():
106 serializer = FooSerializer(data=CORRECT_DATA)
107
108 assert serializer.is_valid()
109
110 # Mocking the database interaction
111 with patch('app.models.Foo.objects.create') as mock_create:
112 # Mocking the singleton method
113 with patch.object(serializer, 'get_bar', return_value=MockBar()) as mock_bar:
114 serializer.save()
115 mock_bar.assert_called_once_with(
116 name='Test Item',
117 unit_price=10.99,
118 quantity_on_hand=5
119
120 )
121 mock_create.assert_called_once_with(
122 name='Test Item',
123 unit_price=10.99,
124 quantity_on_hand=5
125 )
126 ```
127
128 You can imagine how the unit test would look as more cascading effects are introduced. With all these 76 lines of code (19 for each case, accounting for 4 tests: 2 with correct data values and 2 with incorrect data values), it essentially tests...
129
130 1. Whether the `create`, `update`, and `Redis` functions are called.
131 2. Whether the data has the correct units.
132
133 <div class="center"> <img src="https://media3.giphy.com/media/v1.Y2lkPTc5MGI3NjExbHpyY3A5ZjlzNzF5Nmp2Zm13M3kzZWU4Znl4djNmZWoxemRkaHVhbSZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/5sNxGQsE3RXap07jQ4/giphy.webp" /> </div>
134
135
136 ### Different Approach: Unit Testing in Rails
137
138 In frameworks like Ruby on Rails, a different paradigm is used for handling tests, one that might directly conflict with the traditional definition of unit tests. Instead of relying heavily on mocking and isolating functions, frameworks like Rails encourage spinning up a test database or any external dependencies that are *essential*. Here’s how the same serializer would look in Rails using Active Record:
139
140 ```ruby
141 class Foo < ApplicationRecord
142 validates :name, presence: true, length: { maximum: 100 }
143 validates :unit_price, numericality: true
144 end
145 ```
146
147 And here’s how a test might look:
148
149 ```ruby
150 require 'test_helper'
151
152 class FooTest < ActiveSupport::TestCase
153 test "validates name length and prevents invalid record from saving" do
154 foo = Foo.new(name: "a" * 101, unit_price: 10.99, quantity_on_hand: 5)
155
156 assert_not foo.valid?, "Foo should be invalid when name exceeds 100 characters"
157 assert_equal ["is too long (maximum is 100 characters)"], foo.errors[:name]
158 assert_not foo.save, "Foo with an invalid name should not be saved to the database"
159 assert_nil Foo.find_by(name: "a" * 101), "No Foo record with an invalid name should exist in the database"
160 end
161
162 test "saves valid record to the database" do
163 foo = Foo.new(name: "Valid Item", unit_price: 10.99, quantity_on_hand: 5)
164
165 assert foo.valid?, "Foo should be valid with correct attributes"
166 assert foo.save, "Foo with valid attributes should be saved to the database"
167
168 # Look for saved foo and bar
169 saved_foo = Foo.find_by(name: "Valid Item")
170 saved_bar = Bar.find_by(name: "Valid Item")
171
172 assert_not_nil saved_foo, "Foo record should exist in the database"
173 assert_equal 10.99, saved_foo.unit_price, "Foo's unit_price should match the saved value"
174 assert_equal 5, saved_foo.quantity_on_hand, "Foo's quantity_on_hand should match the saved value"
175 assert_equal 10.99, saved_bar.unit_price, "Bar's unit_price should match the saved value"
176 end
177 end
178 ```
179
180 In this example, you’re not mocking anything. Instead, you’re leveraging the test database to directly validate business logic. This approach often feels cleaner and less convoluted than mocking, especially for frameworks with tightly integrated ORM layers like Active Record. If your application also involves a Redis instance or similar components, you would create those as part of your test setup and verify their existence. This approach, however, goes strictly against AWS's definition of unit tests. At my previous job, I remember having a heated argument with an engineer over this—he was a strong believer in AWS's strict definition, while I preferred a looser interpretation based on my experience with Rails.
181
182 **(And no, starting up a Docker container for a Redis instance or database is not SOOO slow that shouldn't be done. This is a widely accepted practice in many engineering companies, including Shopify and Airbnb.)**
183
184 If you ask me, I’d argue that Rails' approach is slightly better because you avoid the overhead of creating numerous mock objects. For example, in Jest or Python, mocking can get out of hand—especially if you need to mock React hooks that query or mutate global states instead of initializing the state directly within the test. It’s overwhelming to see ten different mock values inserted into a single object, and now imagine having to do that for every single function you write. But now, we will have conflict with...
185
186 ## Integration Tests
187
188 > "A type of software test that verifies how different components of your application and services interact and work together as a system, ensuring data flows correctly between them and that the overall functionality is as expected."
189 >
190 > — Definition from AWS
191
192 We’ve essentially already done this in our Ruby on Rails testing, so let’s revisit our `FooSerializer`. Integration tests ensure that the serializer works as expected. Here's an example of such a test:
193
194 ```python
195 def test_serializer_create():
196 serializer = FooSerializer(data=CORRECT_DATA)
197
198 assert serializer.is_valid()
199 serializer.save()
200 assert Foo.objects.filter(name=CORRECT_DATA.name).exists()
201 ```
202
203 At this point, we’re essentially rewriting the same tests but with less mocking. So, what’s the real value of this? There has to be a scenario where unit tests are useful, and integration tests aren’t—or vice versa. In this particular case, though, it’s hard to distinguish that line. As applications grow more complex, these distinctions may become clearer. However, for most scenarios like this one, if tests weren’t written at all, it would be difficult to identify what’s missing or redundant.
204
205 In this specific case, many would agree that writing both unit tests and integration tests offers little value. It’s almost redundant and doesn’t justify the time and effort required.
206
207 ## E2E Tests
208
209 I couldn't find a definition so I am going to just ask gemini for it.
210
211 > End-to-end (E2E) testing is a software testing method that verifies how a software product works from start to finish. It's also known as system testing or broad stack testing
212 >
213 > — Gemini version that doesn't create controversial images.
214
215 Finally, let’s talk about end-to-end (E2E) tests. The serializer function we wrote will likely live inside some API endpoint, and I don’t feel like writing codes for it so I hope you guys can imagine it. In most cases, E2E tests are run through a virtual machine (VM) or Docker container that replicates a smaller version of the real server and database, seeded with factory data. These environments are spun up during CI/CD pipelines, where tests simulate real server requests.
216
217 E2E testing is extremely useful when your server is self-contained. However, if your application relies on third-party services, things can get tricky. For example, let’s consider a case where your application depends on an AWS service to check content safety. You can’t easily mimic that service locally, so you’d need to test against their live servers, which may incur costs for every request. As a result, you’ll likely want to separate those tests from your main CI/CD pipeline and use a dedicated staging environment to avoid mixing with production accounts.
218
219 That’s a *good* case. A *bad* case is when the third-party service doesn’t allow any testing at all. For instance, Google’s SSO or similar services might block testing to prevent potential DDoS attacks (Because it is a DDoS attack). In these scenarios, your E2E tests often become glorified integration tests, where you simulate the service instead of interacting with the real thing.
220
221 In short, TDD in its purest sense—using unit, integration, and E2E tests together—is often impractical or unachievable. I haven’t even touched on the time required to run E2E tests, the context-switching needed to write or debug them, flakiness, and other challenges. Still, we write tests because they are critical to delivering reliable software. After all, we’re not open-source engineers releasing products that randomly break because of dependencies like `is_number` in npm, right? Right?
222
223
224 ## My Thoughts
225
226 We should primarily focus on integration and E2E tests that validate real user activity. It’s not always necessary to test the entire flow, especially when certain parts—like SSO—can only be mocked and not tested fully. E2E tests are often slow, flaky (the worst!), and sometimes impossible to implement comprehensively.
227
228 Unit tests should be reserved for complex functions that require isolation. My general approach to testing is like a binary search:
229
230 - If a unit test fails, the problem is always within the business logic of that specific function.
231 - If an integration test fails, the issue is likely due to mismatched input/output between components.
232 - If an E2E test fails, it could indicate a third-party service outage, a race condition, or a broader system issue.
233
234 For simple components like our `FooSerializer`, writing three separate tests (unit, integration, and E2E) is overkill at best and a waste of time at worst. Instead, focus on testing critical user flows and isolating tests to specific areas when necessary. This strikes a balance between test coverage and productivity, avoiding the trap of excessive and redundant testing.