Testing Testing

I just released version 3.0.2 of tkts recently, which is frankly a really cool feeling. I haven't really touched this code based in a couple of years despite using the tool almost daily to track and document various little issues and tasks that come up across the breathe of projects I deal with. What's even neater to me is that tkts will be five years old at the end of October, just in time for the next Fennelconf! It's kind of absurd how quickly time flies, and it's really surprising to me that I can revisit one of my code bases year after year like this and continue to improve upon it.

Which is actually exactly what this post is about, little things I can improve upon. Did you know that in the almost 5 years I've been working on tkts I've never once written tests for it? Not a single one. Like an absolute mad man I just hammered away at the code and tested it on the little blob of sqlite that I had managed to accumulate over the years. Worst case scenario if I broke something I'd pull a backup of the data from Restic and move on with my life a little grumpier. You know what I also did for the last two years? Engineer half baked shell scripts to work around weird pain points that tkts couldn't handle but that bothered me greatly. Like for example, I had this fever dream to have tkts generate invoices using groff templates that I could convert into PDFs. That totally worked, if you billed a single ticket inside tickets as an item. Try and break that down over a range of dates and it wouldn't work at all. My solution was just to hammer in a fix with a random shell script that did this instead.

Primarily because extending tkts to support this functionality meant doing deep focused work and the process of implementing this without a test harness and after being away from the code base for so long was just too daunting.

I should probably pause here, if you're not sure what tkts is, you might want to read this blog post. But if you want the cliff notes instead, tkts is a ticket system written for the sole purpose of being. Yeah I'm weird, I write business applications for fun, I know. But it's a seriously amazing way to learn!

Anyways back to this. Getting past the daunting reality of the fact that tkts lacked features I wanted and needed for us, had lots of little broken corner cases that I just lived with because this is my own problem I created with my own hands and likely nobody else is dealing with, ergo the problem is unimportant. All of that really just took swallowing the frog, and a month+ of nights working through the code base to iteratively extend, manually test, validate, repeat. All of which could have been made 1000% easier if I had just written some busted tests to begin with. Spoiler alert, I waited until the VERY end when I went to update the package for Alpine to actually implement these tests at all.

What is Busted?

Well, Busted is a really easy to use testing framework for Lua. Obviously since we're talking about getting past daunting realities and broken corner cases. Realistically the only thing "busted" could be is something to ensure your software isn't in fact busted. And after pouring so many hours into tkts, and finally getting it merged into Alpine's community repos I for one do not want it to be busted!

I won't be labor the point here, but the idea is that I was manually testing that things "worked" and all I ever really accomplished by doing this was to ensure that it "worked" when I personally ran the program. That means the development of this tool was idiosyncratic to the configuration of my computer, and not even really my "computers" but my droid where I do most of my work. And any of the data I tested on was a highly massage variant of real data, that sort of worked around known issues or hand waved things. The point is, you need to test your software, and testing frameworks like busted give you a systematic way to do this. It's a maturity thing, and tkts is mature enough for this.

So what does a test look like for tkts specifically? Well right now just a black box test. I build the software, then I run it through the ringer in a clean environment. This way I know that things like database initialization works correctly, db migrations apply, raw inserts work. To begin with I want to confirm "does the user experience of tkts hold up?"

local lfs = require "lfs"
local posix = require "posix"

-- Helper function to handle cleanup of directories in the test env
function removedir(dir)
   for file in lfs.dir(dir) do
	  local file_path = dir .. '/' .. file
	  if file ~= "." and file ~= ".." then
		 if lfs.attributes(file_path, 'mode') == 'file' then
			os.remove(file_path)
		 elseif lfs.attributes(file_path, 'mode') == 'directory' then
			removedir(file_path)
		 end
	  end
   end
   lfs.rmdir(dir)
end
   
describe("tkts CLI", function()
			--We define where our test env data will be created
			local test_home = "/tmp/tkts_test"
			local tkts_cmd = "./src/./tkts.fnl"

			-- Then we describe our tests
			describe("init operations", function()
						it("should remove existing config directory & recreate it", function()
							  -- Remove test directory if it exists
							  if lfs.attributes(test_home) then
								 removedir(test_home)
							  end
							  
							  -- Create fresh test directory
							  lfs.mkdir("/tmp/tkts_test")
							  lfs.mkdir("/tmp/tkts_test/.config")
							  posix.setenv("HOME", test_home)

							  -- Next we run tkts, read the output it creates, and compare it to what we expect it to display. Like a db migration, or ticket display.
							  local init_output = io.popen(tkts_cmd):read("*a")
							  assert.matches("DB{Migrating database from version (%d+) to (%d+)}", init_output)
							  assert.matches("Open: 0 | Closed: 0", init_output)

							  -- Then we check to make sure that any files tkts is supposed to create are created
							  assert.is_true(lfs.attributes(test_home .. "/.config/tkts/tkts.conf", 'mode') == "file")
							  assert.is_true(lfs.attributes(test_home .. "/.config/tkts/tkts.db", 'mode') == "file")
							  assert.is_true(lfs.attributes(test_home .. "/.config/tkts/invoice.tf", 'mode') == "file")
						end)

			-- And clean up our environment when we're done.
			describe("deinit operations", function()
						it("should remove existing config directory", function()
							  -- Remove test directory if it exists
							  if lfs.attributes(test_home) then
								 removedir(test_home)
							  end
						end)
			end)
	end)
end)

With this framework testing becomes simple questions that we ask in batches. When we're working with a "ticket" inside of tkts we should know that running tkts create -t 'title' -d 'description' creates a ticket. I know this, I wrote it. But whoever packages or uses the software on their system should also know that they need to ask this question and how to verify it. That's sort of what blackbox testing is all about!

Where to next?

Of course, black box testing isn't the best way to test things. It doesn't ensure that the interplay between functions in your software are correct. Something could be materially broken in tkts itself that just isn't exposed to the end user during normal expected operation. Like, what if I run tkts create --this-software-sucks that's a junk flag, how does tkts respond to it? Or maybe less dramatic, what if I pass tkts help -s a_section_that_doesnt_exist does it gracefully handle that?

No, it doesn't. And I didn't catch that before I started writing this and really thinking about how I test my software and why. That's all still just more blackbox testing, but it would be more helpful to source the bits and pieces that comprise tkts and test them actively as I develop. Does this function do what I think it does? If I feed it bad data does it react in an expected and deterministic fashion? These are questions I can answer with busted, but cannot ask of tkts in its current state.

That's because tkts is a 2400 line monolith. It's a bad design choice I made 5 years ago and never recovered from. Every time I've revisited the tkts code base it has been to add features I never got right, or suddenly needed unexpectedly. The scope creep never allowed for a refactor and so I have more and more technical debt. It's fun! Can you imagine how badly this would suck if it wasn't a personal project and did something actually important for some business? Goodness, no thank you.

So the next steps are to address that. I plan to break up the tkts code base into modules, build a full test framework on top of that, and continue to expand the feature set. I really like how much learning tkts has enabled for me as a developer. And I want to really encourage myself to continue to learn and grow from it.