Packaging Ruby Gems
Another victory in the Ruby Crusade ยท July 14th, 2024
Is this blog becoming a Ruby blog? Of all the languages I know and use on a regular basis, I seem to be reaching more and more for Ruby on a regular basis. Of course, we're right smack dab in the midst of the Old Computer Challenge, and I've complained in the past that Ruby really is not super optimal for the types of computer I use during that event, or really on a regular basis. But despite that fact this little language has been my stalwart friend at work.
Just in the past few months I've written little tools to automate upgrading my rc-service scripts for some of my homelab services, to bulk convert data, and at work I've recently finished a Cradlepoint Netcloud & Zabbix API integration that brings a great level of detail into our monitoring stack. And in each of those situation Ruby has just been a breeze to work with.
I have an idea, I write what is essentially a guess at what I expect the Ruby syntax to be, et voila with minimal debugging I have a program. Python feels a lot like this for me too, sort of just guessing and boom something happens! Of course, Ruby doesn't fall over because I accidentally pressed tab instead of space. And the packaging ecosystem isn't abhorrent trash.
Oh wait, this is a post about packaging, not complaining about python.
Why would we package a Gem?
I think, for a lot of developers, the idea of packaging is somewhat vague and contrary to how they think about their programs. They have a very specific version of a library they want to use, and test against. And they vendor that directly into their code base, using something like git submodules or a Gemfile. And then they develop against this very specific version.
I personally like this workflow, it makes a lot of sense. You work with a known thing that doesn't just change out from under you, so you can focus very specifically on what you want to work on and not swatting the bugs brought about by someone else changing something upstream. But what if you need to manage a whole bunch of these sorts of programs, and they all depend on roughly the same thing?
A lot of the time most people just develop against whatever is the latest revision, or their code base is not so fragile that changing the version breaks just because a library is swapped out. Sometimes they are and you're stuck patching away the issues to bring modernity to the codebase, or you just end up with a vendored lib.
For me, I think about it from the perspective of the distribution. If I package something, then there is less likelihood that the corpus of tools that depend on a specific library remain vulnerable to CVEs found in a specific version of that library. If we're using system packages, and patching our codebase to work with up to date libraries, then I don't have to worry about version 2.3.1 being vulnerable to an RCE in a specific tool where my Gemfile tells me I absolutely must use that version. I just need to apk upgrade and move on.
Further if we consider the use case of Alpine in containers, by relying on system packages and not Ruby Gems/Bundler we can remove a whole corpus of tools and dependencies from our container images that really never need to exist in the first place. Not in a build layer, and not in the resulting product. Plus with a distro you have several hundred sets of eyes reviewing the packages as they flow through the ecosystem, whether that be regular users just trying to use and test something, to package maintainers like myself aggressively packaging the world, to the core distribution teams that scrutinizes each package change as it happens. These are contributors that would not exist in your project if all you did was bundle install and move on.
And as you'll see in a second, we're not really deviating that much from the typical Ruby workflow when we package a Gem. We still rely on tools like Bundler, Rake, and the various ruby test frameworks to package and validate our code. We're just adding more scrutiny and rigor around the process to make it sustainable/accessible to the distro at large.
APKBUILDs for Gems
Now admittedly, I'm not a master on packaging Ruby things, I've really only recently dipped my toes into these waters. But this process is so easy I rapidly added ~13 ruby libraries to Alpine in just the course of two nights. In fact, this is how I spent the first two days of the Old Computer Challenge, bundling up all the dependencies I've used in all the dabbling I've done with Ruby thus far. And whatever other dependencies they might have.
Lets look at what I did for ruby-resolv, this is a really simple Gem that provides an alternative DNS resolution to the default socket method built into ruby. Since this project is truly just a single ruby file, we don't actually have to do much work.
# Maintainer: Will Sinatra <wpsinatra@gmail.com>
pkgname=ruby-resolv
_gemname=${pkgname#ruby-}
pkgver=0.4.0
pkgrel=0
pkgdesc="A thread-aware DNS resolver library written in Ruby"
url="https://rubygems.org/gems/resolv"
arch="noarch"
license="BSD-2-Clause"
checkdepends="ruby-rake ruby-bundler ruby-test-unit ruby-test-unit-ruby-core"
depends="ruby"
source="$pkgname-$pkgver.tar.gz::https://github.com/ruby/resolv/archive/refs/tags/v$pkgver.tar.gz
gemspec.patch"
builddir="$srcdir/$_gemname-$pkgver"
prepare() {
default_prepare
sed -i '/spec.signing_key/d' $_gemname.gemspec
}
build() {
gem build $_gemname.gemspec
}
check() {
rake
}
package() {
local gemdir="$pkgdir/$(ruby -e 'puts Gem.default_dir')"
gem install --local \
--install-dir "$gemdir" \
--ignore-dependencies \
--no-document \
--verbose \
$_gemname
rm -r "$gemdir"/cache \
"$gemdir"/build_info \
"$gemdir"/doc
}
sha512sums="
c1157d086a4d72cc48a6e264bea4e95217b4c4146a103143a7e4a0cea800b60eb7d2e32947449a93f616a9908ed76c0d2b2ae61745940641464089f0c58471a3 ruby-resolv-0.4.0.tar.gz
ed64dbce3e78f63f90ff6a49ec046448b406fa52de3d0c5932c474342868959169d8e353628648cbc4042ee55d7f0d4babf6f929b2f8d71ba7bb12eb9f9fb1ff gemspec.patch
"
Gem build and its gemspec files do a wonderful job obfuscating away the complexity of our packaging concerns. It's extremely common to see rakefile's default to running tests and nothing more with whatever framework the other likes. And so we really only need to tell gem to be very particular about where it installs files and how it thinks about what it needs to install.
The one issue I ran into very consistently is the use of git ls-files inside of the gemspec files to figure out what kind of files the Gem actually installs. This is a neat trick, but a bit silly for a library that is literally one file, and even if it's several directories almost everything in a ruby library gets dump into a directory called lib.
Fortunately this little patch (while specific to the ruby-resolv package) is a quick fix for that one tiny issue. And it's really not a big deal to carry these sorts of "make the build system work" patches. At least I don't really mind.
--- a/resolv.gemspec
+++ b/resolv.gemspec
@@ -20,9 +20,7 @@
spec.metadata["homepage_uri"] = spec.homepage
spec.metadata["source_code_uri"] = spec.homepage
- spec.files = Dir.chdir(File.expand_path('..', __FILE__)) do
- `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
- end
+ spec.files = Dir["lib/**/*"]
spec.bindir = "exe"
spec.executables = []
spec.require_paths = ["lib"]
Now some Gem files need to be compiled, because they're actually wrappers on top of C libraries. This is a pretty common design, Ruby is used to interface with the lower level lib using FFI, just the same as would be done in Lua. When that happens the gem build system needs to compile the FFI interface code, as well as bundle the ruby code away.
# Contributor: Will Sinatra <wpsinatra@gmail.com>
# Maintainer: Will Sinatra <wpsinatra@gmail.com>
pkgname=ruby-sqlite3
_gemname=${pkgname#ruby-}
pkgver=2.0.2
pkgrel=0
pkgdesc="Ruby bindings for SQLite3"
url="https://rubygems.org/gems/sqlite3"
arch="all"
license="BSD-3-Clause"
makedepends="ruby-dev sqlite-dev"
depends="ruby ruby-mini_portile2"
checkdepends="ruby-rake ruby-bundler"
source="$pkgname-$pkgver.tar.gz::https://github.com/sparklemotion/sqlite3-ruby/archive/refs/tags/v$pkgver.tar.gz"
builddir="$srcdir/sqlite3-ruby-$pkgver"
options="!check" # requires rubocop
build() {
gem build $_gemname.gemspec
}
check() {
rake
}
package() {
local gemdir="$pkgdir/$(ruby -e 'puts Gem.default_dir')"
local geminstdir="$gemdir/gems/sqlite3-$pkgver"
gem install \
--local \
--install-dir "$gemdir" \
--bindir "$pkgdir/usr/bin" \
--ignore-dependencies \
--no-document \
--verbose \
"$builddir"/$_gemname-$pkgver.gem -- \
--use-system-libraries
rm -r "$gemdir"/cache \
"$gemdir"/doc \
"$gemdir"/build_info \
"$geminstdir"/ext \
"$geminstdir"/ports \
"$geminstdir"/*.md \
"$geminstdir"/*.yml \
"$geminstdir"/.gemtest
find "$gemdir"/extensions/ -name mkmf.log -delete
}
sha512sums="
987027fa5e6fc1b400e44a76cd382ae439df21a3af391698d638a7ac81e9dff09862345a9ba375f72286e980cdd3d08fa835268f90f263b93630ba660c4bfe5e ruby-sqlite3-2.0.2.tar.gz
"
But as you can see in the apkbuild for ruby-sqlite3, that really isn't that much more effort. We just need to include our -dev dependencies to ensure we can actually compile against the correct distro libraries, and then we tell gem to build and install against those deps. There's extra work cleaning up the installation directory, but it's more or less the exact same process.
Closing Thoughts
Honestly, this is a pretty delightful find from my perspective. It means that it's not only really easy to add additional packages to Alpine, but I also discovered in this process that there aren't really that many Ruby things packaged for Alpine in the first place. I've often heard that people just can't use Alpine for Ruby things because X dependency isn't packaged, or when they try and add something using bundler it fails to properly compile. This will in the long term help wave away a whole class of issues, and I'm really excited about that.
Now, the OCC stuff. This year's theme is DIY, whatever you want to do, do it! None of use could agree on what to do so we're just doing anything and everything. I've seen some really cool posts and ideas thrown around, but with so much of my time limited by commitments at work and with my family, the best thing I can think to do is just anything I would normally decided to do, just from my junky little Acer ZG5. All of the packages above were built, tested, and pushed from the terribly 5400rpm IDE drive after being lovingly toasted by the heat spewing Atom N270 cpu. And while that process was slow at times with a repo as large as aports, it was still totally doable.
Long live old machines! We're doing real work out here thanks to them!