is one of my favorite design patterns to develop highly-testable and modular code. To apply this pattern, all you have to do is follow two simple guidelines:
Separate object construction from usage. In practical terms: stop creating objects inside constructors and take those objects as input arguments.
Use interfaces instead of concrete types as constructor parameters. In this way, the receiver remains agnostic to the implementation of those types and thus it becomes possible to supply different implementations.
Dependency injection is key to testability indeed, but it is also a good design principle1 because it keeps system pieces loosely coupled. The best part is that dependency injection is a simple concept: there is no need for fancy frameworks.
As for how to define the generic interfaces, the techniques depend on your language of choice. In C++, you would define pure abstract classes; in Java, Go and C#, you would define interfaces; and in Rust, you would use traits.
This post is about Rust though, so let’s talk about using traits for dependency injection, how they have a nasty side-effect, and what we can do about it.
A blog on operating systems, programming languages, testing, build systems, my own software projects and even personal productivity. Specifics include FreeBSD, Linux, Rust, Bazel and EndBASIC.
An example
When the dependency injection pattern is applied correctly, a library crate exposes:
- traits to represent heavyweight objects;
- a collection of objects that implement those traits; and
- functions that consume those objects via the generic traits.
Then, consumers of the library instantiate the specific objects they need, wire them together to create a dependency graph, and feed those into the generic business logic functions exposed by the library.
Is that clear? Yeah… I guess not. Let’s take a look at a concrete example so that we can have reference code for the explanations below. For that, I’ll use the code of the db_logger
crate I recently published.
The db_logger
crate provides a log
facade implementation that records log messages into a database. The database in which messages are stored depends on your choice between PostgreSQL and SQLite (for now), and this selection is made via dependency injection.
You would think that Cargo features would be sufficient to choose a database backend, but Cargo features are a poor tool for this kind of configuration. Cargo features are well-suited to control which dependencies are built and not—and for this reason—but you, the user, still have to choose which database to talk to via code. Configuring each backend requires different settings and you may even want to select a backend at runtime.
To support runtime configuration, we start by defining a trait to represent the generic features we need our database connection to expose. In this way, the logging (business) logic can record log entries and remain unaware of which specific database it is talking to. Then, we add a function to initialize the logger based on an abstract connection object and we are done:
#[async_trait]
/// Operations that an arbitrary database connection can perform.
pub trait Db {
async fn put_log_entries(&self, es: Vec<LogEntry<'_, '_>>) -> Result<()>;
}
/// Initializes the logging subsystem to record entries in `db`.
pub fn init(db: Arc<dyn Db + Send + Sync + 'static>) {
// ...
}
With this interface at hand, consumers of db_logger
can pick which database to connect to by using alternate objects that implement the Db
trait—of which there are two now: PostgresDb
and SqliteDb
.
On the consumer side (say, from src/main.rs
), we can do something like this when we want to talk to PostgreSQL:
let db = Arc::from(db_logger::PostgresDb::connect_lazy(
host, port, database, username, password));
db_logger::init(db); // Doesn't care about which specific `db`.
Or something like this when we want to talk to SQLite:
let db = Arc::from(db_logger::SqliteDb::connect(uri));
db_logger::init(db); // Doesn't care about which specific `db`.
Note that, thanks to this design, not only downstream consumers are better off: the business logic unit tests can use this abstraction as well to remain stable and extremely fast. In particular, the logging logic’s , which makes them super-fast and avoids flakiness due to misconfiguration or networking issues.
Sounds great, right? Well, it does, but notice how the Db
trait above references the LogEntry
type in its put_log_entries()
function. That type, which users of the db_logger
crate should have no knowledge of, must now be public too because Db
is public. And this is a big transitive problem.
The problem
To key issue with using traits in Rust for dependency injection is that any type referenced in a function signature must be at least as visible as the function itself. Which means that if a trait is public (like the Db
trait above), any type referenced by any of the trait’s functions (like the LogEntry
struct above), will have to be public as well.
Too broad visibility is problematic for at least two reasons:
Broken encapsulation. Users of a library (crate) shouldn’t see APIs that are internal to the library. Otherwise, they can easily take dependencies on implementation details, which can break behavior and will make your life as a maintainer harder in the future.
Insufficient dead code detection. Once a type is marked public, the compiler cannot claim that it is unused even if nothing else uses the type within the crate. This is viral: an unused type may be referenced by unused functions, which in turn may reference other unused types, etc. Link-time optimizations can make this (almost) a non-issue at runtime, but any dead code is a liability during development because it gets in the way of maintenance.
This problem is not exclusive to library crates. Binary crates suffer as well if you follow the recommendations of placing most code into a private library and having src/main.rs
be a simple facade over the library. Plus any crate can suffer if it has integration tests because those can only interact with your public interface.
So, what can we do about this to keep our architecture sane?
Bad solution: “do-it-all functions”
A first solution is to try and hide the problematic traits behind what I’ll call “do-it-all functions” for lack of a better name.
This is what I first did in a couple of projects last year. In those project, I used to have a serve_rest_api()
public function that took the database connection and then started a REST server backed by it. To hide the traits, I renamed this function to serve_rest_api_internal()
and then added a new serve_rest_api()
function that took configuration parameters to decide which objects to instantiate, thus subsuming most of the logic that previously existed in src/main.rs
.
Needless to say, this is ugly because we lose , and this looks bad because we are shoehorning the responsibilities of the main program into the library—all to appease some visibility issues. Not a good trade-off from an API design perspective.
To make matters worse, this approach didn’t quite work for db_logger
. In the case of this library, you can imagine having exposed init_postgres()
and init_sqlite()
public functions (again, bad for composability) that created the database objects within them. I tried doing that, but I ran into some nightmare-ish problems in my integration tests because I had to handle lifetimes and async tasks across async runtime boundaries (Drop
being sync is… troublesome).
As a result of this trouble, I ended up having to dump this “solution” and spend a couple of head-scratching mornings finding an alternative—which is a good thing because these “do-it-all functions” really sucked from a design perspective.
Good solution: newtype
I don’t know why it took me so long to reach the conclusion of using the newtype idiom to hide the traits. I guess I was too bogged down in trying to make the specific solution above work, and that prevented me from seeing an alternative approach. In retrospect, it sounds trivial, but here it goes.
The idea to solve the visibility issues is to introduce a new concrete type that wraps the trait as its single member. Then, this concrete type is the one that’s made public and the trait (and all of its dependencies) can remain private.
For our db_logger
case study, all we have to do is introduce a new type, like this:
#[derive(Clone)]
pub struct Connection(Arc<dyn Db + Send + Sync + 'static>);
Note how Connection
is just wrapping the Db
trait, but now, the trait is an implementation detail of the struct and does not have to be public. Note also how this hides the complexity of the Db
instance representation: the Arc
and all of the trait bounds are now hidden within the struct and don’t pollute the public API.
With this, we can update our concrete implementations of Db
with some factory methods:
/// Factory to connect to a PostgreSQL database.
pub fn connect_lazy(opts: ConnectionOptions) -> Connection {
Connection(Arc::from(PostgresDb::connect_lazy(opts, None)))
}
/// Factory to connect to a SQLite database.
pub async fn connect(opts: ConnectionOptions) -> Result<Connection> {
SqliteDb::connect(opts).await.map(|db| Connection(Arc::from(db)))
}
And, finally, our caller code can do one of these to set up the logger with ease:
let conn = if (use_real_db) {
postgres::connect_lazy(...)
} else {
sqlite::connect(...)
};
db_logger::init(conn);
Voila. By hiding the trait into a struct with the newtype idiom, the trait and all of its internal dependent types can be private again. And, as expected, the compiler can now spot unused code.
Update (2022-04-23): Some people have brought up via other channels that using static dispatch might be a better approach to avoid the runtime overheads imposed by the solution above. Maybe. I hadn’t thought about that when writing this post or the referenced code. Would be interesting to explore that avenue.
Dependency injection can also be a horrible way to make your code harder to work with when applied blindly. Some people have taken the meaning of “don’t construct objects inside other objects” to the limit and have hidden even classes that represent plain data types behind interfaces. Don’t be that person. Only define interfaces for objects that interact with the real world (file systems, databases, networks, graphics, etc.) or that implement functionality that is so complex that it makes other parts of the system hard to test. ↩︎