balrogboogie

The write.as blog for https://ceilidh.space/@balrogboogie

Imagine you are writing an application that consumes an HTTP API. You make calls, receive JSON/XML/protobufs/etc, and use that data for something else. You like and use TDD to make sure that your app works properly, and you add tests for as much as you possibly can. However, when it comes to the code that actually makes those HTTP calls, you have to make some adjustments, because you don't actually want to be making those HTTP calls during your tests. In languages that have the ability to dynamically change things at runtime, this would be where you would add mocks. In Rust, it can be a little more challenging, and require a little more thought up-front1,2.

In this article we will make a small Rust library that uses the reqwest http client library, and see what we can do to adequately test the business logic. We assume you have the Rust toolchain installed, and are at least passingly familiar with programming in Rust.

Project setup

First, let's make our new Rust library, and add some dependencies to it:

$ cargo new --lib foo-client
$ cd foo-client
$ $EDITOR Cargo.toml

Make sure your Cargo.toml dependencies section looks like this:

[dependencies]
# There is new stuff on `master` that we need for this
reqwest = { git = "https://github.com/seanmonstar/reqwest" }
url = "1.7"
serde_json = "1.0"

Creating the client

Now, let's get our new library actually doing something. First, we'll create the FooClient:

// src/lib.rs

pub struct FooClient {
   client: reqwest::Client,
}

impl FooClient {
   pub fn new() -> FooClient {
      FooClient {
         client: reqwest::Client::new(),
      }
   }
}

(I'm using edition = 2018 here, hence why you don't see extern crate reqwest anywhere)

Okay, so we have our client, now let's give it something to do.

// src/lib.rs
use serde_json::Value;

pub struct FooClient {
   client: reqwest::Client,
}

impl FooClient {
   pub fn new() -> FooClient {
      FooClient {
         client: reqwest::Client::new(),
      }
   }

   pub fn get_widget(&self, id: &str) 
      -> Result<Value, Box<Error>>
   {
      let url = Url::parse("https://example.com/widget/")?
         .join(id)?;
      let value: Value = self.client.get(url)
         .send()?
         .json()?;
      Ok(value)
   }
}

Great! Now we can retrieve a widget by it's "id". Usually we'd want to have better error management, but hey, Box<Error> is better than .unwrap() everywhere, right?

Now that our client can retrieve the widget resources it needs, we need to test this (yes, I know that technically we should have written the tests first, just go with me here). In order to avoid actually making HTTP calls during our tests, we're going to restructure our code just a bit. Essentially, we need abstract away the process of taking a reqwest::Request, and turning it into a request::Response. This way, we can do one thing in our production code, and something else in our tests.

So, let's add a trait for this:

// src/lib.rs

impl FooClient {
   // ...
}

pub trait HttpSend {
   fn send(&self, request: reqwest::RequestBuilder)
      -> Result<reqwest::Response, Box<Error>>;
}

and now we parameterize our client using this trait:

// src/lib.rs

pub struct FooClient<S: HttpSend> {
   client: reqwest::Client,
   sender: S,
}

impl<S: HttpSend> FooClient<S> {
   // we'll take care of `pub fn new` in a minute...

   pub fn get_widget(&self, id: &str) 
      -> Result<Value, Box<Error>> 
   {
      let url = Url::parse("https://example.com/widget/")?
         .join(id)?;
      let value: Value = self.sender
         .send(self.client.get(url))?
         .json()?;
      Ok(value)
   }
}

(Did you notice the slight change to the get_widget method?)

Before we can actually get this to compile & work correctly, we need two more pieces: first, the actual implementation of HttpSend for our client:

pub trait HttpSend {
   fn send(&self, request: reqwest::RequestBuilder) 
      -> Result<reqwest::Response, Box<Error>>;
}

pub struct Sender;
impl HttpSend for Sender {
   fn send(&self, request: reqwest::RequestBuilder) 
      -> Result<reqwest::Response, Box<Error>> 
   {
      Ok(request.send()?)
   }
}

Easy enough, yeah? Lastly, let's make our client use this implementation:

pub struct FooClient<S: HttpSend=Sender> {
   client: reqwest::Client,
   sender: S,
}

impl FooClient<Sender> {
   pub fn new() -> FooClient<Sender> {
      FooClient {
         client: reqwest::Client::new(),
         sender: Sender,
      }
   }
}

impl<S: HttpSend> FooClient<S> {
   pub fn with_sender(sender: S) -> FooClient<S> {
      FooClient {
         client: reqwest::Client::new(),
         sender: sender,
      }
   }

   pub fn get_widget(..) // etc
}

Nice! Now our client can send HTTP requests like normal, the user of the library doesn't have to deal with the HttpSend implementation at all (thanks to the default we set — <S: HttpSend=Sender>), and we can swap out the “Sender” at compile time with our own implementation!

On to testing!

Ok, so, now that we have all of that in place, we're finally ready to write some tests! Well, okay, we have to make something in our test module to build from first, but then we'll get to write some tests!

First, our test module:

#[cfg(test)]
mod tests {
   use std::error::Error;
   use super::{FooClient, HttpSend};
}

Now, we need to make an HttpSender that can be used in our tests. In order to do this, we'll need to add one more dependency. In your Cargo.toml, add this section:

[dev-dependencies]
http = "0.1.13"

This pulls in the http crate, which is used by reqwest, hyper, and other crates, and contains many standard types needed to work with http. We are going to be using it for it's http::response::Builder and http::Response types. Why is that? Because for our mock HttpSender, we will need to manually create reqwest::Response objects. The only problem is, there is no way to manually construct reqwest::Response objects! Well, at least, there didn't used to be. Recently, reqwest gained the ability to construct reqwest::Response objects from http::Response objects. And since http::Response objects can be manually constructed, that's what we'll do!

#[cfg(test)]
mod tests {
   use std::{
      cell::RefCell,
      error::Error
   };
   use super::{FooClient, HttpSend};
   use http::response;

   pub struct MockSender(
      RefCell<response::Builder>, 
      &'static str
   );
   impl HttpSend for MockSender {
      fn send(&self, _: reqwest::RequestBuilder) 
         -> Result<reqwest::Response, Box<Error>> 
      {
         let mut builder = self.0.borrow_mut();
         let response = builder.body(self.1)?;
         Ok(response.into())
      }
   }
}

Whew! Okay, so what are we doing here? Since our HttpSend trait takes &self, but our response::Builder methods take &mut self, we need to wrap the Builder in a RefCell so that we can get a mutable reference to it. Normally we might wrap this in a Mutex or something else so that we don't try to mutably borrow it more than once, but in this case we know we will only be using it for the single test case that the MockSender gets created for.

But why even put a Builder in the MockSender? Why not just store the http::Response directly? Unfortunately, we cannot because response.into() consumes the response, which we couldn't do if the response was part of MockSender, since MockSender is borrowed inside .send() and we can't move out of it. So instead, we store both the builder, and the request body, and then we can use those two pieces to create the http::Response within the .send() method, which means we can move it, consume it, or whatever else we might want to do with it.

With this mock sender, we can write a test for the get_widget method:

// src/lib.rs
#![macro_use] extern crate serde_json;

// ...


#[cfg(test)]
mod tests {
   use std::{
      cell::RefCell,
      error::Error
   };
   use super::{FooClient, HttpSend};
   use http::response;

   pub struct MockSender(RefCell<response::Builder>, &'static str);
   impl HttpSend for MockSender {
      fn send(&self, _: reqwest::RequestBuilder) 
         -> Result<reqwest::Response, Box<Error>> 
      {
         let mut builder = self.0.borrow_mut();
         let response = builder.body(self.1)?;
         Ok(response.into())
      }
   }

   #[test]
   fn get_widget() {
      let mut builder = response::Builder::new();
      builder.status(200);
      let body = r#"{
         "id": 42,
         "foo": "bar",
         "baz": "quux"
      }"#;
      let sender = MockSender(RefCell::new(builder), body);
      let client = FooClient::with_sender(sender);

      let result = client.get_widget("42")
         .expect("get_widget() call did not succeed");

      assert_eq!(
         result,
         json!({
            "id": 42,
            "foo": "bar",
            "baz": "quux"
         })
      );
   }
}

(Note that we had to add #![macro_use] extern crate serde_json to the top so we could get the json!() macro)

Nice! Though it seems like a lot of setup for a single call, maybe we could abstract some of that setup away?

   fn client_with_response(status: u16, body: &'static str) 
      -> FooClient<MockSender> 
   {
      let mut builder = response::Builder::new();
      builder.status(status);
      let sender = MockSender(RefCell::new(builder), body);
      FooClient::with_sender(sender)
   }

   #[test]
   fn get_widget() {
      let client = client_with_response(200, r#"{
         "id": 42,
         "foo": "bar",
         "baz": "quux"
      }"#);

      let result = client.get_widget("42")
         .expect("get_widget() call did not succeed");

      assert_eq!(
         result,
         json!({
            "id": 42,
            "foo": "bar",
            "baz": "quux"
         })
      );
   }
}

There we go, that looks a little better.

Now, obviously, we aren't testing much here. All we know is that serde_json is taking that response and correctly converting it to a serde_json::Value, which we don't really need to test ourselves. However, we could change get_widget to return one of our own data structures, which would then test that our deserialization is working correctly. We could use these mocked clients to test higher-level methods, whose business logic combines the use of multiple HTTP calls and transforms the outputs somehow.

We would also want to make a better MockSender for our tests. The one we have here is nice for a few simple tests here and there, but we would probably also want it doing some kind of validation about the RequestBuilder coming in, as well as making it possible to return different responses depending on the incoming request. This would be necessary to test methods that make multiple HTTP calls. There are a lot of improvements we could make, but for this post I just wanted to show a simple example. Expanding it is left as an exercise to the reader ;–)3

Full source

Finally, here's the full source code to our little project:

// src/lib.rs
#[cfg_attr(test, macro_use)] extern crate serde_json;

use std::error::Error;

use serde_json::Value;
use url::Url;

pub struct FooClient<S: HttpSend=Sender> {
    client: reqwest::Client,
    sender: S,
}

impl FooClient<Sender> {
    pub fn new() -> FooClient<Sender> {
        FooClient {
            client: reqwest::Client::new(),
            sender: Sender,
        }
    }
}

impl<S: HttpSend> FooClient<S> {

    pub fn with_sender(sender: S) -> FooClient<S> {
        FooClient {
            client: reqwest::Client::new(),
            sender: sender,
        }
    }

    pub fn get_widget(&self, id: &str) 
         -> Result<Value, Box<Error>> 
    {
        let url = Url::parse("https://example.com/widget/")?
            .join(id)?;
        let value: Value = self.sender
            .send(self.client.get(url))?
            .json()?;
        Ok(value)
    }
}

pub trait HttpSend {
    fn send(&self, request: reqwest::RequestBuilder) 
        -> Result<reqwest::Response, Box<Error>>;
}

pub struct Sender;
impl HttpSend for Sender {
    fn send(&self, request: reqwest::RequestBuilder) 
         -> Result<reqwest::Response, Box<Error>>
    {
        Ok(request.send()?)
    }
}

#[cfg(test)]
mod tests {
    use super::{FooClient, HttpSend};
    use std::error::Error;
    use std::cell::RefCell;
    use http::response;

    pub struct MockSender(
        RefCell<response::Builder>, 
        &'static str
    );
    impl HttpSend for MockSender {
        fn send(&self, _: reqwest::RequestBuilder)
            -> Result<reqwest::Response, Box<Error>> 
        {
            let mut builder = self.0.borrow_mut();
            let response = builder.body(self.1)?;
            let response = response.into();
            Ok(response)
        }
    }

    fn client_with_response(status: u16, body: &'static str)
         -> FooClient<MockSender>
    {
        let mut builder = response::Builder::new();
        builder.status(status);
        let sender = MockSender(RefCell::new(builder), body);
        FooClient::with_sender(sender)
    }

    #[test]
    fn get_widget() {
        let id = "42";
        let client = client_with_response(200, r#"{
              "id":42,
              "foo":"bar",
              "baz":"quux"
            }"#
        );
        let result = client.get_widget(id).expect("Call failed");
        assert_eq!(
                result,
                json!({
                    "id": 42,
                    "foo": "bar",
                    "baz": "quux"
                })
        );
    }
}

Thanks to @seanmonstar for all his help!

1 One strategy has been https://github.com/leoschwarz/reqwest_mock, which is a great library that works by duplicating reqwest's API. Unfortunately, it has not been able to keep up with the API changes to reqwest and therefore I was unable to use it, which spurred my effort to get the technique in this article working properly.

2 You could also abstract all possible calls to the remote service into a Service layer, defined by a trait, and write tests using a mocked Service. I wanted to test a little closer to the HTTP communication without actually testing the HTTP communication itself, hence this article.

3 Sorry, I know this is the lazy way out

Yesterday, someone I follow on mastodon talked about how they wrote a quick python script to count how many people they followed (and how many followed them) from mastodon.social, to see how much their social graph would be affected if they were to block mastodon.social from their instance.

This made me want to do something similar, but I really wanted to see how easy it would be to do it with rust and elefren. So, after initially failing because of a bug in elefren (which I posted about yesterday) which also prompted a new elefren release, I present the finished script:

extern crate elefren;
extern crate promptly;

use std::error::Error;
use std::io::{self, Write};

use promptly::prompt;
use elefren::prelude::*;

fn main() -> Result<(), Box<Error>> {
    let registration = Registration::new("https://ceilidh.space")
            .client_name("following-count")
            .build()?;
    let url = registration.authorize_url()?;
    println!("Go to this URL in your browser: {}", url);
    io::stdout().flush()?;
    let code: String = prompt("Paste code here");

    let client = registration.complete(&code)?;
    let me = client.verify_credentials()?;

    // this retrieves all the users I'm following
    let following = client.following(&me.id)?
        // and this turns the page of Accounts I get into an 
        // iterator that will take care of all the pagination logic
        // and turns it into a single iterator of Accounts
        .items_iter();

    let mut total = 0;
    let ms = following
        // first, let's go through each Account and see if the user
        // is on mastodon.social
        .filter(|account| {
            // first let's update the total counter
            total += 1;

            // account.acct might be just `username` for local
            // accounts, or `username@domain` for remote names.
            //
            // splitn() gives us an iterator, with 0, 1, or 2 items in it
            let mut parts = account.acct.splitn(2, '@');

            // probably won't happen, but let's guard against it
            // anyway
            if parts.next().is_none() {
                return false;
            }

            // if it's a remote user, then there will be a second item
            // in the iterator
            if let Some(domain) = parts.next() {
                domain == "mastodon.social"
            } else {
                false
            }
        })
        .count();

    println!(
        "You are following {} users on mastodon.social out of {} total",
        ms,
        total
    );

    Ok(())
}

This is ~36 lines of actual code, 34 if you don't count the extern crate declarations. Which is not bad, for a compiled language, right?

Still, we should be able to do better. There are a few aspects of elefrens API that could be shortened to make tasks like this easier.

The Authenticated User

First, let's look at this section:

let client = registration.complete(&code)?;
let me = client.verify_credentials()?;

// this retrieves all the users I'm following
let following = client.following(&me.id)?

Here we see that, in order to pull down the list of follows, we actually need to make 2 calls, one to retrieve the account information for the authenticated user, and another to actually pull down the follows. This “get something for the logged in user” is bound to be a pretty common use-case, so what if we added some methods to make it easier to do this? There are specifically 4 methods that could probably use a complementary method that performs the action using the authenticated user's account id:

fn followers(&self, id: &str) -> Result<Page<Account, H>>
fn following(&self, id: &str) -> Result<Page<Account, H>>
fn reblogged_by(&self, id: &str) -> Result<Page<Account, H>>
fn favourited_by(&self, id: &str) -> Result<Page<Account, H>>

So let's assume we're going to add these 4 methods:

fn follows_me(&self) -> Result<Page<Account, H>>
fn followed_by_me(&self) -> Result<Page<Account, H>>
fn reblogged_by_me(&self) -> Result<Page<Account, H>>
fn favourited_by_me(&self) -> Result<Page<Account, H>>

This gets rid of a line of code from the snippet above:

let client = registration.complete(&code)?;

// this retrieves all the users I'm following
let following = client.followed_by_me()?;

Retrieving OAuth Details

Next, let's take a look at this snippet:

let registration = Registration::new("https://ceilidh.space")
    .client_name("following-count")
    .build()?;
let url = registration.authorize_url()?;
println!("Go to this URL in your browser: {}", url);
io::stdout().flush()?;
let code: String = prompt("Paste code here");

let client = registration.complete(&code)?;

This is ugly. We have helpers for loading & saving OAuth information into various data formats, but we don't have anything to help with the initial task of interactively retrieving this information on the command line. Now, not every application is going to be using the command line for authentication, so anything we add for this probably shouldn't be front-and-center in the API, but we could at least add a helper for it. We could turn the above snippet into something more like the following:

use elefren::helpers::cli;

let registration = Registration::new("https://ceilidh.space")
    .client_name("following-count")
    .build()?;
let client = cli::authenticate(&registration)?;

Which looks much, much better, and saves us another 4 lines of code!

Helpers methods on models

The last snippet I want to look at is in the closure that we pass to .filter:

// account.acct might be just `username` for local 
// accounts, or `username@domain` for remote names.
//
// splitn() gives us an iterator, with 0, 1, or 2 items in it
let mut parts = account.acct.splitn(2, '@');

// probably won't happen, but let's guard against it anyway
if parts.next().is_none() {
    return false;
}

// if it's a remote user, then there will be a second item
// in the iterator
if let Some(domain) = parts.next() {

This is 5 lines of code, plus a bunch of comments, just to take the Account::acct string and extract the domain from it. I'm counting the comments here because the code is not terribly self-documenting, and it's purpose might not be immediately obvious without them. We can do better.

Right now the elefren::entities::account::Account struct is pretty much just used to hold the deserialized information that is returned from the mastodon API. But that does not mean it has to stay an inert container of fields, right? Getting the domain for a specific account is bound to be something other users might want to do, so let's make it into a helper method:

impl Account {
    // it's `Option<String>` because a local account won't
    // have a domain as part of the `acct` string
    fn domain(&self) -> Option<String> {
        // pretty much the same logic from the above snippet
    }
}

Now, in our filter() closure, we can just do this:

 .filter(|account| {
            // first let's update the total counter
            total += 1;

            // account.acct might be just `username` for local
            // accounts, or `username@domain` for remote
            // names.
            if let Some(ref domain) = account.domain() {
                domain == "mastodon.social"
            } else {
                false
            }
        })

Which eliminates another 4 lines of code!

Result

So with all these changes, what would the full example look like? Well, like this:

extern crate elefren;
extern crate promptly;

use std::error::Error;

use promptly::prompt;
use elefren::prelude::*;
use elefren::helpers::cli;

fn main() -> Result<(), Box<Error>> {
    let registration = Registration::new("https://ceilidh.space")
            .client_name("following-count")
            .build()?;
    let client = cli::authenticate(&registration)?;
    // this retrieves all the users I'm following
    let following = client.followed_by_me()?
        // and this turns the page of Accounts I get into an
        // iterator that will take care of all the pagination
        // logic and turns it into a single iterator of Accounts
        .items_iter();

    let mut total = 0;
    let ms = following
        // first, let's go through each Account and see if the
        // user is on mastodon.social
        .filter(|account| {
            // first let's update the total counter
            total += 1;

            // account.acct might be just `username` for local
            // accounts, or `username@domain` for remote
            // names.
            if let Some(ref domain) = account.domain() {
                domain == "mastodon.social"
            } else {
                false
            }
        })
        .count();

    println!(
        "You are following {} users on mastodon.social out of {} total",
        ms,
        total
    );

    Ok(())
}

Our final script is down to 26 lines! Still not as small as if we were using Ruby or Python, but for a compiled language I think that's pretty good.

My Social Graph

So what did this script compute for me?

You are following 216 users on mastodon.social out of 877 total

Yikes, that's a quarter of the people I follow!

Today, hot on the heels of v0.13.0, I've released elefren 0.14.0. I normally would have waited longer between releases, but today I discovered a bug in elefren & mammut that caused a runtime failure when 3 specific API calls were made

The Bug

I noticed the bug when I went to pull down a list of the people I was following. I was doing something like:

let client = Mastodon::from(data);
client.following()?;

I saw following in the docs and just assumed it would pull down the following list for the authenticated user. So I was surprised when I got an error response from the server! Since there really isn't much to this call, or any way to configure the request, I figured maybe the API was out-of-date, or there was something wrong with my server. When I looked in the code, I found out that it was trying to call this API endpoint: /api/v1/accounts/{}/following. Did you notice the {} in there? This is a placeholder that normally would get filled in with a call to format!, so maybe the method is retrieving the account id of the authenticated user and using that to fill in the account id parameter?

Turns out, nope, it just takes that string, verbatim, and makes an HTTP call to the server. Of course, there is no account with ID {}, so the call fails.

Digging Deeper

As it turns out, there was a small bug in the macro that was generating these methods. The macro call is here:

paged_routes_with_id! {
    (get) followers: "accounts/{}/followers" => Account,
    (get) following: "accounts/{}/following" => Account,
    (get) reblogged_by: "statuses/{}/reblogged_by" => Account,
    (get) favourited_by: "statuses/{}/favourited_by" => Account,
}

As you can see, we are generating 4 methods with this macro call. The macro definition is supposed to generate methods that look like this:

pub fn followers(&self, id: &str) -> Result<Page<Account, H>> {
    // implementation here
}

In fact, for followers it did generate that. However, the other 3 methods ended up like this:

pub fn following(&self) -> Result<Account> {
    // implementation here
}

This is obviously incorrect, and is the reason that client.following() compiles fine, when the method should have been requiring an id: &str.

The Solution

The solution eventually presented itself as I was staring at the macro definition:

macro_rules! paged_routes_with_id {

    (($method:ident) $name:ident: $url:expr => $ret:ty, $($rest:tt)*) => {
        fn $name(&self, id: &str) -> Result<Page<$ret, H>> {
            // implementation
        }

        route!{$($rest)*}
    }
}

Do you see it? The match arm pulls out the first (get) followers: "accounts/{}/followers" => Account pattern, matching the rest of the endpoints with the $rest:tt pattern, then generates fn followers, then calls another macro to expand the $rest. However, it does not call paged_routes_with_id!{$($rest)*}, it calls route!{$($rest)*}. route is another macro that is used to generate methods, but it generates them as, you guessed it, fn method_name(&self) -> Result<Entity>, which is exactly what we were seeing with the following method.

So after changing route! to paged_routes_with_id! at the end of that macro expansion, and adding an empty match arm, the problem was fixed and the methods started working 👍.

So Why 0.14.0 and not 0.13.1?

After I fixed this bug and opened a PR for it, I realized I was going to have to release this as 0.14.0 and not 0.13.1. Rust crates usually follow strict semver when it comes to versioning, and elefren is no different. Even though this was a bugfix, it did result in a change to the public API. It may not matter too much, since anyone that was relying on the old API for these methods was relying on broken code, but still, the public API changed. As much as it pained me to do it, I released it as 0.14.0.

If you've gotten this far, thanks for reading, and be sure to follow me @balrogboogie@ceilidh.space!