Accurate test data in FastAPI

Accurate test data in FastAPI

Do you write tests for your FastAPI server? Do you find it annoying to populate the database with test data? Follow along, and this article will give you some tips on loading test data from YAML files with type checking and cross-table references (i.e. foreign keys).

Full, working code example for this article at:

FastAPI and SQLModel

FastAPI is a wonderful Python framework for easily creating a fast, intuitive, robust API. And SQLModel is a handy add-on library that simplifies creating models and querying a SQL database.

So what's the problem?

The FastAPI documentation outlines how to write tests, and the SQLModel documentation shows how to use fixtures to create a fresh SQLite in-memory database for each test. But the example of loading test data is rather laborious:

  hero_deadpond = Hero(name="Deadpond", secret_name="Dive Wilson")
  hero_rusty_man = Hero(name="Rusty-Man", secret_name="Tommy Sharp", age=48)
  team_preventers = Team(name="Preventers", headquarters="Sharp Tower")
  team_z_force = Team(name="Z-Force", headquarters="Sister Margaret’s Bar")
  session.add(hero_deadpond)
  session.add(hero_rusty_man)
  session.add(team_preventers)
  session.add(team_z_force)
  session.commit()

There must be an easier way, and there is!

YAML files

The YAML format is a very easy way to specify test data. The test data above can be split into two files.

hero.yaml:

- name: Deadpond
  secret_name: Dive Wilson
- name: Rusty-Man
  secret_name: Tommy Sharp
  age: 48

And team.yaml:

- name: Preventers
  headquarters: Sharp Tower
- name: Z-Force
  headquarters: Sister Margaret’s Bar

Now we create a couple functions for reading the data:

def _load_table(session: Session, model, filename: str):
    data = yaml.safe_load(open(f"app/test_data/{filename}.yaml"))
    for entry in data:
        obj = model.validate(entry)
        session.add(obj)

def load_all(session: Session):
    _load_table(session, models.Hero, "hero")
    _load_table(session, models.Team, "team")
    session.commit()

And call it at the top of each test

def test_get_heroes(session: Session, client: TestClient):
    load_all(session)
    # The rest of the test code...

Let's double check the test data

One great part about Pydantic (which is used by SQLModel) is that it will complain if your test data is not the correct type, or is missing a required field.

For example, if this is one of our hero's data with a string for the age field instead of integer:

- name: Rusty-Man
  secret_name: Tommy Sharp
  age: forty-nine

We'll see an error during testing:

E  pydantic.error_wrappers.ValidationError: 1 validation error for Hero
E  age
E    value is not a valid integer (type=type_error.integer)

That will be very handy, because there's nothing worse than a test (or in this case test data) that's faulty. The error even tells you exactly which field is causing the problem, "age".

However, it won't tell you about extra, invalid fields. But you can configure Pydantic to check these by adding a simple Config class to each model:

class Hero(SQLModel, table=True):
    id: Optional[int] = Field(default=None, primary_key=True)
    # other fields omitted

    class Config:
        extra = Extra.forbid

Now if we accidentally mistype an optional field, we'll also get an error:

- name: Rusty-Man
  secret_name: Tommy Sharp
  afe: 49

Linking records

Now that we can write our test data easily, wouldn't it also be nice to easily link records between tables? Imagine we have a HeroTeamLink model that implements a many-to-many relationship as outlined in the SQLModel documentation. Specifying hero and team ids will work, but is really tedious and error-prone:

- team_id: 1
  hero_id: 1
- team_id: 2
  hero_id: 1
- team_id: 1
  hero_id: 2

With a little custom data loading logic, we can link records together using the "name" field:

- team__team__name: Z-Force
  hero__hero__name: Deadpond
- team__team__name: Preventers
  hero__hero__name: Deadpond
- team__team__name: Preventers
  hero__hero__name: Spider-Boy

The changes to the _load_table function do two things. First the teams and heroes are stored in a dictionary keyed by their model name ("hero" or "team") and name value (e.g. "Deadpond"). It then looks through keys for anything ending in "__name" and uses that to link the records together. (The load_all function is the same.)

I've chosen an arbitrary scheme for the dictionary keys of double underscores to separate the relation key (the name of the "Relationship" field in the model), and the model and key names in the other table.

from typing import Any, Dict

LINK_KEY = "name"
link_data: Dict[str, Any] = {}

def _load_table(session: Session, model, model_name: str):
    data = yaml.safe_load(open(f"app/test_data/{model_name}.yaml"))
    for entry in data:
        entry_keys = list(entry.keys())
        for key in entry_keys:
            if key.endswith(f"__{LINK_KEY}"):
                (relation_key, model_name_key, _) = key.split("__")
                entry[relation_key] = link_data[f"{model_name_key}__{entry[key]}"]
                # Delete the link key to avoid warnings about extra keys
                del entry[key]
        obj = model.validate(entry)
        if LINK_KEY in entry_keys:
            link_data[f"{model_name}__{entry[LINK_KEY]}"] = obj
        session.add(obj)

One thing to note is that when linking records, the target records need to be loaded first, e.g. heroes and teams before the link records.

Conclusion

Now you can write test data easily in the YAML format, check that the data is typed correctly, and link records between tables with ease!

Check out the full, working example at: %[github.com/Davepar/example-for-article]