Contents

Modern (Go) application design

When it comes to application design, I’ve formed a few opinions backed by experience. The most important one is: structure matters. In my first years of development, I’ve built a CMS system that was copied over more than 100 times for different web pages. You don’t get there unless you repeat the same process over and over.

Application development is like that. If you’re writing one middleware, you want the process to be repeatable for each following middleware.

The more people that work on the project, the more consistent you want the code base to be. Principles like SOLID or DDD give you a repeatable structuring model. Extending your application with a new service or component encourages composition, locality of behaviour, while adding new bounded scopes for testing.

Smaller package scopes lead to better tests, lead to less bugs overall.

There’s two ways to think about this, or rather two principles that apply to application development leaning into composability:

  1. Use cases for the application
  2. Data model first principles

As we know, applications can range from cli tooling, services of various sizes providing APIs, REST, gRPC, a package API. A web application may use templating to render a data model into other representations aimed at browsers. There’s variety and the use cases sort of dictate the top level components.

Use cases dictate structure

Let’s take a look at a familiar example for most, git. The tool provides some typical commands behind it which are used by developers daily.

  flowchart TD
    git --> status
    git --> clone
    git --> pull
    git --> commit
    git --> push

Git also allows to be extended, for example if you provide a git-st binary in the system, the git command will execute that binary when git st is invoked. I’ve used this in situations where multiple repositories are composed into a single application, and I had to work on multiple source trees at the same time.

Sometimes, the data model is more integrated into the cli, giving you a hint at additional structures. An example of that would be the docker compose config command, which makes config a primary component of the tool. The command evaluates dynamic parts of the configuration and prints out the json, including things like environment variables.

If someone comes along and sees a component like that, they are well equipped to integrate against it, and use it for new purposes, extending the original tooling without tight integrations.

The practice of introducing barriers between components makes software more reliable. But compared to the development benefits (TTV), well defined components are much valuable in the regard of what safe changes can be made, while relying on type safety to provide this security.

  • Reduction in on-call to practically 0, bugs disappear
  • Safer development of new features, APIs and components
  • A set of common patterns for development from end to end

In the context of terminal cli apps, the interface for them usually follows similar practices, so you’d expect terraform apply, terraform sync and similar invocations of various cli tools to structure their code according to their interface. Maybe I want to implement terraform stats, and there should be a clear-cut way how to begin.

Data model first principles

When it comes to application development with SQL, there’s a good chance that people just described the work of a database architect as domain driven design (DDD).

The top level entities I’d expect to see in any database driven application are:

  • user - providing user registration, login
  • profile - providing public data for a user
  • session - authenticating users requests with a session ID
  • user_group - define user groups
  • user_group_member - link users to user groups
  • user_group_rule - link access controls and permissions to user groups

These are typical, common, repetitive business entities that can be interfaced to componentize services around them. Wether or not OAuth2 or other authentication is in use, that’s all an implementation detail, another component of the composed system.

If you wanted to build a customer-relationship management software, this gets extended again with domain specific tables and schema. You may have a customer table, mail_list table, mail_list_members. All of these things are additional data models which come from the business domain.

Personally I use a pet project of mine to take this known data model and generate go code for it, similar to how protobuf / gRPC works to generate the data model for the .proto service definitions. I’d reach for go-bridget/mig to map the SQL schema to code, and buf.build to generate the gRPC models.

This gives me:

  • A source of truth for the API definitions (.proto + buf)
  • A source of truth for the repository / storage layer (.sql + mig)

This schema-first approach keeps a clean data model, against which one or multiple services can be written. The components written usually form a repetitive CRUD (create-replace-update-delete) pattern, that generally maps better to RPC than rest.

Internal components data access layer could be expressed with a generic interface that each of the tables should meet:

type Repository[T] interface {
	Create(context.Context, T) error
	Update(context.Context, T) error
	Get(context.Context, id string) (T, error)
	Select(context.Context, query string, args ...any) ([]T, error)
}

A more complex repository would compose multilple repositories or implement functions specific to that repository. This is in general called an aggregate:

type UserGroupAggregate struct {
	groups Repository[*model.UserGroup]
	members Repository[*model.UserGroupMember]
	rules Repository[*model.UserGroupRule]
}

func (r *UserGroupAggregate) IsMember(ctx context.Context, userID string, groupID string) (bool, error) {
	g, err := r.groups.Get(ctx, groupID)
	u, err := r.users.Get(ctx, userID)
	m, err := r.GetMember(ctx, groupID, userID)
	return count(m) > 0, err
}

func (r *UserGroupAggregate) GetMember(ctx context.Context, groupID string, userID string) ([]*model.UserGroupMember, error) {
	return r.members.Select(ctx, "user_id=? and group_id=?", userID, groupID)
}

func (r *UserGroupAggregate) GetMembers(ctx context.Context, groupID string) ([]*model.UserGroupMember, error) {
	return r.members.Select(ctx, "group_id=?", groupID)
}

This handles the surface level operations that can be invoked on a table. Separating the implementation for write and read operations can be classified as following the CQRS pattern.

Reads can be typically scaled independent of writes with approaches like in-memory caches, Redis, while due to consistency guarantees, indexing, database writes are harder to scale.

The usage of Repository somewhat guarantees that the correct underlying storage driver is in use. If we wrote raw SQL and added an *sqlx.DB into the struct, the storage for all 3 areas must be available on the same connection. In practical use, the user groups, members and permissions may be cached by the application, avoiding a storage hit altogether.

The IsMember function also carries a certain amount of validation:

  1. Does the group exist,
  2. Does the user exist,
  3. Is the user a member of the group.

The smaller GetMember function can be considered the data access layer for this particular business logic call. It’s also relatively easy to replace the implementation for GetMember as on the surface the only it provides is the known data model types in the response.

In practice there are some additional considerations for handling writes, like performing them in a database transaction. The important thing to note is that the business layer should not be aware of the underlying storage driver types like db.Tx, db.Conn and otherwise. These are all implementation details for a repository, and should not be exposed as a coupling through function arguments or otherwise.

Data model separation benefits

Ok, let’s asume you’re further down the implementation journey. You’ve structured your services into packages, your data model is separated and new code can be written that uses and extends it.

You decide to write a web application. A common pattern for that is that you get the business-layer data model and write a template that renders it into the desired view. What you need is to write a controller that retrieves data from storage layer into the business layer, to complete a common pattern called MVC - model, view, controller.

In the case of a REST API or similar, the controller is a http.Handler that translates business logic into the output format required. To summarize, when it comes to the data model we always consider the same path:

  1. Read data from storage layer into the business layer,
  2. Read data from the business layer to a view

When you consider the flow of data, the REST API receives and uses the same components that the templated view would. The data gets translated on each layer, adding or omitting details as needed.

If we take a simple microblogging platform to the extreme, we want to have an article table that carries an user_id field. The data model for each layer can be unique for example:

  • storage layer contains the user_id field,
  • business layer uses *model.User, going from id to a full object,
  • the view layer omits most detail and may just use user_id and user_name.

When you consider that the user model may contain sensitive information like email, the view has to take only the data required and limit field usage to a separate view specific model.

If you use the MVC pattern, or write gRPC services, REST APIs, these components of your system always follow the three package rule:

  1. model (reusable, build things on top)
  2. service (replicatable, new things of the same flavour)
  3. storage (interface for the model “at rest”)

These things are then integrated into a transport:

  • gRPC
  • REST APIs
  • Websocket APIs, SSE
  • CLI interfaces

I’m basically explaining that as long as your database model is isolated, accessible, it’s always possible to go from that into a new component or satisfy new transport requirements.

While the components can be laid out in a single package with conventions, it’s more appropriate to put them into stand alone packages that follow the explained patterns, reaching a higher level of stability, testing, consistency.

Following DDD principles on top allows you to maintain a glossary, easier to discuss the business logic when this glossary is aligned between the business shareholders and engineering teams.

An example of such a glossary is published by Docker. It defines concepts that have meaningful mappings to their offering, providing a common interpretation of image, container, layer and other terms specific to adopting and using docker.

Conclusion

Good software isn’t fools gold. Just like work itself where you’d break up a big task into smaller ones, you’d ideally have software which breaks itself up in sets of smaller components.

Keeping structures flat and simple is similar to factories. Factories have made insane progress in manufacturing by adopting a repeatable process. Software engineering when done right, takes a similar approach to divide and structure a code bases in similar ways. The abstraction of a http.Handler is able to provide new abstractions, for example a middleware may be defined as:

type MiddlewareFunc func(http.Handler) http.Handler

I’ve lifted this definition from gorilla/mux. By following this abstraction, everyone is able to create a new middleware, put it into a package, and set up tests that cover the required behaviour. And thus, gorilla/handlers was born. The implementation of those handlers is completely separated from gorilla/mux, but seemlessly integrated using composition.

When it comes to the smallest applications (microservices), they generally follow the same pattern for building it:

  1. get config/credentials from env,
  2. get storage connection,
  3. update or query data with a well known data model,
  4. render the response

In my experience there’s always some requirement that goes beyond the initial architecture, the consideration that should always be made is “how can i make a 1000 of these” while keeping things neatly sorted away.

How can you make this process repeatable?