A straight forward guide to Replicache
Published on • 11 min read
Ever felt the magic apps like Linear can create? No loading spinners, instant changes, or real-time sync. These are just some of the benefits a local first architecture provides. Not to mention the improved developer experience once the foundations are in place. One of the latest products I’m building presented the perfect opportunity to give this a shot - so I did a deep dive into this new approach using a platform called Replicache.
It was a bit to get my head around initially, and I found most of the examples provided hard to translate into a more traditional SaaS application. Chat rooms are great at showcasing the concept, but how does that apply to multi-tenant business workflow apps? So I’ve put together the guide I would’ve found useful if I were starting again.
What we’ll be building
For this example, we’ll keep it simple and build a project management app. The basic data structure will be:
- A workspace that could represent a company or organisation.
- Users who belong to a workspace
- Project that also belongs to a workspace
- Task, which belongs to a project and workspace
The key is for each workspace’s data to remain private from other workspaces. We don’t want to accidentally expose all these important projects and tasks to users who shouldn’t see them! We also want to ensure the users of a workspace always have the latest data on their local device, so syncing with a “partition” between each workspace will be required.
Our tech stack will be Next.js and React with Typescript, Supabase for authentication, real-time broadcasts and the database, Drizzle for the ORM, and Legend for state management. But any auth provider, Postgres database, or similar tech stack should work. An in-depth guide to setting these up is probably beyond the scope of this article, so be sure to check out some of the great resources available if you have any specific issues with them.
It’s also worth mentioning not all applications are best suited to a local first approach. If you’re dealing with a lot of data, or user experience isn’t a competitive advantage (think internal business tools), then this might not be the right path to take. But if it is, then great!
How does Replicache work?
Before we get started, let’s take a second to understand the architecture at a high level. There are a couple of ways to set things up, but for our multi-tenant application, we’ll be focusing on the per-space version strategy. Once you understand this though, the same concept applies to other strategies as well.
High level
The server and client each have a goal. The server’s goal is to maintain the source of truth about all our data (done through Postgres) and the client’s goal is to keep its local data in sync with that source of truth (done through the Replicache client and IndexDB). It’s important to understand that Replicache is entirely client-side. They’ll handle all the work required to keep that IndexDB store in sync with your remote database, but it’s up to you to implement the correct logic via a push and pull API endpoint. These are the only two endpoints the Replicache client will communicate with. Similarly, all client-side CRUD actions in your application will be performed through the Replicache client.
When it comes to keeping everything in sync, the basic idea is for each workspace to contain its own version
number. This number will increment whenever data related to that workspace changes. The Replicache client also keeps a record of the workspace version number (they call it a cookie
) which is essentially saying “This is the last version of the workspace I know about and have successfully synced”. So what about all our projects and tasks then? They also have what’s called a last_modified_version
against each record. This represents, as the name suggests, the version
of the workspace when the record was either created, updated, or deleted.
First-time user
If you’re following along at this point, awesome. If not, don’t worry because I think walking through an example will demonstrate the concept more clearly. So let’s walk through what happens in a typical user journey of someone logging in for the first time. I’ll use the term “client” to represent an instance of a Replicache client for brevity. Here’s what we’re starting with:
Item | User Version | Server Version |
---|---|---|
Workspace | 0 | 2 |
Project A | n/a | 1 |
Task A | n/a | 2 |
- The user logs in to their workspace and immediately we create a new client. This client is unique to the workspace, user, browser, and sometimes tabs, but we worry about that right now. We provide our data schema to the client so it knows how the set up the IndexDB, and importantly its
cookie
is0
because it’s never synced with the database before. - The client then sends a request to the “pull” endpoint to get up to speed with the source of truth. This isn’t a new workspace, so there are already projects and tasks. The client says to the server “Hey, the last version of the workspace I’ve seen was
0
. Please send me all the data that’s changed since then”. - The server then gets all any projects and tasks that have a
last_modified_version
greater than0
- which would be everything. The workspace’s version is currently at2
(meaning there have been 2 mutations to data related to this workspace since it was created), so it also includes that in its response back to the client as well. - The server returns its response to the client and says “Here’s all the data you need. You’re now up to date with the latest version of the workspace, which is
2
”. - The client adds all the projects and tasks to its local database, as well as the version of the workspace
2
.
The client is now in sync and can display all the projects and tasks to the user.
Item | User Version | Server Version |
---|---|---|
Workspace | 2 | 2 |
Project A | 1 | 1 |
Task A | 2 | 2 |
Great! Let’s run through what happens when this user creates a new task.
Handling mutations
- The user creates a new task using the local mutators. These are run through the client, and the task is immediately added to the local database. This is what’s called an “optimistic mutation” as we’re communicating to the user that the task has been added successfully, even though nothing has been done against our source of truth on the server. The user sees the new task and continues as normal.
Item | User Version | Server Version |
---|---|---|
Workspace | 2 | 2 |
Project A | 1 | 1 |
Task A | 2 | 2 |
Task B | null | n/a |
-
In the background, the client then sends a request to the “push” endpoint with this new task. It says “Hey, here’s a new thing I’ve created. Please add it to the source of truth”. In addition to this, the client created its ID for this mutation of adding a task. It has a local record of it for reference, and also includes that in its request. “here’s an ID for this mutation. I’ll leave it with you for now, but keep a record of it”.
-
The server will increment the
version
of the workspace, create a task in the database with this new version as thelast_modified_version
, and save the mutation ID in a separate table against the ID of the client. This is all done in a database transaction, so all these steps must be successful for it to work. It’s important to note, that this “last mutation id” is separate from the “last modified version” and workspace “version” that’s been talking about. It’s unique to this instance of the replicate client. -
The server then sends a simple
200
response back to the client. The data now looks like this:
Item | User Version | Server Version |
---|---|---|
Workspace | 2 | 3 |
Project A | 1 | 1 |
Task A | 2 | 2 |
Task B | null | 3 |
Notice how the client is technically still out of sync? It has a new task, but it’s still optimistic. So, how do we fix that? Through a pull of course!
Syncing changes from mutations
-
In the background, the client requests to the “pull” endpoint. This is done either on a polling interval, real-time WebSocket “poke”, or when the user refreshes the page. This time, the client says “Hey, the last version of the workspace I’ve seen was
2
(from the initial sync). Please send me all the data that’s changed since then”. -
The server goes and gets all projects and tasks with a
last_modified_version
greater than 2. In this case, it’s just “Task B” from earlier. This is exactly what we need, because this mutation is still unconfirmed for the client, while “Project A” and “Task A” haven’t changed since the initial sync. -
The server packages this up and sends Task B back to the client. It also includes the
last_mutation_id
and the latest workspaceversion
in the response. It says “Here’s all the data you need to apply to your local copy of the database. You’re now up to sync with the latest version of the workspace which is3
. Oh, and that last mutation you sent me for creating Task B, was successfully processed”. -
As the mutation is confirmed, the client can safely forget about it. If for some reason there was an issue, the client would send that mutation back to the server the next time it pushed. A handy fallback feature. And while it may seem obvious the mutation was processed (as the new task was included in the response) other data, or even changes to Task B from another user may have taken place since the push, so this is required to be sure.
-
The client then applies the response from the server to its local database, and it’s once again in sync.
Item | User Version | Server Version |
---|---|---|
Workspace | 3 | 3 |
Project A | 1 | 1 |
Task A | 2 | 2 |
Task B | 3 | 3 |
🚧 Everything below is still a work in progress. I’m hoping to get some full code examples together for a detailed walkthrough.
-
Walk through next time open app
-
Walk through poke with another user
What about other workspaces? Well from the users point of view, it doesn’t really matter. They don’t need to know when data from other workspaces change, which is why the version is separated by workspace. We’ve also skipped over a few details, but
Building our app
So, let’s get started building. The final repo can be found here, but I recommend following along step by step so it clicks.
Project setup
- init next app, install dependencies
- Get Replicache key
- Setup database schema.
- Setup login page and slug page for workspace
- Middleware and auth (link to supabase guide)
Initiate the Replicache client
- Client side replicache wrapper over slug layout
- Saving to state using legend
Pull endpoint
- Setup the pull sync
- Authentication
Client side mutators
- Replicache on rails
Push endpoint
- Duplicate the mutators on push endpoint
- Extra security using server side data about the user
- Transactional
Common client side patterns
- Showing lists
- Single item
- Filtering and combining
- Creating, updating, auto binding
Poke
- Supabase realtime channel per space to let users know they need to pull again
- Reason it’s a broadcast (no rls required and not sending data around) Doesn’t matter if people see it
Handling changes not initialed by users
- Like cron jobs
Common issues
- Not saving replicache client and creating new one each time
- client side schema versioning
- Adding the read items to the list of mutators
- The deleting data structure
- Validators
- Debounce on things like text fields.
- Needing to add a last_modified_version against the workspace, and set it to 1 so it get’s synced as well
Areas for improvement
There are still a few things I feel could use with some improvement or I’m trying to figure out. These will become more clear over time as the project grows and I start to see common patterns arise.
- Default data for new thing
- Having more uniform approach to using the mutators and observables. E.g. when to prop drill the entire object, or just pass around the id and get a new observable each time. This is probably more of a React problem that you’d need to think about with other state management systems.
- Editing data in the database directly
- Better validation and type safety on the mutators. Or an easier way to abstract and be based off the drizzle schema.
- Handling storage and files. Will the same level of speed be expected here? Will they need to be synced as well? Or is a remote option okay?
- Seeing how it performs once a space has a lot more data, and if there’s way to “archive” things and show them using a more traditional approach. I particularly like this talk from Linear about the problems they ran into scaling their realtime sync.
Summary
Once you get the sync in place, the improved developer experience starts to become clear. It’s a different model to what I’ve usually been dealing with, but the benefits of not having to create api endpoints / server actions for everything, and handle loading states etc. make the usually mundane CRUD tasks much easier. I can really start to focus on building a great user experience.
If you’re starting a new project and the use case makes sense for a local first approach, I’d definitely recommend giving it a try.