## where it started
Every platform starts with simple auth. Ours was no different — a user signs up, creates a project, and has full control. Other people can be added as members with one of three roles: owner, admin, or member. A hardcoded hierarchy decided who could do what.
Then someone needed read-only access to logs. Then an agent needed deploy permissions but nothing else. Then a contractor needed temporary access to a database — not the whole project, just the database. The three-role model had nothing to say about any of this.
We needed IAM. But we also needed it to stay simple.
## the permission language
The simplest way to express "this person can do this thing to this resource" is a triple:
operations : resource-uri
Operations are named actions — read_file, deploy_service,
query_db. Resources are hierarchical paths that mirror the platform's structure:
/project/myapp/service/api
/project/myapp/file/config/database.yaml
/project/myapp/db
Globs give you range. ** matches any depth, * matches one segment:
*:/project/myapp/** — everything
We explicitly decided against deny rules. Allow-only, deny by default. More verbose when you want "everything except X," but it eliminates an entire class of bugs where interacting allow and deny rules produce surprising results.
## named roles and groups
Policies get names so you don't paste operation lists repeatedly:
project-admin : * : /project/myapp/**
Users never bind directly to roles. Everything goes through groups:
user/alice → group/backend-team
This felt like unnecessary indirection at first. But team management is the common case — add someone to a group and they get the right access. Revocation is atomic: remove them from the group and everything disappears.
## the personal group trick
Strict "groups only" creates a problem: what about one-off permissions? Creating a group for temporary database access is absurd.
The fix: every user gets an auto-created personal group.
One-off permissions bind to the personal group:
The system's rule stays consistent — users only bind to groups. But it feels like giving Alice direct access. The personal group is invisible plumbing.
This was the key insight that made the whole design work. One rule, two use cases, zero exceptions.
## delegation without complexity
How do you give someone permission to give permissions? You could invent meta-operations
like grant_read_file, but that scales horribly.
Instead, grant and revoke are just standard operations:
The system enforces scope — you can only grant access to resources you have
grant on. A file owner can delegate file access but can't grant service
permissions. Two universal operations replaced what could have been dozens of meta-operations.
## how a request gets authorized
When DELETE /projects/{id}/services/api hits the API:
-
01.
Authenticate — verify the Ed25519 signature.
-
02.
Map — translate to
delete_serviceon/project/myapp/service/api. -
03.
Resolve — one SQL query: user → groups → roles via JOINs.
-
04.
Evaluate — for each role, check operation match and resource pattern match.
-
05.
Decide — first match wins. No match means denied.
Platform admins bypass all of this — is_admin is an implicit
*:/**.
The whole evaluation is a single indexed query plus in-memory pattern matching. Sub-millisecond on SQLite.
## what we kept in mind
-
→
Explicit over abstract. No role inheritance, no deny rules. If something is allowed, there's a role that says so. You can always trace a permission back to a specific role bound to a specific group.
-
→
Personal groups solve the 80/20 problem. Team access uses regular groups. One-off access uses personal groups. These two patterns cover nearly every case.
-
→
Grant and revoke as operations, not meta-operations. Delegation works with the same primitives as everything else.
-
→
Backwards compatibility through migration, not abstraction. We migrated the old
project_memberstable into IAM roles and bindings, then removed it. One source of truth, no split brain. -
→
Start without caching. Two SQLite queries and pattern matching is fast enough on a single VPS. We'll add caching when we have evidence it's needed.
## what's next
-
→
Time-bound permissions —
expires_aton role bindings for temporary access without manual revocation. -
→
Policy caching — short-lived in-memory cache to eliminate redundant queries during request bursts.
-
→
Condition-based access — IP ranges, time of day, without changing the core schema.
But for now, the system does what we need: fine-grained, delegatable, auditable access control that doesn't make you want to throw your laptop out a window. That's about all you can ask from IAM.