If you're building a [[SaaS]] product with [[GRANDStack]] and a [[Neo4j]] database, you'll need to consider how to isolate data of your multiple customers (tenants) from each other.
The key considerations here are:
1. No customer should be able to access another's data
2. Related to 1, it should be difficult for a developer to accidentally leak another customer's data via a bug in a query
3. Any per-tenant hosting costs that would be incurred. Ideally costs will be usage-based and not increase linearly for each tenant
4. Avoiding configuration management sprawl as new tenants are added
## Neo4j multi-tenancy
There are a few options for doing multi-tenancy in Neo4j (excluding [[Apollo GraphQL]] Server for now):
1. Use separate databases (Neo4j v4.0+). See [worked example](https://neo4j.com/developer/multi-tenancy-worked-example/) from official docs
2. Use a single database with [node labels](https://grandstack.io/docs/neo4j-graphql-js-middleware-authorization/#additionallabels) containing a tenantId (see [these forum comments](https://community.neo4j.com/t/proper-way-to-implement-multi-tenancy-on-neo4j/625/10))
3. Use single database but different users (dynamically created, one per tenant), with restricted subgraph access control.
## GRANDStack multi-tenancy
Options:
1. Separate GraphQL endpoint for each tenant, each accessing a separate database.
- Pros:
- Can provide each client with a unique customised URL
- Maximum data isolation
- Cons:
- Very large overhead in terms of code to provision a new tenant (create new API Gateway, Lambda and Neo4j database)
- Configuration management
- Code changes and dependency patches need to be applied separately to all endpoints
- Cost management and operational monitoring is much more difficult
2. Single GraphQL endpoint with ability to switch between multiple databases
- Pros:
- Single URL is easiest to share with customers (e.g. in automated emails)
- Strong data isolation due to separate databases
- Cons:
- Overhead of dynamically provisioning a new Neo4j database for new tenant
- Multiple databases to monitor (for performance, storage and costings)
- Per-customer databases makes it more difficult to replicate in pre-production environments
3. Single GraphQL endpoint with access to single database and ability to apply sub-graph filters based on, say, a JWT token claim
- Pros:
- Minimal overhead when creating a new tenant
- Least configuration management (connection strings, database users/passwords, etc)
- Easiest to monitor
- Cons:
- Least data isolation. Relies moreso on good practices on behalf of developer to ensure no data leaks cross-tenant.
- Database limitations of compute and storage could be hit quicker
---
tags: