How to support multiple tenants in Neo4j and GRANDStack

If you're building a [[SaaS]] product with [[GRANDStack]] and a [[Neo4j]] database, you'll need to consider how to isolate data of your multiple customers (tenants) from each other. The key considerations here are: 1. No customer should be able to access another's data 2. Related to 1, it should be difficult for a developer to accidentally leak another customer's data via a bug in a query 3. Any per-tenant hosting costs that would be incurred. Ideally costs will be usage-based and not increase linearly for each tenant 4. Avoiding configuration management sprawl as new tenants are added ## Neo4j multi-tenancy There are a few options for doing multi-tenancy in Neo4j (excluding [[Apollo GraphQL]] Server for now): 1. Use separate databases (Neo4j v4.0+). See [worked example](https://neo4j.com/developer/multi-tenancy-worked-example/) from official docs 2. Use a single database with [node labels](https://grandstack.io/docs/neo4j-graphql-js-middleware-authorization/#additionallabels) containing a tenantId (see [these forum comments](https://community.neo4j.com/t/proper-way-to-implement-multi-tenancy-on-neo4j/625/10)) 3. Use single database but different users (dynamically created, one per tenant), with restricted subgraph access control. ## GRANDStack multi-tenancy Options: 1. Separate GraphQL endpoint for each tenant, each accessing a separate database. - Pros: - Can provide each client with a unique customised URL - Maximum data isolation - Cons: - Very large overhead in terms of code to provision a new tenant (create new API Gateway, Lambda and Neo4j database) - Configuration management - Code changes and dependency patches need to be applied separately to all endpoints - Cost management and operational monitoring is much more difficult 2. Single GraphQL endpoint with ability to switch between multiple databases - Pros: - Single URL is easiest to share with customers (e.g. in automated emails) - Strong data isolation due to separate databases - Cons: - Overhead of dynamically provisioning a new Neo4j database for new tenant - Multiple databases to monitor (for performance, storage and costings) - Per-customer databases makes it more difficult to replicate in pre-production environments 3. Single GraphQL endpoint with access to single database and ability to apply sub-graph filters based on, say, a JWT token claim - Pros: - Minimal overhead when creating a new tenant - Least configuration management (connection strings, database users/passwords, etc) - Easiest to monitor - Cons: - Least data isolation. Relies moreso on good practices on behalf of developer to ensure no data leaks cross-tenant. - Database limitations of compute and storage could be hit quicker --- tags: