When deleting an item that exists in [[DynamoDB]], you can delete it permanently from the database using the Delete API operation (a "hard" delete) or you can instead keep the item and set a `deleted` or `archived` flag attribute on it (a "soft" delete).
## Pros and cons of soft deletes
### Pros
- Being able to get back data that was incorrectly deleted due to a bug
- Enables an "undo" operation for users
- Keeps an archive of data for audit purposes
### Cons
Soft deletes do introduce several risks and complexities:
- The `deleted` attribute needs to be incorporated into PK and SK index designs in tables/GSIs, perhaps using a sparse index pattern.
- If not included in PK/SK fields, `deleted` items could be filtered out with a FilterExpression. But this would have performance and cost impact.
- Every DynamoDB query or GetItem request needs to keep this in mind that they only query for non-deleted items. There's a high risk a developer will forget to do this somewhere, and so deleted data is incorrectly returned to user.
- Storage cost of keeping around deleted items
- Denormalized copies of an item also need to have `deleted` flag set on them
- For deletion of aggregate root entities, the `deleted` flag will need to be set on all child entities
- Keeping data that a user has requested to be deleted is possibly a GDPR compliance violation
## Default to hard deletes
In terms of implementation and maintenance effort, hard deletes are definitely the easier option. Therefore, it seems like they should be the default unless some of the benefits listed above are a well-defined user requirement. Keeping every type of data item in the database "just in case" smells too much like [[YAGNI]].
## Alternative to flag attribute-based soft deletes
The main benefits of soft deletes centre around retention of the deleted data in archive form. The complexities mostly centre around the deleted data living beside the non-deleted data.
A simpler way of achieving this is to use a [[DynamoDB streams|DynamoDB stream]]. Application code would still perform "hard" deletes using the Delete API operation, while a Lambda function would watch this stream for DELETED events, and then selectively store deleted items in another table or even an S3 bucket.
---
tags: #DynamoDB