Scaling Your Mastodon Instance: Caching Strategies and Object Storage Explained

Your Mastodon instance is growing. Federation traffic is increasing, media storage is ballooning, and Sidekiq queues are getting longer. This guide covers the practical scaling strategies that instance admins use in 2026 to keep their servers fast and reliable as usage grows.

What You’ll Know by the End

Where Mastodon instances typically bottleneck as they grow
How to implement effective caching strategies
Object storage setup and optimization for media
Database scaling approaches for PostgreSQL
Sidekiq tuning and background job management
CDN integration for static assets and media

Where Mastodon Bottlenecks

Before optimizing, understand where the pressure comes from:

Sidekiq (background jobs): Federation delivery, media processing, email, and scheduled tasks all flow through Sidekiq. This is usually the first bottleneck.

PostgreSQL: The database handles user data, posts, relationships, and timeline generation. Complex queries (especially for timelines with many follows) can slow down.

Media storage: Every federated post with attachments means media stored on your server (or fetched on demand). This grows faster than most admins expect.

Web and streaming: The web process handles API requests and page rendering. The streaming process handles real-time WebSocket connections. Both scale with concurrent users.

Caching Strategies

Effective caching reduces load on your database and application servers.

Redis Caching

Mastodon uses Redis extensively for:

Session storage
Background job queues (Sidekiq)
Timeline caching
Rate limiting

Optimization tips:

Monitor Redis memory usage; set a reasonable maxmemory limit
Use a dedicated Redis instance if you share Redis with other services
Consider separate Redis instances for cache vs. persistent data (Sidekiq queues)
Enable RDB snapshots for persistence but tune snapshot frequency to avoid performance spikes

HTTP Caching

Proper HTTP caching reduces server load significantly:

Static assets: Set long cache headers for CSS, JS, and images. Mastodon fingerprints assets, so you can cache them aggressively.
API responses: Some API responses include cache headers. Ensure your reverse proxy (Nginx) respects them.
Media proxy cache: Configure Nginx to cache proxied media from remote instances. This reduces repeated fetches.

Application-Level Caching

Mastodon caches timelines, account relationships, and other frequently accessed data in Redis. You can tune:

MAX_THREADS for the web process to handle more concurrent requests
Connection pooling for PostgreSQL to reduce connection overhead
Streaming API connection limits based on your expected concurrent user count

Object Storage Deep Dive

Media storage is the most common scaling challenge. Moving from local disk to object storage is essential for any growing instance.

Why Object Storage

Separates compute from storage: Your server’s disk does not fill up with media files
Cost-effective: S3-compatible storage is cheap per GB compared to VPS disk
CDN-friendly: Object storage integrates naturally with CDN distribution
Scalable: No practical storage limits compared to local disk

Setup Process

Choose a provider (S3, Backblaze B2, Wasabi, MinIO for self-hosted)
Create a bucket with appropriate access policies
Configure Mastodon’s .env.production with storage credentials
Migrate existing local media using tootctl media commands
Verify federation and media serving work correctly

Cost Management

Media storage costs grow with your instance:

Remote media cache: Mastodon caches media from federated posts. Use tootctl media remove to periodically clean old remote media.
Storage tiers: Some providers offer cold storage for rarely accessed media at lower cost.
Lifecycle policies: Configure automatic deletion of old cached media in your storage provider.

Database Scaling

PostgreSQL is Mastodon’s backbone. As your instance grows:

Connection Management

Use PgBouncer as a connection pooler between Mastodon and PostgreSQL
This reduces the number of direct database connections and improves performance under load

Index Optimization

Mastodon includes database indexes for common queries, but as your data grows:

Run VACUUM ANALYZE regularly (or configure autovacuum properly)
Monitor slow queries using pg_stat_statements
Consider partial indexes for your specific usage patterns

Read Replicas

For larger instances, PostgreSQL read replicas can offload read queries:

Configure Mastodon to use a read replica for timeline and search queries
Keep writes on the primary
This significantly reduces primary database load

Maintenance

Automate daily backups (pg_dump or WAL-based continuous backup)
Test restore procedures quarterly
Monitor disk usage and connection counts
Plan for major version upgrades (test thoroughly before upgrading)

Sidekiq Tuning

Sidekiq is Mastodon’s job processor. Tuning it is critical for federation performance.

Queue Priorities

Mastodon uses multiple Sidekiq queues with different priorities:

default: Most federation and processing jobs
push: Outbound federation delivery
pull: Inbound content fetching
mailers: Email delivery
scheduler: Periodic tasks

Scaling Workers

Increase the number of Sidekiq threads for higher throughput
Run multiple Sidekiq processes with different queue assignments
Monitor queue latency — if jobs wait more than a few minutes, add capacity
Consider dedicated Sidekiq servers for large instances

Queue Monitoring

Track these metrics:

Queue depth (how many jobs are waiting)
Processing rate (jobs per second)
Error rate (failed jobs)
Retry queue size (jobs that failed and are retrying)

Our developer notes discuss monitoring approaches in more detail.

CDN Integration

A CDN (Content Delivery Network) dramatically improves media delivery performance:

Static Assets

Serve Mastodon’s CSS, JavaScript, and images through a CDN:

Configure CDN_HOST in your environment
The CDN pulls from your server and caches globally
Users worldwide get faster page loads

Media CDN

For media (user uploads and federated media):

Point your object storage bucket through a CDN
Configure appropriate cache headers
Set up custom domain for cleaner URLs

Benefits

Reduced bandwidth on your origin server
Faster media loading for users worldwide
Protection against traffic spikes
Lower overall bandwidth costs (CDN bandwidth is often cheaper at scale)

Monitoring and Alerting

Set up monitoring before you need it:

Server metrics: CPU, RAM, disk, network (Prometheus + Grafana is popular)
Application metrics: Request latency, error rates, queue depths
Database metrics: Connection count, query performance, disk usage
Alerts: Set thresholds for critical metrics (disk space, queue depth, error rate)

Common Mistakes

Waiting too long to move to object storage: Start with object storage; retrofitting is painful
Not monitoring Sidekiq queues: Federation delays are invisible until you look at the queue
Ignoring media cleanup: Remote media cache can consume terabytes if not managed
Over-optimizing prematurely: Profile first, then optimize the actual bottleneck
Skipping connection pooling: PgBouncer is almost always worth the setup effort

Frequently Asked Questions

How many users before I need to scale? It depends on activity patterns more than user count. An instance with 100 very active users may need more resources than one with 500 casual users. Monitor your metrics and scale based on actual load.

Is Docker or bare-metal better for scaling? Both work. Docker simplifies deployment and scaling of individual services. Bare-metal gives you more control. For large instances, Kubernetes deployments are becoming more common. See our tools page for infrastructure resources.

How much does media storage cost? Costs vary by provider and usage. A moderately active instance might use 50–200 GB per month in new media. With remote media cleanup, total storage stays manageable. Budget accordingly.

Can I scale horizontally? Yes. Mastodon supports running multiple web processes, streaming servers, and Sidekiq workers across multiple machines. The database is the main coordination point.

When should I consider managed hosting instead? If infrastructure management takes more time than you want to spend, managed hosting services handle scaling for you. This is a valid choice at any scale. Check our articles hub for hosting guides.

How do relays affect scaling? Relays increase federation traffic and media storage. They improve content diversity but add load. Subscribe to relays judiciously and monitor the impact on your fediverse instance’s resources.