Your Mastodon instance is growing. Federation traffic is increasing, media storage is ballooning, and Sidekiq queues are getting longer. This guide covers the practical scaling strategies that instance admins use in 2026 to keep their servers fast and reliable as usage grows.
What You’ll Know by the End
- Where Mastodon instances typically bottleneck as they grow
- How to implement effective caching strategies
- Object storage setup and optimization for media
- Database scaling approaches for PostgreSQL
- Sidekiq tuning and background job management
- CDN integration for static assets and media
Where Mastodon Bottlenecks
Before optimizing, understand where the pressure comes from:
Sidekiq (background jobs): Federation delivery, media processing, email, and scheduled tasks all flow through Sidekiq. This is usually the first bottleneck.
PostgreSQL: The database handles user data, posts, relationships, and timeline generation. Complex queries (especially for timelines with many follows) can slow down.
Media storage: Every federated post with attachments means media stored on your server (or fetched on demand). This grows faster than most admins expect.
Web and streaming: The web process handles API requests and page rendering. The streaming process handles real-time WebSocket connections. Both scale with concurrent users.
Caching Strategies
Effective caching reduces load on your database and application servers.
Redis Caching
Mastodon uses Redis extensively for:
- Session storage
- Background job queues (Sidekiq)
- Timeline caching
- Rate limiting
Optimization tips:
- Monitor Redis memory usage; set a reasonable
maxmemorylimit - Use a dedicated Redis instance if you share Redis with other services
- Consider separate Redis instances for cache vs. persistent data (Sidekiq queues)
- Enable RDB snapshots for persistence but tune snapshot frequency to avoid performance spikes
HTTP Caching
Proper HTTP caching reduces server load significantly:
- Static assets: Set long cache headers for CSS, JS, and images. Mastodon fingerprints assets, so you can cache them aggressively.
- API responses: Some API responses include cache headers. Ensure your reverse proxy (Nginx) respects them.
- Media proxy cache: Configure Nginx to cache proxied media from remote instances. This reduces repeated fetches.
Application-Level Caching
Mastodon caches timelines, account relationships, and other frequently accessed data in Redis. You can tune:
MAX_THREADSfor the web process to handle more concurrent requests- Connection pooling for PostgreSQL to reduce connection overhead
- Streaming API connection limits based on your expected concurrent user count
Object Storage Deep Dive
Media storage is the most common scaling challenge. Moving from local disk to object storage is essential for any growing instance.
Why Object Storage
- Separates compute from storage: Your server’s disk does not fill up with media files
- Cost-effective: S3-compatible storage is cheap per GB compared to VPS disk
- CDN-friendly: Object storage integrates naturally with CDN distribution
- Scalable: No practical storage limits compared to local disk
Setup Process
- Choose a provider (S3, Backblaze B2, Wasabi, MinIO for self-hosted)
- Create a bucket with appropriate access policies
- Configure Mastodon’s
.env.productionwith storage credentials - Migrate existing local media using
tootctl mediacommands - Verify federation and media serving work correctly
Cost Management
Media storage costs grow with your instance:
- Remote media cache: Mastodon caches media from federated posts. Use
tootctl media removeto periodically clean old remote media. - Storage tiers: Some providers offer cold storage for rarely accessed media at lower cost.
- Lifecycle policies: Configure automatic deletion of old cached media in your storage provider.
Database Scaling
PostgreSQL is Mastodon’s backbone. As your instance grows:
Connection Management
- Use PgBouncer as a connection pooler between Mastodon and PostgreSQL
- This reduces the number of direct database connections and improves performance under load
Index Optimization
Mastodon includes database indexes for common queries, but as your data grows:
- Run
VACUUM ANALYZEregularly (or configure autovacuum properly) - Monitor slow queries using
pg_stat_statements - Consider partial indexes for your specific usage patterns
Read Replicas
For larger instances, PostgreSQL read replicas can offload read queries:
- Configure Mastodon to use a read replica for timeline and search queries
- Keep writes on the primary
- This significantly reduces primary database load
Maintenance
- Automate daily backups (pg_dump or WAL-based continuous backup)
- Test restore procedures quarterly
- Monitor disk usage and connection counts
- Plan for major version upgrades (test thoroughly before upgrading)
Sidekiq Tuning
Sidekiq is Mastodon’s job processor. Tuning it is critical for federation performance.
Queue Priorities
Mastodon uses multiple Sidekiq queues with different priorities:
- default: Most federation and processing jobs
- push: Outbound federation delivery
- pull: Inbound content fetching
- mailers: Email delivery
- scheduler: Periodic tasks
Scaling Workers
- Increase the number of Sidekiq threads for higher throughput
- Run multiple Sidekiq processes with different queue assignments
- Monitor queue latency — if jobs wait more than a few minutes, add capacity
- Consider dedicated Sidekiq servers for large instances
Queue Monitoring
Track these metrics:
- Queue depth (how many jobs are waiting)
- Processing rate (jobs per second)
- Error rate (failed jobs)
- Retry queue size (jobs that failed and are retrying)
Our developer notes discuss monitoring approaches in more detail.
CDN Integration
A CDN (Content Delivery Network) dramatically improves media delivery performance:
Static Assets
Serve Mastodon’s CSS, JavaScript, and images through a CDN:
- Configure
CDN_HOSTin your environment - The CDN pulls from your server and caches globally
- Users worldwide get faster page loads
Media CDN
For media (user uploads and federated media):
- Point your object storage bucket through a CDN
- Configure appropriate cache headers
- Set up custom domain for cleaner URLs
Benefits
- Reduced bandwidth on your origin server
- Faster media loading for users worldwide
- Protection against traffic spikes
- Lower overall bandwidth costs (CDN bandwidth is often cheaper at scale)
Monitoring and Alerting
Set up monitoring before you need it:
- Server metrics: CPU, RAM, disk, network (Prometheus + Grafana is popular)
- Application metrics: Request latency, error rates, queue depths
- Database metrics: Connection count, query performance, disk usage
- Alerts: Set thresholds for critical metrics (disk space, queue depth, error rate)
Common Mistakes
- Waiting too long to move to object storage: Start with object storage; retrofitting is painful
- Not monitoring Sidekiq queues: Federation delays are invisible until you look at the queue
- Ignoring media cleanup: Remote media cache can consume terabytes if not managed
- Over-optimizing prematurely: Profile first, then optimize the actual bottleneck
- Skipping connection pooling: PgBouncer is almost always worth the setup effort
Frequently Asked Questions
How many users before I need to scale? It depends on activity patterns more than user count. An instance with 100 very active users may need more resources than one with 500 casual users. Monitor your metrics and scale based on actual load.
Is Docker or bare-metal better for scaling? Both work. Docker simplifies deployment and scaling of individual services. Bare-metal gives you more control. For large instances, Kubernetes deployments are becoming more common. See our tools page for infrastructure resources.
How much does media storage cost? Costs vary by provider and usage. A moderately active instance might use 50–200 GB per month in new media. With remote media cleanup, total storage stays manageable. Budget accordingly.
Can I scale horizontally? Yes. Mastodon supports running multiple web processes, streaming servers, and Sidekiq workers across multiple machines. The database is the main coordination point.
When should I consider managed hosting instead? If infrastructure management takes more time than you want to spend, managed hosting services handle scaling for you. This is a valid choice at any scale. Check our articles hub for hosting guides.
How do relays affect scaling? Relays increase federation traffic and media storage. They improve content diversity but add load. Subscribe to relays judiciously and monitor the impact on your fediverse instance’s resources.