Results at a Glance
40%
Cost Reduction
AWS bill cut in 14 days
$3,800
Monthly Savings
From $9,400 to $5,600/mo
$45,600
Annual Savings
Recurring year-over-year
1. Client Overview
Our client is a US-based healthcare technology company operating a patient data management platform serving hospitals and outpatient clinics across three states. Their platform handles protected health information (PHI), appointment scheduling, clinical note storage, and real-time provider communications — all under strict HIPAA regulatory requirements.
With a growing user base and an engineering team fully committed to product development, infrastructure management had taken a back seat. The result: an AWS bill that had quietly ballooned to over $9,400 per month — and nobody on the team could explain exactly why.
2. The Problem: $9,400/Month and No Idea Why
When the client’s CTO first contacted Skynats, the conversation started with a telling admission: “We know we’re overpaying. We just don’t know where.”
This is one of the most common situations we encounter. Fast-growing teams provision infrastructure to meet immediate needs and move on. Reserved Instances expire. Test environments stay running. Databases keep their original spec even after traffic patterns shift. Over time, the bill grows — not from bad decisions, but from decisions that were never revisited.
The compounding factor for this client was compliance. Any infrastructure change had to preserve full HIPAA compliance — encryption at rest and in transit, access controls, audit logging, and data isolation. This meant a standard cost optimisation playbook couldn’t simply be applied — every change required validation against the compliance framework first.
🎯 The Core Challenge
Reduce a $9,400/month AWS bill by a meaningful percentage — with full HIPAA compliance maintained at every step, zero downtime, and no changes to application code.
3. The Skynats AWS Infrastructure Audit
Before touching anything, our engineers spent the first three days conducting a full read-only infrastructure audit. This is a non-negotiable step — acting without a complete picture is how optimisation projects cause outages.
Our audit covered:
- EC2 fleet analysis: Instance types, utilisation metrics from CloudWatch, rightsizing opportunities, Reserved Instance coverage, and Spot Instance suitability per workload.
- RDS database review: Instance classes, storage provisioning, Multi-AZ configuration, backup retention, and read replica deployment.
- Data transfer mapping: Cross-AZ traffic, NAT Gateway usage, CloudFront coverage, and inter-region transfer costs.
- Zombie resource sweep: Unattached EBS volumes, idle Elastic IPs, unused load balancers, and forgotten snapshots.
- Auto-scaling review: Scaling policies, minimum/maximum thresholds, and schedule-based scaling opportunities.
- HIPAA compliance layer: Encryption status of all data stores, CloudTrail audit log coverage, VPC flow logs, IAM permission scope, and security group configurations.
4. Key Findings
| Finding | Detail |
|---|---|
| EC2 Over-provisioning | 14 of 22 EC2 instances running at under 12% average CPU utilisation. Six production instances sized 2–3 tiers larger than required. Legacy m4/c4 families carrying a 25–35% price premium over current-generation equivalents. |
| No Reserved Instance Coverage | 100% of compute running On-Demand — the most expensive configuration available on AWS for a production workload with predictable base load. |
| RDS Sizing Mismatch | Primary RDS instance (db.r5.2xlarge) sized for peak load projections from 18 months prior. Actual database CPU rarely exceeded 22%. An unused read replica was still running and being billed. |
| Uncontrolled Data Transfer | EC2 instances communicating across AZ boundaries for non-critical tasks. Misconfigured S3 endpoint routing through the public internet instead of VPC endpoint — generating unnecessary NAT Gateway charges. |
| Zombie Resources | 11 unattached EBS volumes (2.4TB), 4 idle Elastic IPs, 2 load balancers with no registered targets, 847GB of unmanaged EBS snapshots — all generating ongoing charges. |
| No Auto-Scaling | Application layer had no auto-scaling groups. Fixed capacity running 24/7 at peak provisioning — even during overnight hours when traffic dropped to under 8% of peak. |
5. The Optimisation — What We Did
With the audit complete and every proposed change reviewed against HIPAA requirements, we executed the optimisation in a sequenced, low-risk order. Changes that could cause service disruption were scheduled during off-peak hours with the client’s engineering team on standby.
5.1 EC2 Right-Sizing and Instance Modernisation
We right-sized 14 EC2 instances, moving each to the appropriate instance size with 30% headroom for traffic spikes, and migrated all instances from legacy m4/c4 families to current-generation m6i/c6i equivalents. Every instance change was preceded by a 72-hour CloudWatch monitoring window and followed by a 48-hour post-change validation period to confirm performance and HIPAA logging continuity.
5.2 Reserved Instances and Savings Plans
We analysed 12 months of usage data to identify the stable base load suitable for Reserved Instance pricing, and configured Compute Savings Plans for variable workloads.
💰 Savings Breakdown: Compute Commitments
- 1-year Reserved Instances on 8 stable production instances: 42% discount vs On-Demand
- Compute Savings Plans on the remaining flexible workload: 20–28% discount
- No upfront payment required — all structured as monthly billing
5.3 RDS Optimisation
We downsized the primary RDS instance from db.r5.2xlarge to db.r6g.large — a two-tier size reduction that still provided 40% headroom above observed peak utilisation. The move to the r6g (Graviton2) family delivered an additional 10–15% cost reduction. The unused read replica was decommissioned after confirming no application queries were targeting its endpoint.
All changes maintained HIPAA compliance: encryption at rest using AWS KMS was preserved and validated post-migration, automated backups retained for 35 days, and CloudTrail logging confirmed active throughout.
5.4 Data Transfer Routing Fix
We corrected the S3 VPC endpoint misconfiguration, routing all S3 traffic through the private VPC endpoint and eliminating NAT Gateway charges on every S3 operation. This single fix reduced the monthly NAT Gateway line item by 68%.
5.5 Zombie Resource Cleanup
After client sign-off, we removed 11 unattached EBS volumes (2.4TB), 4 idle Elastic IPs, 2 load balancers with zero registered targets, and 847GB of EBS snapshots outside the client’s own retention policy.
5.6 Auto-Scaling Configuration
We implemented Auto Scaling Groups for the application tier with target tracking (scale out at 65% CPU, scale in at 35%) and scheduled scaling that reduced minimum instance count during confirmed low-traffic windows (02:00–06:00 UTC). Overnight instance count reduced by up to 60%, with full automatic scale-out ahead of morning traffic ramp.
6. Results
| AWS Service | Before | After |
|---|---|---|
| EC2 Compute | $4,820/mo | $2,340/mo |
| RDS Database | $1,940/mo | $810/mo |
| Data Transfer | $1,280/mo | $410/mo |
| EBS Storage | $760/mo | $320/mo |
| Idle / Zombie Resources | $600/mo | $0/mo |
| TOTAL | $9,400/mo | $5,600/mo ✓ |
✅ Final Result
- Monthly saving: $3,800 (40.4% reduction)
- Annualised saving: $45,600
- HIPAA compliance: Fully maintained — confirmed by post-optimisation compliance review
- Application performance: Unchanged — zero degradation in response times or error rates
- Downtime: Zero
7. HIPAA Compliance — Maintained Throughout
Every infrastructure change was assessed against the HIPAA Security Rule before execution. Post-optimisation compliance verification confirmed:
- Encryption: All EBS volumes, RDS instances, and S3 buckets confirmed encrypted at rest (AES-256 / AWS KMS). All data in transit encrypted via TLS 1.2+.
- Access controls: IAM roles reviewed and scoped to least-privilege. No new permissions introduced during optimisation.
- Audit logging: CloudTrail, CloudWatch Logs, and VPC Flow Logs confirmed active and uninterrupted throughout all changes.
- Backup integrity: RDS automated backups and EBS snapshot policies confirmed operational post-resize. 35-day retention maintained.
The zombie resource cleanup also improved the compliance posture — removing idle resources reduced the attack surface, and the corrected S3 routing eliminated a potential data exfiltration vector.
8. Lessons for Healthcare and Fintech Teams
Four patterns appear repeatedly in regulated-industry AWS environments:
- Reserved Instance gaps are the single largest cost lever. A 1-year, no-upfront Reserved Instance pays for itself in month five.
- Data transfer costs are invisible until they’re not. The S3 VPC endpoint fix took 20 minutes and saved over $600/month.
- Auto-scaling is not optional at scale. Fixed-capacity fleets leave 30–60% of compute spend on the table every night.
- Compliance and cost optimisation are not in conflict. In this engagement, the optimisation improved the compliance posture.
Is Your AWS Bill Telling You the Same Story?
If your cloud bill is growing faster than your user base — or you simply can’t explain what each line item is paying for — it’s fixable, and faster than you expect.
🔍 Your free Skynats AWS audit includes:
- Full breakdown of current AWS spend by service and resource
- Top 5 cost reduction opportunities, prioritised by impact
- Compliance risk assessment (HIPAA, PCI-DSS, ISO 27001)
- Implementation timeline and expected savings estimate
- No obligation. No access to billing or application code required.