Multi-Cloud Cost Optimization: FinOps Strategies for AWS, Azure & IBM Cloud

Posted on July 15, 2025 • 5 minutes • 1056 words

Introduction

Cloud costs can spiral quickly. At IBM, we managed cloud spending across AWS, Azure, and IBM Cloud for 200+ teams. Without proper FinOps governance, organizations waste 30-40% on unused resources.

This article shares proven strategies for multi-cloud cost optimization.

FinOps Framework

The Three Pillars

Inform: Visibility into cloud spending
Optimize: Reduce waste without sacrificing performance
Operate: Continuous cost management

Implementing Cost Visibility

1. Tagging Strategy

# Standard tags across all clouds
common_tags:
  Environment:
    - dev
    - staging
    - production
  CostCenter:
    - engineering
    - marketing
    - finance
  Owner:
    - team-name
    - email
  Application:
    - api-server
    - web-frontend
    - data-pipeline
  Project:
    - project-name
  CreatedDate:
    - ISO 8601 format
  ManagedBy:
    - terraform
    - manual
    - kubernetes-operator

# AWS Implementation
resource "aws_instance" "web_server" {
  instance_type = "t3.medium"
  
  tags = {
    Name           = "web-server-prod"
    Environment    = "production"
    CostCenter     = "engineering"
    Owner          = "platform-team"
    Application    = "api-server"
    Project        = "customer-portal"
    CreatedDate    = "2025-07-15"
    ManagedBy      = "terraform"
  }
}

# Azure Implementation
resource "azurerm_virtual_machine" "web_server" {
  name = "web-server-prod"
  
  tags = {
    Environment    = "production"
    CostCenter     = "engineering"
    Owner          = "platform-team"
    Application    = "api-server"
  }
}

# IBM Cloud Implementation
resource "ibm_compute_vm_instance" "web_server" {
  hostname = "web-server-prod"
  domain   = "example.com"
  
  tags = ["environment:production", "cost-center:engineering"]
}

2. Cost Tracking Dashboard

# Multi-cloud cost aggregation
import boto3
import requests
from datetime import datetime, timedelta

class MultiCloudCostTracker:
    def __init__(self):
        self.aws_client = boto3.client('ce')
        self.azure_token = self.get_azure_token()
        self.ibm_client = self.get_ibm_client()
    
    def get_aws_costs(self, days_back=30):
        """Fetch AWS costs from Cost Explorer"""
        response = self.aws_client.get_cost_and_usage(
            TimePeriod={
                'Start': (datetime.now() - timedelta(days=days_back)).strftime('%Y-%m-%d'),
                'End': datetime.now().strftime('%Y-%m-%d')
            },
            Granularity='DAILY',
            Metrics=['UnblendedCost'],
            GroupBy=[
                {'Type': 'DIMENSION', 'Key': 'SERVICE'},
                {'Type': 'TAG', 'Key': 'CostCenter'}
            ]
        )
        
        costs = {}
        for result in response['ResultsByTime']:
            for group in result['Groups']:
                service = group['Keys'][0]
                cost_center = group['Keys'][1]
                amount = float(group['Metrics']['UnblendedCost']['Amount'])
                
                key = f"{service}:{cost_center}"
                costs[key] = costs.get(key, 0) + amount
        
        return {'aws': costs, 'total': sum(costs.values())}
    
    def get_azure_costs(self, days_back=30):
        """Fetch Azure costs from Cost Management API"""
        headers = {'Authorization': f'Bearer {self.azure_token}'}
        
        url = (
            'https://management.azure.com/subscriptions/'
            '{subscription}/providers/Microsoft.CostManagement/query'
        ).format(subscription=AZURE_SUBSCRIPTION_ID)
        
        payload = {
            "type": "Usage",
            "timeframe": "Custom",
            "timePeriod": {
                "from": (datetime.now() - timedelta(days=days_back)).isoformat(),
                "to": datetime.now().isoformat()
            },
            "dataset": {
                "granularity": "Daily",
                "aggregation": {
                    "totalCost": {"name": "PreTaxCost", "function": "Sum"}
                },
                "grouping": [
                    {"type": "Dimension", "name": "ServiceName"},
                    {"type": "Tag", "name": "CostCenter"}
                ]
            }
        }
        
        response = requests.post(url, headers=headers, json=payload)
        data = response.json()
        
        costs = {}
        for row in data['properties']['rows']:
            service = row[2]
            cost_center = row[3]
            amount = float(row[0])
            
            key = f"{service}:{cost_center}"
            costs[key] = costs.get(key, 0) + amount
        
        return {'azure': costs, 'total': sum(costs.values())}
    
    def get_ibm_cloud_costs(self, days_back=30):
        """Fetch IBM Cloud costs"""
        # Implementation using IBM Cloud Cost Management API
        # Similar pattern to Azure
        pass
    
    def aggregate_costs(self):
        """Aggregate costs across all clouds"""
        aws_data = self.get_aws_costs()
        azure_data = self.get_azure_costs()
        ibm_data = self.get_ibm_cloud_costs()
        
        total = aws_data['total'] + azure_data['total'] + ibm_data['total']
        
        return {
            'aws': aws_data,
            'azure': azure_data,
            'ibm': ibm_data,
            'total': total,
            'by_cloud': {
                'AWS': aws_data['total'],
                'Azure': azure_data['total'],
                'IBM Cloud': ibm_data['total']
            }
        }

Cost Optimization Strategies

1. Reserved Instances & Savings Plans

# AWS Reserved Instance recommendation
resource "aws_ec2_instance" "web" {
  instance_type = "t3.xlarge"
  
  # On-demand: $0.1664/hour = $1,460/month
  # 1-year reserved: $0.0997/hour = $874/month
  # Savings: 40% ($588/month)
}

# Implement with Terraform
resource "aws_ec2_fleet" "web_fleet" {
  launch_template_config {
    launch_template_specification {
      launch_template_id = aws_launch_template.web.id
      version            = "$Latest"
    }
    
    override {
      instance_type             = "t3.xlarge"
      weighted_capacity         = 1
      availability_zone         = "us-east-1a"
      spot_price                = "0.05"  # 70% cheaper than on-demand
    }
  }
  
  target_capacity_specification {
    total_target_capacity = 10
    on_demand_target_capacity = 3  # 30% on-demand for stability
    spot_target_capacity = 7       # 70% spot for cost savings
  }
}

2. Downsize Underutilized Resources

# Identify underutilized instances
class UndertilizationAnalyzer:
    def find_underutilized_instances(self, threshold_cpu=20, threshold_memory=30):
        """Find instances below utilization threshold"""
        underutilized = []
        
        instances = self.get_all_instances()
        
        for instance in instances:
            metrics = self.get_metrics(instance['id'])
            
            avg_cpu = metrics['cpu_utilization']['average']
            avg_memory = metrics['memory_utilization']['average']
            
            if avg_cpu < threshold_cpu and avg_memory < threshold_memory:
                monthly_cost = instance['instance_type_cost']
                
                underutilized.append({
                    'instance_id': instance['id'],
                    'instance_type': instance['instance_type'],
                    'cpu_usage': avg_cpu,
                    'memory_usage': avg_memory,
                    'monthly_cost': monthly_cost,
                    'potential_savings': monthly_cost * 0.6,  # Downsize 60%
                    'recommendation': 'downsize_or_terminate'
                })
        
        return underutilized

# Monthly report
underutilized = analyzer.find_underutilized_instances()
total_potential_savings = sum(i['potential_savings'] for i in underutilized)
print(f"Total potential savings: ${total_potential_savings:,.2f}/month")

3. Scheduled Scaling

# AWS Auto Scaling Schedules
resource "aws_autoscaling_group" "web_asg" {
  name = "web-asg"
  
  # Scale down during off-hours
  tag {
    key                 = "Name"
    value               = "web-server"
    propagate_at_launch = true
  }
}

# Scale up for business hours
resource "aws_autoscaling_schedule" "scale_up" {
  scheduled_action_name  = "scale_up_morning"
  min_size               = 3
  max_size               = 20
  desired_capacity       = 10
  recurrence             = "0 8 * * MON-FRI"  # 8 AM weekdays
  autoscaling_group_name = aws_autoscaling_group.web_asg.name
  time_zone              = "America/New_York"
}

# Scale down for off-hours
resource "aws_autoscaling_schedule" "scale_down" {
  scheduled_action_name  = "scale_down_evening"
  min_size               = 1
  max_size               = 5
  desired_capacity       = 2
  recurrence             = "0 18 * * MON-FRI"  # 6 PM weekdays
  autoscaling_group_name = aws_autoscaling_group.web_asg.name
  time_zone              = "America/New_York"
}

# Potential savings: 30-40% for dev/staging environments

4. Storage Tiering

# Lifecycle policies for cost optimization
resource "aws_s3_bucket_lifecycle_configuration" "archive" {
  bucket = aws_s3_bucket.data.id
  
  rule {
    id     = "archive_rule"
    status = "Enabled"
    
    # Move to cheaper storage after 30 days
    transition {
      days          = 30
      storage_class = "STANDARD_IA"  # 50% cheaper
    }
    
    # Move to Glacier after 90 days
    transition {
      days          = 90
      storage_class = "GLACIER"  # 80% cheaper
    }
    
    # Delete after 365 days
    expiration {
      days = 365
    }
  }
}

# Costs comparison:
# Standard:       $0.023 per GB
# Standard-IA:    $0.0125 per GB (45% savings)
# Glacier:        $0.004 per GB (83% savings)

Multi-Cloud Cost Comparison

Example: Web Application

Component	AWS	Azure	IBM Cloud
Compute (t3.xlarge × 3)	$150	$130	$120
Database (8 vCPU)	$1,000	$950	$880
Storage (1TB)	$23	$20	$18
Network	$50	$40	$35
Total	$1,223	$1,140	$1,053

Optimization Savings

Strategy	AWS	Azure	IBM Cloud
Reserved Instances	-$350 (28%)	-$342 (30%)	-$290 (27%)
Spot/Preemptible	-$200	-$180	-$150
Rightsizing	-$120	-$110	-$100
Final Cost	$553	$508	$413
Savings	55%	55%	61%

Governance & Automation

# Enforce FinOps policies with OPA/Rego
package finops

deny[msg] {
    input.resource.type == "aws_ec2_instance"
    not input.resource.tags.CostCenter
    msg := "EC2 instances must have CostCenter tag"
}

deny[msg] {
    input.resource.type == "aws_rds_instance"
    input.resource.publicly_accessible == true
    msg := "RDS instances must not be publicly accessible"
}

deny[msg] {
    input.resource.instance_type == "m5.4xlarge"
    input.resource.tags.Environment != "production"
    msg := "m5.4xlarge only allowed in production"
}

deny[msg] {
    input.monthly_cost > 10000
    not input.resource.tags.Owner
    msg := "Resources costing >$10k/month require owner tag"
}

Conclusion

FinOps is not just about cutting costs—it’s about aligning engineering, finance, and operations. By implementing governance, automation, and continuous optimization, you unlock significant savings while improving system efficiency.

What’s your biggest cloud cost challenge? Share your FinOps strategies in the comments!