5 Ways Optimize Snowflake Warehouse Performance
Your Snowflake warehouse costs are probably 40% higher than they need to be.
The good news?
We’ve got five proven tactics to slash those expenses while maintaining peak performance. Here’s how our clients cut their monthly Snowflake bills by an average of 37.6%.
In this guide, we’ll walk you through how to optimize your Snowflake warehouse performance so you can maintain performance, but slash costs.
We’ll cover five techniques:
- How to rize-size your warehouse
- How to automate warehouse management
- How to get your warehouse monitoring right
- Advanced tactics
- How to continue to improve on your warehouse optimizations
1. Right-Sizing Your Warehouse
The best warehouse optimization win? Getting your sizing right. Oversizing warehouses burns credits unnecessarily. Undersizing ones creates expensive spilling issues.
Identify Oversize Warehouses
Your most common warehouse inefficiency is using a too large warehouse. An oversized warehouse will burn through credits even when the job is done. The key is to look for that low query load percentage.
Use this query to find those overly large warehouses:
SELECT
warehouse_name,
warehouse_size,
MAX(query_load_percent) AS max_query_load,
AVG(query_load_percent) AS avg_query_load,
COUNT(*) AS query_count
FROM
snowflake.account_usage.query_history
WHERE start_time >= DATEADD('DAY', -30, CURRENT_TIMESTAMP())
AND query_load_percent IS NOT NULL
GROUP BY 1, 2
HAVING max_query_load < 100 OR avg_query_load < 75
ORDER BY 3, 4;
Keep an eye out for warehouses that have:
- Average query loads below 75%
- Maximum loads under 100%
A properly-sized warehouse should use 80-90% of its capacity during peak usage times.
If you’re ready to downsize a warehouse, here’s what you should use:
ALTER WAREHOUSE warehouse_name SET warehouse_size = 'SMALL';
Identify Undersized Warehouses
Unfortunately too-small warehouses can also give you cost trouble, forcing data to “spill” from memory to disk storage, seriously degrading query performance. That “spillage” is key to finding too-small warehouses.
You’ll want to keep an eye on both remote and local disk storage.
Here’s how to check local:
SELECT
DATE_TRUNC('day', START_TIME) AS DATE,
WAREHOUSE_NAME,
SUM(BYTES_SPILLED_TO_LOCAL_STORAGE) / POWER(1024, 3) AS "Local Spillage (GB)"
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
WHERE
START_TIME >= CURRENT_DATE() - INTERVAL '30 DAYS'
AND BYTES_SPILLED_TO_LOCAL_STORAGE > 0
GROUP BY 1, 2
ORDER BY "Local Spillage (GB)" DESC;
Here’s how to find remote (this code will run much slower):
SELECT
DATE_TRUNC('day', START_TIME) AS DATE,
WAREHOUSE_NAME,
SUM(BYTES_SPILLED_TO_REMOTE_STORAGE) / POWER(1024, 3) AS "Remote Spillage (GB)"
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
WHERE
START_TIME >= CURRENT_DATE() - INTERVAL '30 DAYS'
AND BYTES_SPILLED_TO_REMOTE_STORAGE > 0
GROUP BY 1, 2
ORDER BY "Remote Spillage (GB)" DESC;
Use this code to keep an eye out for:
- Consistent remote spillage
- Local spillage exceeding 10GB daily for smaller warehouses
- Local spillage exceeding 50GB for larger warehouses
The Goldilocks Principle
So, you can’t go too big or too small with warehouses. Finding that perfect Goldilocks size is a matter of balancing performance requirements with cost efficiency.
The best way to do that?
Using bytes:
- X-Small/Small: Queries scanning less than 1GB
- Medium/Large: Queries scanning less than 1-20GB
- X-Large/2X-Large: Queries scanning less than 20GB-1TB
- 3X-Large and above: Queries scanning less than 1TB
Remember: Doubling warehouse size means doubling your hourly cost. Make sure you can justify that budget increase with proportional performance.
2. Automated Warehouse Management
The key to Snowflake cost optimization is not how many changes you can make by hand, but how you can set the platform up to optimize itself. If you’re not ready to invest in a third-party tool, you’ll want to use different automations.
Auto-Suspend Configuration
Warehouses without an auto-suspend function will keep consuming budget while idle. Always use auto-suspend to keep from burning money.
Identify warehouses missing auto-suspend with this code:
SHOW WAREHOUSES;
SELECT "name", "auto_suspend"
FROM TABLE(RESULT_SCAN(LAST_QUERY_ID()))
WHERE "auto_suspend" = 0 OR "auto_suspend" IS NULL;
Just make sure to pick the best auto-suspend timeout based on your workload patterns:
- ETL/ELT warehouse: Should use immediate suspension
- BI/Analytics warehouse: Should keep caches warm
- Development/Ad-hoc warehouse: Moderate suspension
Use longer timeouts to keep your cache warm for frequently accessed data. And don’t worry about your queries getting cut off. All warehouses should include a 60-second minimum after a query to ensure everything finishes before suspension.
Auto-Resume Best Practices
Snowflake’s startup time is under 10 seconds, so it’s well-worth you enabling auto-resume on all of your warehouses to provide a seamless user experience and maintain those automatic cost controls.
Here’s how you can check for any warehouses missing this feature:
SHOW WAREHOUSES;
SELECT "name", "auto_resume"
FROM TABLE(RESULT_SCAN(LAST_QUERY_ID()))
WHERE "auto_resume" != 'true';
You can also enable auto-resume universally using this:
ALTER WAREHOUSE warehouse_name SET AUTO_RESUME = TRUE;
3. Warehouse Monitoring and Governance
Now that you know how to cut your warehouse cost down to size, the key is making sure everything stays that way with proper monitoring and governance.
Resource Monitor Implementation
Proper resource monitoring means you never have to worry about runaway warehouse budgets.
Keep an eye on everything using this:
SHOW WAREHOUSES;
SELECT "name"
FROM TABLE(RESULT_SCAN(LAST_QUERY_ID()))
WHERE "resource_monitor" = 'null';
Or you can use a tiered monitoring system with progressive alerts at 75% and 90% usage, and then a suspension at 100% – that also lets current queries complete:
USE ROLE ACCOUNTADMIN;
CREATE RESOURCE MONITOR monthly_warehouse_monitor
WITH CREDIT_QUOTA = 1000
TRIGGERS
ON 75 PERCENT DO NOTIFY
ON 90 PERCENT DO NOTIFY
ON 100 PERCENT DO SUSPEND;
ALTER WAREHOUSE warehouse_name SET RESOURCE_MONITOR = monthly_warehouse_monitor;
Identifying Idle and Underutilized Warehouses
Your unused warehouses? Waste. And even worse – security risks.
Make sure they’re not set aside for periodic batch jobs, and if not, try consolidating or decommissioning them to save on budget and bulk security.
Look for any warehouses that have been idle for 30+ days with this:
WITH wh_events AS (
SELECT
WAREHOUSE_NAME,
MAX(TIMESTAMP) AS LAST_USED_DATE
FROM warehouse_events_history
WHERE WAREHOUSE_ID NOT IN (
SELECT DISTINCT WAREHOUSE_ID
FROM snowflake.account_usage.warehouse_events_history
WHERE EVENT_NAME = 'DROP_WAREHOUSE'
)
AND EVENT_NAME IN ('CREATE_WAREHOUSE', 'RESUME_CLUSTER', 'RESUME_WAREHOUSE')
GROUP BY WAREHOUSE_NAME
)
SELECT
WAREHOUSE_NAME,
LAST_USED_DATE,
DATEDIFF(DAY, LAST_USED_DATE, CURRENT_TIMESTAMP()) AS IDLE_DAYS
FROM wh_events
WHERE IDLE_DAYS >= 30
ORDER BY IDLE_DAYS DESC;
Performance Monitoring
Make sure your Snowflake warehouses are in tip-top shape by keeping an eye out for these performance indicators:
- Query load percentage: Should average 70-90%
- Execution time trends: Any increasing trends may indicate sizing issues
- Queued time: Higher number indicate insufficient concurrency
- Spillage patterns: Monitor for consistent memory overflow
If you’re not using a third-party tool that can automate optimizations for you, set up your own dashboards so you can track these metrics.
4. Advanced Warehouse Optimization Tactics
You’ve got all the basics, so let’s take a quick look at our favorite advanced tactics to bring down those last stubborn bills.
Workload Segregation
Try using dedicated warehouses for the same kinds of workloads. Separating workloads and warehouses like this can prevent workload interference.
Optimize those warehouses for the specific workload requirements:
- ETL warehouses: Large, short-running, immediate suspension
- BI warehouses: Medium-sized, longer suspension for cache warmth
- Ad-hoc warehouses: Variable sizing based on user needs
Query Timeout Management
Stop runaway queries by using warehouse-level timeouts:
ALTER WAREHOUSE warehouse_name SET STATEMENT_TIMEOUT_IN_SECONDS = 3600;
Make sure you don’t cut off legitimate longer-running queries. Most of those bulkier, analytical queries should be finished within 1-2 hours.
Cache Optimization Through Warehouse Strategy
The key to maximizing cache efficiency? Grouping users accessing similar datasets to the same warehouse.
This will get you an increased cache hit rate and a reduction in data retrieval times, so a win/win for both performance and cost efficiency.
5. Warehouse Optimization Continuous Improvement
The good news is that you’re well-prepared for some Snowflake warehouse optimization.
The bad news?
This is a time consuming, on-going process. You’ll want to continue to tend to your warehouses as your workloads evolve and implement regular review cycles:
- Weekly: Monitor spillage patterns and resource monitor alerts
- Monthly: Review warehouse utilization and sizing appropriateness
- Quarterly: Comprehensive review of workload changes and sizing adjustments
Keeping up with Snowflake optimizations can feel like a full-time job – and that’s because it is.
While manual optimization delivers results, many teams find the ongoing maintenance difficult to keep up with. Automated optimization tools like Yuki handle that heavy lifting, helping clients see cost reductions in weeks, no extra work needed.
Yuki is a plug-and-play Snowflake optimization tool that turns all that work into a few button clicks. It automates these processes, right-sizing your warehouses to perfectly match your queries every time. It’s already saved previous customers 37.6% on Snowflake spend.
Ready to see how much it can help you save? Get your free demo today.