200+
Sites Supported
Global enterprise operations across Americas, EMEA & APAC
99.99%
Uptime Mindset
Reliability treated as a business promise, not an IT metric
$25.5M
Budget Stewardship
Forecasting, chargeback alignment, and run vs. change tradeoffs
Reliability & Resilience
Operational stability
- Standardization and disciplined change control protecting revenue and operations across a 200+ site global footprint.
- Incident response improvements that reduced noise and improved recovery consistency — calmer on-call, faster resolution.
- Runbooks, escalation clarity, and repeatable playbooks that removed ambiguity during high-pressure situations.
- Maintained 99.99% uptime target through structured discipline, not heroics.
- Participated in enterprise PCI DSS and ISO audits — delivered clear evidence packages and drove remediation to schedule.
Automation & Maturity
Operational efficiency
- Ansible and Python automation reduced MTTR by 25% and eliminated configuration drift across the global network fleet.
- Better asset visibility and faster triage through repeatable discovery and alerting workflows.
- Reduced toil allowed engineers to focus on higher-value, higher-impact work — improving retention and morale.
- Managed and mentored 55+ engineers and regional leaders across multiple time zones and cultures.
- Built a team structure that scaled operations without scaling headcount linearly.
Executive Framing
How I talk about ROI
Most infrastructure investment "looks expensive" until you put it next to the cost of downtime, rework, and burnout.
My approach is to quantify tradeoffs in plain language: business impact, risk, timeline, and total cost of ownership.
Infrastructure decisions are business decisions — and I present them that way.
Example executive translation:
"This upgrade isn't about new hardware. It's about reducing the probability of an outage that would cost X in lost productivity and customer impact,
and it removes a failure mode that's been growing for Y months. Here are the options, the costs, and the risk we carry if we defer."
What I quantify
- Cost of delay — lost productivity, slower change velocity, hidden "tax" on every project.
- Risk exposure — outage probability, security gaps, vendor EOL event windows.
- People impact — on-call load, burnout, and the attrition cost that follows.
What good looks like
- Fewer recurring incidents with clearer ownership and faster recovery.
- Better forecasting and fewer surprise costs at year-end.
- Teams that stay, grow, and ship improvements consistently.
- Stakeholders who trust the plan because it's clear and defensible.