200+
Sites Supported
Global enterprise operations across Americas, EMEA & APAC
99.99%
Uptime Mindset
Reliability treated as a business promise, not an IT metric
$25.5M
Budget Stewardship
Forecasting, chargeback alignment, and run vs. change tradeoffs
Reliability & Resilience

Operational stability

  • Standardization and disciplined change control protecting revenue and operations across a 200+ site global footprint.
  • Incident response improvements that reduced noise and improved recovery consistency — calmer on-call, faster resolution.
  • Runbooks, escalation clarity, and repeatable playbooks that removed ambiguity during high-pressure situations.
  • Maintained 99.99% uptime target through structured discipline, not heroics.
  • Participated in enterprise PCI DSS and ISO audits — delivered clear evidence packages and drove remediation to schedule.
Automation & Maturity

Operational efficiency

  • Ansible and Python automation reduced MTTR by 25% and eliminated configuration drift across the global network fleet.
  • Better asset visibility and faster triage through repeatable discovery and alerting workflows.
  • Reduced toil allowed engineers to focus on higher-value, higher-impact work — improving retention and morale.
  • Managed and mentored 55+ engineers and regional leaders across multiple time zones and cultures.
  • Built a team structure that scaled operations without scaling headcount linearly.
Executive Framing

How I talk about ROI

Most infrastructure investment "looks expensive" until you put it next to the cost of downtime, rework, and burnout. My approach is to quantify tradeoffs in plain language: business impact, risk, timeline, and total cost of ownership. Infrastructure decisions are business decisions — and I present them that way.

Example executive translation:
"This upgrade isn't about new hardware. It's about reducing the probability of an outage that would cost X in lost productivity and customer impact, and it removes a failure mode that's been growing for Y months. Here are the options, the costs, and the risk we carry if we defer."

What I quantify

  • Cost of delay — lost productivity, slower change velocity, hidden "tax" on every project.
  • Risk exposure — outage probability, security gaps, vendor EOL event windows.
  • People impact — on-call load, burnout, and the attrition cost that follows.

What good looks like

  • Fewer recurring incidents with clearer ownership and faster recovery.
  • Better forecasting and fewer surprise costs at year-end.
  • Teams that stay, grow, and ship improvements consistently.
  • Stakeholders who trust the plan because it's clear and defensible.