Kenny Sheridan Resume

Experience

Member of Technical Staff - Infrastructure Product | Andromeda

Greater Seattle Area, Remote | March 2026 - Present | Website: andromeda.ai

Shape AI infrastructure product direction across robotics, edge, Kubernetes, observability, platform delivery, and customer-facing workflows.
Convert ambiguous infrastructure requirements into shipped systems, acceptance criteria, operator documentation, and leadership-ready decisions.
Drive Kubernetes, Slurm, SUNK, and bare-metal AI/HPC platform work beyond observability, including deployment workflows, control-plane integration, cluster operations, developer experience, and productized infrastructure paths.
Create rich Kubernetes and infrastructure product documentation for customer-facing workflows, designed for online hosting, mobile viewing, and clear operational adoption.
Build repeatable validation environments with reproducible packaging, Kubernetes, and QEMU/KVM.
Guide performance engineering, system modeling, cache strategy, storage evaluation, offload analysis, and workload placement so products reach market quickly without losing operational reliability.

Senior Supercomputing Infrastructure Engineer | San Francisco Compute Company

Greater Seattle Area, Remote | September 2024 - March 2026 | Website: sfcompute.com

Led automated bring-up for 2,000 NVIDIA H100 GPUs, moving bare metal into operational Kubernetes clusters through a single-command Rust-based deployment workflow.
Scaled onboarding from 8 nodes to hundreds of GPU nodes within weeks, reducing product iteration time by eliminating manual provisioning across hardware, networking, and company infrastructure integration.
Deployed distributed supercomputing infrastructure globally for GPU marketplace capacity, with emphasis on scalable utilization, reliability, and operational repeatability.
Built and open-sourced Rust tooling for serialized infrastructure inventory plus object-storage and network throughput profiling across bare-metal AI/HPC environments.
Designed a private Linux-side hardware discovery and lifecycle agent for bare-metal fleet introspection, host validation, and infrastructure control-plane integration.
Optimized compute, SDN, network fabric, high-performance storage, performance testing, custom Kubernetes controllers/operators, resource management, near-metal validation, and repeatable systems setup.

Senior AI and HPC Infrastructure Engineer | TensorWave

Greater Seattle Area, Remote | May 2024 - July 2024 | Website: tensorwave.com

Architected bare-metal AI/HPC infrastructure for on-premises AMD MI300X GPU clusters using EPYC CPUs, Slurm/SUNK operating patterns, and RDMA over Converged Ethernet on traditional TCP/IP networks.
Benchmarked NVIDIA InfiniBand and AMD RoCE designs, including high-bandwidth all_reduce testing over 800G switching infrastructure.
Designed vendor-agnostic AI/ML infrastructure patterns capable of scaling toward hundreds of nodes while reducing accelerator lock-in.
Created deployment documentation for bare-metal GPU cluster setup, Slurm-oriented operating paths, configuration, and operational handoff.

Senior Hardware Infrastructure Automation Engineer | ServiceNow

Kirkland, WA | February 2022 - December 2023 | Website: servicenow.com

Engineered HPC system testing software for distributed enterprise infrastructure, including stress validation, benchmarking, and reliability assessment workflows.
Validated infrastructure hardware for IL5, FedRAMP, and FedRAMP High environments, including Thales SafeNet security devices.
Led migration of internal automation from Python and Bash to Go, improving efficiency across heterogeneous hardware environments.
Built Redfish-based SKU auditing, NIC benchmarking, and GitLab CI/CD workflows for hardware-software validation.

Senior Cloud Hardware Performance Test Engineer | ServiceNow

Kirkland, WA | May 2017 - February 2022 | Website: servicenow.com

Led hardware performance testing across storage, networking, BIOS, firmware, PCIe, FPGAs, SmartNICs, Smart Storage cards, NVMe, Linux filesystems, Weka, VAST, and Ceph.
Worked with ODMs, system engineers, executive stakeholders, and product teams to refine infrastructure roadmaps and train engineers on repeatable test methods.

System Administrator | NexLevel Information Technology

Sacramento, CA | August 2015 - May 2017 | Website: nexlevelit.com

Provided Tier 3 Unix and Windows server support for biometric systems serving 300+ remote clients, including storage recovery, monitoring scripts, and production baseline improvements.

Technical Instructor of Meteorology | U.S. Marine Corps

Quantico, VA | May 2007 - June 2015 | Website: marines.mil

Administered two modular data centers, maintained METMF(R) computing infrastructure, virtualized instructional environments, and managed WAN-connected remote sensing sites.
Produced 300+ surface observations, 100+ forecasts, and 50+ weather warnings cited in Navy and Marine Corps Achievement Medal recognition.

Selected Engineering Work

Repeatable AI infrastructure environments

Nix | QEMU/KVM | Secure packaging | Model-serving cache paths

Set up deterministic infrastructure environments that use Nix for secure packages, QEMU/KVM for sandboxed validation, and caching strategies to serve AI models quickly.

AI-first infrastructure product delivery

Kubernetes | Product engineering | Observability | GPU platforms

Delivered customer-facing and internal infrastructure products at 9,000+ GPU scale, balancing fast iteration, operational adoption, Kubernetes platform work, and production reliability.

Automated GPU bring-up and onboarding

Rust | Kubernetes | Bare metal | NVIDIA H100

Single-command workflow that deploys hardware, joins company infrastructure, configures networking, and removes manual intervention from large-scale GPU node onboarding.

Infrastructure inventory and throughput profiling

Rust | Hardware inventory | Object storage | Network profiling

Built public tooling for serialized hardware reports and portable object-storage and network throughput profiling, supporting faster validation across bare-metal AI/HPC fleets.

Bare-metal lifecycle agent

Rust | Linux agents | Hardware discovery | Control-plane integration

Designed private host-side agent work for hardware discovery, lifecycle state, fleet validation, and integration with infrastructure control planes.

Multi-node GPU cluster networking

AMD MI300X | NVIDIA | RoCE | InfiniBand | 800G fabrics

Designed vendor-agnostic topologies across AMD and NVIDIA accelerators, including RoCE on TCP/IP networks and InfiniBand benchmarking for large-scale AI/HPC clusters.

Skills

CodeLinux, Rust, Go, Bash, NixOS

BuildNix packaging, QEMU/KVM

ControlKubernetes operators, GitOps, Argo CD

AgenticWorkflows, harnesses, MCP

DataRaft reconciliation, embedded databases

HardwareBare metal, Redfish, H100, H200, B200, B300, MI300X

GPUCUDA, ROCm, NCCL

FabricRoCE, InfiniBand, NVMe-oF, SONiC, SDN

OverlayTailscale, Netmaker

OffloadSmartNICs, Smart Storage cards

StorageWeka, VAST, Ceph

TelemetryOpenTelemetry, Prometheus, Grafana

PerfFIO, iperf3, stress-ng, MPI, PyTorch

Strengths

Defense AI infrastructureBuild offline AI and HPC platforms for regulated, contested, edge, simulation, and datacenter workloads.
Reproducible systemsMake infrastructure repeatable, auditable, and safe to operate across environments.
GPU and edge orchestrationAutomate accelerator fleets across providers, clusters, and constrained networks.
Fabric and storage pathsDesign data movement, storage access, offload paths, and telemetry for reliable workloads.
Validation-first deliveryBenchmark, stress, profile, model, and place workloads with operational evidence.

Open Source

Forgejo portfolioSelf-hosted forge for systems architecture and private/public engineering work.
GitHub profilePublic Rust, Nix, GPU, MCP, and developer tooling repositories.
Neovim toolingLua-based personal and professional editor configuration used since 2015.

Education

Air University / Community College of the Air ForceKeesler AFB | 62 credits in meteorology, forecasting, and atmospheric physics

Certifications

Introduction to BlockchainAmazon Web Services, 2021
Basic Instructor CourseCommunity College of the Air Force, 2012
Meteorology and Oceanography Analyst ForecasterCommunity College of the Air Force, 2008

Recognition

Navy and Marine Corps Achievement Medal
Good Conduct Medal
Letter of Appreciation, Director of Meteorology

Export tip: open this file in a browser, print, choose "Save to PDF", and enable background graphics. Verify exact current title and dates before external submission.