HPC Kubernetes Solutions Architect (GPU Platforms)
Location: Dallas, TX (Hybrid)
Type: Direct Hire
⢠Competitive base salary + performance bonus
⢠100% company-paid benefits
We are seeking an HPC Kubernetes Solutions Architect to lead the design, integration, and adoption of GPU-accelerated Kubernetes platforms supporting HPC, AI/ML, simulation, and scientific workloads.
This is a highly technical, customer-facing architecture role with ownership across the full solution lifecycleâfrom discovery and requirements gathering through architecture design, proof-of-concept delivery, deployment, and long-term optimization. The role serves as a trusted advisor to customers while also influencing internal product and engineering direction through real-world feedback.
The ideal candidate brings deep expertise across Kubernetes, GPU orchestration, and HPC environments, along with the ability to design scalable, high-performance platforms and guide customers through complex infrastructure transformations.
⢠Serve as the primary architectural point of contact for customers adopting GPU-accelerated Kubernetes platforms
⢠Capture workload requirements, performance objectives, and scaling needs, translating them into reference architectures and solution designs
⢠Lead customer workshops, technical design sessions, and architecture reviews
⢠Architect and operate Kubernetes clusters optimized for GPU workloads using NVIDIA GPU Operator, Network Operator, DCGM, and device plugins
⢠Integrate Multi-Instance GPU (MIG), GPU sharing, and advanced scheduling (Volcano, Slurm integration, kube-scheduler plugins)
⢠Design and implement multi-tenant Kubernetes environments with strong isolation and performance guarantees
⢠Develop or extend custom Kubernetes operators and controllers using Go or Python
⢠Automate HPC infrastructure services and platform operations
⢠Support Infrastructure-as-Code and GitOps practices using Terraform, Helm, Kustomize, ArgoCD, and FluxCD
⢠Lead proof-of-concept and benchmarking initiatives to validate performance and scalability
⢠Utilize profiling tools and workload characterization methodologies to optimize GPU utilization and cluster performance
⢠Conduct performance tuning across compute, storage, and networking layers
⢠Define integration strategies across compute, storage, networking, and orchestration layers
⢠Support CNI integrations (NVIDIA CNI, Multus, Cilium), distributed storage (Lustre, GPFS, Ceph, VAST), and container runtimes
⢠Ensure seamless integration with HPC schedulers and enterprise systems
⢠Implement monitoring and telemetry solutions using Prometheus, Grafana, DCGM Exporter, and OpenTelemetry
⢠Provide visibility into GPU health, cluster utilization, and workload performance
⢠Partner with HPC, ML, DevOps, and platform teams to ensure scalability and performance in hybrid and on-prem environments
⢠Collaborate with product and engineering teams to influence roadmap and platform improvements
⢠Build relationships with ecosystem vendors including NVIDIA, networking providers, and storage partners
⢠Stay current on GPU roadmaps, interconnect technologies (InfiniBand, RoCE, NVLink), and Kubernetes advancements
⢠Provide forward-looking guidance to customers on scaling and future architecture evolution
⢠Represent the organization in technical workshops, design sessions, and industry events
⢠Extensive experience designing and operating Kubernetes platforms in HPC or GPU-intensive environments
⢠Deep expertise across:
⢠Proven ability to design scalable, secure, and resilient Kubernetes-based architectures
⢠Proficiency in Go or Python for operator development and automation
⢠Experience with workload profiling, benchmarking, and performance tuning
⢠Strong customer-facing skills with the ability to translate requirements into actionable architectures
⢠Experience collaborating across engineering, product, and operations teams
⢠Experience delivering end-to-end HPC or AI/ML solutions from design through deployment and optimization
⢠Familiarity with containerized HPC environments (e.g., Singularity/Apptainer)
⢠Experience with GitOps practices and CI/CD pipelines for Kubernetes platforms
⢠Contributions to open-source projects in Kubernetes or NVIDIA ecosystems
⢠Experience advising customers on future-state architectures and emerging technologies
⢠Bachelorâs or Masterâs degree in Computer Science, Engineering, Physics, or related field
⢠Relevant certifications such as CKA, CKAD, CKS, AWS Solutions Architect, or Azure Solutions Architect Expert
...Job Title: Medical Writer II Location: Maple Grove, MN (100% Onsite) Rate: $40 - $45/hr Duration: 1 Year Hours: 40 Hours/Week Contact Type: W2 (must be authorized to work in the U.S.; no sponsorship or C2C) Job Description We are seeking a Medical...
...Requirements Must Haves: Minimum 7 to 9 years of experience in project management in health care, both operational and technical PMP or Six Sigma certification required Proficient project management of technical and operational projects in healthcare Knowledge...
...Digital Marketing & Content Specialist In-House | Legal Industry Overland Park, KS | Primarily In-Office with Occasional Remote Flexibility Full-Time | $55,000-$75,000 (DOE)PLEASE READ BEFORE APPLYING: This is NOT a freelance, contract, or agency role. We...
.... Duties Coordinate and implement educational curriculum by developing classroom activities... ...appropriate practices and early learning standards Lead by example; encourage... ...screening. Associates Degree in early childhood education or related field of study with...
...We are seeking an experienced Data Architect / Data Modeler with strong industry standards knowledge and a solid foundation in modern data modelling practices. The ideal candidate will have firsthand experience with modelling tools, data frameworks, and insurance industry...