Systems / Network Engineer

Flam is building AI Infrastructure for Brands in Immersive Advertising spanning across all channels viz. Digital, Broadcast TV, Retail, Communications, Print, OOH etc.

Vision: The Immersive & Interactive Layer for Every Screen & Surface

Flam aims to redefine how consumers interact with ads, content in every shape and form, retail aisles, live broadcasts and fan moments turning content and interfaces into shoppable, shareable experiences that deliver measurable ROI. Flam has raised a $14 million Series A round led by global technology investor RTP Global with participation from Dovetail and select others bringing the total funding to $22 million.

The next phase of growth is to accelerate R&D on its app-less GenAI infrastructure that lets brands create, publish and measure high-fidelity MR, 3D & Digital experiences in <300 ms on any smartphone no app download required. The same infra already powers advertising for Google, Samsung, Emirates and hundreds of global enterprises & agency powerhouses.

Key Focus Areas:

Product Roadmap: Upcoming releases include GenAI-driven 3D asset generation, Democratising MR deployment at scale, Enterprise Suite of Products across Industries, and Infrastructure for broadcasters and fan engagement.
Geography: Funds will support new enterprise pods in North America, Europe and the Middle East while deepening Asia operations.
Partnerships: Flam will expand its partner program for creative studios and global platforms, enabling Fortune 500 brands to move from pilot to rapid global roll-out.

What You'll Be Doing:

  • Design and implement custom Kubernetes controllers that manage GPU node lifecycle across multiple cloud providers, handling provisioning, bootstrapping, and de provisioning with zero manual intervention
  • Build and maintain autoscaling systems that analyze pod resource requirements (GPU count, memory, CPU) and automatically provision appropriate instance types from provider APIs
  • Develop microVM-based isolation for multi-tenant GPU workloads using cloud-hypervisor with VFIO GPU passthrough, enabling secure, high-performance compute for inference and training workloads
  • Create node bootstrapping automation that provisions bare-metal or cloud instances with container runtimes, GPU drivers, Kubernetes components, and custom networking configurations via cloud-init, Ansible, or Terraform
  • Implement sophisticated networking solutions connecting hybrid infrastructure - GKE/GCE control planes with external GPU workers using WireGuard, custom CNI configurations, and cross-cloud service mesh
  • Build Kubernetes operators and CRDs for managing ML infrastructure components like model registries, inference endpoints, training job orchestration, and GPU time-slicing configurations
  • Design monitoring, cost optimization, and capacity planning systems that provide visibility into GPU utilization, workload patterns, and infrastructure efficiency across heterogeneous compute pools.
  • Work closely with ML researchers to understand workload requirements and translate them into infrastructure automation that enables rapid experimentation and production deployment.

What We Need to See :
  • 5+ years of experience building production Kubernetes systems with deep expertise in controllers, operators, CustomResourceDefinitions, and API machinery
  • Strong proficiency in Go and experience building scalable, reliable services that manage complex distributed systems
  • Hands-on experience with GPU infrastructure in Kubernetes - NVIDIA GPU Operator, device plugins, time-slicing configurations, or custom GPU scheduling logic
  • Deep understanding of Kubernetes architecture including admission controllers, scheduler extenders, resource lifecycle management, and cluster autoscaling mechanisms
  • Demonstrated ability to design and implement automation systems that replace manual processes with API-driven, self-service tooling
  • Experience with at least one cloud provider's APIs (GCP, AWS, Azure) for programmatic compute provisioning and management
  • Strong Linux systems knowledge including networking (iptables, WireGuard, CNIs), storage (LVM, device mapper), and virtualization (KVM, QEMU, cloud-hypervisor)
  • Bachelor's/Master's degree in Computer Science, Engineering, or equivalent practical experience

Ways to Stand Out from the Crowd :
  • Experience building hybrid/multi-cloud Kubernetes architectures with cross-provider networking and unified control planes
  • Deep familiarity with microVM technologies (Firecracker, cloud-hypervisor, Kata Containers) and their application to GPU workloads
  • Hands-on experience with VFIO GPU passthrough, SR-IOV, MIG (Multi-Instance GPU), or other GPU virtualization technologies
  • Track record of building custom cloud controllers or provider implementations for bare-metal or specialized compute
  • Experience with ML infrastructure patterns - model serving, training orchestration, experiment tracking, or distributed training frameworks
  • Contributions to upstream Kubernetes projects, CNCF ecosystem tools, or GPU-related open source projects
  • Understanding of cost optimization strategies for GPU compute, including spot instances, preemption handling, and intelligent workload placement
  • Experience with Infrastructure as Code (Terraform, Pulumi) for complex multi-provider deployments

Systems / Network Engineer - Career Opportunities at Flam