AI Research Engineer

Flam is building AI Infrastructure for Brands in Immersive Advertising spanning across all channels viz. Digital, Broadcast TV, Retail, Communications, Print, OOH etc.

Vision: The Immersive & Interactive Layer for Every Screen & Surface

Flam aims to redefine how consumers interact with ads, content in every shape and form, retail aisles, live broadcasts and fan moments turning content and interfaces into shoppable, shareable experiences that deliver measurable ROI. Flam has raised a $14 million Series A round led by global technology investor RTP Global with participation from Dovetail and select others bringing the total funding to $22 million.

The next phase of growth is to accelerate R&D on its app-less GenAI infrastructure that lets brands create, publish and measure high-fidelity MR, 3D & Digital experiences in <300 ms on any smartphone—no app download required. The same infra already powers advertising for Google, Samsung, Emirates and hundreds of global enterprises & agency powerhouses.

Key Focus Areas:

Product Roadmap: Upcoming releases include GenAI-driven 3D asset generation, Democratising MR deployment at scale, Enterprise Suite of Products across Industries, and Infrastructure for broadcasters and fan engagement.
Geography: Funds will support new enterprise pods in North America, Europe and the Middle East while deepening Asia operations.
Partnerships: Flam will expand its partner program for creative studios and global platforms, enabling Fortune 500 brands to move from pilot to rapid global roll-out.

Job Brief - We are seeking Research Engineers to advance artificial intelligence across multiple modalities and applications, working at the intersection of cutting-edge research and production engineering. 

What you'll do:

  • Design and train models across text, vision, and audio modalities from first principles through production deployment. 
  • Build AI systems that perform previously impossible tasks or achieve unprecedented performance levels. 
  • Implement novel architectures and training techniques, then scale them to real applications. 
  • Work across LLMs, vision-language models, agentic systems, and generative AI. 
  • Ship production systems with immediate user impact while contributing to open research. 
  • Collaborate with a small, high-impact team where technical excellence drives product innovation.

Technical Requirements

Core Competencies 
Model Training 
  • FSDP and DeepSpeed frameworks 
  • Mixed precision training (bf16/fp8)
  •  Gradient accumulation strategies
  •  Effective batch sizes >10,000 

Architectures 
  • Transformers, diffusion models, VAEs 
  • Mixture-of-experts systems 
  • Retrieval-augmented architectures 
  • Custom attention mechanisms and loss functions

Scale & Infrastructure 
  • Multi-node training 
  • Large Dataset handling 
  • SLURM/Kubernetes orchestration 
  • Distributed storage (S3/GCS) 
  • Checkpoint management strategies 

Performance Optimization
  • CUDA kernel development 
  • FlashAttention implementation 
  • Tensor parallelism 
  • KV cache optimization 
  • Quantization (GPTQ/AWQ) 

Specific Technical Skills
  • Implement RLHF/DPO/GRPO from scratch
  • Debug NaN gradients in distributed settings 
  • Profile and optimize GPU memory usage 
  • Write custom CUDA kernels when needed 
  • Build evaluation frameworks for novel tasks 
  • Design multimodal data pipelines 
  • Create production-ready inference servers 

What we’re looking for:
  • Min 4 years of relevant experience 
  • M.Tech/Ph.D
                       
Technical Stack:
  • Training: PyTorch 2.0+, JAX/Flax, custom training loops 
  • Data: Apache Beam, Ray Data, custom preprocessing pipelines 
  • Monitoring: Weights & Biases, custom metrics dashboards 
  • Deployment: Kubernetes, vLLM, NVIDIA Dynamo

Example Projects:
  •  Multimodal RAG: Building retrieval systems understanding code, images, and text simultaneously 
  • Efficient Fine-tuning: Implementing LoRA variants at 70B+ scale 
  • Agent Infrastructure: Creating production-ready tool-use frameworks 
  • Custom Architectures: Designing domain-specific models outperforming general models.
  • Edge Deployment: Quantizing and optimizing models for consumer hardware.

What Sets This Role Apart:
  • Direct ownership and impact—your code powers real products 
  • Work across the entire AI stack, not narrow specialization 
  • Immediate path from research to deployment 
  • Small team with significant technical autonomy 
  • Focus on shipping products that users love