AI Research Engineer - Career Opportunities at Flam

Flam is building AI Infrastructure for Brands in Immersive Advertising spanning across all channels viz. Digital, Broadcast TV, Retail, Communications, Print, OOH etc.

Vision: The Immersive & Interactive Layer for Every Screen & Surface

Flam aims to redefine how consumers interact with ads, content in every shape and form, retail aisles, live broadcasts and fan moments turning content and interfaces into shoppable, shareable experiences that deliver measurable ROI. Flam has raised a $14 million Series A round led by global technology investor RTP Global with participation from Dovetail and select others bringing the total funding to $22 million.

The next phase of growth is to accelerate R&D on its app-less GenAI infrastructure that lets brands create, publish and measure high-fidelity MR, 3D & Digital experiences in <300 ms on any smartphone—no app download required. The same infra already powers advertising for Google, Samsung, Emirates and hundreds of global enterprises & agency powerhouses.

Key Focus Areas:

Product Roadmap: Upcoming releases include GenAI-driven 3D asset generation, Democratising MR deployment at scale, Enterprise Suite of Products across Industries, and Infrastructure for broadcasters and fan engagement.

Geography: Funds will support new enterprise pods in North America, Europe and the Middle East while deepening Asia operations.

Partnerships: Flam will expand its partner program for creative studios and global platforms, enabling Fortune 500 brands to move from pilot to rapid global roll-out.

Job Brief - We are seeking Research Engineers to advance artificial intelligence across multiple modalities and applications, working at the intersection of cutting-edge research and production engineering.

What you'll do:

Design and train models across text, vision, and audio modalities from first principles through production deployment.
Build AI systems that perform previously impossible tasks or achieve unprecedented performance levels.
Implement novel architectures and training techniques, then scale them to real applications.
Work across LLMs, vision-language models, agentic systems, and generative AI.
Ship production systems with immediate user impact while contributing to open research.
Collaborate with a small, high-impact team where technical excellence drives product innovation.

Technical Requirements

Core Competencies

Model Training

FSDP and DeepSpeed frameworks
Mixed precision training (bf16/fp8)
Gradient accumulation strategies
Effective batch sizes >10,000

Architectures

Transformers, diffusion models, VAEs
Mixture-of-experts systems
Retrieval-augmented architectures
Custom attention mechanisms and loss functions

Scale & Infrastructure

Multi-node training
Large Dataset handling
SLURM/Kubernetes orchestration
Distributed storage (S3/GCS)
Checkpoint management strategies

Performance Optimization

CUDA kernel development
FlashAttention implementation
Tensor parallelism
KV cache optimization
Quantization (GPTQ/AWQ)

Specific Technical Skills

Implement RLHF/DPO/GRPO from scratch
Debug NaN gradients in distributed settings
Profile and optimize GPU memory usage
Write custom CUDA kernels when needed
Build evaluation frameworks for novel tasks
Design multimodal data pipelines
Create production-ready inference servers

What we’re looking for:

Min 4 years of relevant experience
M.Tech/Ph.D

Technical Stack:

Training: PyTorch 2.0+, JAX/Flax, custom training loops
Data: Apache Beam, Ray Data, custom preprocessing pipelines
Monitoring: Weights & Biases, custom metrics dashboards
Deployment: Kubernetes, vLLM, NVIDIA Dynamo

Example Projects:

Multimodal RAG: Building retrieval systems understanding code, images, and text simultaneously
Efficient Fine-tuning: Implementing LoRA variants at 70B+ scale
Agent Infrastructure: Creating production-ready tool-use frameworks
Custom Architectures: Designing domain-specific models outperforming general models.
Edge Deployment: Quantizing and optimizing models for consumer hardware.

What Sets This Role Apart:

Direct ownership and impact—your code powers real products
Work across the entire AI stack, not narrow specialization
Immediate path from research to deployment
Small team with significant technical autonomy
Focus on shipping products that users love