GenAI and Open-source LLM Developer
L&T Precision Engineering and Systems
Posted on: March 12, 2026
We are seeking a skilled developer to design, optimize, and deploy offline/on-premise generative AI models and LLMs. This role focuses on building high-throughput, low-latency inference systems and robust MLOps pipelines for private environments.
Responsibilities:
• Deploy and optimize LLMs using frameworks like TGI, vLLM, DeepSpeed
• Accelerate models with TensorRT-LLM, quantization, GPU optimizations
• Build offline RAG/agent systems with LangChain, Google ADK, SgLang
• Fine-tune models (LoRA/QLoRA) for edge and constrained environments
• Develop CI/CD pipelines with Docker & Kubernetes/OpenShift
• Create production-ready APIs with FastAPI, SQLAlchemy/Alembic
• Research new inference algorithms & quantization techniques
Qualifications:
• Bachelor’s/Master’s in CS, Engineering, or a related field
• 4–5 years ML experience (with minimum 2 years of experience in LLM optimization/deployment)
• Strong Python & CUDA/GPU expertise
• Proven record of deploying offline, high-performance LLMs
Preferred:
• Experience in air-gapped/non-cloud environments
• Knowledge of Spark/Kafka for RAG pipelines
• Contributions to open-source LLM frameworks
About Company
L&T Precision Engineering and Systems
Your next job is waiting
Create your profile and start applying in minutes.