L&T Precision Engineering and Systems logo

GenAI and Open-source LLM Developer

L&T Precision Engineering and Systems

IN Full–Time

We are seeking a skilled developer to design, optimize, and deploy offline/on-premise generative AI models and LLMs. This role focuses on building high-throughput, low-latency inference systems and robust MLOps pipelines for private environments.

Responsibilities:

  • Deploy and optimize LLMs using frameworks like TGI, vLLM, DeepSpeed
  • Accelerate models with TensorRT-LLM, quantization, GPU optimizations
  • Build offline RAG/agent systems with LangChain, Google ADK, SgLang
  • Fine-tune models (LoRA/QLoRA) for edge and constrained environments
  • Develop CI/CD pipelines with Docker & Kubernetes/OpenShift
  • Create production-ready APIs with FastAPI, SQLAlchemy/Alembic
  • Research new inference algorithms & quantization techniques

Qualifications:

  • Bachelor’s/Master’s in CS, Engineering, or a related field
  • 4–5 years ML experience (with minimum 2 years of experience in LLM optimization/deployment)
  • Strong Python & CUDA/GPU expertise
  • Proven record of deploying offline, high-performance LLMs

Preferred:

  • Experience in air-gapped/non-cloud environments
  • Knowledge of Spark/Kafka for RAG pipelines
  • Contributions to open-source LLM frameworks

Posted 12 Mar 2026 · Listing from OnJob.io. Create a free profile to apply and see your AI match score.

Related Data & AI jobs

Hand-picked roles that match this listing on skills, category and location — each scored to your profile inside OnJob.

Explore more on OnJob

Create my free profile — free