Careerfit Ai logo

ML Inference & Optimization Engineer

Careerfit Ai

Maharashtra, IN Full–Time

You Bring

  • 3+ years of experience in deploying and optimizing machine learning models in production, with 1+ years of experience in deploying deep learning models
  • Experience deploying async inference APIs (FastAPI, gRPC, Ray Serve etc.)
  • Understanding of PyTorch internals and inference-time optimization
  • Familiarity with LLM runtimes: vLLM, TGI, TensorRT-LLM, ONNX Runtime etc.
  • Familiarity with GPU profiling tools (nsight, nvtop), model quantization pipelines

Posted 11 Mar 2026 · Listing from OnJob.io. Create a free profile to apply and see your AI match score.

Related Engineering jobs

Hand-picked roles that match this listing on skills, category and location — each scored to your profile inside OnJob.

Explore more on OnJob

Create my free profile — free