ML Inference & Optimization Engineer

Careerfit Ai

Maharashtra , IN Full–time
Posted on: March 11, 2026
You Bring • 3+ years of experience in deploying and optimizing machine learning models in production, with 1+ years of experience in deploying deep learning models • Experience deploying async inference APIs (FastAPI, gRPC, Ray Serve etc.) • Understanding of PyTorch internals and inference-time optimization • Familiarity with LLM runtimes: vLLM, TGI, TensorRT-LLM, ONNX Runtime etc. • Familiarity with GPU profiling tools (nsight, nvtop), model quantization pipelines

About Company

Careerfit Ai

Maharashtra ,IN

Your next job is waiting

Create your profile and start applying in minutes.