ML Inference & Optimization Engineer
Careerfit Ai
Posted on: March 11, 2026
You Bring
• 3+ years of experience in deploying and optimizing machine learning models in production, with 1+ years of experience in deploying deep learning models
• Experience deploying async inference APIs (FastAPI, gRPC, Ray Serve etc.)
• Understanding of PyTorch internals and inference-time optimization
• Familiarity with LLM runtimes: vLLM, TGI, TensorRT-LLM, ONNX Runtime etc.
• Familiarity with GPU profiling tools (nsight, nvtop), model quantization pipelines
About Company
Careerfit Ai
Maharashtra ,IN
Your next job is waiting
Create your profile and start applying in minutes.