ML Inference & Optimization Engineer
Careerfit Ai
Maharashtra, IN Full–Time
You Bring
- 3+ years of experience in deploying and optimizing machine learning models in production, with 1+ years of experience in deploying deep learning models
- Experience deploying async inference APIs (FastAPI, gRPC, Ray Serve etc.)
- Understanding of PyTorch internals and inference-time optimization
- Familiarity with LLM runtimes: vLLM, TGI, TensorRT-LLM, ONNX Runtime etc.
- Familiarity with GPU profiling tools (nsight, nvtop), model quantization pipelines
Posted 11 Mar 2026 · Listing from OnJob.io. Create a free profile to apply and see your AI match score.
Related Engineering jobs
Hand-picked roles that match this listing on skills, category and location — each scored to your profile inside OnJob.
Laser Cutting Design Engineer Purple Squirrel Consulting Services Mumbai, Maharashtra, India · ₹3L–₹4L Factory Head Purple Squirrel Consulting Services Khopoli, Maharashtra, India · ₹12L–₹15L Electrical Engineer (Solar Design & EPC) Sensory Grid Private Limited Vadodara, Gujarat, India · ₹20,000–₹35,000/mo Solution Architect Solkuu Technologies NA, Andhra Pradesh, India · ₹42L–₹50L Lab Engineer Dr. Bansi Dhar Institute Gurgaon, Haryana, India · ₹7L+ AR Developer Revalsys Hyderabad, Telangana, India · ₹3L+ FullStack Developer -.NET & AI-Assisted Engineering (India) UWorld IN Full Stack Engineer (0 to 2 Years) Abstrabit Technologies Pvt Ltd IN Full Stack Developer – Node.js, Python, Angular, MySQL gate6 IN Automation Tool developer Indium Software IN