ML Inference & Optimization Engineer

Careerfit Ai

Maharashtra, IN Full–Time

You Bring

3+ years of experience in deploying and optimizing machine learning models in production, with 1+ years of experience in deploying deep learning models
Experience deploying async inference APIs (FastAPI, gRPC, Ray Serve etc.)
Understanding of PyTorch internals and inference-time optimization
Familiarity with LLM runtimes: vLLM, TGI, TensorRT-LLM, ONNX Runtime etc.
Familiarity with GPU profiling tools (nsight, nvtop), model quantization pipelines

Apply with OnJob Browse more jobs

Posted 11 Mar 2026 · Listing from OnJob.io. Create a free profile to apply and see your AI match score.

Related Engineering jobs

Hand-picked roles that match this listing on skills, category and location — each scored to your profile inside OnJob.

Laser Cutting Design Engineer Purple Squirrel Consulting Services Mumbai, Maharashtra, India · ₹3L–₹4L Factory Head Purple Squirrel Consulting Services Khopoli, Maharashtra, India · ₹12L–₹15L Electrical Engineer (Solar Design & EPC) Sensory Grid Private Limited Vadodara, Gujarat, India · ₹20,000–₹35,000/mo Solution Architect Solkuu Technologies NA, Andhra Pradesh, India · ₹42L–₹50L Lab Engineer Dr. Bansi Dhar Institute Gurgaon, Haryana, India · ₹7L+ AR Developer Revalsys Hyderabad, Telangana, India · ₹3L+ FullStack Developer -.NET & AI-Assisted Engineering (India) UWorld IN Full Stack Engineer (0 to 2 Years) Abstrabit Technologies Pvt Ltd IN Full Stack Developer – Node.js, Python, Angular, MySQL gate6 IN Automation Tool developer Indium Software IN

Explore more on OnJob

Find AI-matched jobs Browse jobs by city Salary guide by role Check your resume (ATS)

Create my free profile — free