This is a high-ownership “do whatever it takes” role for someone who wants to operate at founder speed, learn the full stack of an insurance/warranty business, and ship work that directly moves revenue, conversion, and retention. About SureBright SureBright is building the world’s first AI-native, agentic warranty company. We power warranty programs through merchants and are launching D2C in early Q2. We’re going vertical end-to-end: pricing, claim management and fulfillment activities, and building systems that deliver fast, fair, auditable outcomes for customers while improving merchant conversion and unit economics. What you’ll do You will build the agentic layer of our core product: AI systems that reason, take actions, and reliably complete workflows across pricing/underwriting, policy issuance, claims intake, adjudication, fulfillment (repair/replacement/reimbursement), and other parts of the bueinsess. Key responsibilities • Design and ship production-grade AI agents that run real business processes (not demos) • Build agentic architectures: orchestration, tool calling, state machines, memory, permissions, audit trails, human-in-the-loop, and fallback paths • Own our RAG platform end-to-end: ingestion, chunking, embeddings, retrieval, reranking, citations/grounding, and hallucination mitigation • Build evaluation and monitoring systems: offline eval sets, regression tests, online metrics, drift detection, and red-team suites • Implement model optimization: prompt systems, structured outputs, fine-tuning where appropriate, latency/cost optimization, caching, and throughput tuning • Build core ML systems for warranty/claims: document understanding, extraction, classification, anomaly/fraud signals, decision support, and SLA routing • Partner tightly with product/ops to translate real workflows into deterministic, testable, compliant automation What you’ll build (examples) • Underwriting/pricing agents: real-time quote decisions using merchant/product/context signals with strict guardrails and auditability • Claims copilot + auto-adjudication engine: intake triage, evidence requests, decision proposals with explanation, vendor routing, reimbursement automation • OEM warranty parsing system: turn messy manufacturer policies into machine-readable coverage logic • Internal ops copilots: tooling that reduces manual work and increases consistency across customer support, compliance, and finance Requirements (must have) • 4+ years building and shipping ML/LLM systems in production (or equivalent founder-level experience) • Proven experience building agentic products/companies: multi-step workflows, tool use, orchestration, reliability engineering • Deep hands-on expertise in: • RAG and retrieval systems (vector databases, reranking, grounding strategies) • LLM evals (golden sets, automated judging, human eval, regression pipelines) • Prompting and structured outputs (schemas, function/tool calling, robustness) • Model training/fine-tuning fundamentals and tradeoffs (when to tune vs prompt vs retrieve) • Strong software engineering: clean APIs, testing, observability, performance tuning, secure-by-default design • Comfortable owning ambiguous problems end-to-end and driving them to measurable outcomes Strong preference (nice to have) • Experience building systems with compliance/audit requirements (fintech/insurance/health/enterprise) • Experience with document AI at scale (PDFs, images, messy inputs), and extracting structured truth reliably • Experience designing human-in-the-loop workflows and escalation rules for high-stakes decisions • Experience with infra for LLMs: model hosting, batching, streaming, caching, prompt/version management • Startup or ex-founder background, especially shipping 0→1 products fast What success looks like (first 90 days) • You ship an agentic workflow that replaces meaningful manual ops work and improves a measurable metric (cycle time, accuracy, cost per claim, attach rate, CSAT) • You implement an eval harness that catches regressions before production and gives us a reliable “quality score” per workflow • You establish a scalable architecture pattern for agents (permissions, audit logs, observability, fallbacks) that the team can replicate Tech environment We’re cloud-native and move fast. Expect Python for ML/agents, TypeScript for product surfaces, Postgres for systems of record, event-driven services, and a modern LLM + retrieval stack with strong observability and CI/CD. And AWS+Azure for infra. Why this role is special • Build an AI-native category-defining company in a massive market • Direct founder exposure and high leverage: your work will change the trajectory of the company • Real breadth: growth + underwriting/claims ops + product, in one seat • Career accelerant: if you perform, your scope and title will grow quickly

Staff AI Engineer (Agentic Systems)

Your next job is waiting