Senior / Lead Machine Learning Engineer

On-siteSalary not specified
Serbia

Job Description, Responsibilities & Requirements

About the Position

Senior / Lead Machine Learning Engineer, Serving - Serbia

Inworld is a research lab of top researchers and engineers, building the world’s top-ranked realtime voice models. Today, our models are the #1 ranked realtime voice models in the world, powering the largest consumer-facing AI applications across various categories. Our work spans research and development, optimizing realtime inference, and creating best-in-class APIs and products.

Responsibilities

  • Optimize realtime voice models
  • Develop and maintain high-performance systems
  • Collaborate with research and engineering teams

Requirements

  • Inference Optimization: Deep understanding of modern serving frameworks and techniques like vLLM or TRT-LLM.
  • Model Acceleration: Hands-on experience with quantization, distillation, caching strategies, continuous batching, paged attention, and speculative decoding.
  • High-Performance Systems: Proficiency in C++, CUDA, Rust, or highly optimized Python. Ability to profile code and optimize performance on NVIDIA GPUs.
  • Distributed Systems & Scaling: Experience with Kubernetes, Ray, custom load balancing, multi-GPU/multi-node inference, and handling thousands of concurrent connections.
  • Public work: Non-trivial systems programming projects, open-source contributions to major inference engines, or deep-dive technical write-ups.
  • Full-cycle ownership: Ability to take a model from the research team, containerize it, optimize its serving, and ensure it runs reliably in production.
  • Background: PhD in CS, Physics, Math, or equivalent practical experience building backend or ML systems.
  • Professional fluency in English: Required for daily collaboration with US-based leadership and engineering teams.

Who Thrives Here

  • Comfortable picking a direction and building the map as you go.
  • Believes engineering isn't finished until it’s shipped and stable.
  • Obsessed with the "why" behind architectures and solutions.
  • Thrives on deep context and understanding the fundamental logic behind decisions.

What Working Here Is Like

  • We hand you unclear problems and expect you to make them clear.
  • We value engineers who say "I don't know yet" and then design the benchmark or prototype that finds out.
  • We treat performance, latency, and reliability as first-class product features.
  • Impact comes before everything else, though we support sharing work and open-source contributions that move the field forward.
  • Flat structure, fast iterations, minimal process theater.

We Offer

  • Competitive compensation
  • Opportunity to work with cutting-edge AI technologies
  • Collaborative and innovative work environment

About the Company

Inworld is a leading AI research lab, recognized by CB Insights as one of the 100 most promising AI companies globally and named one of LinkedIn’s Top 10 Startups in the USA. Our technology has powered experiences from companies such as NVIDIA, Microsoft Xbox, Niantic, Logitech Streamlabs, Wishroll, Little Umbrella, and Bible Chat.

For candidates interested in relocating to the San Francisco Bay Area in the future, full U.S. visa and relocation support may be available, subject to business needs and applicable legal and work authorization requirements.

Job Details

Company name:
Inworld AI
Location:
Serbia
Employment Type:
Full-time
Work Mode:
On-site
Posted on TheJob:
4/11/2026
Last checked:
6/12/2026
Apply Now
© 2026 TheJob, Inc. All rights reserved.