职位描述

## 岗位要求


1. 本科及以上学历,3 年以上后端开发经验,精通 Python 后端开发,具备扎实的数据结构与系统设计能力

2. 熟悉Python主流 Web 框架(如 FastAPI / Flask / Django 等),理解异步编程模型与协程机制

3. 具备高并发、高可用系统设计经验,理解分布式系统基础(服务注册发现、负载均衡、限流熔断、降级策略等)

4. 熟悉 Linux 开发环境,理解 Docker 原理,具备 K8s 实践经验,理解容器调度与资源管理机制

5. 具备模型服务接入或外部 API 聚合经验,理解推理服务的性能特征、延迟控制与成本优化策略

6. 熟悉 Redis、消息队列(Kafka / RabbitMQ 等)或缓存与异步任务系统设计者优先



## 加分项


1. 有算法服务中台、技术中台或云原生平台建设经验

2. 有多模型调度、策略引擎或流量分发系统设计经验

3. 熟悉服务网格(Service Mesh)、可观测体系(Prometheus / OpenTelemetry 等)

4. 了解大模型推理架构、GPU 资源调度或推理性能优化

5. 有大规模系统稳定性建设经验(压测、容量规划、故障演练等)



## 我们提供


1. 从 0 到 1 参与 AI 算法服务中台架构设计与核心模块建设

2. 参与多模型能力接入与调度系统建设,解决真实高并发与成本优化问题

3. 技术决策空间充分,鼓励工程质量与系统设计能力提升

4. 入职配备最新款 MacBook Pro,提供 AI 工具支持(如 Cursor 等)

5. 扁平开放的技术氛围,与算法及业务团队深度协作



## 我们期待这样的你


1. 对系统架构与工程质量有追求,愿意构建可长期演进的技术体系

2. 对 AI 基础设施与模型工程化充满兴趣

3. 希望参与构建一个真正支撑业务规模增长的核心平台系统


欢迎加入我们,一起构建面向 AI 时代的高性能、可扩展的算法服务基础设施。


Backend Software Engineer (Python) | AI Algorithm Services Platform

About the Role

We are building a next-generation AI Algorithm Services Platform that standardizes the integration, orchestration, and governance of AI models and external algorithm providers. As a Backend Software Engineer, you will play a key role in designing and developing the engineering infrastructure that enables scalable, high-performance, and highly available AI services.

You will help create a unified platform for model onboarding, routing, scheduling, observability, and runtime governance, forming the foundation of AI-powered applications at scale.

Responsibilities

  • Design and implement engineering frameworks for AI algorithm services, including service packaging, containerization, deployment standards, and runtime governance.
  • Build and enhance a multi-model integration and orchestration platform, supporting model routing, concurrency control, rate limiting, circuit breaking, priority scheduling, and cost optimization strategies.
  • Architect and optimize backend systems for high-concurrency workloads, ensuring scalability, reliability, and fault tolerance under high-QPS and complex dependency scenarios.
  • Develop comprehensive observability capabilities, including logging, metrics, tracing, monitoring, and alerting, to improve system performance and resiliency.
  • Collaborate closely with AI researchers, product managers, and business teams to efficiently deliver AI capabilities and establish reusable engineering best practices.

Requirements

  • Bachelor's degree or above in Computer Science, Software Engineering, or a related field.
  • 3+ years of backend development experience with strong proficiency in Python.
  • Solid understanding of data structures, algorithms, software architecture, and system design principles.
  • Hands-on experience with mainstream Python web frameworks such as FastAPI, Flask, or Django.
  • Strong understanding of asynchronous programming, event-driven architectures, and coroutine mechanisms.
  • Experience designing and operating high-availability, distributed systems, including service discovery, load balancing, rate limiting, circuit breaking, and degradation strategies.
  • Familiarity with Linux development environments and container technologies.
  • Practical experience with Docker and Kubernetes, including container orchestration, scheduling, and resource management.
  • Experience integrating AI model services or aggregating external APIs.
  • Understanding of inference service performance characteristics, latency optimization, throughput management, and cost-efficiency strategies.
  • Experience with Redis, message queues (Kafka, RabbitMQ, etc.), caching systems, or asynchronous task processing frameworks is a strong plus.

Preferred Qualifications

  • Experience building AI service platforms, technical middleware platforms, or cloud-native infrastructure.
  • Experience designing multi-model orchestration, traffic routing, policy engines, or intelligent scheduling systems.
  • Familiarity with Service Mesh technologies and observability ecosystems such as Prometheus and OpenTelemetry.
  • Understanding of LLM inference architectures, GPU resource scheduling, or inference performance optimization.
  • Experience in large-scale system reliability engineering, including load testing, capacity planning, and disaster recovery exercises.

What We Offer

  • The opportunity to participate in the end-to-end design and development of an AI Algorithm Services Platform from the ground up.
  • Exposure to real-world challenges in multi-model orchestration, large-scale concurrency, and infrastructure cost optimization.
  • Significant ownership and influence over technical decisions, architecture, and engineering standards.
  • Latest-generation MacBook Pro and access to modern AI productivity tools such as Cursor.
  • A collaborative, flat organizational culture with close interaction between engineering, AI research, and business teams.

We’d Love to Meet Someone Who

  • Is passionate about system architecture, software craftsmanship, and building sustainable, scalable platforms.
  • Has a strong interest in AI infrastructure, model serving, and AI engineering.
  • Wants to help build mission-critical systems that directly support business growth and long-term scalability.

Join us in building high-performance, scalable, and reliable AI infrastructure for the next generation of intelligent applications.

公司信息
美图公司成立于2008年,是一家以美为内核,以人工智能为驱动的科技公司。秉承着“让艺术与科技美好交汇”的使命,美图公司致力于打造优秀的影像与设计产品,让图像、视频、设计的制作变得更简单,并通过美业解决方案助力产业数字化升级。美图公司于2016年12月在香港联合交易所主板挂牌上市,股票代码:1357.HK。
部门信息
美图研发团队负责美图系列产品的全链路技术开发与中台体系建设。我们聚焦AI影像、多端协同与数据智能,用扎实的技术将创意转化为稳定优美的亿级用户产品体验。期待你加入,用代码赋能创意。