## 岗位要求
1. 本科及以上学历,3 年以上后端开发经验,精通 Python 后端开发,具备扎实的数据结构与系统设计能力
2. 熟悉Python主流 Web 框架(如 FastAPI / Flask / Django 等),理解异步编程模型与协程机制
3. 具备高并发、高可用系统设计经验,理解分布式系统基础(服务注册发现、负载均衡、限流熔断、降级策略等)
4. 熟悉 Linux 开发环境,理解 Docker 原理,具备 K8s 实践经验,理解容器调度与资源管理机制
5. 具备模型服务接入或外部 API 聚合经验,理解推理服务的性能特征、延迟控制与成本优化策略
6. 熟悉 Redis、消息队列(Kafka / RabbitMQ 等)或缓存与异步任务系统设计者优先
## 加分项
1. 有算法服务中台、技术中台或云原生平台建设经验
2. 有多模型调度、策略引擎或流量分发系统设计经验
3. 熟悉服务网格(Service Mesh)、可观测体系(Prometheus / OpenTelemetry 等)
4. 了解大模型推理架构、GPU 资源调度或推理性能优化
5. 有大规模系统稳定性建设经验(压测、容量规划、故障演练等)
## 我们提供
1. 从 0 到 1 参与 AI 算法服务中台架构设计与核心模块建设
2. 参与多模型能力接入与调度系统建设,解决真实高并发与成本优化问题
3. 技术决策空间充分,鼓励工程质量与系统设计能力提升
4. 入职配备最新款 MacBook Pro,提供 AI 工具支持(如 Cursor 等)
5. 扁平开放的技术氛围,与算法及业务团队深度协作
## 我们期待这样的你
1. 对系统架构与工程质量有追求,愿意构建可长期演进的技术体系
2. 对 AI 基础设施与模型工程化充满兴趣
3. 希望参与构建一个真正支撑业务规模增长的核心平台系统
欢迎加入我们,一起构建面向 AI 时代的高性能、可扩展的算法服务基础设施。
Backend Software Engineer (Python) | AI Algorithm Services Platform
About the Role
We are building a next-generation AI Algorithm Services Platform that standardizes the integration, orchestration, and governance of AI models and external algorithm providers. As a Backend Software Engineer, you will play a key role in designing and developing the engineering infrastructure that enables scalable, high-performance, and highly available AI services.
You will help create a unified platform for model onboarding, routing, scheduling, observability, and runtime governance, forming the foundation of AI-powered applications at scale.
Responsibilities
- Design and implement engineering frameworks for AI algorithm services, including service packaging, containerization, deployment standards, and runtime governance.
- Build and enhance a multi-model integration and orchestration platform, supporting model routing, concurrency control, rate limiting, circuit breaking, priority scheduling, and cost optimization strategies.
- Architect and optimize backend systems for high-concurrency workloads, ensuring scalability, reliability, and fault tolerance under high-QPS and complex dependency scenarios.
- Develop comprehensive observability capabilities, including logging, metrics, tracing, monitoring, and alerting, to improve system performance and resiliency.
- Collaborate closely with AI researchers, product managers, and business teams to efficiently deliver AI capabilities and establish reusable engineering best practices.
Requirements
- Bachelor's degree or above in Computer Science, Software Engineering, or a related field.
- 3+ years of backend development experience with strong proficiency in Python.
- Solid understanding of data structures, algorithms, software architecture, and system design principles.
- Hands-on experience with mainstream Python web frameworks such as FastAPI, Flask, or Django.
- Strong understanding of asynchronous programming, event-driven architectures, and coroutine mechanisms.
- Experience designing and operating high-availability, distributed systems, including service discovery, load balancing, rate limiting, circuit breaking, and degradation strategies.
- Familiarity with Linux development environments and container technologies.
- Practical experience with Docker and Kubernetes, including container orchestration, scheduling, and resource management.
- Experience integrating AI model services or aggregating external APIs.
- Understanding of inference service performance characteristics, latency optimization, throughput management, and cost-efficiency strategies.
- Experience with Redis, message queues (Kafka, RabbitMQ, etc.), caching systems, or asynchronous task processing frameworks is a strong plus.
Preferred Qualifications
- Experience building AI service platforms, technical middleware platforms, or cloud-native infrastructure.
- Experience designing multi-model orchestration, traffic routing, policy engines, or intelligent scheduling systems.
- Familiarity with Service Mesh technologies and observability ecosystems such as Prometheus and OpenTelemetry.
- Understanding of LLM inference architectures, GPU resource scheduling, or inference performance optimization.
- Experience in large-scale system reliability engineering, including load testing, capacity planning, and disaster recovery exercises.
What We Offer
- The opportunity to participate in the end-to-end design and development of an AI Algorithm Services Platform from the ground up.
- Exposure to real-world challenges in multi-model orchestration, large-scale concurrency, and infrastructure cost optimization.
- Significant ownership and influence over technical decisions, architecture, and engineering standards.
- Latest-generation MacBook Pro and access to modern AI productivity tools such as Cursor.
- A collaborative, flat organizational culture with close interaction between engineering, AI research, and business teams.
We’d Love to Meet Someone Who
- Is passionate about system architecture, software craftsmanship, and building sustainable, scalable platforms.
- Has a strong interest in AI infrastructure, model serving, and AI engineering.
- Wants to help build mission-critical systems that directly support business growth and long-term scalability.
Join us in building high-performance, scalable, and reliable AI infrastructure for the next generation of intelligent applications.