负责 Hadoop、Spark、Hive、HBase、Presto、Flink、ClickHouse 等大数据集群的规划、部署与算力调优,保障集群 7×24 小时稳定运行。
负责集群日常运维、性能调优、容量规划、故障排查与问题根治,保障服务达成 SLA 指标。
负责大数据开源组件二次封装、功能迭代与漏洞修复,搭建通用工具与服务,赋能数据平台建设。
为数据分析团队提供底层技术支撑,协助解决平台使用过程中各类复杂技术问题。算机、软件工程、数学等相关专业,本科及以上学历,拥有 3 年及以上大数据研发相关工作经验。
具备中大型集群(PB 级数据量或 50~100 节点规模)整体规划、搭建与优化实战经验。
有 Hadoop、Spark、Hive、Presto 等开源框架源码修改、二次开发经验者优先。
精通 Hadoop、Spark、Hive、HBase、Presto、Flink、ClickHouse 等主流大数据生态组件。
熟练使用 Java、Python、Shell 进行开发,精通 HiveSQL 性能调优。
可独立完成需求分析、技术选型,能够设计高可用、高可扩展的大数据解决方案。
具备钻研精神与自驱力,沟通表达良好,拥有较强的抗压能力及集群应急处置能力。
具备 AI 开发能力,熟练使用主流 AI 编程提效工具(如 Codex、CodeCopilot 等)。
拥有大数据场景下大模型 Agent 设计、开发及项目落地经验者优先。
了解或掌握机器学习算法底层原理者优先。
## Key Responsibilities
* Design, deploy, optimize, and maintain large-scale big data clusters based on technologies such as Hadoop, Spark, Hive, HBase, Presto, Flink, and ClickHouse.
* Ensure 24×7 stability, reliability, and performance of production data platforms through proactive monitoring, capacity planning, and performance tuning.
* Troubleshoot complex system issues, conduct root cause analysis, and implement long-term solutions to meet SLA requirements.
* Develop and enhance internal platform capabilities through customization, secondary development, feature enhancements, and bug fixes of open-source big data components.
* Build reusable tools, frameworks, and platform services to improve engineering efficiency and support data platform evolution.
* Provide technical guidance and infrastructure support for data analysts, data engineers, and business teams, helping resolve complex platform-related challenges.
* Participate in architecture design, technology evaluation, and best-practice establishment for enterprise-scale data platforms.
## Qualifications
* Bachelor's degree or above in Computer Science, Software Engineering, Mathematics, or a related field.
* 3+ years of experience in big data development, platform engineering, or data infrastructure-related roles.
* Hands-on experience planning, building, and optimizing medium-to-large-scale big data clusters (50–100+ nodes or PB-scale data environments).
* Strong expertise in mainstream big data ecosystem technologies, including Hadoop, Spark, Hive, HBase, Presto, Flink, and ClickHouse.
* Proficient in Java, Python, and Shell scripting for platform development and automation.
* Advanced experience with Hive SQL optimization and large-scale query performance tuning.
* Ability to independently perform requirement analysis, technology selection, and architecture design for highly available and scalable big data solutions.
* Experience with source code modification, customization, or secondary development of open-source frameworks such as Hadoop, Spark, Hive, or Presto is highly preferred.
* Strong problem-solving skills, self-motivation, communication abilities, and the capability to handle critical production incidents under pressure.
* Familiarity with AI-assisted development tools and coding productivity platforms such as Codex, GitHub Copilot, or similar solutions.
## Preferred Qualifications
* Experience designing, developing, and deploying LLM-powered Agent solutions within big data or enterprise data platform environments.
* Understanding of machine learning algorithms and their underlying principles.
* Experience supporting AI/ML workloads on large-scale data infrastructure is a plus.