Operate, improve reliability, scalability, and performance of the Japan Sovereign Cloud platform on Oracle Cloud Infrastructure (OCI).
Collaborate with software engineering, cloud operations, and global OCI teams; apply software engineering principles to automate operations, resolve complex production issues, and enhance service resiliency.
Undergo a 24x7 hands-on operational learning period covering shift workflows, alerts, incidents, escalation paths, runbooks, and customer-impacting reliability risks; participate in a 24x7 shift rotation across Japanese and international teams.
Partner with shift teams to capture recurring operational issues, improve alert actionability, maintain operational documentation, and deliver practical fixes through tooling, automation, and process improvements.
技術スタック
必須スキル
Linux-based production environments
Python (scripting/programming)
Reliability Engineering / SRE fundamentals
Cloud computing concepts, networking, distributed systems, and automation
Business-level English and native Japanese communication
歓迎スキル(該当する場合)
Go, Java, Shell, or similar programming languages
Experience building/runbooks and automation tooling for incident response
Familiarity with 24x7 on-call operational support
キャリア成長観点
日本の Sovereign Cloudに特化した大阪・日本市場で、クラウド運用の信頼性・耐障害性の専門性を深められる。