San Jose, US-United States
Posted 4 weeks ago
About The Company This company pioneers short-form video creation and social engagement, boasting a vast, engaged user base. Its platform empowers users with creative tools, filters, and effects. With a diverse content ecosystem, it’s a hub of creativity and expression. The proprietary algorithm ensures personalized content feeds, enhancing user engagement and satisfaction. This company wields significant influence on digital media, making it an invaluable partner for innovative collaborations and marketing endeavors. Global e-commerce is a content e-commerce business with international short video product as the carrier. It is committed to becoming the first choice for users to discover and purchase good products with affordable prices. Global e-commerce business team hopes to provide users with more tailored and efficient consumption experience, enabling merchants to receive reliable platform services in different scenarios such as live e-commerce, short video content e-commerce, so as to make more affordable and high-quality products sell easily and a better life within reach. Responsibilities – Be part of global SRE oncall rotation and be responsible for Tier-1 online incident response and devops support. – Be responsible for service levels of mission critical, revenue-generating E-commerce platform as well as all supporting infrastructure and services. This role will focus on service reliability, highly-scalable design, and release management in a cloud-native environment. – Define service level indicators and data-driven objectives, and develop devops / SRE standards, processes and methodologies, to uphold and improve uptime, latency, and system health of a core global e-commerce production platform. – Collaborate cross-team with engineering and product to ensure that key stability and maintainability requirements, such as capacity planning and launch reviews, are performed to enable transparent service delivery to customers. – Design strategies for risk detection and mitigation, disaster recovery & simulation, release management, cost optimisation, engineering quality etc – Automation geared towards infrastructure-as-code, scalability and service resiliency. – Implement best practices around incident management, post-mortems while being part of on-call rotations. Minimum Qualifications – Bachelor’s or higher degree in Computer Science, similar technical field of study, or equivalent practical experience – 5+ years experience developing, provisioning or maintaining production-grade large scaled distributed systems – High level of proficiency in Linux OS internals, networking, microservices, databases, caches etc in cloud-native environments – Demonstrable familiarity with programming or scripting languages (Go/Python/Bash/C++ etc) – Demonstrable experience in the development and implementation of devops and SRE methodologies Preferred Qualifications – Experience in designing, analyzing, and troubleshooting large-scale distributed systems – Systematic problem-solving approach, coupled with effective communication skills and a sense of drive |
Job Features
Job Category | DevOps & SRE |
Seniority | Senior IC / Tech Lead |
Base Salary | $180,000 - $280,000 |
Recruiter | martina.wang@ocbridge.ai |