Define metrics to evaluate system performance and runtime, improving observability. Plan system capacities to accommodate business growth and promotions.
Analyze production incidents to establish best practices for a highly available payment architecture.
At least 3 years relevant work experience from a large-scale systems.
...
Lead efforts to improve the stability of payment systems, including monitoring, log management, and creating diagnostic tools.
Conduct regular drills and develop plans for quick service restoration during incidents, participating in on-call rotations as needed.
Define metrics to evaluate system performance and runtime, improving observability. Plan system capacities to accommodate business growth and promotions.
...
Build robust, performant, user-facing web applications in TypeScript/Angular, Python/Django.
Develop, create, and ship new functionality for user interaction and data visualization, using modern APIs and frameworks
Work with Service Delivery leaders on high value automation opportunities to consult, design, develop, and/or support solutions using web technologies including Angular and Django.
...
The role is under the prerogative of the Service Automation department belonging to Technology and Operations’ Group Infrastructure and platform services group. Its primary responsibilities include the following:
· Build robust, performant, user-facing web applications in TypeScript/Angular, Python/Django.
...
Define metrics to evaluate system performance and runtime, improving observability. Plan system capacities to accommodate business growth and promotions.
Analyze production incidents to establish best practices for a highly available payment architecture.
At least 3 years relevant work experience from a large-scale systems.
...
: Ensure Mean Time to Recovery (MTTR) meets department Key Performance Indicators (KPI). The 2022 target is less than 89 minutes, subject to annual updates. Achieve a timely closure rate of ≥ 95% for major and critical alarms. Address and resolve major and critical alarms within 24 hours.
The role is under the prerogative of the Service Automation department belonging to Technology and Operations’ Group Infrastructure and platform services group. Its primary responsibilities include the following:
· Build robust, performant, user-facing web applications in TypeScript/Angular, Python/Django.
· Develop, create, and ship new functionality for user interaction and data visualization, using modern APIs and frameworks
...
The role is under the prerogative of the Service Automation department belonging to Technology and Operations’ Group Infrastructure and platform services group. Its primary responsibilities include the following:
Build robust, performant, user-facing web applications in TypeScript/Angular, Python/Django.
Develop, create, and ship new functionality for user interaction and data visualization, using modern APIs and frameworks
...
Provide effective maintenance and support services to business units. Ensure the availability and robustness of the IT applications for a 24×7 mission critical system.
Develop and implement automation tools to streamline operational tasks.
Ensure the efficient functioning of applications, monitor performance, automate processes, and enhance system availability.
...
Support & oversee availability, reliability, resilience, performance, security, and monitoring of applications on Azure Cloud and various supporting platforms to ensure business operational SLA and SLO are met.
...
The platform will support a variety of services based on open-source software, such as Kubernetes, Cassandra, Zookeeper, Kafka, Redis, etc, alongside internally developed services. Key Qualifications Strong emphasis on SRE as an engineering subject area, with proficiency in at least in one of the following languages (Golang, Rust, Python, Swift)
Successful track-record and proven experience as a backend internet services software developer Knowledge of SDLC, including continuous integration, testing methodologies, TDD and agile development methodologies Understanding of base internet infrastructure services including DNS, DHCP, LDAP, server virtualization, server monitoring in critical, large scale distributed systems experience, combining Hardware, Operating Systems and Software
Understanding of SRE principals, including monitoring, alerting, error budgets, fault analysis, and other common reliability engineering concepts, with a keen eye for opportunities to eliminate toil by code and process improvements. Description
...
Build robust, performant, user-facing web applications in TypeScript/Angular, Python/Django.
Develop, create, and ship new functionality for user interaction and data visualization, using modern APIs and frameworks
Work with Service Delivery leaders on high value automation opportunities to consult, design, develop, and/or support solutions using web technologies including Angular and Django.
...
The role is under the prerogative of the Service Automation department belonging to Technology and Operations’ Group Infrastructure and platform services group. Its primary responsibilities include the following:
Build robust, performant, user-facing web applications in TypeScript/Angular, Python/Django.
Develop, create, and ship new functionality for user interaction and data visualization, using modern APIs and frameworks
...