Why American Express?
There’s a difference between having a job and making a difference.
American Express has been making a difference in people’s lives for over 160 years, backing them in moments big and small, granting access, tools, and resources to take on their biggest challenges and reap the greatest rewards.
We’ve also made a difference in the lives of our people, providing a culture of learning and collaboration, and helping them with what they need to succeed and thrive. We have their backs as they grow their skills, conquer new challenges, or even take time to spend with their family or community. And when they’re ready to take on a new career path, we’re right there with them, giving them the guidance and momentum into the best future they envision.
Because we believe that the best way to back our customers is to back our people.
The powerful backing of American Express.
Don’t make a difference without it.
Don’t live life without it.
Site Reliability engineering portfolio consists of several mission critical applications for americanexpress.com. Mobile and Web engineering enterprise applications are highly available applications, maintains high (~100%) availability in an extremely high throughput transactional system with strict performance requirements. Site Reliability Engineering team of MWE portfolio works with various Product teams, Staff Architects, Engineering Leaders and Engineering Teams across Mobile and Web engineering platform. Primary focus of the Site Reliability Engineering team is to conceptualize, design, develop and implement frameworks/common components, instrumenting observability tools for enterprise that will ensure high application reliability, scalability, availability and performance of the Mobile and Web applications. Site reliability team is embarking on a transformation journey to implement “Robotics first” approach in Service Delivery and Site Reliability Engineering space.
The Sr. Engineer I (Site Reliability Engineer) role is a hands-on Senior Architect Level position supporting American Express' MWE Service Reliability Engineering team. The ideal candidate must have experience in full stack engineering.
What you will be doing:
- Conceptualize and implement Machine Learning driven Site Reliability Engineering Framework/Components to improve predictive monitoring and driving SRE team’s journey towards “Robotics First” approach.
- Research latest technology, concepts, conceptualize solution and develop proof of concept that will improve resiliency and performance of the production infrastructure. Design and implement innovative solution/framework that will improve software engineering velocity, infrastructure resiliency and security, and data availability.
- Develop common framework components (to be leveraged by enterprise applications), define standards for configuration, monitoring, reliability and performance engineering.
- Work with operations team to resolve major incidents.
- Continuously improve automated remediation tasks to ensure the highest levels of availability.
- A BS degree in Computer Science, Computer Engineering, other Technical discipline, or equivalent work experience.
- 10 + years of Technical hands-on experience with systems analysis, incorporating: Design Methodology, Production Support and Engineering, Enterprise level technologies including, but not limited to OpenShift, WebSphere Administration, JEE (JSP, Servlets, XML, Java), and internet-related technologies to deliver complex Internet facing solutions.
- Hands on experience with frameworks - Spring Boot, Vertex, NodeJS
- Experience in designing mission critical highly available enterprise applications.
- Hand on experience with performance testing framework design, tuning Java applications.
- Experience managing relational and NoSQL databases such as DB2, Postgres.
- Strong knowledge of Linux internals and experience managing Linux systems in high traffic environments.
- Strong interpersonal communication skills and the ability to work well in a diverse team-focused environment.
- Experience with Splunk and/or ELK.
- Familiarity with financial services and authorizations systems.
- Experience with machine learning implementation would be an advantage