Site Reliability Engineering (SRE) II (JR1023741)

Broadridge

Job Details

Location: Bremner Boulevard, Waterfront Communities-The Island, Toronto, Toronto, Ontario, m5j 0l7, Canada Posted: Feb 18, 2021

Job Description

Company Description

Broadridge Financial Solutions, Inc. (BR) , a $4 billion global Fintech leader and part of the S&P 500® Index, is a leading provider of investor communications and technology-driven solutions to banks, broker-dealers, asset and wealth managers and corporate issuers. At Broadridge, we do well by doing good. Our unique culture is guided by the Service-Profit Chain—the idea that success is mutual, directly connecting employee engagement, client satisfaction, and the creation of stockholder value. We enable better financial lives by powering investing, governance, and communications for our clients, their customers, and the financial services industry.

Job Description

Reporting into the Director, Service Delivery Engineer, the Site Reliability Engineer II (SRE) will build strategies to utilize automation tools to build/create infrastructure hardware, software and other technical components baked into the orchestration. The SRE will assist in building, maintaining and debugging state-of-the-art engineering technical and dive deep into technology and are on the forefront of the latest tools, technologies, and strategies and help evaluate, prototype, and introduce them to our team. The SRE will be experienced with agile continuous delivery and DevOps and will champion the culture, processes, and tools required to maintain a frictionless high-quality development environment.

Key Job Functions/Responsibilities

Participate in defining microservices infrastructure from inception and design, through deployment, operation and continuous refinement
Support services before they go live through activities such as system design, deployment automation, capacity planning and launch reviews
Maintain services once they are live by measuring and monitoring availability, latency and overall system health
Scale systems through mechanisms like automation
Evolve systems by pushing for changes that improve reliability and velocity
Provides input to a Risk Management Plan that will anticipate reliability-related, and non-reliability-related risks that could adversely impact plant operation
Develops engineering solutions to repetitive failures and all other problems that adversely affect plant operations. These problems include capacity, quality, cost or regulatory compliance issues
Other parameters that define operating condition, reliability and costs of assets
Provides technical support to production, maintenance management and technical personnel
Applies value analysis to repair/replace, repair/redesign, and make/buy decisions

Qualifications

Education

Bachelor’s degree in Computer Science or equivalent combination of education and relevant experience is required

Experience

5 years’ of combined systems architecture experience across multiple OS’s, languages and frameworks or equivalent experience
3 years’ experience working with firewalls, preventive controls, detective controls, corrective controls and information security controls to protect the confidentiality, integrity and/or availability of information and/or applications
3+ years’ experience supporting multi-tier web applications
Data analysis techniques that can include:
-Reliability modeling and prediction
-Fault Tree Analysis
-Root-cause and Root-Cause Failure Analysis
-Failure Reporting, Analysis and Corrective Action Systems
Minimum 2-4 years experience in a role supporting cloud-based solutions or as an SRE
Expert with design, deployment and administration of security and compliance technologies
Expert in distributed reliability engineering with a solid understanding of application data flow and how it meets system infrastructure
Advanced experience working with server and application hardening best practices

Computer Skills/Tools

2+ years’ experience using Terraform, Chef or other automated orchestration systems.
2+ years’ experience with multiple continuous integration tools such as Jenkins, GitLabs
Experience in one or more of the following: Python, Go, Perl, or shell scripting
Experience with Unix/Linux operating systems internals and administration
Advanced support experience within an operating platform, Windows or Unix
Extensive experience with Clouds, Kubernetes and Docker

Skills

Strong working knowledge of networking & load balancing protocols
Strong experience with Web API technologies
Strong analytical and problem-solving skills
Strong experience with a variety of monitoring tools and concepts
Strong written and verbal communication skills with the ability to document and communicate technical solutions at all levels and articulate technical details to different audiences

Job Location

I'm interested Privacy Policy

About Broadridge

Broadridge is a provider of investor communications and technology solutions for broker dealers, banks, mutual funds and corporate issuers.

View Website

Get More Interviews for This and Many Other Jobs

Huntr helps you instantly craft tailored resumes and cover letters, fill out application forms with a single click, effortlessly keep your job hunt organized, and much more.