Jobs /

Senior Engineer, Site Reliability Engineering

DocuSign

Apply Now

Job Details

Location: The Castro, San Francisco, CA, USA Posted: Aug 10, 2021

Job Description

Senior Engineer, Site Reliability Engineering
IT, InfoSec, Cyber Risk & Business Operations | Seattle, WA or San Francisco, CA or Remote - US

If this position is eligible for remote employment in the US, the employee can work remotely in all but the following states: Alaska, Hawaii, Iowa, Maine, Mississippi, North Dakota, South Dakota, Vermont, West Virginia and Wyoming.

Colorado base salary range: $106,500 - $130,200 and eligible for bonus, equity and benefits.

Our agreement with employees

DocuSign is committed to building trust and making the world more agreeable for our employees, customers and the communities in which we live and work. You can count on us to listen, be honest, and try our best to do what’s right, every day. At DocuSign, everything is equal. We each have a responsibility to ensure every team member has an equal opportunity to succeed, to be heard, to exchange ideas openly, to build lasting relationships, and to do the work of their life. Best of all, you will be able to feel deep pride in the work you do, because your contribution helps us make the world better than we found it. And for that, you’ll be loved by us, our customers, and the world in which we live.

The team

Our IT, InfoSec, Cyber Risk & Business Ops team - is in the business of trust and reliability. We create, maintain and operate scalable technology and data solutions that deliver an exceptional experience for our internal & external customers. We embrace Agile principles and values, favor DevOps practices, and view infrastructure as code, all while we create an infrastructure that scales and supports our growth and ambitious vision. This requires a smart, highly collaborative team who can identify, investigate, and implement new technologies to continue securely scaling our global business.

This position

The position is based within the Platform Engineering team of the Security Technology Group. DocuSign is seeking a passionate, dynamic, and experienced Information Security leader to join our team. This is a unique opportunity to work with everything security in a best-in-class, cloud-based platform on which DocuSign, customer, and partner applications run as well as the DocuSign enterprise environment that supports that robust platform. Come join a team who live and breathe information security and to work for a company with security in its DNA. The position reports directly to the Snr. Director, Security Platform Engineering within the Security Technology Group.
In this dynamic and fast paced role you will build and lead an SRE team that is responsible for providing and entails:

  • Understanding and measuring risk of the platform resilience, Monitoring of the platforms, Evolving Automation in Security Tooling & SRE, Release Engineering, Practical Alerting, Being On-call, Effective Troubleshooting, Emergency Response, Incident-Management, Root Cause Analysis and Post-mortem culture, Testing for Reliability, Log-Ingestion and Data-processing, Deal with and handle Interrupts effectively, Communication and Collaboration, Structured and Rational Decision Making.
  • This is a fantastic opportunity to join a team who live and breathe information security and to work for a company with security in its DNA. Position reports to the Snr. Director, Security Platform and Threat Detection Engineering and then to Deputy CISO, who is a critical member of the senior security leadership team.

This position is an individual contributor role reporting to the Senior Director of Security Operations and is designated Flex or Remote.

Responsibilities

  • Curate and maintain an inventory of all security tooling that have expected SLAs, with appropriate tiering built into the coverage
  • Instrument infrastructure monitoring & application performance monitoring
  • Keep all security tooling resilient, available and at optimal performance by monitoring and alerting on symptoms and not on outages
    • to be preemptive and avoid outages
    • take ownership of tool- incidents and swiftly resolve/mitigate platform impacting issues
    • triage issues across the entire stack: hardware, software, application and network
  • Assist with the post incident review process by isolating key process, product and platform failures. Engage the Engineering teams with carry forward action items that will improve ongoing reliability
  • Be curious, innovative and adoptive to develop the SRE culture beyond the practice and always have a continuous-learning mindset
  • Onboard new Platforms and tools into SRE, ensuring adequate documentation, detailed Runbooks, sufficient monitoring and appropriate escalation channels have been defined and agreed
  • Build end-to-end documentation and instrumentation of the Platform to ensure visibility, automation, self-healing and resiliency throughout the stack
  • Ensure execution readiness at all times through frequent game day exercises and drills
  • Lead and participate in 24x7 Site Reliability rotations and escalation workflows

Basic qualifications

  • 5+ years of experience in team leadership roles
  • 8+ years of experience in various information security roles and in technical roles (e.g. information technology, software engineering, system administration, solution architecture, network engineering, etc.)
  • Experience with:
    • monitoring tools & SIEMs like - Splunk , DataDog ,Prometheus etc.
    • logging tools like - Splunk , Elastic + Elastic Logstash Kibana etc.
    • On-call (pager-duty), production incidents response and performing RCA
    • log management tools, building monitoring dashboard
    • managing SLAs and contributing to the cross-functional team
  • Working knowledge of tools & technologies like:
    • Ansible, Git, CI/CD pipelines Jenkins), Python, Ruby, REST APIs
    • Splunk architecture, deployment & Container Orchestration using Elastic Kubernetes Service/Docker
    • AWS, Azure & GCP with Infrastructure Provisioning like Terraform and Cloudformation
  • Excellent troubleshooting skills and good experience:
    • TCP/IP stack
    • REST APIs
    • end to end Incident Management and the IM lifecycle
  • Must have worked in IT Operations for 3+ years
  • Logical thinker with good communication skills
  • Implement automated testing, continuous integration, and continuous deployment to streamline how we operate Splunk and data collection services internally.
  • Ability to collect and send data through REST API, Splunk HEC

Preferred qualifications

  • Previous experience in leading a technical SRE team to enable business needs, decisions, and requirements
  • Demonstrated ability to assess and triage service related incidents and problems
  • Ability to articulate risks and risk mitigations to technical, business, and executive audiences in verbal and written formats
  • Experience presenting to large and small diverse audiences
  • Direct experience defining and building programs and processes that maximize efficiency to participants and stakeholders

About us

DocuSign helps organizations connect and automate how they prepare, sign, act on, and manage agreements. As part of the DocuSign Agreement Cloud, DocuSign offers eSignature: the world's #1 way to sign electronically on practically any device, from almost anywhere, at any time. Today, over a million customers and hundreds of millions of users in over 180 countries use DocuSign to accelerate the process of doing business and simplify people's lives. And we help save the world’s forests and embrace environmental sustainability.
It's important to us that we build a talented team that is as diverse as our customers and where all employees feel a deep sense of belonging and thrive. We encourage great talent who bring a range of perspectives to apply for our open positions. DocuSign is an Equal Opportunity Employer and makes hiring decisions based on experience, skill, aptitude and a can-do approach. We will not discriminate based on race, ethnicity, color, age, sex, religion, national origin, ancestry, pregnancy, sexual orientation, gender identity, gender expression, genetic information, physical or mental disability, registered domestic partner status, caregiver status, marital status, veteran or military status, or any other legally protected category.

Accommodations

DocuSign provides reasonable accommodations for qualified individuals with disabilities in job application procedures, including if you have any difficulty using our online system. If you need such an accommodation, you may contact us at [email protected].

About DocuSign

DocuSign helps small- and medium-sized businesses collect information, automate data workflows, and sign on various devices.

View Website

Get More Interviews for This and Many Other Jobs

Huntr helps you instantly craft tailored resumes and cover letters, fill out application forms with a single click, effortlessly keep your job hunt organized, and much more.

Sign Up for Free