Job Description
Site Reliability Engineer, Cloud Management
Site Reliability Engineering (SRE) is an engineering discipline that combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures internally critical and externally visible systems have reliability and up-time appropriate to users' needs and a fast rate of improvement while keeping an ever-watchful eye on capacity and performance. SRE is a mindset and a set of engineering approaches focused on optimizing existing systems, building infrastructure, and eliminating work through automation. As a Site Reliability Engineer in the Cloud Management team, you will build and operate cloud management solutions for Vmware services being offered across multiple public and private clouds.
Our team focuses on common service components across the stack. We develop and operate solutions to support public cloud management, CI/CD container orchestration, security and monitoring, closing the potential gaps between software and service requirements.
We work with various Software Engineering teams building high performance and reliable cloud systems. You will tackle a variety of business, infrastructure security and application problems in a complex ecosystem. You will collaborate with many SaaS teams across all disciplines. These teams will look to you for support and guidance on how to build and operate complex services. Our team is directly responsible for solutions around cloud management, security, reliability and visibility into cloud systems.
As the SaaS business runs on a 24 by 7 basis, the role requires rotational on-call availability (weekdays at work, evenings and weekend for service/system related incidents).
Success in this role requires very strong technical skills, a broad background and understanding of every layer of the software development and cloud ecosystem and excellent understanding of the cloud and container management stacks. You should be comfortable working independently and as part of a specialized team.
Minimum Qualifications
- 3+ years in various DevOps/SRE roles
- 3+ years of experience working with AWS
- Experience administering Linux systems in a production environment
- Experience in building and running large-scale systems and application architectures
- Deep understanding of system performance and monitoring
- Understanding of containers and container orchestration
- Experience in one or more of the following languages: Python, Java, Go and/or NodeJS
- Excellent project management skills and the ability to work in a fast-paced and hectic work environment
- Demonstrate skills in priority setting, analysis, communication, time management, scheduling, and multitasking.
- Proven verbal and written communication skills
- BS or MS degree in Computer Science, or a related field
- U.S. citizen able to attain a U.S. government security clearance and pass regular background investigations
Preferred Qualifications
- Experience with modern container orchestration systems: Kubernetes, Mesos, DC/OS, Swarm
- Experience with infrastructure configuration and automations processes and tools: Terraform, Puppet, Ansible, Chef, Fabric
- Experience with security in the cloud: Intrusion, penetration, and vulnerability scanning
- Experience with monitoring solutions: ELK, Splunk, SUMO, Nagios, Prometheus
- Experience with various data technologies including relational and non-relational databases and message queues
- Good working knowledge of build automation and continuous integration/delivery ecosystem: Git, Gerrit, Maven/Gradle, Jenkins, Docker, Nexus, Artifactory. Selenium
VMware Company Overview: VMware, we believe that software has the power to unlock new opportunities for people and our planet. We look beyond the barriers of compromise to engineer new ways to make technologies work together seamlessly. Our cloud, mobility, and security software form a flexible, consistent digital foundation for securely delivering the apps, services and experiences that are transforming business innovation around the globe. At the core of what we do are our people who deeply value execution, passion, integrity, customers, and community. Shape what’s possible today at http://careers.vmware.com.
Equal Employment Opportunity Statement: VMware is an Equal Opportunity Employer and Prohibits Discrimination and Harassment of Any Kind: VMware is committed to the principle of equal employment opportunity for all employees and to providing employees with a work environment free of discrimination and harassment. All employment decisions at VMware are based on business needs, job requirements and individual qualifications, without regard to race, color, religion or belief, national, social or ethnic origin, sex (including pregnancy), age, physical, mental or sensory disability, HIV Status, sexual orientation, gender identity and/or expression, marital, civil union or domestic partnership status, past or present military service, family medical history or genetic information, family or parental status, or any other status protected by the laws or regulations in the locations where we operate. VMware will not tolerate discrimination or harassment based on any of these characteristics. VMware encourages applicants of all ages. Vmware will provide reasonable accommodation to employees who have protected disabilities consistent with local law.