Job Description
Site Reliability Engineer - Marin Software
We are open to remote applications from most States within the US
What is exciting about this role?
- Work with a very experienced tech team located from the San Francisco Bay Area to Texas, to Shanghai, to Paris, and the UK operating a follow the sun model
- Hands on experience with our sophisticated tech stack, spanning Linux to Hadoop, MySQL to Spark/Presto, monolithic applications to microservices
- Be part of a DevOps initiative that enables making changes to environments, including production, in an agile, trackable, and safe fashion
- Join a team that is constantly extending and improving the dozens of GoCD pipelines that enable developers to push changes many times a day directly to production
- Manage petabytes of data to support our Data Science / Machine Learning initiatives
What is exciting about Marin?
An ally to online marketers, Marin Software delivers the leading independent multichannel digital advertising platform. Our open solution unites search and social to connect our advertisers with customers wherever they are. This synergy—plus the insight and efficiency we bring to advertising—wins more customers, revenue, and ROI for the world’s top brands. Every day Advertisers and agencies use Marin to manage billions of dollars in annualized ad spend.
We have exciting plans for 2022 and this team and the DevOps initiative will be important in this.
We offer a good base salary, wide ranging benefits (inc Medical insurance) and Stock in the business. We are also comfortable with remote working (within the US) and can support Visa transfers if relevant.
As a Reliability Engineer, you are an equal parts appops, sysops, and devops. We count on our site reliability engineers (SREs) to empower our users with a rich feature set, high availability, and stellar performance level to pursue their missions. As we expand our platform, we are currently seeking an experienced SRE to deliver insights from massive scale data in real time. Specifically, we are searching for someone who brings fresh ideas, demonstrates a unique and informed viewpoint, and enjoys collaborating with a cross-functional team to develop real-world solutions and positive user experiences at every interaction. You love keeping abreast of the latest industry trends and using them to help you innovate. You have strong leadership qualities, great judgment, clear communication skills, and a track record of delivering great products.
Key Responsibilities:
- Run the production environment by monitoring availability and taking a holistic view of system health.
- Provide primary operational support and engineering for multiple large distributed software applications.
- Improve reliability, quality, and time-to-market of our suite of software solutions
- Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve.
- Collaborate with Developers on designing large, scalable, and robust systems.
- Continuously remediate, automate or shift-left unplanned/toil work and issues by:
- Working with Development to remediate root cause.
- Enhance monitoring and detection.
- Automate away with scripting and coding
- Training and transitioning tasks to L1 teams.
- Develop, manage and follow operational policies and procedures including documentation and training.
Minimum Qualifications:
- Proven experience of application operation engineering , SRE or DevOps experience or Bachelor’s degree in computer science or other highly technical, scientific discipline.
- Familiar with software engineering principles (build,test,deploy)
- Hands-on experience in writing build and deployment scripts, creating reusable scripts to automate repeatable tasks.
- Hands on coding experience (structured and OO) with one or more high level languages, such as Python, Java, Scala, Shell and JavaScript.
- Experience working with large volumes of data, preferably in Hadoop,Hive, Hbase and/or MySQL.
- Experience with the following tools: Tomcat and equivalent app servers, Jenkins, Git, Jira, Artifactory, and Build / Dependency Management Tools.
- Excellent problem solving and thought leadership skills.
- Strong sense of ownership and the ability to work with a limited set of requirements.
- Team attitude with strong verbal and written communication skills.
- Comfortable working in Linux with log parsing and text formatting skills.
- SQL query skills with the minimum or joins,unions,alias knowledge.
- Understanding of common system architecture like web application, micro services, distributed applications etc.
- Understanding of ITIL concepts and continual service improvement.
Desired Qualifications:
- Hands-on experience with infrastructure as code tools and concepts: Nomad, Terraform, ansible, etc.
- Familiarity with SRE/DevOps principles
- Experience setting up and managing distributed NoSQL databases.
- Extensive experience working in an agile environment (i.e. user stories, iterative development, etc.)
- Familiarity with cloud computing platforms (AWS, google compute, OpenStack)
- Hands-on experience with virtualization, VMware, etc.
- Working with test-driven development and software test automation.
- Experience with code review tools GitHub, Review Board, Crucible, Fisheye, SVN Bridge, or similar tools.
Marin Software embraces diversity and is proud to be an equal opportunity employer. As part of our commitment to diversifying our workforce, we do not discriminate on the basis of age, race, sex, gender, gender identity, color, religion, national origin, sexual orientation, marital status, citizenship, veteran status, or disability status, and we operate in compliance with the San Francisco Fair Chance Ordinance.