hireejobs
Hyderabad Jobs
Banglore Jobs
Chennai Jobs
Delhi Jobs
Ahmedabad Jobs
Mumbai Jobs
Pune Jobs
Vijayawada Jobs
Gurgaon Jobs
Noida Jobs
Oil & Gas Jobs
Banking Jobs
Construction Jobs
Top Management Jobs
IT - Software Jobs
Medical Healthcare Jobs
Purchase / Logistics Jobs
Sales
Ajax Jobs
Designing Jobs
ASP .NET Jobs
Java Jobs
MySQL Jobs
Sap hr Jobs
Software Testing Jobs
Html Jobs
IT Jobs
Logistics Jobs
Customer Service Jobs
Airport Jobs
Banking Jobs
Driver Jobs
Part Time Jobs
Civil Engineering Jobs
Accountant Jobs
Safety Officer Jobs
Nursing Jobs
Civil Engineering Jobs
Hospitality Jobs
Part Time Jobs
Security Jobs
Finance Jobs
Marketing Jobs
Shipping Jobs
Real Estate Jobs
Telecom Jobs

Lead Site Reliability Engineer (Remote)

3.00 to 7.00 Years   Chennai   13 Oct, 2020
Job LocationChennai
EducationNot Mentioned
SalaryNot Disclosed
IndustryInternet / E-Commerce
Functional AreaGeneral / Other Software
EmploymentTypeFull-time

Job Description

Mattermost provides an open source enterprise-grade messaging platform to the world s leading organizations that allows teams to collaborate securely and privately anywhere. With over 10,000 server downloads / month our customers include Intel, Samsung, Affirm, The US Department of Defense and more. Our private cloud solutions offer secure, configurable, highly-scalable messaging across web, phone and PC with archiving, search, and deep integrations with hundreds of SaaS and on-premises technologies. Headquartered in Palo Alto, California, our company serves customers around the world with a distributed organization spanning the globe.We value high impact work, ownership, self-awareness and being focused on customer success. If these values match who you are, we hope youll learn more about working at Mattermost and apply!We are looking for an engineer to lead our Site Reliability Engineering team for Mattermost s new SaaS offering. You have a strong blend of software development, infrastructure and networking skills with a keen sense for leadership. Leading a team ensuring high reliability of user-facing production services is an area you have lots of experience in. You can keep cool under pressure and lead your team in the development of systems and processes to allow for the effective resolution of incidents.Responsibilities:

  • Lead a team of engineers focused on maintaining high reliability of Mattermost s SaaS offering
  • Build services and tools to ensure the stability of production services
  • Set technical vision and innovate to be on the forefront of self-healing SaaS services
  • Help drive efforts for compliance certifications in our SaaS (SOC2, GDPR, FedRamp, etc.)
  • Define infrastructure as code with Terraform and other tools
  • Write thoughtful and high-quality code in Go
  • Develop services to handle automatic recovery from incidents and disasters
  • Automate incident or disaster simulations to identify blindspots
  • Lead hiring for your team
  • Execute our performance management process to ensure a high level of performance within the team
  • Work with other Leads to follow our engineering best practices, and ensure alignment with our Leadership Principles
  • Implement, maintain and tune monitoring and alerting systems
  • Deploy applications to and manage Kubernetes clusters
  • Respond on-call to incidents with quick and effective resolutions
Requirements:
  • Bachelors degree in Computer Science or related fields, or significant professional DevOps or SRE experience
  • Experience with SRE and DevOps methodologies
  • Strong experience running reliable, high scale applications with Kubernetes in production
  • Strong knowledge of container systems such as Kubernetes & Docker
  • Experience defining and leading on-call rotations for highly available SaaS services
  • Previous experience achieving compliance certifications, audits, and remediation for a public SaaS. Examples include one or more of: SOC2, PCI, HIPPA, GDPR, FedRamp
  • Solid programming skills and experience with or an ability to quickly become proficient in Go
  • Experience working with infrastructure as code tools, such as Terraform
  • Ability and willingness to be on-call
Preferences:
  • Experience with distributed application systems using HTTP, WebSockets, RPC, pub/sub, etc. at scale
  • Open source contributions to related projects
  • Knowledge of Grafana and Prometheus
  • Comfortable with GitHub, Jira, Jenkins, CircleCI
  • Experience working in open source communities
,

Keyskills :
javaacademicsacpalgorithmsandroidopen sourceprivate cloudcomputer sciencetechnical visionnetworking skillsbehavioral trainingsoftware developmentperformance managementreliability engineering

Lead Site Reliability Engineer (Remote) Related Jobs

© 2019 Hireejobs All Rights Reserved