Site Reliability Engineer
We are looking for a high-performance Site Reliability Engineer for our client’s project — Europe’s leading last-mile B2B delivery platform.
Project — platform instantaneously connects businesses of all sizes to a fleet of high-quality couriers. They offer a range of services that includes instant, scheduled same-day and next-day delivery. It is the ideal solution for retailers, e-merchants, grocers and restaurants that want to delight their customers with flexible and express delivery options.
Main office – France, Paris.
We are looking for a Senior Site Reliability Engineer with expertise in Golang, strong problem-solving skills, and experience in improving system reliability within a cloud-native environment.
Experience / Skills:
Must have:
- 5+ years of experience as a Site Reliability Engineer or DevOps Engineer
- Strong experience with Golang
- Proficiency in observability tools, particularly Prometheus, Grafana, or the ELK stack
- Deep understanding of SLIs (Service Level Indicators) and SLOs (Service Level Objectives)
- Experience with incident management and improving related processes
- Familiarity with cloud-native environments
- Hands-on experience with Kubernetes
- Willingness to participate in on-call rotations
- Upper-Intermediate level of spoken English
Good to have:
- Familiarity with a wider range of open-source observability tools
- Experience in team leadership or mentorship
- Expertise in optimizing incident management workflows and systems
- Exposure to multi-cloud or hybrid-cloud environments
- Experience in team leadership or mentorship
Responsibilities:
- Maintain reliability features, ensuring robust and scalable infrastructure
- Features development using Golang
- Implement and enhance observability tooling, particularly using Prometheus, Grafana, and the ELK stack, to monitor and improve system performance
- Define and track SLIs (Service Level Indicators) and SLOs (Service Level Objectives) to ensure system reliability and performance meet business needs
- Improve incident management processes, contributing to faster resolution times and more efficient on-call practices
- Ensure reliability in cloud-native environments, deploying and managing systems with a focus on scalability and fault tolerance
- Manage and maintain Kubernetes environments, optimizing deployments and configurations for high availability and resilience
- Participate in on-call rotations, promptly responding to incidents, minimizing downtime, and ensuring business continuity
We offer:
- Competitive salary with the regular review
- Medical Insurance after 3 months’ probation period (can be used in Ukraine)
- Vacation (up to 20 working days)
- Paid sick leaves (10 working days)
- National Holidays as paid time off (11 days)
- Online English courses
- Accountant assistance and legal support
- Flexible working schedule, remote, office-based or hybrid format
- Fully equipped perfect office space located in the city center (ready for work in blackouts)
- Direct cooperation with the customer
- Dynamic environment with low level of bureaucracy and great team spirit
- Challenging projects in diverse business domains and a variety of tech stacks
- Communication with Top/Senior level specialists to strengthen your hard skills
- Online/offline teambuildings
- Volunteering culture development and support.
Dear ,
Thank you for applying for the position at nCube. Your application has been successfully received and is currently under review by our recruitment team. We will be in touch soon to discuss your application further and to outline the next steps if your skills and expertise are a match with the requirements. In the meantime, feel free to browse our Company Blog for the latest updates and insights.
Looking forward to connecting with you soon!
Best regards,
nCube Recruitment Team