Overview
DCT Bogota, D.C., Capital District, Colombia
Site Reliability Engineer
Responsibilities
- Service & Infrastructure Management: Oversee and manage core platform web services, including API and database servers to ensure optimal performance and health.
- System Monitoring & Emergency Response: Proactively monitor application and infrastructure health using tools like Grafana, ELK, and Sentry.
- Participate in a compensated 24/7 on-call rotation that is professionally managed and structured for fairness, conducted virtually (no need to be on-site).
You will be backed up by a senior engineer for immediate support, troubleshooting, and swift emergency resolution.
- Automate recurring operational tasks, system deployments, backups, and maintenance procedures to improve efficiency.
- Partner with the Software Development team to provide guidance and embed modern DevOps practices directly into their development workflows.
- Security & Compliance: Assist the IT team in implementing security policies across the entire infrastructure.
Requirements
- 4+ years of experience in a Site Reliability, DevOps, or Software Engineering role with a primary focus on production infrastructure.
- Willingness and ability to participate in a compensated on-call rotation to respond to and resolve after-hours emergencies.
- Linux Expertise: Strong practical experience with Linux system administration , including usage of the command line, shell scripting (Bash) , and advanced system-level troubleshooting.
- Containerization: Good understanding of container technologies, with hands-on proficiency using Docker and Docker Compose in a production context.
- Web Server Configuration: Experience configuring and managing web servers, specifically NGINX, for tasks like reverse proxying, load balancing, and SSL termination.
- Strong analytical and problem-solving skills, with the ability to take ownership and drive complex technical challenges to resolution.
Nice to Have
- Knowledgeable of Amazon Web Services (AWS) cloud platform.
- Proficiency with infrastructure and application monitoring tools (e.g., Grafana, Amplify, Sentry, ELK stack).
- Networking Fundamentals: Solid understanding of core networking concepts and essential protocols like HTTP/HTTPS and DNS, along with basic familiarity with firewall and interface configuration.
- Experience with database administration (experience with AWS Aurora and PostgreSQL are a strong plus).
- Experience with Redis DB.
- Experience building and maintaining CI/CD pipelines.
- Experience with modern software development workflows based on Pull Requests, Continuous Delivery, and TDD, as well as an understanding of Agile principles.
- Experience with container orchestration technologies (e.g., Swarm, Podman, Kubernetes/K8s).
- Familiarity with Infrastructure as Code (IaC) principles and tools like Terraform.
- Familiarity with project management tools such as Jira or ClickUp.
The Team You Will Join
- You will join a growing Engineering team, based in Bogotá in the role of Software Engineering focused on Site Reliability .
You will report directly to our SRE Lead , receiving technical guidance and mentorship.
In addition, you will be paired with a dedicated Line Manager whose primary focus is to support your long-term career progression and professional development.
Who we are
- DCT is a global leader in the Fleet Telematics Industry with over 25 years of software and hardware development with headquarters in Miami, FL - USA.
- Our platform is the backbone for hundreds of customers across diverse industries and countries in more than 25 countries, with a significant and strategic focus in LATAM.
What we offer
- Career Growth & Mentorship: A dedicated Line Manager and a personal training budget are provided to ensure you have the resources and guidance to advance your professional skills and career path.
- A Generative & Collaborative Culture: Join a dynamic and innovative team that embraces a generative culture to achieve quality products—we encourage curiosity and an open creative mindset as part of our core principles.
- Flexible Work Environment: We offer a flexible work-from-home policy designed to support a healthy work-life balance for our team members.
- Stability & Impactful Work: Be part of a globally recognized company with a 25-year track record of financial stability and technological innovation.
Your work will have a direct and meaningful impact on a platform used by hundreds of leading businesses in the fleet telematics space handling massive streams of real-world data.
We want to hear from you! Even if the salary or benefits aren't exactly what you're looking for, we encourage you to apply if you believe you're a great fit for the role and the team
Seniority level
Mid-Senior level
Employment type
Full-time
Job function
Engineering and Information Technology
Industries
Technology, Information and Internet
We’re unlocking community knowledge in a new way.
Experts add insights directly into each article, started with the help of AI.
#J-18808-Ljbffr