Roadie
Site Reliability Engineer
Job Summary
The role involves maintaining and enhancing the reliability, scalability, and performance of the company's platform, primarily through working with Kubernetes clusters and various infrastructure systems. Candidates should have experience with scripting, automation, monitoring tools, and cloud technologies like AWS, along with an understanding of DevOps practices. The position requires independent work, problem-solving skills, and the ability to thrive in a fast-paced, agile environment. The company offers comprehensive benefits and flexible work arrangements.
Required Skills
Benefits
Job Description
Roadie, a UPS company, is a leading logistics and delivery platform that helps businesses tackle the complexities of modern retail with unmatched delivery coverage, flexibility and visibility. Reaching 97% of U.S. households across more than 30,000 zip codes — from urban hubs to rural communities — Roadie provides seamless, scalable solutions that meet a variety of delivery needs.
With a network of more than 310,000 independent drivers nationwide, Roadie offers flexible delivery solutions that make complex logistics challenges easy, including solutions for local same-day delivery, delivery of big and bulky items, ship-from-store and DC-to-door.
Roadie is seeking a Site Reliability Engineer to join our growing Technical Operations Team. We're looking for someone with a solid understanding of site reliability practices and hands-on experience working with production Kubernetes environments. The ideal candidate is a skilled problem solver with intimate knowledge of site reliability practices, standard Dev Ops principles, AWS, scripting languages and Kubernetes.
What You'll Do
- Support the reliability, scalability, and performance of our platform through hands-on work with our infrastructure and deployment pipelines
- Assist in maintaining and operating Kubernetes clusters (EKS), as well as other systems including Elasticsearch, MSK, RDS, and Redis
- Contribute to the deployment, tuning, and upkeep of observability tools like Prometheus, Loki, Grafana, OpenTelemetry, and New Relic
- Partner with more senior engineers to identify and remediate system bottlenecks and improve resource utilization
- Participate in the monitoring and tracking of service level indicators (SLIs) and service level objectives (SLOs)
- Write scripts and build automation to streamline operations and reduce manual work
- Help troubleshoot production and non-production issues as part of the incident response process
- Participate in an on-call rotation
Technology We're Using Now
- Python, Ruby on Rails, Golang
- React/Redux, Objective-C and Swift, Android
- Postgres, Redshift, Redis, Kafka
- AWS/GCP
- Docker/Kubernetes
- OpenTelemetry/Prometheus/Thanos/Loki/Grafana/New Relic/Sentry
- Git/CircleCI
- ArgoCD
What You Bring
- 3+ Years in various SRE roles
- 3+ Years in various DevOPS/System Engineering roles
- 3+ Years of experience building and managing production Kubernetes infrastructure
- 3+ Years experience with popular scripting languages (Python, Ruby, Bash, etc.)
- Experience with Infrastructure as code such as Terraform or Crossplane
- Experience with CI/CD Development tools (CircleCI, etc.)
- Experience with GitOPS Tools (ArgoCD)
- Experience using a broad range of AWS technologies (RDS, ElasticSearch, VPC, EKS, S3, CloudFront, MSK, Elasticache, CloudWatch, etc.)
- Experience developing and maintaining YAML templating systems (Helm charts, Kustomize, etc)
- Must be able to work independently, be self-motivated and handle multiple priorities
- Comfortable working in a fast-paced agile environment
Finally, a willingness to admit what you don’t know, and learn what you need to learn quickly.
Why Roadie?
- Competitive compensation packages
- 100% covered health insurance premiums for yourself
- 401k with company match
- Tuition and student loan repayment assistance (that’s right - Roadie will contribute directly to your existing student loans!)
- Flexible work schedule with unlimited PTO
- Monthly 3-day weekends
- Monthly WFH stipend
- Paid sabbatical leave- tenured team members are given time to rest, relax, and explore
- The technology you need to get the job done
Roadie
Roadie is a crowdsourced delivery platform that enables urgent, same-day and local next day delivery of just about anything, anywhere, across the U.S.
See more jobsSafe Remote Job Search Tips
Verify Employer Thoroughly
Research the company's identity thoroughly before applying. Check for a professional website with contacts, active social media, and LinkedIn profiles. Verify details across platforms and look for reviews on Glassdoor or Trustpilot to confirm legitimacy.
Never Pay to Get a Job
Legitimate employers never require payment for applications, training, background checks, or equipment. Always reject upfront payment requests or demands for bank details, even if they claim it's for purchasing necessary work gear on your behalf.
Safeguard Your Personal Information
Protect sensitive data like SSN, bank details, or ID copies. Share this only after accepting a formal, written job offer. Ensure it's submitted via a secure company system or portal, never through insecure channels like standard email attachments.
Scrutinize Communication & Interviews
Watch for communication red flags: poor grammar, generic emails (@gmail), vague details, or undue pressure. Be highly suspicious of interviews held only via text or chat apps; legitimate companies typically use video or phone calls.
Beware of Unrealistic Offers
If an offer's salary or benefits seem unrealistically high for the work involved, be cautious. Research standard pay for similar roles. Offers that appear 'too good to be true' are often scams designed to lure you into providing information or payment.
Insist on a Formal Contract
Always secure and review a formal, written job offer or employment contract before starting work or sharing final personal details. Ensure it clearly defines your role, compensation, key terms, and conditions to avoid misunderstandings or scams.