Join our Talent Network
Skip to main content
   Current UPHS employees must apply HERE

Site Reliability Engineer

Job ID: 194002
Category: Information Services/Technology/Service Desk/Telecom
Work Type: FT
Location: Philadelphia, PA, United States
Work Schedule: 8:00AM-4:30PM, M-F | Hybrid

Save Job Saved

Description

Penn Medicine is dedicated to our tripartite mission of providing the highest level of care to patients, conducting innovative research, and educating future leaders in the field of medicine. Working for this leading academic medical center means collaboration with top clinical, technical and business professionals across all disciplines.

Today at Penn Medicine, someone will make a breakthrough. Someone will heal a heart, deliver hopeful news, and give comfort and reassurance. Our employees shape our future each day. Are you living your life's work?

Job Summary:

  • The Site Reliability Engineer (SRE) is responsible for production systems enabling custom software development work in support of areas such as Application Development, Informatics, Predictive Healthcare, and Translational Research. The SRE applies software engineering, systems engineering, and dev-ops principles to operations; designs and implements cohesive end-to-end systems for comprehensive solutions with measurable outcomes; and builds resilient, self-healing systems. The SRE values proactive automation, expert tool-smithing, and adherence to design and engineering principles over reactive systems management, or traditionally siloed systems administration.

Responsibilities:

  • Designs, builds, and maintains our core infrastructure, while retaining the flexibility to integrate next-generation systems for applied Data Science in healthcare.
  • Applies Systems Engineering and Software Engineering skills to advance core infrastructure, systems design, recurring microservices, tooling, automation, and libraries to “lift all the boats” instead of fragmented support for individual applications.
  • Improves and automates system infrastructure and application deployment processes.
  • Implements proactive monitoring and alerting of symptoms, instead of reactive alerting of outages.
  • Secure automated systems infrastructure and microservices.
  • Applies dev ops process to monitor and stabilize core infrastructure.
  • Prevents incidents, e.g., reduce baseline noise, streamline metrics, characterize expected latency, tune alert thresholds, ticket applications without effective health checks, improve playbooks for issue resolution.
  • Uses playbooks to document actions alongside code in source control to turn initial problem discovery and resolution into automated processes.
  • Participates in on-call rotation for systems infrastructure.

Education/Experience:

  • H.S. Diploma/GED and 6+ years of relevant experience. (Required) 
    OR
  • Bachelor's degree in a relevant field, including Computer Science, Systems Engineering, Data Science, Mathematics, Statistics and 2+ years of relevant experience; a Master's degree in a relevant technical field may substitute for additional years of experience. (Required)
  • 1+ years of Software engineering experience. (Required)
  • 1+ years of Infrastructure as code with a cloud provider OR Systems Engineering. (Required)

Skills/Ability:

  •  Exceptional design and programming skills in a language such as Golang, Python, C, C++. 
  •  Exceptional coding skill in ANSI SQL or PL/PGSQL. 
  •  Familiarity with relational databases and ANSI SQL 
  • Competency in Linux and the Unix shell, or with equivalent operating systems. 
  • Production experience with microservice orchestration (e.g., Hashicorp, Kubernetes), logging, metrics, and alerting (e.g., Loki, Grafana, Prometheus, Kibana, Fluentd). 
  • Production experience with infrastructure as code using a cloud provider (e.g., Azure, AWS, Google Cloud Platform) using tools like Terraform and python. 
  • Production experience with dev ops automation directly from source control, semantic versioning, and CI/CD (e.g., GitHub actions, Circle CI, Travis CI). 
  • Development of open-source products. 
  • Effective asynchronous communication, documentation of process and code in source control toward automation.
  • Practical understanding of Agile software development process with code review and retrospectives.
We believe that the best care for our patients starts with the best care for our employees. Our employee benefits programs help our employees get healthy and stay healthy. We offer a comprehensive compensation and benefits program that includes one of the finest prepaid tuition assistance programs in the region. Penn Medicine employees are actively engaged and committed to our mission. Together we will continue to make medical advances that help people live longer, healthier lives.

Live Your Life's Work

We are an Equal Opportunity and Affirmative Action employer. Candidates are considered for employment without regard to race, ethnicity, color, sex, sexual orientation, gender identity, religion, national origin, ancestry, age, disability, marital status, familial status, genetic information, domestic or sexual violence victim status, citizenship status, military status, status as a protected veteran or any other status protected by applicable law.

   Current UPHS employees must apply HERE
Share: mail

Similar Jobs