Job Description
Day to Day
This engineering role focuses on building, scaling, and maintaining reusable platform components and infrastructure that empower software developers across the organization. The engineer will collaborate with architects and engineering teams to design platform infrastructure, automate provisioning, integrate observability, and ensure secure, compliant, and scalable operations. Daily work includes developing Infrastructure as Code, managing cloud and container platforms, maintaining CI/CD-based provisioning workflows, monitoring system health, and ensuring performance, security, and disaster recovery best practices are embedded into platform services. The engineer will troubleshoot complex technical issues, conduct deep dive RCAs, support application teams with platform integrations, and continuously improve developer experience and platform reliability through automation, SRE principles, and proactive capacity management.
Specific Responsibilities
Design platform infrastructure across servers, networks, storage, databases, and cloud environments
Implement and manage infrastructure that powers developer-facing platform tools
Ensure regular upgrades, patches, and performance improvements across platform components
Evaluate cloud providers, containerization technologies, and configuration patterns; create reusable abstractions
Write and execute automated Infrastructure as Code; utilize CI/CD for provisioning
Integrate QoS, SLA metrics, and monitoring to support auto-scaling
Incorporate security and disaster recovery practices including access control, identity, logging, segmentation, encryption, backups
Integrate enterprise-managed configurations into deployment pipelines
Act as liaison between developers and service providers
Conduct capacity planning and forecasting for OpenShift Virtualization (OSV)
Analyze resource trends and recommend scaling/optimization
Develop automation scripts and playbooks for OSV tasks
Deliver operator updates and changes at scale via automation
Apply SRE principles to improve stability and operational efficiency
Manage RBAC deployment and auditing
Manage namespaces and resource quotas
Maintain end-to-end observability integrating Dynatrace/Prometheus/Grafana
Explore and implement event-driven monitoring
Identify abnormalities and observability blind spots
Perform deep-dive RCAs in global compute environments
Monitor VM health, performance, and security
Provide solution design, consulting, and knowledge management
Compensation: $60/hr to $67/hr Exact compensation may vary based on several factors, including skills, experience, and education. Benefit packages for this role will start on the 31st day of employment and include medical, dental, and vision insurance, as well as HSA, FSA, and DCFSA account options, and 401k retirement account access with employer matching. Employees in this role are also entitled to paid sick leave and/or other paid time off as provided by applicable law.
We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to HR@insightglobal.com.To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: https://insightglobal.com/workforce-privacy-policy/.
Skills and Requirements
6+ years of IT experience; 4+ years in development
Practical experience in two coding languages or advanced proficiency in one
Strong scripting and automation skills
Expertise in root cause analysis, troubleshooting, and problem-solving
Experience with cloud architecture and cloud infrastructure design
Proficiency with GitHub, CI/CD processes, and Infrastructure as Code
Hands-on experience with Kubernetes and container orchestration
Strong understanding of IT solutions, platform engineering, and developer tooling
Experience with Tekton or other pipeline automation tools
Ability to perform technical analysis and drive utilization management
Experience managing change, upgrades, patches, and platform stability
Ability to collaborate with architects, engineering teams, and platform stakeholders
Strong written and verbal communication skills
Ability to manage multiple priorities in a fast-paced environment - Experience with Ansible automation
Familiarity with GCP cloud services
Experience with Dynatrace, Prometheus, Grafana, or other observability tools
Knowledge of PowerShell, Python, or other scripting languages
Experience with access controls, identity management, and authorization
Understanding of information security practices
Hands-on experience with VMware or virtualization platforms
Familiarity with event-driven architecture (EDA)
Additional certifications or advanced coursework