Lead Cloud Engineer (Remote)
Company: Home Depot
Posted on: March 20, 2023
Working at the world's largest home improvement retailer is a career-defining experience. New associates decide to join us because: they love our people and culture, they get to work with cutting-edge technologies, and they spend every day solving large-scale problems that matter. As a Home Depot QuoteCenter associate, you will impact the daily lives and decisions of our customers who spend billions of dollars at our stores in North America.
At The Home Depot QuoteCenter our mission is to "enable a frictionless customer experience to sell Pros the complete job for the planned purchase." Behind everything our users see and experience within The Home Depot QuoteCenter application eco-system is the Build, Integrate and Connect operating model built and managed by a diverse team of Merchants, Product Managers, Engineers, Marketeers, Operations, and Support professionals that enable Doers to Get More Done.
We're committed to maintaining a fun, engaging, and inclusive environment, ensuring the agility of a close-knit team and driving results that enable The Home Depot to continue to be a leader in our industry. At The Home Depot QuoteCenter, we value diversity in all its forms, and we work hard to support growing associate skills in a fast-paced collaborative environment.
The Home Depot QuoteCenter technology team is focused on radically reimagining the shopping experience at Home Depot utilizing the latest web technologies and data tools.
As a Lead Cloud Engineer, you will be responsible for the infrastructure and security of QuoteCenter's cloud-native platform in GCP as well as the tooling and integration with The Home Depot's wider Enterprise.
This will require you to maintain high site uptime/availability while embracing rapid change and growth using a strong DevSecOps mindset of continuous delivery and site automation. This role requires deep technical knowledge, adaptability, hands-on execution, and a drive towards reliability and disaster-resilience. In this role you will:
Drive the practice of Reliability Engineering in the Cloud Infrastructure domain.
Partner closely with Software Engineering and Architecture teams to develop DevSecOps. solutions in cloud infrastructure that are reliable, efficient, and maintainable.
Design and deliver improvements to existing cloud-native processes and technology.
Set the standard for infrastructural engineering excellence.
Mentor and upskill junior team members in Cloud Engineering.
Partner with teams across The Home Depot enterprise.
Serve as a Subject Matter Expert in your domain.
To learn more about QuoteCenter, watch this short video: https://bit.ly/HomeDepotQCPro
Major Tasks, Responsibilities & Key Accountabilities:
25%- Architecture and Solution Design
35%- Implementation and Support
25%- Team Mentoring and Education
10%- Professional Development
5%- Administrative and Planning Activities
Nature and Scope:
This position reports to the Manager, Site Reliability Engineering.
This position has no direct reports.
Environmental Job Requirements:
Typically requires overnight travel less than 10% of the time.
Additional Environmental Job Requirements:
Standard Minimum Qualifications:
Must be eighteen years of age or older.
Must be legally permitted to work in the United States.
Additional Minimum Qualifications:
Education Required: The knowledge, skills and abilities typically acquired through the completion of a high school diploma and/or GED.
Years of Relevant Work Experience: 8 years
5-8 years or relevant work experience
3-5 years Cloud Native Engineering - Expertise in one of the major cloud providers (AWS, GCP, Azure)
Proficient in production systems design including High Availability, Disaster Recovery, Performance, Efficiency, and Security
Strong preference for Cloud Architect Certification in Azure, AWS, or GCP
Required - Technical Proficiencies
Deep experience with
Configuration Management (OS Config, Ansible)
Containerization (Docker, containerd, et al)
Secrets Management (Vault, ASM, GSM, Azure Key Vault, et al)
CI/CD solutions (Spinnaker, Harness.io, Jenkins, CircleCI, GitHub Actions, et al)
Observability Tools (Prometheus, Influx, ELK, New Relic, Datadog)
Proficient in fundamentals of cloud-native Networking and Security.
Proficient in Linux/Unix-based and Windows Operating Systems
Strong understanding of POSIX standards and commands
Deep understanding of modern microservice based architectures and operations
Proficient in production monitoring concepts and implementation including synthetic, real user, application performance, system, log, time-series, and dashboarding
Expert in shell scripting and with standard data serialization languages (JSON, Yaml, et al)
Expertise in Git and GitOps workflows
Thorough understanding of data structures and algorithms
Knowledge of software design patterns
Required - Competencies
Cultivates Innovation: Creating new and better ways for the organization to be successful
Action Oriented: Taking on new opportunities and tough challenges with a sense of urgency, high energy and enthusiasm
Collaborates: Building partnerships and working collaboratively with others to meet shared objectives
Communicates Effectively: Developing and delivering multi-mode communications that convey a clear understanding of the unique needs of different audiences
Drives Results: Consistently achieving results, even under tough circumstances
Global Perspective: Taking a broad view when approaching issues; using a global lens
Interpersonal Savvy: Relating openly and comfortably with diverse groups of people
Manages Ambiguity: Operating effectively, even when things are not certain, or the way forward is not clear
Optimizes Work Processes: Knowing the most effective and efficient processes to get things done, with a focus on continuous improvement
Self-Development: Actively seeking new ways to grow and be challenged using both formal and informal development channels
Situational Adaptability: Adapting approach and demeanor in real time to match the shifting demands of different situations
Additional Desired Experience
Exposure to modern objected oriented programming languages (preferably Java or .NET C#)
Experience in destructive testing methodologies and tools such as Chaos Monkey
Experience in defensive coding practices and patterns for high availability
Hands on experience Service Discovery and Routing Mesh technologies, ex. Envoy, Istio, Anthos Service Mesh, et al
Modern Deployments Strategies (Blue/Green, Canary, et al)
Delivery & Execution
Defines team level and infrastructural best practices and engineering excellence
Develops automated mechanisms to drive them forward
E.g. Code Style Guides, Static Code Analysis, Sentinel and/or OPA policies, testing standards etc
Architects network (VPC, Subnet, CIDR, Firewall etc.), IAM, Infrastructure solutions based on business need and future planning for the infrastructure platform
Contributes to meaningful architecture diagrams and other documentation needed for security reviews or other interested parties
Automates infrastructure change management and pipelining (CI/CD)
Automates application-level change management and pipelining (CI/CD)
Constantly reflects, reviews, and proposes improvements to our infrastructure, security, tooling, processes, standards, capabilities with a continuous learning and improvement mindset
Collaborates and pairs with outside team members (e.g. Architects, Engineers, product management) to create secure, reliable, scalable software solutions
Drives and delivers on the major workstreams and business requirements
Documents, reviews and ensures that all quality and change control standards are met
Writes custom code or scripts to automate infrastructure, monitoring services, and test cases
Writes custom code or scripts to do "destructive testing" to ensure adequate resiliency in production
Creates meaningful dashboards, logging, alerting, and responses to ensure that issues are captured and addressed proactively
Contributes to enterprise-wide tools to drive destructive testing, automation, or engineering empowerment
Defines Service Level Objectives to constantly measure reliability in production and help prioritize backlog work
Support & Enablement
Fields questions from other product teams or support teams
Monitors tools and participates in conversations to encourage collaboration across product teams
Provides infrastructure support for services running in production
Proactively monitors production Service Level Objectives
Proactively reviews the performance and capacity of all aspects of production: code, infrastructure, data, and message processing
Triages high priority issues and outages as they arise
Collaborates with other Leads in the org to drive skills development and training in cloud tooling.
Participates in and leads learning activities around modern software design and development core practices (communities of practice)
Proactively views articles, tutorials, and videos to learn about new technologies and best practices being used within other technology organizations
Attends conferences and learns how to apply new technologies where appropriate
140,000.00 - 210,000.00
We are an Equal Opportunity Employer and do not discriminate against any employee or applicant for employment because of race, color, sex, age, national origin, religion, sexual orientation, gender identity, status as a veteran, and basis of disability or any other federal, state or local protected class.
Keywords: Home Depot, Atlanta , Lead Cloud Engineer (Remote), Engineering , Atlanta, Georgia
here to apply!