Site Reliability Engineering (SRE) is an engineering discipline that combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. At Goldman Sachs, SRE is responsible for the availability and reliability of our firm's most critical platform services, and ensures they meet the requirements of our internal and external users. We look for engineers who are motivated to collaborate with our businesses to build and run sustainable production systems, which can evolve and adapt to changes in our fast-paced, global business environment.
HOW YOU WILL FULFILL YOUR POTENTIAL
• Develop, support, administer and consult in the Firm's primary business-critical trading infrastructure
• Create and support automation solutions to improve the reliability of the platform and to increase the productivity of the team.
• Adhere to and drive Site Reliability Engineering (SRE) disciplines and processes across the Global team
• Provide essential day-to-day support for a massive scale, distributed system
• Assess monitoring & alert signals to determine impact and risk to the business and help steer the incident management process
• The successful candidate will have outstanding verbal and written communications, a natural ability to learn in a fast-paced environment, and will be a self-starter with plenty of initiative.
SKILLS AND EXPERIENCE WE ARE LOOKING FOR
• Programming expertise inJava/ Python, Perl/Shell Scripting
• Strong communication skills with a track record of working and collaborating with global teams
• Ability to handle multiple on-going assignments and be able to work independently in addition to contributing as part of a highly collaborative and globally dispersed team
• Strong analytical skills with the ability to break down and communicate complex issues, ideas and solutions
• In-depth knowledge of Unix systems is a pre-requisite, as is a willingness to learn new languages and programming paradigms (functional programming for example)
• Experience with managing performance, availability and scale for mid to large sized systems
• Previous experience in a blameless SRE environment would be a benefit
• Degree in computer science or engineering, or equivalent industry experience
• Hands-on experience with storage and networking stacks
• Proven experience working in the lifecycle of large distributed systems
• Works effectively and thrives in a team while able to operate independently, self-motivation is essential
• Strong verbal and written communication skills
The Goldman Sachs Group, Inc. is a leading global investment banking, securities and investment management firm that provides a wide range of financial services to a substantial and diversified client base that includes corporations, financial institutions, governments and individuals. Founded in 1869, the firm is headquartered in New York and maintains offices in all major financial centers around the world.
Â© The Goldman Sachs Group, Inc., 2020. All rights reserved Goldman Sachs is an equal employment/affirmative action employer Female/Minority/Disability/Vet.