listing details




Site Reliability Engineer

Location: New York City
Category:Engineering

Overview:

The company is a confluence of media, technology and massive amounts of data. There is a transformation of an industry underway and the company is at the cutting edge. Our engineers develop very complex, innovative, and highly scalable technology to change how advertising is bought, sold, and traded.  Their breakthroughs create new market places, solve long-standing problems, and push new technology every day.  It's a very exciting company in a very exciting industry.  Our platform handles billions of transactions per hour and we reach hundreds of millions of Internet and mobile users worldwide...and we're not done yet! The platform and tools we develop are built to scale because this revolution has just begun.

Responsibilities:

The company is currently seeking an experienced Site Reliability Engineer to build and run the services that make up our core platform. Our platform team brings front-end, back-end, analytics, and distributed systems expertise together to create the APIs and user interfaces that are the backbone of our and our client's businesses. We look for high performing individuals that have both broad and deep skill sets as well as the drive to create world-class, resilient, and scalable technology. Please note, this position may be based in either New York or Chicago. You will be front-and-center in the effort to keep our distributed services fast and reliable, 100% of the time.  


Role & Responsibilities



  • Manage the scalability, performance, and availability of the company platform APIs by solving for reliability against existing systems and services spanning the entire stack.

  • Develop tools and automation to minimize delivery time and increase developer productivity.

  • Participate in the design and development of new and evolving services, architecture, and performance standards.

  • Support team members in the development of a SOA strategy and migration path.

  • Participate in capacity planning and service performance analysis and tuning.

  • Respond to and resolve emergent issues. Be on-call periodically as part of shared team.


Qualifications:



  • 5+ years of relevant work experience, including experience with high-volume, production distributed systems environment

  • Fluency with Python, Perl, Shell, Ruby, Scala, Go, or similar

  • Experience managing and deploying full stack, distributed services

  • Experience with system automation tools such as Ansible, Chef, Puppet, Salt Stack, etc

  • Experience with monitoring, alerting, and pipeline analysis tools such as Nagios, Sensu, Graphite, Riemann, Logstash, etc

  • Expertise in the use and optimization of SQL

  • Experience with queuing/data-pipelining solutions such as Storm, Kafka, RabbitMQ, ZeroMQ, etc

  • Experience with systems such as PostgresSQL, MySQL, Cassandra, CouchDB, Redis, and Memcached

  • Exposure to AWS and OpenStack APIs preferred

  • Excellent analytical skills, coupled with a strong sense of ownership, urgency and drive


How do we reward our outstanding Men and Women? Well, we start with company equity, 100% employer paid medical, dental, vision, short term and long term disability insurance, 20 paid time off days, free on-site chair massages and our 401(k).  We then serve-up flexible spending accounts (health care, dependent care, transit and parking), bagel Fridays, free snacks and sodas, and the latest and greatest technology you need to do your job (including a $250 biennial allowance for the latest smartphone). And the cherry on top? Monthly Mixers; including Potlucks, Trivia Nights, pick-up basketball, and game nights to name a few.  


 



Warning: Unknown: open(/home/content/13/2422513/tmp/sess_3qrprfr84f41icl9o5emcvo972, O_RDWR) failed: No such file or directory (2) in Unknown on line 0

Warning: Unknown: Failed to write session data (files). Please verify that the current setting of session.save_path is correct () in Unknown on line 0