Location: New York, NY
The Data Engineer is responsible for building and deploying streaming and batch data pipelines capable of processing and storing petabytes of data quickly and reliably. You will be responsible for ingesting and integrating large volumes of disparate data from a variety of sources, including but not limited to subscriber & listener data, customer journey data, vehicle data, video and 2nd/3rd party data. This will involve rapid innovation in large scale data pipeline design and development to ensure critical data sets are made available to our users and predictive models in a timely manner. We are looking for someone with hands on experience in all layers of the full stack involving data. The Data Engineer plays a significant role as both an enabler and practitioner of the data and analytics driven culture at SiriusXM.
Duties and Responsibilities:
* Build and deploy streaming and batch data pipelines capable of processing and storing petabytes of data quickly and reliably.
* Collaborate with product teams, data analysts and data scientists to design and build data-forward solutions.
* Gather and process all types of data including raw, structured, semi-structured, and unstructured data.
* Integrate with a variety of data providers ranging from marketing, web analytics, and consumer devices including IoT and Telematics.
* Build and maintain dimensional data warehouses in support of business intelligence tools.
* Develop data catalogs and data validations to ensure clarity and correctness of key business metrics.
* Design, code, test, correct and document programs and scripts using agreed standards and tools to achieve a well-engineered result.
* Derive an overall strategy of data management, within an established information architecture (including both structured and unstructured data), that supports the development and secure operation of existing and new information and digital services.
* Plan effective data storage, security, sharing and publishing within the organization.
* Ensure data quality and implement tools and frameworks for automating the identification of data quality issues.
* Collaborate with internal and external data providers on data validation providing feedback and making customized changes to data feeds and data mappings.
* Provide ongoing support, monitoring, and maintenance of deployed products.
* Drive and maintain a culture of quality, innovation and experimentation.
* Advanced degree in relevant field of study strongly desirable, particularly in computer science or engineering level programs.
* Minimum 2 years professional experience working with data extract/manipulation logic.
* Minimum 2 years professional experience with object-oriented programming, functional programming, and data design.
* 1-3 years working with a public cloud big data ecosystem (certification in AWS a plus).
* 1-3 years working with MPP databases, distributed databases, and/or Hadoop.
Requirements and General Skills:
* Passion for data engineering, able to excite and lead by example.
* Hungry and eager to learn new systems and technologies.
* Self-directed and enjoys the challenge and freedom of deciding what is the most impactful thing to work on next.
* Ability to deliver exceptional results through iterative improvement rather than initial perfection.
* Excellent communication and presentation skills and ability to interact appropriately with all levels of the organization, including: business users, technical staff, senior level colleagues, vendors, and partners.
* An extensive track record that demonstrates effectiveness in driving business results through data and analytics.
* The ability to develop and articulate a compelling vision and generate necessary consensus.
* A successful history of translating business objectives and problems into analytic problems, and analytic solutions into actionable business solutions.
* A proven ability to influence decision making across large organizations.
* A proven ability to hire, develop, and effectively lead deeply technical resources.
* Demonstrate and foster a sense of urgency, strong commitment, and accountability while making sound decisions and achieving goals.
* Articulate, inspire, and engage commitment to a plan of action aligned with organizational mission and goals.
* Create an environment where people from diverse cultures and backgrounds work together effectively.
* Experience deploying and running AWS-based data solutions and familiar with tools such as Cloud Formation, IAM, Athena, and Kinesis.
* Experience engineering big-data solutions using technologies like EMR, S3, Spark and an in-depth understanding of data partitioning and sharding techniques.
* Experience loading and querying both on premise and cloud-hosted databases such as Teradata and Redshift.
* Building streaming data pipelines using Kafka, Spark, or Flink.
* Familiarity with binary data serialization formats such as Parquet, Avro, and Thrift.
* Experience deploying data notebook and analytic environments such as Jupyter and Databricks.
* Knowledge of the Python data ecosystem using pandas and numpy.
* Experience building and deploying ML pipelines: training models, feature development, regression testing.
* Experience with graph-based data workflows using Apache Airflow a plus.
* Knowledge of data profiling, data modeling, and data pipeline development.
* Strong knowledge with high volume heterogeneous data, preferably with distributed systems.
* Strong knowledge writing distributed, high-volume services in Python, Java or Scala.
* Familiar with metadata management, data lineage, and principles of data governance.
* Knowledge of data modeling, data access, and data storage techniques.
* Appreciation of agile software processes, data-driven development, reliability, and responsible experimentation.
Minimum 2 years' experience with the following:
* professional role in one or more of the following: Development, Engineering, R&D or Information Technology
Strong and thorough knowledge of the following:
* ETL/ELT Tools
* BI tools
* MDM / Reference Data
* RDBMS, NoSQL and NewSQL
* MS Office Suite
SiriusXM is an equal opportunity employer that does not discriminate on the basis of sex, race, color, age, national origin, religion, creed, physical or mental disability, medical condition, marital status, sexual orientation, gender identity or expression, citizenship, pregnancy, military or veteran status or any other status protected by applicable law.
The requirements and duties described above may be modified or waived by the Company in its sole discretion without notice.
Equal Opportunity Employer Minorities/Women/Protected Veterans/Disabled