Reddit

Staff Data Engineer

Job Description

Posted on: 
November 2, 2024

Reddit is a community of communities. It’s built on shared interests, passion, and trust and is home to the most open and authentic conversations on the internet. Every day, Reddit users submit, vote, and comment on the topics they care most about. With 100,000+ active communities and approximately 82M+ daily active unique visitors, Reddit is one of the internet’s largest sources of information. For more information, visit redditinc.com.

Responsibilities

Act as the analytics engineering lead within Ads DS team and a key contributor to the success of data science data quality and automation initiatives.
You will have a keen interest in the collection and quality of underlying data (experiment design and analysis, data deep dive) and in working on ETLs, reporting dashboards, and data aggregations needed for business tracking and/or ML model development.
Develop and maintain robust data pipelines and workflows for data ingestion, processing, and transformation. Work closely with engineering to ensure the quality and reliability of these data pipelines.
Create user-friendly tools and applications for internal use across Data Science and cross-functional teams, streamlining data analysis and reporting processes. Driver widespread adoption of these tools and applications
Lead transformational efforts to build a data-driven culture at Reddit by enabling data self-service.
Provide technical guidance, mentorship, coaching and/or training to data analysts
Serve as a thought partner for data scientists, engineering managers, and leadership on data foundations, communicating and shaping the data foundations roadmap and strategy for Reddit

Job Requirements

MS or PhD in a quantitative discipline: engineering, statistics, operations research, computer science, informatics, applied mathematics, economics, etc.
7+ years of experience working with large-scale ETL systems (implementation, strategy, and maintenance), building clean, maintainable, object-oriented code (Python preferred) in a production environment.
Strong programming proficiency in Python, SQL, Spark, Scala, etc.
Experience with data modeling, ETL (Extraction, Transformation Load) concepts, and patterns for efficient data governance. Experience with manipulating massive-scale structured and unstructured data.
Experience with data workflows (such as Airflow), data modeling, front-end or back-end engineering.
Experience in data visualization and dashboard design, including tools such as Looker, Tableau, R visualization packages, streamlit, D3, and other libraries, etc.
Deep understanding of technical and functional designs for relational and MPP Databases
Proven track record of cross-functional execution and collaboration. Excellent communication skills to collaborate with cross-functional stakeholders at all levels of the company.
Experience in mentoring junior data scientists and analytics engineers.
Self starter, ability to work independently and autonomously, as well as part of a team.

Apply now

More job openings