Job Summary:
We are seeking an experienced Data Engineer to join our team. The ideal candidate will have a strong background in data technology, hands-on expertise in SQL development, and experience with MPP systems. You will play a critical role in designing, developing, and optimizing our data architecture, ensuring high data quality, and enabling efficient data processing and analytics.
Key Responsibilities:
- Design, develop, and optimize data models (relational & dimensional) and schemas.
- Implement and maintain data architecture, governance, and quality improvements to ensure reliability and consistency.
- Develop and optimize SQL queries and curated datasets, ensuring high performance and scalability.
- Work with Cloud-based Data Warehousing and ETL solutions (e.g., AWS, Azure, GCP, Snowflake, Redshift, BigQuery, etc.), Python, and ETL coding to build and maintain data pipelines.
- Leverage Google Cloud Platform (GCP) services, Big Data, and streaming integrations for data processing.
- Utilize PySpark, Pandas, and other data processing libraries for large-scale data transformations.
- Implement best practices for data movement, handling large volumes of data, and reporting.
- Develop and manage ETL workflows using tools such as Apache Airflow, Google Cloud Dataflow, etc.
- Apply data warehouse methodologies like Kimball, Inmon, or Data Vault for effective data organization.
- Ensure smooth integration and deployment using CI/CD tools like Git, Terraform, etc.
- Troubleshoot and optimize database performance, query plans, and indexing strategies.
Qualifications & Experience:
- Bachelor’s degree in Computer Science or a related field, or an equivalent combination of education and experience.
- 12+ years of experience in Data Engineering with strong hands-on SQL development.
- Extensive experience with MPP (Massively Parallel Processing) systems.
- Strong understanding of data modeling, data architecture, metadata, and governance best practices.
- Proficiency in Google Cloud Platform (GCP), BigQuery, and cloud-based data solutions.
- Expertise in database programming, performance tuning, and query optimization.
- Experience with big data processing frameworks such as PySpark and Pandas.
- Familiarity with data warehouse best practices and methodologies (Kimball, Inmon, Data Vault).
- Hands-on experience with ETL tools and workflow automation (Airflow, Dataflow, etc.).
- Knowledge of CI/CD practices and tools like Git, Terraform, Jenkins.
Preferred Skills:
- Experience in streaming data processing and real-time analytics.
- Strong problem-solving skills and the ability to work in fast-paced, collaborative environments.
- Excellent communication and stakeholder management skills.
If you are passionate about building scalable, high-performance data solutions and want to be part of an innovative team, we’d love to hear from you!
Please reach-out to h1b@infyshine.com