Data Engineer - ApplyBoard
  • Kitchener, Ontario, Canada
  • via Whatjobs
CAD - CAD
Job Description

The Role:The data engineering team is an experienced team, responsible for supporting our product development and the entire organization. In addition to building ETL pipelines to automate analytics and building integrations between systems , the team is responsible for building and maintaining the infrastructure used to host these pipelines and integrations. The team is also responsible for building and maintaining data access components and providing tooling and analytics that are required for our predictive/ML models.What you will be doing:Build and maintain analytics with Python (pandas/pyspark)Build and maintain ETL pipelines on AWS (EC2/Glue ETLs/Airflow)Build and maintain Infrastructure components to support our pipelines and integrations(CDK)Setup and maintain integrations between different systems to enable data flow between these systems (Appflow)Actively contribute to shaping the direction of our data platform including architecting our data warehouse, machine learning deployment infrastructure, and ETL/ELT workflowsGather and understand data requirements by working with stakeholders across multiple teamsWorking closely with Engineering, IT, and Security to build processes and standards for our data science platform and how it integrates with data sources across the companyDeveloping ingestion, transformation, and cleansing pipelines to prepare a variety of structured and unstructured data sources for data analyticsMaintaining our data platform including managing and improving our redshift cluster and monitoring our data pipelinesDeveloping infrastructure using CDK to deploy data products to internal and external usersProviding operational support to the data science teamBeing a go-to person about data-related questions company-wideWhat you bring to this role:Bachelor’s degree in Engineering, Computer Science, Mathematics, or a related technical discipline4+ years experience in the data engineering fieldExperience in setting up and maintaining a high volume of ETL pipelinesExperience in setting up ETL orchestration Familiarity with infrastructure as code (CDK or Terraform) is a plusAdvanced knowledge of SQL and knowledge of NoSQL (MongoDB)Ability to communicate effectively with people who are both highly technical, and non-technical alikeStrong analytical skills and an understanding of data scienceDriven, passionate and creative, and thrives in a fast-paced environmentKnowledge of data modeling and system design using UMLExperience with AWS computing (eg. EC2, Lambda) and data storage technologies (eg. Redshift)Tech Stack:PostgreSQLPythonPandasNice to have PysparkNice to have CDK or TerraformAWS

;