About the project:
You will be embedded in the Data Platform team of a global mining organization, helping design and deliver a modern Azure-native data infrastructure. The team ingests data from dozens of operational sources — including REST and SOAP APIs, IoT sensors, ERP systems, and flat-file feeds — transforming it into curated, analytics-ready datasets that drive decisions across finance, operations, and supply chain.
This is a hands-on engineering role. You will spend most of your time building and maintaining robust pipelines, optimizing data flows within Azure Synapse and Databricks, and ensuring the platform remains reliable, scalable, and secure.
Key responsibilities:
- Gather and analyze business requirements to identify and prioritize integration opportunities.
- Design, build, and maintain scalable data pipelines using Azure Data Factory to orchestrate ingestion from diverse source systems.
- Develop and optimize data transformation logic and analytical workloads in Azure Synapse Analytics (dedicated and serverless SQL pools, Synapse Pipelines).
- Build and maintain Databricks notebooks and jobs (PySpark / Spark SQL) for large-scale data processing and feature engineering.
- Implement robust API extraction patterns — REST and SOAP — handling pagination, authentication (OAuth 2.0, API keys), rate limiting, and error recovery.
- Design and maintain the lakehouse architecture: raw, curated, and serving layers using Azure Data Lake Storage Gen2.
- Ensure data quality through validation frameworks, monitoring pipelines, alerting, and lineage documentation.
- Collaborate with Analytics Engineers and data consumers to understand downstream requirements and model data accordingly.
- Apply best practices for CI/CD, infrastructure-as-code (Azure DevOps, Terraform / Bicep), and environment management.
- Participate in data governance activities including metadata management, access control, and documentation.
Requirements:
- 4+ years of experience in data engineering or a closely related software engineering role.
- Strong analytical and communication skills
- Strong hands-on experience with Azure Data Factory — authoring pipelines, triggers, linked services, and data flows.
- Solid experience with Azure Synapse Analytics — SQL pools, Synapse Pipelines, and integration with ADLS Gen2.
- Proven experience with Databricks — PySpark, Delta Lake, job clusters, and Unity Catalog is a plus.
- Strong API extraction skills: designing reliable ingestion from REST and SOAP endpoints, handling complex authentication and error scenarios.
- Proficiency in SQL (T-SQL or Spark SQL) and Python for data manipulation and pipeline scripting.
- Familiarity with data modelling concepts: star/snowflake schemas, data vault, or medallion architecture.
- Experience with Azure DevOps for source control (Git), CI/CD pipelines, and automated deployments.
- Understanding of data security, role-based access control (RBAC), and encryption in Azure.
- Strong problem-solving mindset and ability to work autonomously in a distributed team environment.
Nice to have:
- Experience with Azure Purview for data cataloguing and governance.
- Knowledge of streaming data patterns (Azure Event Hubs, Kafka, Spark Streaming).
- Familiarity with dbt or similar SQL-based transformation frameworks.
- Background in mining, heavy industry, or operational technology (OT) data environments.
- Power Platform awareness (not required, but beneficial for understanding downstream consumers).
We offer*:
- Flexible working format - remote, office-based or flexible
- A competitive salary and good compensation package
- Personalized career growth
- Professional development tools (mentorship program, tech talks and trainings, centers of excellence, and more)
- Active tech communities with regular knowledge sharing
- Education reimbursement
- Memorable anniversary presents
- Corporate events and team buildings
- Other location-specific benefits
- not applicable for freelancers