Tech Stack
ETL
Job Description, Responsibilities & Requirements
About the Position
We are seeking a Databricks Data Architect to design and build a unified Master Data Management platform for occupational health and safety data. This role involves creating a single source of truth for data from various applications and building an AI layer on top of this platform to deliver clean, connected, and enriched data.
Responsibilities
- Own the end-to-end architecture of a Databricks-based MDM platform for occupational health, safety, incident, risk, and regulatory data.
- Design ingestion and transformation patterns using Databricks, Spark, PySpark, SQL, Delta Lake, Unity Catalog, and Lakeflow where appropriate.
- Define canonical data models, golden-record logic, entity-resolution rules, and survivorship strategies across heterogeneous source systems.
- Build a semantic layer that provides consistent definitions for incidents, organizations, locations, hazards, controls, risks, regulations, corrective actions, and compliance metrics.
- Design graph-based relationship models for linking entities across systems and enriching downstream analytics and AI use cases.
- Architect AI/RAG capabilities for semantic search, regulatory lookup, incident enrichment, data validation, and source-grounded answers over governed enterprise data.
- Embed data quality, lineage, governance, access control, auditability, and monitoring into the platform from the start.
- Partner with product, engineering, compliance, and analytics teams to convert domain requirements into scalable architecture and implementation patterns.
Requirements
- Strong production experience with Databricks Lakehouse architecture, including Spark, PySpark, SQL, Delta Lake, Unity Catalog, and workflow orchestration.
- Hands-on experience designing and building ETL/ELT pipelines for batch and incremental ingestion, cleansing, normalization, deduplication, and enrichment.
- Practical experience with MDM: golden records, survivorship/merge rules, trust ranking, identity resolution, duplicate detection, SCD, and exception workflows.
- Strong data modeling skills for analytical, operational, and semantic consumption patterns.
- Experience designing a semantic layer with shared business definitions, governed metrics, reusable dimensions, and consistent entity definitions.
- Experience with data quality and observability: pipeline SLAs, schema drift, CDC, data contracts, dead-letter handling, and source-to-master reconciliation.
- Experience implementing data governance and security: Unity Catalog lineage, RBAC/ABAC, row/column-level security, PII handling, and regulatory traceability.
- Ability to translate business requirements from product, compliance, and engineering stakeholders into scalable data architecture.
Nice to Have
- Experience with Databricks Lakeflow Connect, Lakeflow Spark Declarative Pipelines, and Lakeflow Jobs.
- Experience with Unity Catalog metric views or comparable semantic-layer technologies.
- Experience with knowledge graphs, graph analytics (e.g. GraphFrames), or graph-based entity resolution - linking people, organizations, locations, incidents, hazards, controls, regulations, assets, and corrective actions.
- Experience building AI/RAG solutions over enterprise data using AI Search / Vector Search, embeddings, metadata filtering, retrieval evaluation, and source-grounded generation with citations.
- Experience with ML-based data enrichment, classification, anomaly detection, or entity matching.
- Experience in regulated domains such as occupational health and safety, incident management, risk, compliance, ESG, insurance, healthcare, or industrial operations.
We Offer
- Projects for clients such as PayPal, Wargaming, Xerox, Philips, Adidas, and Toyota.
- Competitive compensation that depends on your qualification and skills.
- Career development system with clear skill qualifications.
- Flexible working hours aligned to your schedule.
- Options to work remotely.
- Corporate medical insurance covering services of private and public medical centers.
- English courses online.
- Corporate parties and events for employees and their children.
- Internal conferences, workshops, and meetups for learning and experience sharing.
- Gym membership compensation.
- 5 days of paid sick leave per year with no obligation to submit a sick-leave certificate.
About the Company
Itransition is a global IT consulting and software development company that partners with clients to deliver innovative solutions. Our team is dedicated to providing high-quality services and supporting our clients in achieving their business goals.
Apply for
Apply by filling in the form beside or sending your CV to [email protected].