Databricks Data Architect

RemoteSalary not specified

Tech Stack

ETL

Job Description, Responsibilities & Requirements

About the Position

We are seeking a Databricks Data Architect to design and build a unified Master Data Management platform for occupational health and safety data. This role involves creating a single source of truth for data from various applications and building an AI layer on top of this platform to deliver clean, connected, and enriched data.

Responsibilities

  • Own the end-to-end architecture of a Databricks-based MDM platform for occupational health, safety, incident, risk, and regulatory data.
  • Design ingestion and transformation patterns using Databricks, Spark, PySpark, SQL, Delta Lake, Unity Catalog, and Lakeflow where appropriate.
  • Define canonical data models, golden-record logic, entity-resolution rules, and survivorship strategies across heterogeneous source systems.
  • Build a semantic layer that provides consistent definitions for incidents, organizations, locations, hazards, controls, risks, regulations, corrective actions, and compliance metrics.
  • Design graph-based relationship models for linking entities across systems and enriching downstream analytics and AI use cases.
  • Architect AI/RAG capabilities for semantic search, regulatory lookup, incident enrichment, data validation, and source-grounded answers over governed enterprise data.
  • Embed data quality, lineage, governance, access control, auditability, and monitoring into the platform from the start.
  • Partner with product, engineering, compliance, and analytics teams to convert domain requirements into scalable architecture and implementation patterns.

Requirements

  • Strong production experience with Databricks Lakehouse architecture, including Spark, PySpark, SQL, Delta Lake, Unity Catalog, and workflow orchestration.
  • Hands-on experience designing and building ETL/ELT pipelines for batch and incremental ingestion, cleansing, normalization, deduplication, and enrichment.
  • Practical experience with MDM: golden records, survivorship/merge rules, trust ranking, identity resolution, duplicate detection, SCD, and exception workflows.
  • Strong data modeling skills for analytical, operational, and semantic consumption patterns.
  • Experience designing a semantic layer with shared business definitions, governed metrics, reusable dimensions, and consistent entity definitions.
  • Experience with data quality and observability: pipeline SLAs, schema drift, CDC, data contracts, dead-letter handling, and source-to-master reconciliation.
  • Experience implementing data governance and security: Unity Catalog lineage, RBAC/ABAC, row/column-level security, PII handling, and regulatory traceability.
  • Ability to translate business requirements from product, compliance, and engineering stakeholders into scalable data architecture.

Nice to Have

  • Experience with Databricks Lakeflow Connect, Lakeflow Spark Declarative Pipelines, and Lakeflow Jobs.
  • Experience with Unity Catalog metric views or comparable semantic-layer technologies.
  • Experience with knowledge graphs, graph analytics (e.g. GraphFrames), or graph-based entity resolution - linking people, organizations, locations, incidents, hazards, controls, regulations, assets, and corrective actions.
  • Experience building AI/RAG solutions over enterprise data using AI Search / Vector Search, embeddings, metadata filtering, retrieval evaluation, and source-grounded generation with citations.
  • Experience with ML-based data enrichment, classification, anomaly detection, or entity matching.
  • Experience in regulated domains such as occupational health and safety, incident management, risk, compliance, ESG, insurance, healthcare, or industrial operations.

We Offer

  • Projects for clients such as PayPal, Wargaming, Xerox, Philips, Adidas, and Toyota.
  • Competitive compensation that depends on your qualification and skills.
  • Career development system with clear skill qualifications.
  • Flexible working hours aligned to your schedule.
  • Options to work remotely.
  • Corporate medical insurance covering services of private and public medical centers.
  • English courses online.
  • Corporate parties and events for employees and their children.
  • Internal conferences, workshops, and meetups for learning and experience sharing.
  • Gym membership compensation.
  • 5 days of paid sick leave per year with no obligation to submit a sick-leave certificate.

About the Company

Itransition is a global IT consulting and software development company that partners with clients to deliver innovative solutions. Our team is dedicated to providing high-quality services and supporting our clients in achieving their business goals.

Apply for

Apply by filling in the form beside or sending your CV to [email protected].

Job Details

Company name:
Itransition
Location:
Poland
Employment Type:
Full-time
Work Mode:
Remote
Posted on TheJob:
6/12/2026
Last checked:
6/12/2026
Posted on the source:
6/9/2026
Apply Now
© 2026 TheJob, Inc. All rights reserved.