Overview of current Datadrivet repositories and their purposes.

Main Dagster Data Infrastructure Repository

datadrivet-infra-opendatastack

Primary data infrastructure and pipeline repository
  • Purpose: All data processing pipelines, LLM job ad matching, ETL jobs, and analytics infrastructure
  • Technology: Python, dbt, Dagster, Snowflake, Airtable and upstream message intergrations,
  • Use Cases:
    • Data ingestion from various sources (Airtable, Teamtailor, etc.)
    • Data transformations and modeling
    • Analytics and reporting pipelines
    • LLM Job ad Matching (Cofinder pipeline)

When to add here: Most data-related projects, utilities, and experiments should start here.

The CoFinder data infrastructure and data pipeline is more thoroughly explained on this page

Azure Terraform Resources

k8s-dataplatform

All Azure related resources are deployed through here.

Cofinder Portal

datadrivet-cofinder-portal

Fluxcd (Kubernetes Continous Delivery)

fluxcd-dataplatform

Template Repository

datadrivet-template

Template for creating new Datadrivet repositories
  • Purpose: Standardized starting point for new projects
  • Includes:
    • Pre-configured devenv setup
    • SOPS secrets management
    • Common development scripts
    • GitHub Actions workflows
    • Pre-commit hooks

When to use: Only when you have a valid reason for a new repository.

Support Repositories

nixos-wsl

NixOS-based WSL distribution for Windows development

  • Purpose: Provides Windows developers with a pre-configured Linux environment
  • Includes: Nix, Devenv, Direnv, and common development tools
  • Use Cases: Windows developers who need the full Nix development stack

datadrivet-docs

This documentation site

  • Purpose: Central documentation for Datadrivet infrastructure
  • Technology: Jekyll, GitHub Pages
  • Content: Setup guides, best practices, service documentation

Repository Selection Guide

Start with Existing Repositories

For most new work, add to datadrivet-infra-opendatastack:

  • ✅ New data sources or integrations
  • ✅ Analytics scripts and reports

For CRUD things & MCP server, add to datadrivet-cofinder-portal

Consider New Repository Only If

  • Different technology stack (i.e not python)
  • Different deployment target (e.g., mobile app, embedded system)
  • External collaboration (will be shared with external partners)
  • Regulatory isolation (compliance requires separate codebase)

Getting Started with Existing Repositories

1. Clone and Setup

# Example: Main infrastructure repository
git clone git@github.com:knowit-solutions-cocreate/datadrivet-infra-opendatastack.git
cd datadrivet-infra-opendatastack
direnv allow

2. Explore Available Commands

menu

See each respective repository’s REAMDE.md for mor information on how to start contributing.

Contributing Guidelines

  1. Use Vibes with Care - Write your code first and then let your vibe-tool of choice iterate on it. We are interested in how you think. At the time of writing(August 2025), Agents/LLMs tend to overlook existing coding patterns and this causes a lot of technical debt.
  2. See rule 0 -

Repository Maintenance

Each repository has designated maintainers:

  • Primary contact: Jimmy Flatting
  • Access management: Repository owners manage SOPS keys and permissions
  • Code review: All changes require pull request review

Need Help?

  • General questions: Ask in #cofinder or #cofinder-dev
  • New repository requests: Discuss in the above channels