Existing Repositories

Overview of current Datadrivet repositories and their purposes.

Main Dagster Data Infrastructure Repository

datadrivet-infra-opendatastack

Primary data infrastructure and pipeline repository

Purpose: All data processing pipelines, LLM job ad matching, ETL jobs, and analytics infrastructure
Technology: Python, dbt, Dagster, Snowflake, Airtable and upstream message intergrations,
Use Cases:
- Data ingestion from various sources (Airtable, Teamtailor, etc.)
- Data transformations and modeling
- Analytics and reporting pipelines
- LLM Job ad Matching (Cofinder pipeline)

When to add here: Most data-related projects, utilities, and experiments should start here.

The CoFinder data infrastructure and data pipeline is more thoroughly explained on this page

Azure Terraform Resources

k8s-dataplatform

All Azure related resources are deployed through here.

Cofinder Portal

datadrivet-cofinder-portal

Fluxcd (Kubernetes Continous Delivery)

fluxcd-dataplatform

Template Repository

datadrivet-template

Template for creating new Datadrivet repositories

Purpose: Standardized starting point for new projects
Includes:
- Pre-configured devenv setup
- SOPS secrets management
- Common development scripts
- GitHub Actions workflows
- Pre-commit hooks

When to use: Only when you have a valid reason for a new repository.

Support Repositories

nixos-wsl

NixOS-based WSL distribution for Windows development

Purpose: Provides Windows developers with a pre-configured Linux environment
Includes: Nix, Devenv, Direnv, and common development tools
Use Cases: Windows developers who need the full Nix development stack

datadrivet-docs

This documentation site

Purpose: Central documentation for Datadrivet infrastructure
Technology: Jekyll, GitHub Pages
Content: Setup guides, best practices, service documentation

Repository Selection Guide

Start with Existing Repositories

For most new work, add to datadrivet-infra-opendatastack:

✅ New data sources or integrations
✅ Analytics scripts and reports

For CRUD things & MCP server, add to datadrivet-cofinder-portal

Consider New Repository Only If

Different technology stack (i.e not python)
Different deployment target (e.g., mobile app, embedded system)
External collaboration (will be shared with external partners)
Regulatory isolation (compliance requires separate codebase)

Getting Started with Existing Repositories

1. Clone and Setup

# Example: Main infrastructure repository
git clone git@github.com:knowit-solutions-cocreate/datadrivet-infra-opendatastack.git
cd datadrivet-infra-opendatastack
direnv allow

2. Explore Available Commands

menu

See each respective repository’s REAMDE.md for mor information on how to start contributing.

Contributing Guidelines

Use Vibes with Care - Write your code first and then let your vibe-tool of choice iterate on it. We are interested in how you think. At the time of writing(August 2025), Agents/LLMs tend to overlook existing coding patterns and this causes a lot of technical debt.
See rule 0 -

Repository Maintenance

Each repository has designated maintainers:

Primary contact: Jimmy Flatting
Access management: Repository owners manage SOPS keys and permissions
Code review: All changes require pull request review

Need Help?

General questions: Ask in #cofinder or #cofinder-dev
New repository requests: Discuss in the above channels