Data Engineer
Comfy Org
Software Engineering, Data Science
San Francisco, CA, USA
Location
San Francisco
Employment Type
Full time
Location Type
On-site
Department
Marketing
The Role
You'll be our first dedicated data engineer, building the data foundation that every team at Comfy will rely on. We stood up the basics of a data stack recently — now we need someone to take ownership of it and turn it into something the whole company trusts.
This means designing dimensional models, establishing dbt standards, building pipeline observability, and making data self-serve for product, BizOps, growth, and engineering. You'll work across Comfy Cloud usage, open-source telemetry, Registry activity, GPU scheduling, and billing data. Our stack includes Snowflake and dbt, but we care way more about your ability to learn and ship than whether you've used these exact tools before.
You might be a good fit if
You've owned a data warehouse end-to-end and built the modeling standards that kept it trustworthy as things scaled
You care about data quality enough to write the tests before anyone asks you to
You're comfortable working directly with non-technical stakeholders to define metrics, resolve ambiguity, and build models people can actually self-serve against
You instinctively reach for automation to eliminate toil, not because it's trendy, but because life is short
-
You've shipped dimensional models that survived contact with stakeholders who kept changing their minds
What you'll do
Own the warehouse end-to-end. Design dimensional models, establish dbt conventions, and build datasets that serve analytics, product, and finance — starting with unifying Comfy Cloud events, open-source install metrics, and Registry activity into a clean model with dbt tests that prove it's correct.
Build observability into the stack. Alerting, lineage, freshness monitoring from ingestion to consumption — so you know when an upstream schema change breaks something before the Monday metrics review, not after.
Catch data quality problems at the source. Write the controls and tests that surface issues before they show up in a dashboard and someone has to ask why the numbers look wrong.
-
Make data self-serve for every team. Work directly with BizOps, product, and engineering to understand what they're trying to answer, then build the models that let them answer it themselves — including the free-to-paid conversion funnel, a single agreed-upon definition of "active user," and an end-to-end billing model that reconciles usage events with invoices and flags discrepancies automatically.
Nice to have
Python for data engineering (not notebooks, actual pipeline code)
Snowflake, BigQuery, or similar cloud warehouse
Airflow, Dagster, or similar orchestration
Experience with product analytics or BI tools
Familiarity with ComfyUI or node-based workflow tools