Posts

Using databases with Shiny

Key issues when adding persistent storage to a Shiny application, featuring {golem} app development and Digital Ocean serving

How to Make R Markdown Snow

Much like ice sculpting, applying powertools to absolutely frivolous pursuits

Make grouping a first-class citizen in data quality checks

Which of these numbers doesn’t belong? -1, 0, 1, NA. You can’t judge data quality without data context, so our tools should enable as much context as possible.

Why machine learning hates vegetables

A personal encounter with ‘intelligent’ data products gone wrong

A lightweight data validation ecosystem with R, GitHub, and Slack

A right-sized solution to automated data monitoring, alerting, and reporting using R (pointblank, projmgr), GitHub (Actions, Pages, issues), and Slack

Simple things that are hard and important

How metric definitions, ambiguous calculations, sample sizes, and domain knowledge make calculating a humble average a formidable and thought-deserving task

Workflows for querying databases via R

Simple, self-contained, reproducible examples are a common part of good software documentation. However, in the spirit of brevity, these examples often do not demonstrate the most sustainable or flexible workflows for integrating software tools into large projects.

Understanding the data (error) generating processes for data validation

A data consumer’s guide to validating data based on the failure modes data producer’s try to avoid

A Tale of Six States: Flexible data extraction with scraping and browser automation

Exploring how Playwright's headless browser automation (and its friends) can help unite the states’ data

Embedding column-name contracts in data pipelines with dbt

dbt supercharges SQL with Jinja templating, macros, and testing – all of which can be customized to enforce controlled vocabularies and their implied contracts on a data model