I’ve been throughly duckpilled. I’ve spent the better part of the last year wrangling a LOT of CSVs and let me tell you, duckdb has been right there with me, helping me along the way.

man running from his house yelling 'AHHH IM INSANE WITH ANGER
me, talking about duckdb to anyone who will listen

And after a year of doing this DAY AFTER DAY AFTER DAY

❯ duckdb
v1.1.3 19864453f7
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
D create table t as select * from read_csv('ome_more_csv');

I decided that enough was enough and I made DUCKIT.

BEHOLD MY MONSTROUS CREATION!

duckit() {
  csv=$1

  tmp_db=$(mktemp).db
  cat $1 | duckdb $tmp_db "create table t as select * from read_csv('/dev/stdin')" && duckdb $tmp_db
}

Does it clean up after itself? Of course not!

Does it check inputs? Psh.

What happens if duckdb isn’t installed? You’ll probably die who can tell?

But it lets me do this, which I’ve been dying to do for months now:

❯ duckit free_company_dataset.csv
100% ▕████████████████████████████████████████████████████████████▏ 
v1.1.3 19864453f7
Enter ".help" for usage hints.
D summarize t;
┌──────────────┬─────────────┬─────────────────────────────────────────────────────────────────────┬────────────────────────────────────────────────────────────────────────────────┬───────────────┬────────────────────┬────────────────────┬─────────┬─────────┬─────────┬──────────┬─────────────────┐
│ column_name  │ column_type │                                 min                                 │                                      max                                       │ approx_unique │        avg         │        std         │   q25   │   q50   │   q75   │  count   │ null_percentage │
│   varchar    │   varchar   │                               varchar                               │                                    varchar                                     │     int64     │      varchar       │      varchar       │ varchar │ varchar │ varchar │  int64   │  decimal(9,2)   │
├──────────────┼─────────────┼─────────────────────────────────────────────────────────────────────┼────────────────────────────────────────────────────────────────────────────────┼───────────────┼────────────────────┼────────────────────┼─────────┼─────────┼─────────┼──────────┼─────────────────┤
│ country      │ VARCHAR     │ afghanistan                                                         │ åland islands                                                                  │           283 │                    │                    │         │         │         │ 22734322 │            9.98 │
│ founded      │ BIGINT      │ 10012024937 │ 2006.3161400677536 │ 24.825956869579304 │ 20032013201822734322 │           60.79 │
│ id           │ VARCHAR     │ 000053ktcMRCSIbGGboE0QlBPQlh                                        │ zzzzgepytuhECxjtjLcHyAGn554K                                                   │      22169331 │                    │                    │         │         │         │ 22734322 │            0.00 │
│ industry     │ VARCHAR     │ "glass                                                              │ writing and editing                                                            │           164 │                    │                    │         │         │         │ 22734322 │           23.83 │
│ linkedin_url │ VARCHAR     │  ceramics & concrete"                                               │ linkedin.com/company/   ashiraj-education-overseas-consultants-private-limited │      26144844 │                    │                    │         │         │         │ 22734322 │            0.00 │
│ locality     │ VARCHAR     │ "aeroporto \"b\""                                                   │ ’aïn el melh                                                                   │        248569 │                    │                    │         │         │         │ 22734322 │           32.66 │
│ name         │ VARCHAR     │  el"                                                                │ 🫧sl-wash🫧 spécialiste station de lavage auto                                 │      24204406 │                    │                    │         │         │         │ 22734322 │            0.13 │
│ region       │ VARCHAR     │                                 registered estate surveyors & val…  │ 🧚♀️maison sérénité opôno  🪷                                                   │        139478 │                    │                    │         │         │         │ 22734322 │           22.99 │
│ size         │ VARCHAR     │ 1-10                                                                │ žilinský                                                                       │          2558 │                    │                    │         │         │         │ 22734322 │            0.38 │
│ website      │ VARCHAR     │ "google.com/maps/place/28°36'36.4\"n+77°01'43.6\"e/@28.610122.77.…  │ 👁👄👁.fm                                                                        │      20771265 │                    │                    │         │         │         │ 22734322 │           30.49 │
├──────────────┴─────────────┴─────────────────────────────────────────────────────────────────────┴────────────────────────────────────────────────────────────────────────────────┴───────────────┴────────────────────┴────────────────────┴─────────┴─────────┴─────────┴──────────┴─────────────────┤
│ 10 rows                                                                                                                                                                                                                                                                                     12 columns │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
D select count(0) from t;
┌──────────┐
│ count(0) │
│  int64   │
├──────────┤
│ 22734322 │
└──────────┘

note: don’t scroll all the way to the right, the table got all janked