duckdb-local-sql

一份 48 MB Parquet · 一条 SQL · 0.50 秒出结果 A 48 MB parquet · one SQL · results in 0.50 s — entirely on your laptop

真实端到端:拿 NYC TLC 2024-01 黄色出租 公开 parquet(2,964,624 行),DuckDB CLI 一条 SQL 按小时聚合,0.50 秒返回;`COPY` 出 24 行汇总 parquet;matplotlib 画一张 2862 × 1110 的图。全程本地、零网络、零 LLM、零 API key。下面所有产物都能从这页直接下载、点开。

100% local zero LLM zero API key data: NYC TLC yellow taxi · Jan 2024 · CC0 engine: DuckDB v1.5.3

01
真实 SQL 终端 real session, captured 0.50 s
duckdb · ~/nyc-data · zero network
terminal screenshot of duckdb running SELECT hour, count, avg_fare on yellow_tripdata_2024-01.parquet, returning 24 rows with Run Time 0.504 s
rows scanned: 2,964,624 file: yellow_tripdata_2024-01.parquet (48 MB) Run Time: 0.504 s
02
结果表 + 导出 parquet 24 rows · ZSTD compressed · 1.1 KB
top-pickup-hours.parquet24 rows · 4 cols · 1.1 KB
hourtripsavg_fareavg_miles
077,69219.903.73
152,68418.163.13
236,78317.132.89
324,25219.013.33
416,30423.684.57
518,37827.678.74
640,87322.2012.99
782,93518.876.03
8116,12617.975.47
9127,73218.053.00
10137,28818.123.28
11148,89017.703.43
12162,55217.903.30
13167,87718.533.12
14180,81419.383.60
15187,06019.243.90
16187,88319.613.35
17203,82218.263.01
18210,10617.162.81
19181,71117.763.11
20157,91818.193.31
21158,76418.433.39
22141,18019.243.59
23107,50620.433.91
下载 / open in any tool

下载下来用任意工具打开。比如 duckdb -c "SELECT * FROM 'top-pickup-hours.parquet'"、pandas、polars、Arrow、Parquet Tools、Excel + power query 等。

想跑完整 48 MB 原数据?一行:
curl -L -o yellow_tripdata_2024-01.parquet \
  "https://d37ci6vzurychx.cloudfront.net/trip-data/yellow_tripdata_2024-01.parquet"

然后照 SOP 跑同一条 SQL,应当拿到一致的 24 行(DuckDB 输出确定)。

file scanned
48 MB
rows scanned
2.96 M
SQL run time
0.50 s
peak hour
18:00 · 210k trips
03
matplotlib 图 real png, generated from the exported parquet
hour-of-day-fare.png · 2862 × 1110 · matplotlib bar + line
bar chart of NYC taxi trips by hour and line chart of average fare by hour for January 2024, peak at hour 18
04
30 秒终端 reel file listing → SQL → COPY → python → PNG
live · 1600×900 · 30.00 s · 100% local · zero network