R. Anderson / Memeroot Ltd · working paper · June 2026

The Row Tax

ETL, ELT, and the Driver Stratum

A rigorous, interactive account of why enterprise data integration processed one row at a time, what it cost, and what the set-based alternative looked like in dated artifacts.

Core claim The pipe won over the engine.

The universal ODBC/JDBC connectivity contract was cursor-shaped; client ETL engines inherited that shape and made row motion the architecture.

01 · The claim

Row-wise ETL was an inherited physical constraint, not just a design preference.

The usual story treats row-by-row ETL as a mapping-canvas style. This site frames it as an architectural consequence of the generic driver layer: the universal pipe was row-shaped, so the engine architecture became row-shaped.

API shape

Cursor in

ODBC and JDBC expose result sets through a current-row cursor. Block fetch and array binding reduce call count, but the abstraction remains a rowset cursor.

Engine shape

Rows through client memory

The DTM/plan engine reads, transforms, and writes records through the driver layer, paying marshalling, network, locking, and index maintenance repeatedly.

Alternative

Set engine owns the work

ELT lands once and executes SQL inside the database: INSERT…SELECT, MERGE, CTAS, partition exchange, parallel query, and direct-path writes.

Argument map

  1. Universal connectivity standardised around ODBC/JDBC.
  2. ODBC/JDBC result handling exposed rows through cursors.
  3. ETL engines internalised this as reader → transform → writer.
  4. Batch windows grew as rows crossed the pipe twice.
  5. Pushdown/ELT later conceded that the database engine was the correct transformation location.

Shadow cast by the API

ODBC/JDBC
cursor API
Client ETL
row buffers
Batch Estate
row tax

This is not an attack on universality. It is an accounting of the physical cost of universality when it becomes the default execution model.

02 · The driver stratum

The connectivity layer became an invisible product line inside everyone else’s product.

The public chain below is separated from the private artifact record. It is included to make the driver-lineage claim inspectable.

Cursor contract

ODBC exposes fetch operations over a result-set cursor. SQLFetch advances to the next rowset; JDBC ResultSet similarly starts before the first row and uses next() to advance row by row. This is the contract that generic tools can rely on across databases.

SQLPrepare(stmt, sql)
SQLExecute(stmt)
while SQLFetch(stmt) == SQL_SUCCESS:
    bind columns
    transform current row
    SQLExecute(insert_or_update)

Merant fossil

Oracle BI EE documentation still references required DataDirect 7.1.5 drivers under a path containing ODBC/Merant/7.1.5. Informatica documentation states that its installation includes DataDirect ODBC drivers. The driver layer is often present, but not architecturally visible.

$ORACLE_HOME/bi/common/ODBC/Merant/7.1.5/lib
ODBCHOME=/export/home/Informatica/10.0.0/ODBC7.1

03 · Interactive cost model

Row tax is the repeated cost of crossing the pipe.

This calculator is illustrative. It does not claim universal benchmark accuracy; it makes the terms visible: row count, call count, per-row work, and set-engine replacement.

Workload

Estimated result

row-wise ETL
set-based ELT
driver calls
time reduction
ETL row engine
ELT set engine

Cursor animation

What array binding fixes — and what it does not

Array fetch/write reduces round-trips. It does not move transformation ownership into the database. The client engine still walks record buffers, still pays data movement, and still treats the set as a stream of rows.

04 · ETL versus ELT

The difference is where transformation executes.

Two diagrams show the same business logic in two execution locations. Use the controls to highlight crossings, execution engine, and failure surface.

ETL: extract → transform → load

Source DB
driver cursor
Client ETL engine
parameterised writes
Target DB
  • Two driver crossings per record path.
  • Transform logic runs outside the database.
  • Updates can become row-level index/lock storms.

ELT: land → transform in place

Source
bulk/direct path
Database set engine
SQL set operation
Consumption table
  • One landing path, then in-engine SQL.
  • Sort/hash/parallel machinery owns the work.
  • MERGE/CTAS/INSERT…SELECT collapse row crossings.

CDC versus full compare

Boundary delta effect

cell comparisons
changed rows
workload ratio

CDC does not make row handling disappear by itself. It prevents the false workload: scanning and comparing unchanged rows nightly because the boundary did not provide change state.

05 · Dated artifacts

The alternative was demonstrated during the row-engine period.

The cards below are generated from the evidence register supplied with the working paper. Private artifact claims are deliberately labelled as artifact evidence, not public corporate record.

Token-substituted SQL generator

Single-pass token resolution can turn a registry row into executable set SQL. Change the inputs and re-render the generated statement.

Dispatcher registry demo

Registration rows create executable dispatch. This is the AFFIRM pattern: row inserts define run types/message tables/consumers, then the dispatcher builder resolves them.

1 · Registry rows
2 · Builder reads metadata
3 · DAG/work queue emitted
4 · Workers execute SQL

ROWID equi-block splitter demo

OWNER_OWK’s Table_Where_Split pattern is represented here with synthetic extents. The point is not the literal ROWID values; it is physical-range splitting into deterministic work units.

06 · Economic reading

The persistence of row-wise architecture had a commercial substrate.

This section is interpretive. The factual chain is sourced; the causal weighting is the paper’s argument.

1

Product economics

ETL licences monetised the client engine. Moving work to the database reduced the engine’s architectural centrality.

2

Connectivity rent

CDC and mainframe connectivity became separately licensed capability rather than default boundary discipline.

3

Labour-market lock-in

Mapping-canvas skills, certifications, and offshore delivery made row-wise estate maintenance reproducible.

4

Terminal concession

When estates became unmovable, emulation monetised the refusal to regenerate formal structure.

COBOL copybook as structure declaration

The argument is not that legacy forms are unintelligible. A copybook is a formal declaration: level numbers define nesting, OCCURS defines repetition, PIC defines type/shape. That is generator input. The expensive decision is to treat it as executable fossil rather than resolvable structure.

01 CUSTOMER-REC.
   05 CUSTOMER-ID        PIC X(12).
   05 BALANCES OCCURS 12.
      10 MONTH-NO       PIC 99.
      10 BALANCE-AMT    PIC S9(11)V99.

07 · Evidence register

Claims, artifact basis, and public-source basis.

The app distinguishes three kinds of support: supplied private artifacts, public corporate/documentation sources, and interpretive claims.

Evidence graph

Public sources, supplied artifacts, and interpretive claims are kept as different node classes. This avoids mixing dated artifact evidence with public corporate history.

08 · Cloudflare pack

Pages-ready static app with optional edge worker.

This pass removes the broken asset references and packages the site as a no-build Cloudflare Pages deployment. It also includes an optional _worker.js edge layer for health/manifest endpoints while preserving pure static fallback.

Direct upload

Drag/drop the folder or ZIP

Works as a static site: index.html, styles.css, app.js, manifest.webmanifest, sw.js, _headers, and _redirects.

Wrangler

Deploy with CLI

Use Wrangler when you want the included _worker.js edge endpoint active from the same root directory.

npx wrangler pages deploy . --project-name row-tax
Git

No build command required

Set the build output directory to the repository root, or keep this folder as the output directory in a larger repo.

Edge feature probe

On Cloudflare, this button checks the optional Worker endpoint. Locally it will report static-only mode, which is expected.

Probe result

Not tested yet.

Included Cloudflare files

_headerssecurity headers, cache policy, permissions policy
_redirectsSPA fallback and clean route aliases
_worker.jsoptional edge health/manifest endpoints
sw.jsoffline cache for the app shell
manifest.webmanifestinstallable PWA metadata
wrangler.tomlPages project metadata and compatibility date