High-performance CSV parser with SIMD acceleration

Built with Zig for native performance. Now with Fast Mode, security hardening, and 22 CLI flags.

bun add turbocsv
269 MB/sPeak throughput
5.8xfaster than fast-csv
6.35xfaster than csv-parse

Performance Comparison

Parsing 100K rows (10 MB) on Apple M1 Pro. Higher is better.

Library1K rows10K rows100K rows100K (wide)
TurboCSV122.6 MB/s165.3 MB/s176.1 MB/s269.3 MB/s
PapaParse84.0 MB/s109.3 MB/s112.0 MB/s224.6 MB/s
csv-parse25.2 MB/s34.9 MB/s35.3 MB/s40.3 MB/s
fast-csv24.8 MB/s28.7 MB/s30.2 MB/s38.1 MB/s

Run the benchmark yourself: bun run benchmark:compare

Features

SIMD Acceleration

ARM64 NEON and x86 SSE2 vector instructions for parallel character scanning at native speed.

Security & Validation

CSV injection protection, structured error reporting, skip bad rows, size limits, column relaxation.

DataFrame API

Pandas-like operations: select, filter, sort, groupBy, join with lazy evaluation.

Fast Mode

TypeScript-only parser for clean data. Dynamic typing, cast functions, nested JSON support.

22 CLI Flags

All parser options accessible via CLI: trim, comments, range processing, error handling, and more.

Memory-Mapped Files

Process files larger than RAM. Zero-copy parsing keeps data out of the JS heap.

RFC 4180 Compliant

Full support for quoted fields, escaped quotes, and multi-line values.

Cross-Platform

Native binaries for macOS, Linux, Windows. WASM fallback for universal compatibility.

What's New in v0.3.0

Major feature release — 22 new CLI flags, security hardening, Fast Mode, and robust error handling.

Security & Validation

  • CSV injection protection with escapeFormulae
  • Structured errors with type, code, row fields
  • skipRecordsWithError — silently drop malformed rows
  • maxRecordSize — reject oversized rows
  • Flexible column count handling (relax constraints)

Whitespace & Processing

  • trim, ltrim, rtrim — strip whitespace
  • Greedy empty row skipping with skipEmptyRows
  • fromLine / toLine — parse file ranges
  • Comment support with comments: true
  • Skip rows where all fields are empty

Fast Mode & Typing

  • Fast Mode — TypeScript-only parser for clean data
  • Dynamic typing — auto-convert to numbers/booleans
  • Cast functions — per-column type transformers
  • flatten() / unflatten() for nested JSON
  • unparse() with flattenObjects option

Advanced Features

  • Duplicate header handling (rename or error)
  • beforeFirstChunk — transform raw data
  • onRecord — per-record filtering/transform
  • Fixed SIMD quote handling bug
  • 22 CLI flags — all parser options available

Quick Start

import { CSVParser } from "turbocsv";

const parser = new CSVParser("data.csv");

for (const row of parser) {
  console.log(row.get("name"), row.get("email"));
}

parser.close();
import { CSVParser, unparse, flatten } from "turbocsv";

// Robust parsing with error handling
const parser = new CSVParser("messy.csv", {
  trim: true,                     // Clean whitespace
  skipRecordsWithError: true,     // Skip bad rows
  comments: true,                  // Skip # prefixed lines
  duplicateHeaders: "rename",      // Handle duplicate columns
  dynamicTyping: true,             // Auto-convert types
  maxRecordSize: 10000,            // Reject huge rows
  cast: {                          // Custom transformers
    price: (val) => parseFloat(val.replace("$", "")),
    date: (val) => new Date(val)
  }
});

// Process with structured error handling
for (const row of parser) {
  try {
    processRow(row);
  } catch (error) {
    if (error.code === "TooFewFields") {
      console.log(`Row ${error.row}: Missing fields`);
    }
  }
}

// Secure CSV output
const csv = unparse(data, {
  escapeFormulae: true,    // Prevent CSV injection
  flattenObjects: true     // Handle nested JSON
});
import { CSVParser } from "turbocsv";

const parser = new CSVParser("data.csv");
const df = parser.toDataFrame();

// Chain operations
const result = df
  .filter(row => row.age > 18)
  .select("name", "email", "age")
  .sorted("name", "asc")
  .first(100);

// Aggregation
const grouped = df.groupBy("department", [
  { col: "salary", fn: "mean" },
  { col: "id", fn: "count" },
]);

parser.close();
# Trim whitespace and skip bad rows
turbocsv head --trim --skip-errors data.csv

# Fast mode with dynamic typing
turbocsv head --fast --dynamic-typing --format json data.csv

# Validate with structured error reporting
turbocsv validate data.csv
# Output: ERROR [TooFewFields] at row 42: Expected 5 fields, got 3

# Process specific range with comments
turbocsv head --from-line 5 --to-line 20 --comments data.csv

# Security: escape formula injection
turbocsv convert --escape-formulae data.csv -o safe.csv

# Handle duplicate headers
turbocsv head --duplicate-headers rename data.csv

See the full API documentation on GitHub.