TurboCSV - High-performance CSV parser with SIMD acceleration

████████╗██╗   ██╗██████╗ ██████╗  ██████╗  ██████╗███████╗██╗   ██╗
╚══██╔══╝██║   ██║██╔══██╗██╔══██╗██╔═══██╗██╔════╝██╔════╝██║   ██║
   ██║   ██║   ██║██████╔╝██████╔╝██║   ██║██║     ███████╗██║   ██║
   ██║   ██║   ██║██╔══██╗██╔══██╗██║   ██║██║     ╚════██║╚██╗ ██╔╝
   ██║   ╚██████╔╝██║  ██║██████╔╝╚██████╔╝╚██████╗███████║ ╚████╔╝
   ╚═╝    ╚═════╝ ╚═╝  ╚═╝╚═════╝  ╚═════╝  ╚═════╝╚══════╝  ╚═══╝

High-performance CSV parser with SIMD acceleration

Built with Zig for native performance. Now with Fast Mode, security hardening, and 22 CLI flags.

bun add turbocsv

269 MB/sPeak throughput

5.8xfaster than fast-csv

6.35xfaster than csv-parse

Get Started GitHub npm

Performance Comparison

Parsing 100K rows (10 MB) on Apple M1 Pro. Higher is better.

Library	1K rows	10K rows	100K rows	100K (wide)
TurboCSV	122.6 MB/s	165.3 MB/s	176.1 MB/s	269.3 MB/s
PapaParse	84.0 MB/s	109.3 MB/s	112.0 MB/s	224.6 MB/s
csv-parse	25.2 MB/s	34.9 MB/s	35.3 MB/s	40.3 MB/s
fast-csv	24.8 MB/s	28.7 MB/s	30.2 MB/s	38.1 MB/s

Run the benchmark yourself: bun run benchmark:compare

Features

SIMD Acceleration

ARM64 NEON and x86 SSE2 vector instructions for parallel character scanning at native speed.

Security & Validation

CSV injection protection, structured error reporting, skip bad rows, size limits, column relaxation.

DataFrame API

Pandas-like operations: select, filter, sort, groupBy, join with lazy evaluation.

Fast Mode

TypeScript-only parser for clean data. Dynamic typing, cast functions, nested JSON support.

22 CLI Flags

All parser options accessible via CLI: trim, comments, range processing, error handling, and more.

Memory-Mapped Files

Process files larger than RAM. Zero-copy parsing keeps data out of the JS heap.

RFC 4180 Compliant

Full support for quoted fields, escaped quotes, and multi-line values.

Cross-Platform

Native binaries for macOS, Linux, Windows. WASM fallback for universal compatibility.

What's New in v0.3.0

Major feature release — 22 new CLI flags, security hardening, Fast Mode, and robust error handling.

Security & Validation

CSV injection protection with escapeFormulae
Structured errors with type, code, row fields
skipRecordsWithError — silently drop malformed rows
maxRecordSize — reject oversized rows
Flexible column count handling (relax constraints)

Whitespace & Processing

trim, ltrim, rtrim — strip whitespace
Greedy empty row skipping with skipEmptyRows
fromLine / toLine — parse file ranges
Comment support with comments: true
Skip rows where all fields are empty

Fast Mode & Typing

Fast Mode — TypeScript-only parser for clean data
Dynamic typing — auto-convert to numbers/booleans
Cast functions — per-column type transformers
flatten() / unflatten() for nested JSON
unparse() with flattenObjects option

Advanced Features

Duplicate header handling (rename or error)
beforeFirstChunk — transform raw data
onRecord — per-record filtering/transform
Fixed SIMD quote handling bug
22 CLI flags — all parser options available

Quick Start

import { CSVParser } from "turbocsv";

const parser = new CSVParser("data.csv");

for (const row of parser) {
  console.log(row.get("name"), row.get("email"));
}

parser.close();

import { CSVParser, unparse, flatten } from "turbocsv";

// Robust parsing with error handling
const parser = new CSVParser("messy.csv", {
  trim: true,                     // Clean whitespace
  skipRecordsWithError: true,     // Skip bad rows
  comments: true,                  // Skip # prefixed lines
  duplicateHeaders: "rename",      // Handle duplicate columns
  dynamicTyping: true,             // Auto-convert types
  maxRecordSize: 10000,            // Reject huge rows
  cast: {                          // Custom transformers
    price: (val) => parseFloat(val.replace("$", "")),
    date: (val) => new Date(val)
  }
});

// Process with structured error handling
for (const row of parser) {
  try {
    processRow(row);
  } catch (error) {
    if (error.code === "TooFewFields") {
      console.log(`Row ${error.row}: Missing fields`);
    }
  }
}

// Secure CSV output
const csv = unparse(data, {
  escapeFormulae: true,    // Prevent CSV injection
  flattenObjects: true     // Handle nested JSON
});

import { CSVParser } from "turbocsv";

const parser = new CSVParser("data.csv");
const df = parser.toDataFrame();

// Chain operations
const result = df
  .filter(row => row.age > 18)
  .select("name", "email", "age")
  .sorted("name", "asc")
  .first(100);

// Aggregation
const grouped = df.groupBy("department", [
  { col: "salary", fn: "mean" },
  { col: "id", fn: "count" },
]);

parser.close();

# Trim whitespace and skip bad rows
turbocsv head --trim --skip-errors data.csv

# Fast mode with dynamic typing
turbocsv head --fast --dynamic-typing --format json data.csv

# Validate with structured error reporting
turbocsv validate data.csv
# Output: ERROR [TooFewFields] at row 42: Expected 5 fields, got 3

# Process specific range with comments
turbocsv head --from-line 5 --to-line 20 --comments data.csv

# Security: escape formula injection
turbocsv convert --escape-formulae data.csv -o safe.csv

# Handle duplicate headers
turbocsv head --duplicate-headers rename data.csv

See the full API documentation on GitHub.