Question 1

Is the CSV uploaded anywhere during cleaning?

Accepted Answer

No. The file is parsed locally by PapaParse 5.4 running in your browser, transformed in memory, and offered back as a download. Nothing is sent to any server. You can verify this in your browser's Network tab while the tool runs.

Question 2

Which markers count as null when normalizing?

Accepted Answer

By default: empty strings, NA, N/A, null, NULL, and a standalone hyphen -. The normalization is case-insensitive. You can customize the list in the options if your data uses other markers like #N/A (Excel error code) or None (Python pandas default).

Question 3

How does the de-dupe step decide which row to keep?

Accepted Answer

By default, the first occurrence of each duplicate group is kept and the rest are dropped. If you select specific key columns, rows with matching keys collapse to one regardless of what the other columns contain. There is also a Last option that keeps the most recently seen row instead of the first — useful when later rows are presumed to override earlier ones.

Question 4

Can I dedupe on a subset of columns?

Accepted Answer

Yes. By default the full row is the dedupe key. You can pick any combination of columns from a list, and two rows are considered duplicates if their selected key columns match. This is the right setting for cases like a customer list where the email column is the natural identity and other columns may legitimately differ between exports.

Question 5

What does fill-forward do?

Accepted Answer

Fill-forward replaces empty cells with the most recent non-empty value from the same column. It is useful for hierarchical exports where a parent value (department, region, category) only appears in the first row of each group and the rest are blank. After fill-forward, every row carries its parent value explicitly, which is what every downstream tool actually needs.

Question 6

Does the cleaner detect the delimiter automatically?

Accepted Answer

Yes. PapaParse 5.4 auto-detects comma, semicolon, tab, and pipe delimiters by sampling the first few lines and picking the one that yields the most consistent column count. If the auto-detection picks the wrong delimiter (rare, but it can happen with files where the wrong character is more common in the data than between fields), you can override it manually.

Question 7

How big a CSV can the cleaner handle?

Accepted Answer

There is no hard cap because the work runs in your browser. Practical limits are set by available memory; files up to a few hundred megabytes work fine on a modern laptop. Very large files (multi-gigabyte) may stall on a single tab; in those cases split the file first, clean each piece, and concatenate the results.

Question 8

Can I clean a TSV or pipe-delimited file?

Accepted Answer

Yes. Drop the file in and the parser auto-detects the delimiter. The output will use the same delimiter as the input by default. If you want to convert between delimiters at the same time as cleaning, choose the output delimiter explicitly in the options.

CSV Cleaner

CSV Cleaner (De-dupe, Trim, Fill Nulls)

Why Clean a CSV in the Browser

How the Four Cleanup Passes Work

Use Cases That Justify a Cleanup Pass

How We Compare to Desktop Tools

Frequently Asked Questions