fix: drop partial multibyte char when write length is too small by spokodev · Pull Request #374 · feross/buffer

spokodev · 2026-06-25T17:21:51Z

Problem

Buffer#write(string, offset, length, encoding) with an explicit length smaller than the next character's byte length writes a partial, corrupt multibyte sequence and over-reports the byte count.

The Node docs state: "If buf did not contain enough space to fit the entire string, only part of string will be written. However, partially encoded characters will not be written." The return value counts only complete characters.

Repro (node Buffer oracle vs this shim)

const Node = require('buffer').Buffer   // built-in oracle
const Shim = require('buffer/')         // this package

// 'é' is 2 utf8 bytes (c3 a9); length 1 cannot hold it
Node.alloc(8, 0).write('é', 0, 1, 'utf8')  // 0, buffer stays 00 00 ...
Shim.alloc(8, 0).write('é', 0, 1, 'utf8')  // 1, buffer = c3 00 ...   (lone lead byte)

// '€' is 3 utf8 bytes (e2 82 ac); length 2 cannot hold it
Node.alloc(8, 0).write('€', 0, 2, 'utf8')  // 0, buffer stays 00 00 ...
Shim.alloc(8, 0).write('€', 0, 2, 'utf8')  // 2, buffer = e2 82 00 ...

// 'ab' utf16le is 4 bytes; length 3 holds one full 2-byte unit only
Node.alloc(8, 0).write('ab', 0, 3, 'utf16le')  // 2, buffer = 61 00 00 ...
Shim.alloc(8, 0).write('ab', 0, 3, 'utf16le')  // 3, buffer = 61 00 62 ...

Root cause

utf8Write and ucs2Write passed buf.length - offset (the whole remaining buffer) as the byte cap to utf8ToBytes / utf16leToBytes, ignoring the explicit length:

function utf8Write (buf, string, offset, length) {
  return blitBuffer(utf8ToBytes(string, buf.length - offset), buf, offset, length)
}

So the encoders always generated the full multibyte sequence for the next character, and blitBuffer then copied only length bytes of it, truncating the character mid-sequence and returning the truncated byte count.

utf8ToBytes and utf16leToBytes already cap output atomically per character via their units argument (a multibyte char is pushed only if all of its bytes fit). Passing the effective length as that cap makes the encoder drop an incomplete trailing character instead of emitting a partial one.

Fix

function utf8Write (buf, string, offset, length) {
  return blitBuffer(utf8ToBytes(string, length), buf, offset, length)
}

function ucs2Write (buf, string, offset, length) {
  return blitBuffer(utf16leToBytes(string, length), buf, offset, length)
}

Verification

Added a test in test/write.js asserting node parity for the cases above. It fails on master (6 failing assertions) and passes with the fix.
Full suite green: npm test reports all assertions passing.
Differential sweep against the built-in Buffer over 5 encodings, 18 strings (ASCII, multibyte, surrogate pairs, emoji, lone surrogates), offsets 0 to 4, and lengths 0 to 12 (5940 cases): zero mismatches in return value and buffer contents.

Closed issue #48 covered the implicit length case (a buffer too small for the whole string, no length argument); that path was already fixed by the buf.length - offset cap. This is the distinct case where an explicit length argument is smaller than a single character.

Buffer#write(string, offset, length, encoding) with an explicit length smaller than the next character's byte length wrote a partial, corrupt multibyte sequence and over-reported the byte count. utf8Write/ucs2Write passed buf.length - offset (the whole remaining buffer) as the byte cap to utf8ToBytes/utf16leToBytes, so a full multibyte char was always generated and then byte-truncated mid-sequence by blitBuffer. Node guarantees partially encoded characters are never written and counts only complete characters in the return value. Pass the effective length as the cap so an incomplete trailing character is dropped, matching node Buffer.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: drop partial multibyte char when write length is too small#374

fix: drop partial multibyte char when write length is too small#374
spokodev wants to merge 1 commit into
feross:masterfrom
spokodev:fix/write-explicit-length-partial-char

spokodev commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

spokodev commented Jun 25, 2026

Problem

Repro (node Buffer oracle vs this shim)

Root cause

Fix

Verification

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant