gh-152415: Exercise curses non-ASCII tests under 8-bit locale encodings by serhiy-storchaka · Pull Request #152416 · python/cpython

serhiy-storchaka · 2026-06-27T13:37:58Z

Issue: Extend curses tests to cover non-ASCII characters under 8-bit locales #152415

The non-ASCII tests in test_curses only exercised what the test runner's locale could encode, in practice UTF-8, so the byte-oriented (8-bit locale) code paths were barely tested and several text-accepting methods were tested only with ASCII.

This extends the character and string I/O tests with cases for 8-bit encodings, each guarded by the existing encodability check (skipped when the current locale cannot represent it): ASCII, a character common to the Latin encodings (é), and ones distinctive to a single encoding (byte 0xA4 is ¤ in ISO-8859-1, € in ISO-8859-15, є in KOI8-U). Running the whole suite under different locales (LANG=en_US.ISO8859-1, en_US.ISO8859-15, uk_UA.koi8u) covers those encodings.

It also fills read-side and other gaps found by an audit of the text-accepting API: inch/instr, get_wstr (previously untested), getbkgd/getbkgrnd, unctrl, the default border()/box() ACS cells, and characters given as chtype ints > 127.

A couple of build-/locale-specific notes captured in the comments: on a wide build inch and int (chtype) characters round-trip only Latin-1 codepoints (the wide build stores text/ints through the locale), so those assertions are guarded to ord < 0x100, while instr covers the full set.

Test-only; no behaviour change. Verified on wide and narrow (ncursesw-disabled) builds under UTF-8, ISO-8859-1, ISO-8859-15 and KOI8-U.

…ncodings The non-ASCII tests only exercised what the runner's locale could encode (in practice UTF-8). Add 8-bit-encoding cases to the character and string I/O tests, each guarded by the existing encodability check: ASCII, a character common to the Latin encodings ('é'), and ones distinctive to a single encoding (byte 0xA4 is '¤' in ISO-8859-1, '€' in ISO-8859-15, 'є' in KOI8-U). Run the whole suite under different locales to cover them; unrepresentable cases skip. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…haracter Read each written character back with in_wch() or instr() rather than inch(), which on a wide build returns the low byte of the code point instead of the locale-encoded byte and so mangles a non-ASCII character of an 8-bit locale. This lets the int-argument cases cover '€'/'є', and adds matching coverage for the str argument. insch() with an int byte > 127 is checked only for Latin-1: on a wide build ncurses winsch stores a printable byte directly as a code point instead of decoding it through the locale. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

miss-islington-app · 2026-06-27T19:16:36Z

Thanks @serhiy-storchaka for the PR 🌮🎉.. I'm working now to backport this PR to: 3.15.
🐍🍒⛏🤖

miss-islington-app · 2026-06-27T19:16:42Z

Sorry, @serhiy-storchaka, I could not cleanly backport this to 3.15 due to a conflict.
Please backport using cherry_picker on command line.

cherry_picker 003d3620cc0f44caca7bf26c3e6964f5f379645f 3.15

bedevere-app · 2026-06-27T20:09:50Z

GH-152453 is a backport of this pull request to the 3.15 branch.

…encodings (GH-152416) (#152453) The non-ASCII tests only exercised what the runner's locale could encode (in practice UTF-8). Add 8-bit-encoding cases to the character and string I/O tests, each guarded by the existing encodability check: ASCII, a character common to the Latin encodings ('é'), and ones distinctive to a single encoding (byte 0xA4 is '¤' in ISO-8859-1, '€' in ISO-8859-15, 'є' in KOI8-U). Run the whole suite under different locales to cover them; unrepresentable cases skip. * gh-152415: Verify character output round-trips in test_output_character Read each written character back with in_wch() or instr() rather than inch(), which on a wide build returns the low byte of the code point instead of the locale-encoded byte and so mangles a non-ASCII character of an 8-bit locale. This lets the int-argument cases cover '€'/'є', and adds matching coverage for the str argument. insch() with an int byte > 127 is checked only for Latin-1: on a wide build ncurses winsch stores a printable byte directly as a code point instead of decoding it through the locale. (cherry picked from commit 003d362) Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>

…encodings (GH-152416) (GH-152453) (GH-152457) The non-ASCII tests only exercised what the runner's locale could encode (in practice UTF-8). Add 8-bit-encoding cases to the character and string I/O tests, each guarded by the existing encodability check: ASCII, a character common to the Latin encodings ('é'), and ones distinctive to a single encoding (byte 0xA4 is '¤' in ISO-8859-1, '€' in ISO-8859-15, 'є' in KOI8-U). Run the whole suite under different locales to cover them; unrepresentable cases skip. * gh-152415: Verify character output round-trips in test_output_character Read each written character back with in_wch() or instr() rather than inch(), which on a wide build returns the low byte of the code point instead of the locale-encoded byte and so mangles a non-ASCII character of an 8-bit locale. This lets the int-argument cases cover '€'/'є', and adds matching coverage for the str argument. insch() with an int byte > 127 is checked only for Latin-1: on a wide build ncurses winsch stores a printable byte directly as a code point instead of decoding it through the locale. (cherry picked from commit 003d362) (cherry picked from commit a75aa41) Co-authored-by: Serhiy Storchaka <storchaka@gmail.com> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>

…encodings (GH-152416) (GH-152453) (GH-152456) The non-ASCII tests only exercised what the runner's locale could encode (in practice UTF-8). Add 8-bit-encoding cases to the character and string I/O tests, each guarded by the existing encodability check: ASCII, a character common to the Latin encodings ('é'), and ones distinctive to a single encoding (byte 0xA4 is '¤' in ISO-8859-1, '€' in ISO-8859-15, 'є' in KOI8-U). Run the whole suite under different locales to cover them; unrepresentable cases skip. * gh-152415: Verify character output round-trips in test_output_character Read each written character back with in_wch() or instr() rather than inch(), which on a wide build returns the low byte of the code point instead of the locale-encoded byte and so mangles a non-ASCII character of an 8-bit locale. This lets the int-argument cases cover '€'/'є', and adds matching coverage for the str argument. insch() with an int byte > 127 is checked only for Latin-1: on a wide build ncurses winsch stores a printable byte directly as a code point instead of decoding it through the locale. (cherry picked from commit 003d362) (cherry picked from commit a75aa41) Co-authored-by: Serhiy Storchaka <storchaka@gmail.com> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>

bedevere-app Bot added the tests Tests in the Lib/test dir label Jun 27, 2026

bedevere-app Bot mentioned this pull request Jun 27, 2026

Extend curses tests to cover non-ASCII characters under 8-bit locales #152415

Closed

bedevere-app Bot added the awaiting core review label Jun 27, 2026

serhiy-storchaka added needs backport to 3.15 pre-release feature fixes, bugs and security fixes skip news labels Jun 27, 2026

serhiy-storchaka merged commit 003d362 into python:main Jun 27, 2026
47 of 48 checks passed

bedevere-app Bot removed the awaiting core review label Jun 27, 2026

serhiy-storchaka deleted the curses-test-8bit-locales branch June 27, 2026 19:16

miss-islington-app Bot assigned serhiy-storchaka Jun 27, 2026

bedevere-app Bot removed the needs backport to 3.15 pre-release feature fixes, bugs and security fixes label Jun 27, 2026

serhiy-storchaka removed their assignment Jun 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

gh-152415: Exercise curses non-ASCII tests under 8-bit locale encodings#152416

gh-152415: Exercise curses non-ASCII tests under 8-bit locale encodings#152416
serhiy-storchaka merged 2 commits into
python:mainfrom
serhiy-storchaka:curses-test-8bit-locales

serhiy-storchaka commented Jun 27, 2026

Uh oh!

Uh oh!

miss-islington-app Bot commented Jun 27, 2026

Uh oh!

miss-islington-app Bot commented Jun 27, 2026

Uh oh!

bedevere-app Bot commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Uh oh!

Conversation

serhiy-storchaka commented Jun 27, 2026

Uh oh!

Uh oh!

miss-islington-app Bot commented Jun 27, 2026

Uh oh!

miss-islington-app Bot commented Jun 27, 2026

Uh oh!

bedevere-app Bot commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant