Skip to content

[3.15] gh-152415: Exercise curses non-ASCII tests under 8-bit locale encodings (GH-152416)#152453

Merged
serhiy-storchaka merged 1 commit into
python:3.15from
serhiy-storchaka:backport-003d362-3.15
Jun 27, 2026
Merged

[3.15] gh-152415: Exercise curses non-ASCII tests under 8-bit locale encodings (GH-152416)#152453
serhiy-storchaka merged 1 commit into
python:3.15from
serhiy-storchaka:backport-003d362-3.15

Conversation

@serhiy-storchaka

@serhiy-storchaka serhiy-storchaka commented Jun 27, 2026

Copy link
Copy Markdown
Member

The non-ASCII tests only exercised what the runner's locale could encode (in practice UTF-8). Add 8-bit-encoding cases to the character and string I/O tests, each guarded by the existing encodability check: ASCII, a character common to the Latin encodings ('é'), and ones distinctive to a single encoding (byte 0xA4 is '¤' in ISO-8859-1, '€' in ISO-8859-15, 'є' in KOI8-U). Run the whole suite under different locales to cover them; unrepresentable cases skip.

Read each written character back with in_wch() or instr() rather than inch(), which on a wide build returns the low byte of the code point instead of the locale-encoded byte and so mangles a non-ASCII character of an 8-bit locale. This lets the int-argument cases cover '€'/'є', and adds matching coverage for the str argument.

insch() with an int byte > 127 is checked only for Latin-1: on a wide build ncurses winsch stores a printable byte directly as a code point instead of decoding it through the locale.
(cherry picked from commit 003d362)

…ocale encodings (pythonGH-152416)

The non-ASCII tests only exercised what the runner's locale could encode (in
practice UTF-8).  Add 8-bit-encoding cases to the character and string I/O
tests, each guarded by the existing encodability check: ASCII, a character
common to the Latin encodings ('é'), and ones distinctive to a single encoding
(byte 0xA4 is '¤' in ISO-8859-1, '€' in ISO-8859-15, 'є' in KOI8-U).  Run the
whole suite under different locales to cover them; unrepresentable cases skip.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* pythongh-152415: Verify character output round-trips in test_output_character

Read each written character back with in_wch() or instr() rather than
inch(), which on a wide build returns the low byte of the code point
instead of the locale-encoded byte and so mangles a non-ASCII character
of an 8-bit locale.  This lets the int-argument cases cover '€'/'є', and
adds matching coverage for the str argument.

insch() with an int byte > 127 is checked only for Latin-1: on a wide
build ncurses winsch stores a printable byte directly as a code point
instead of decoding it through the locale.
(cherry picked from commit 003d362)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@bedevere-app bedevere-app Bot added the tests Tests in the Lib/test dir label Jun 27, 2026
@serhiy-storchaka serhiy-storchaka enabled auto-merge (squash) June 27, 2026 20:10
@serhiy-storchaka serhiy-storchaka added needs backport to 3.13 bugs and security fixes needs backport to 3.14 bugs and security fixes labels Jun 27, 2026
@serhiy-storchaka serhiy-storchaka merged commit a75aa41 into python:3.15 Jun 27, 2026
59 of 60 checks passed
@miss-islington-app

Copy link
Copy Markdown

Thanks @serhiy-storchaka for the PR 🌮🎉.. I'm working now to backport this PR to: 3.13, 3.14.
🐍🍒⛏🤖

@bedevere-app

bedevere-app Bot commented Jun 27, 2026

Copy link
Copy Markdown

GH-152456 is a backport of this pull request to the 3.14 branch.

@bedevere-app bedevere-app Bot removed the needs backport to 3.14 bugs and security fixes label Jun 27, 2026
@bedevere-app

bedevere-app Bot commented Jun 27, 2026

Copy link
Copy Markdown

GH-152457 is a backport of this pull request to the 3.13 branch.

@bedevere-app bedevere-app Bot removed the needs backport to 3.13 bugs and security fixes label Jun 27, 2026
serhiy-storchaka added a commit that referenced this pull request Jun 27, 2026
…encodings (GH-152416) (GH-152453) (GH-152457)

The non-ASCII tests only exercised what the runner's locale could encode (in
practice UTF-8).  Add 8-bit-encoding cases to the character and string I/O
tests, each guarded by the existing encodability check: ASCII, a character
common to the Latin encodings ('é'), and ones distinctive to a single encoding
(byte 0xA4 is '¤' in ISO-8859-1, '€' in ISO-8859-15, 'є' in KOI8-U).  Run the
whole suite under different locales to cover them; unrepresentable cases skip.

* gh-152415: Verify character output round-trips in test_output_character

Read each written character back with in_wch() or instr() rather than
inch(), which on a wide build returns the low byte of the code point
instead of the locale-encoded byte and so mangles a non-ASCII character
of an 8-bit locale.  This lets the int-argument cases cover '€'/'є', and
adds matching coverage for the str argument.

insch() with an int byte > 127 is checked only for Latin-1: on a wide
build ncurses winsch stores a printable byte directly as a code point
instead of decoding it through the locale.
(cherry picked from commit 003d362)
(cherry picked from commit a75aa41)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
serhiy-storchaka added a commit that referenced this pull request Jun 27, 2026
…encodings (GH-152416) (GH-152453) (GH-152456)

The non-ASCII tests only exercised what the runner's locale could encode (in
practice UTF-8).  Add 8-bit-encoding cases to the character and string I/O
tests, each guarded by the existing encodability check: ASCII, a character
common to the Latin encodings ('é'), and ones distinctive to a single encoding
(byte 0xA4 is '¤' in ISO-8859-1, '€' in ISO-8859-15, 'є' in KOI8-U).  Run the
whole suite under different locales to cover them; unrepresentable cases skip.

* gh-152415: Verify character output round-trips in test_output_character

Read each written character back with in_wch() or instr() rather than
inch(), which on a wide build returns the low byte of the code point
instead of the locale-encoded byte and so mangles a non-ASCII character
of an 8-bit locale.  This lets the int-argument cases cover '€'/'є', and
adds matching coverage for the str argument.

insch() with an int byte > 127 is checked only for Latin-1: on a wide
build ncurses winsch stores a printable byte directly as a code point
instead of decoding it through the locale.
(cherry picked from commit 003d362)
(cherry picked from commit a75aa41)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

skip news tests Tests in the Lib/test dir

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant