If the error handler is used, a new bytes object is created to set as
the object attribute of UnicodeDecodeError, and that bytes object then
replaces the original data. A pointer to the decoded data will became invalid
after destroying that temporary bytes object. So we need other way to return
the first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal().
_PyBytes_DecodeEscape() does not have such issue, because it does not
use the error handlers registry, but it should be changed for compatibility
with _PyUnicode_DecodeUnicodeEscapeInternal().
(cherry picked from commit 9f69a58623bd01349a18ba0c7a9cb1dad6a51e8e)
(cherry picked from commit 6279eb8c076d89d3739a6edb393e43c7929b429d)
gh-131762: Fixed dereferencing the pointer 'parser_token->metadata' with a NULL value (GH-131764)
(cherry picked from commit 2c686a9ac243800b630d4a09622c8eb789f5b354)
Co-authored-by: rialbat <47256826+rialbat@users.noreply.github.com>
gh-125331: Allow the parser to activate future imports on the fly (GH-125482)
(cherry picked from commit 3bd3e09588bfde7edba78c55794a0e28e2d21ea5)
Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
gh-130740: Move some `stdbool.h` includes after `Python.h` (#130738)
Move some `#include <stdbool.h>` after `#include "Python.h"` when `pyconfig.h` is not
included first and when we are in a platform-agnostic context. This is to avoid having
features defined by `stdbool.h` before those decided by `Python.h` (this caused some
build failures when compiling CPython with `zig cc`).
(cherry-picked from commit 214562ed4ddc248b007f718ed92ebcc0c3669611)
---------
Co-authored-by: Hugo Beauzée-Luyssen <hugo@beauzee.fr>
[3.12] gh-126105: Fix crash in `ast` module, when `._fields` is deleted (GH-126115)
Previously, if the `ast.AST._fields` attribute was deleted, attempts to create a new `as`t node would crash due to the assumption that `_fields` always had a non-NULL value. Now it has been fixed by adding an extra check to ensure that `_fields` does not have a NULL value (this can happen when you manually remove `_fields` attribute).
(cherry picked from commit b2eaa75b176e07730215d76d8dce4d63fb493391)
Co-authored-by: sobolevn <mail@sobolevn.me>
This backports several PRs for gh-113993, making interned strings mortal so they can be garbage-collected when no longer needed.
* Allow interned strings to be mortal, and fix related issues (GH-120520)
* Add an InternalDocs file describing how interning should work and how to use it.
* Add internal functions to *explicitly* request what kind of interning is done:
- `_PyUnicode_InternMortal`
- `_PyUnicode_InternImmortal`
- `_PyUnicode_InternStatic`
* Switch uses of `PyUnicode_InternInPlace` to those.
* Disallow using `_Py_SetImmortal` on strings directly.
You should use `_PyUnicode_InternImmortal` instead:
- Strings should be interned before immortalization, otherwise you're possibly
interning a immortalizing copy.
- `_Py_SetImmortal` doesn't handle the `SSTATE_INTERNED_MORTAL` to
`SSTATE_INTERNED_IMMORTAL` update, and those flags can't be changed in
backports, as they are now part of public API and version-specific ABI.
* Add private `_only_immortal` argument for `sys.getunicodeinternedsize`, used in refleak test machinery.
Make sure the statically allocated string singletons are unique. This means these sets are now disjoint:
- `_Py_ID`
- `_Py_STR` (including the empty string)
- one-character latin-1 singletons
Now, when you intern a singleton, that exact singleton will be interned.
* Add a `_Py_LATIN1_CHR` macro, use it instead of `_Py_ID`/`_Py_STR` for one-character latin-1 singletons everywhere (including Clinic).
* Intern `_Py_STR` singletons at startup.
* Beef up the tests. Cover internal details (marked with `@cpython_only`).
* Add lots of assertions
* Don't immortalize in PyUnicode_InternInPlace; keep immortalizing in other API (GH-121364)
* Switch PyUnicode_InternInPlace to _PyUnicode_InternMortal, clarify docs
* Document immortality in some functions that take `const char *`
This is PyUnicode_InternFromString;
PyDict_SetItemString, PyObject_SetAttrString;
PyObject_DelAttrString; PyUnicode_InternFromString;
and the PyModule_Add convenience functions.
Always point out a non-immortalizing alternative.
* Don't immortalize user-provided attr names in _ctypes
* Immortalize names in code objects to avoid crash (GH-121903)
* Intern latin-1 one-byte strings at startup (GH-122303)
There are some 3.12-specific changes, mainly to allow statically allocated strings in deepfreeze. (In 3.13, deepfreeze switched to the general `_Py_ID`/`_Py_STR`.)
Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
* gh-123321: Fix Parser/myreadline.c to prevent a segfault during a multi-threaded race (GH-123323)
(cherry picked from commit a4562fedadb73fe1e978dece65c3bcefb4606678)
Co-authored-by: Bar Harel <bharel@barharel.com>
* Remove @requires_gil_enabled for 3.12
---------
Co-authored-by: Bar Harel <bharel@barharel.com>
Co-authored-by: Sam Gross <colesbury@gmail.com>
- Cache line object to avoid creating a Unicode object
for all of the tokens in the same line.
- Speed up byte offset to column offset conversion by using the
smallest buffer possible to measure the difference.
(cherry picked from commit d87b0151062e36e67f9e42e1595fba5bf23a485c)
Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
gh-113602: Bail out when the parser tries to override existing errors (GH-113607)
(cherry picked from commit 9ed36d533ab8b256f0a589b5be6d7a2fdcf4aff2)
Signed-off-by: Pablo Galindo <pablogsal@gmail.com>
Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
(cherry picked from commit 48c49739f5502fc7aa82f247ab2e4d7b55bdca62)
Co-authored-by: Yilei Yang <yileiyang@google.com>
Co-authored-by: Gregory P. Smith [Google LLC] <greg@krypto.org>
gh-112388: Fix an error that was causing the parser to try to overwrite tokenizer errors (GH-112410)
(cherry picked from commit 2c8b19174274c183eb652932871f60570123fe99)
Signed-off-by: Pablo Galindo <pablogsal@gmail.com>
Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
gh-111380: Show SyntaxWarnings only once when parsing if invalid syntax is encouintered (GH-111381)
(cherry picked from commit 3d2f1f0b830d86f16f42c42b54d3ea4453dac318)
Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
[3.12] gh-110938: Fix error messages for indented blocks with functions and classes with generic type parameters (GH-110973)
(cherry picked from commit 24e4ec7766fd471deb5b7e5087f0e7dba8576cfb)
Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
gh-110259: Fix f-strings with multiline expressions and format specs (GH-110271)
(cherry picked from commit cc389ef627b2a486ab89d9a11245bef48224efb1)
Signed-off-by: Pablo Galindo <pablogsal@gmail.com>
Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
gh-88943: Improve syntax error for non-ASCII character that follows a numerical literal (GH-109081)
It now points on the invalid non-ASCII character, not on the valid numerical literal.
(cherry picked from commit b2729e93e9d73503b1fda4ea4fecd77c58909091)
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
GH-107263: Increase C stack limit for most functions, except `_PyEval_EvalFrameDefault()` (GH-107535)
* Set C recursion limit to 1500, set cost of eval loop to 2 frames, and compiler mutliply to 2.
(cherry picked from commit fa45958450aa3489607daf9855ca0474a2a20878)
Co-authored-by: Mark Shannon <mark@hotpy.org>