1112 Commits

Author SHA1 Message Date
Serhiy Storchaka
73b3040f59
[3.11] gh-133767: Fix use-after-free in the unicode-escape decoder with an error handler (GH-129648) (GH-133944) (GH-134341)
If the error handler is used, a new bytes object is created to set as
the object attribute of UnicodeDecodeError, and that bytes object then
replaces the original data. A pointer to the decoded data will became invalid
after destroying that temporary bytes object. So we need other way to return
the first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal().

_PyBytes_DecodeEscape() does not have such issue, because it does not
use the error handlers registry, but it should be changed for compatibility
with _PyUnicode_DecodeUnicodeEscapeInternal().
(cherry picked from commit 9f69a58623bd01349a18ba0c7a9cb1dad6a51e8e)
(cherry picked from commit 6279eb8c076d89d3739a6edb393e43c7929b429d)
(cherry picked from commit a75953b347716fff694aa59a7c7c2489fa50d1f5)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2025-06-02 17:52:52 +02:00
Grigoriev Semyon
3bc0d2b851
[3.11] gh-109120: Fix syntax error in handlinh of incorrect star expressions… (#117464)
gh-109120: Fix syntax error in handlinh of incorrect star expressions (#117444)

(cherry picked from commit c97d3af2391e62ef456ef2365d48ab9b8cdbe27b)
2024-04-03 11:37:39 +01:00
Alex Waygood
a30a1e7a49
[3.11] gh-115881: Ensure ast.parse() parses conditional context managers even with low feature_version passed (#115920) (#115960) 2024-02-26 16:27:51 +00:00
Miss Islington (bot)
35a43d4394
[3.11] gh-115823: Calculate correctly error locations when dealing with implicit encodings (GH-115824) (#115950)
gh-115823: Calculate correctly error locations when dealing with implicit encodings (GH-115824)
(cherry picked from commit 015b97d19a24a169cc3c0939119e1228791e4253)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2024-02-26 16:08:37 +00:00
Miss Islington (bot)
1c381ec4ed
[3.11] gh-113602: Bail out when the parser tries to override existing errors (GH-113607) (#113653)
gh-113602: Bail out when the parser tries to override existing errors (GH-113607)
(cherry picked from commit 9ed36d533ab8b256f0a589b5be6d7a2fdcf4aff2)

Signed-off-by: Pablo Galindo <pablogsal@gmail.com>
Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2024-01-02 13:22:39 +00:00
Serhiy Storchaka
4b358d754c
[3.11] gh-106905: Use separate structs to track recursion depth in each PyAST_mod2obj call. (GH-113035) (GH-113472) (GH-113476)
(cherry picked from commit 48c49739f5502fc7aa82f247ab2e4d7b55bdca62)
(cherry picked from commit d58a5f453f59f44ccf09b1a9b11a0b879ac6f35b)

Co-authored-by: Yilei Yang <yileiyang@google.com>
Co-authored-by: Gregory P. Smith [Google LLC] <greg@krypto.org>
2023-12-25 20:40:33 +00:00
Miss Islington (bot)
390a5b81a9
[3.11] gh-112387: Fix error positions for decoded strings with backwards tokenize errors (GH-112409) (#112469)
gh-112387: Fix error positions for decoded strings with backwards tokenize errors (GH-112409)
(cherry picked from commit 45d648597b1146431bf3d91041e60d7f040e70bf)

Signed-off-by: Pablo Galindo <pablogsal@gmail.com>
Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2023-11-27 19:05:20 +00:00
Miss Islington (bot)
43b081bfc4
[3.11] gh-112388: Fix an error that was causing the parser to try to overwrite tokenizer errors (GH-112410) (#112467)
gh-112388: Fix an error that was causing the parser to try to overwrite tokenizer errors (GH-112410)
(cherry picked from commit 2c8b19174274c183eb652932871f60570123fe99)

Signed-off-by: Pablo Galindo <pablogsal@gmail.com>
Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2023-11-27 18:56:27 +00:00
Miss Islington (bot)
08e4e11b75
[3.11] gh-111380: Show SyntaxWarnings only once when parsing if invalid syntax is encouintered (GH-111381) (#111383)
gh-111380: Show SyntaxWarnings only once when parsing if invalid syntax is encouintered (GH-111381)
(cherry picked from commit 3d2f1f0b830d86f16f42c42b54d3ea4453dac318)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2023-10-31 13:29:42 +00:00
Pablo Galindo Salgado
22cde39fbf
[3.11] bpo-43950: handle wide unicode characters in tracebacks (GH-28150) (#111373) 2023-10-27 09:46:20 +09:00
Pablo Galindo Salgado
4e4a3e161f
[3.11] gh-110696: Fix incorrect syntax error message for incorrect argument unpacking (GH-110706) (#110766) 2023-10-18 13:59:17 +01:00
Lysandros Nikolaou
1af7b7db0d
[3.11] gh-107450: Check for overflow in the tokenizer and fix overflow test (GH-110832) (#110939)
(cherry picked from commit a1ac5590e0f8fe008e5562d22edab65d0c1c5507)

Co-authored-by: Filipe Laíns <lains@riseup.net>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2023-10-18 00:34:56 +02:00
Miss Islington (bot)
c9214b90f4
[3.11] gh-107450: Raise OverflowError when parser column offset overflows (GH-110754) (#110763)
(cherry picked from commit fb7843ee895ac7f6eeb58f356b1a320eea081cfc)

Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
2023-10-12 09:57:36 +00:00
Serhiy Storchaka
dae62d456e
[3.11] gh-88943: Improve syntax error for non-ASCII character that follows a numerical literal (GH-109081) (GH-109091)
It now points on the invalid non-ASCII character, not on the valid numerical literal.
(cherry picked from commit b2729e93e9d73503b1fda4ea4fecd77c58909091)
2023-09-07 14:54:07 +00:00
Miss Islington (bot)
c0c4186858
[3.11] GH-105588: Add missing error checks to some obj2ast_* converters (GH-105839)
GH-105588: Add missing error checks to some obj2ast_* converters (GH-105589)
(cherry picked from commit a4056c8f9c2d9970d39e3cb6bffb255cd4b8a42c)

Co-authored-by: Brandt Bucher <brandtbucher@microsoft.com>
2023-06-15 23:13:51 +00:00
Miss Islington (bot)
b764347572
[3.11] Fix typo in the tokenizer (GH-104950) (#104952)
(cherry picked from commit 705e387dd81b971cb1ee5727da54adfb565f61d0)

Co-authored-by: Stepfen Shawn <m18824909883@163.com>
2023-05-25 23:32:04 -07:00
Lysandros Nikolaou
a09d3901a5
[3.11] gh-96670: Raise SyntaxError when parsing NULL bytes (GH-97594) (#104195) 2023-05-07 11:12:04 +01:00
Miss Islington (bot)
7b2ac6cf3d
[3.11] gh-102310: Change error range for invalid bytes literals (GH-103663) (#103703) 2023-04-23 17:21:27 -06:00
Miss Islington (bot)
abd6e97020
[3.11] GH-102711: Fix warnings found by clang (GH-102712) (#103075)
There are some warnings if build python via clang:

Parser/pegen.c:812:31: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
_PyPegen_clear_memo_statistics()
                              ^
                               void

Parser/pegen.c:820:29: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
_PyPegen_get_memo_statistics()
                            ^
                             void

Fix it to make clang happy.

(cherry picked from commit 7703def37e4fa7d25c3d23756de8f527daa4e165)

Signed-off-by: Chenxi Mao <chenxi.mao@suse.com>
Co-authored-by: Chenxi Mao <chenxi.mao@suse.com>
2023-03-28 11:27:30 +02:00
Pablo Galindo Salgado
58de2eb26b
[3.11] gh-102416: Do not memoize incorrectly loop rules in the parser (GH-102467). (#102473) 2023-03-06 17:13:28 +00:00
Pablo Galindo Salgado
31b82abb5c
[3.11] gh-101046: Fix a potential memory leak in the parser when raising MemoryError (GH-101051) (#101085)
Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2023-01-16 23:48:51 +00:00
Miss Islington (bot)
2b97ddd512
gh-100050: Fix an assertion error when raising unclosed parenthesis errors in the tokenizer (GH-100065)
(cherry picked from commit 97e7004cfe48305bcd642c653b406dc7470e196d)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
Automerge-Triggered-By: GH:pablogsal
2022-12-07 01:18:00 -08:00
Pablo Galindo Salgado
6282ef6c3f
[3.11] gh-99891: Fix infinite recursion in the tokenizer when showing warnings (GH-99893) (GH-99896)
Automerge-Triggered-By: GH:pablogsal.
(cherry picked from commit 417206a05c4545bde96c2bbbea92b53e6cac0d48)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2022-12-01 00:57:04 -08:00
Miss Islington (bot)
f381644819
gh-99581: Fix a buffer overflow in the tokenizer when copying lines that fill the available buffer (GH-99605)
(cherry picked from commit e13d1d9dda8c27691180bc618bd5e9bf43dfa89f)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2022-11-20 12:53:02 -08:00
Lysandros Nikolaou
152a437b8d
[3.11] gh-99211: Point to except/except* on syntax errors when mixing them (GH-99215) (GH-99622)
gh-99211: Point to except/except* on syntax errors when mixing them (GH-99215)

(cherry picked from commit 9c4232ae8972a33f84e875cfdd866318a1233e47)
2022-11-20 19:29:05 +01:00
Irit Katriel
d8a42bcaf0
[3.11] gh-99153: set location on SyntaxError for try with both except and except* (GH-99160) (#99168) 2022-11-07 09:41:20 +00:00
Nikita Sobolev
8c6ced36ab
[3.11] gh-96587: Raise SyntaxError for PEP654 on older feature_version (GH-96588) (#96591)
(cherry picked from commit 2c7d2e8d46164efb6e27a64081d8e949f6876515)

Co-authored-by: Nikita Sobolev <mail@sobolevn.me>
2022-10-05 15:00:13 -07:00
Miss Islington (bot)
f2d7fa8839
gh-96678: Fix UB of null pointer arithmetic (GH-96782)
Automerge-Triggered-By: GH:pablogsal
(cherry picked from commit 81e36f350b75d2ed2668825f7df6e059b57f859c)

Co-authored-by: Matthias Görgens <matthias.goergens@gmail.com>
2022-09-13 08:03:40 -07:00
Miss Islington (bot)
ffafa9b91d
gh-96268: Fix loading invalid UTF-8 (GH-96270)
This makes tokenizer.c:valid_utf8 match stringlib/codecs.h:decode_utf8.

It also fixes an off-by-one error introduced in 3.10 for the line number when the tokenizer reports bad UTF8.
(cherry picked from commit 8bc356a7dd50cbdb46d10b8c7e457832431f5d9e)

Co-authored-by: Michael Droettboom <mdboom@gmail.com>
2022-09-07 14:49:17 -07:00
Miss Islington (bot)
bb0dab5c48
gh-96611: Fix error message for invalid UTF-8 in mid-multiline string (GH-96623)
(cherry picked from commit 05692c67c51b78a5a5a7bb61d646519025e38015)

Co-authored-by: Michael Droettboom <mdboom@gmail.com>
2022-09-06 16:40:17 -07:00
Gregory P. Smith
f8b71da9aa
[3.11] gh-95778: CVE-2020-10735: Prevent DoS by very large int() (#96500)
Integer to and from text conversions via CPython's bignum `int` type is not safe against denial of service attacks due to malicious input. Very large input strings with hundred thousands of digits can consume several CPU seconds.

This PR comes fresh from a pile of work done in our private PSRT security response team repo.

This backports https://github.com/python/cpython/pull/96499 aka 511ca9452033ef95bc7d7fc404b8161068226002

Signed-off-by: Christian Heimes [Red Hat] <christian@python.org>
Tons-of-polishing-up-by: Gregory P. Smith [Google] <greg@krypto.org>
Reviews via the private PSRT repo via many others (see the NEWS entry in the PR).

<!-- gh-issue-number: gh-95778 -->
* Issue: gh-95778
<!-- /gh-issue-number -->

I wrote up [a one pager for the release managers](https://docs.google.com/document/d/1KjuF_aXlzPUxTK4BMgezGJ2Pn7uevfX7g0_mvgHlL7Y/edit#).
2022-09-02 09:48:57 -07:00
Shantanu
7fc8221794
[3.11] gh-94996: Disallow lambda pos only params with feature_version < (3, 8) (GH-95934) (GH-95936)
(cherry picked from commit a965db37f27ffb232312bc13d9a509f0d93fcd20)

Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com>

Automerge-Triggered-By: GH:lysnikolaou
2022-08-12 12:41:09 -07:00
Miss Islington (bot)
4abf84602f
gh-94996: Disallow parsing pos only params with feature_version < (3, 8) (GH-94997)
(cherry picked from commit b5e3ea286289fcad12be78480daf3756e350f69f)

Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com>
2022-08-12 10:53:09 -07:00
Miss Islington (bot)
1221e8c400
gh-95876: Fix format string in pegen error location code (GH-95877)
(cherry picked from commit b4c857d0fd74abb1ede6fe083c4fa3ca728b2b83)

Co-authored-by: Christian Heimes <christian@python.org>
2022-08-11 02:19:20 -07:00
Miss Islington (bot)
d3cc99bdce
gh-95355: Check tokens[0] after allocating memory (GH-95356)
GH-95355

Automerge-Triggered-By: GH:pablogsal
(cherry picked from commit b946f529efb4a623ac4ad968d8091edb81ebdcdb)

Co-authored-by: Honglin Zhu <zhuhonglin.zhl@alibaba-inc.com>
2022-07-28 03:29:50 -07:00
Miss Islington (bot)
86eb500068
[3.11] gh-95185: Check recursion depth in the AST constructor (GH-95186) (GH-95208)
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
(cherry picked from commit 00474472944944b346d8409cfded84bb299f601a)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2022-07-26 12:19:22 +02:00
Miss Islington (bot)
7733aa048e
gh-94949: Disallow parsing parenthesised ctx mgr with old feature_version (GH-94950)
* gh-94949: Disallow parsing parenthesised ctx manager with old feature_version

* 📜🤖 Added by blurb_it.

* Allow it with feature_version=(3, 9) as well

Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
(cherry picked from commit 0daba822212cd5d6c63384a27f390f0945330c2b)

Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com>
2022-07-18 14:57:45 -07:00
Miss Islington (bot)
7dc236d116
gh-94947: Disallow parsing walrus with feature_version < (3, 8) (GH-94948)
* gh-94947: Disallow parsing walrus with feature_version < (3, 8)

* oops, commit the parser

* 📜🤖 Added by blurb_it.

Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
(cherry picked from commit ae0be5a53bb4caee3de4888341addd9c94133f2d)

Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com>
2022-07-18 02:46:21 -07:00
Miss Islington (bot)
e121cb5814
gh-94869: Fix the location in some expressions for multi-line f-string ast nodes (GH-94895)
(cherry picked from commit 2e9da8e3522764d09f1d6054a2be567e91a30812)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2022-07-16 12:16:51 -07:00
Miss Islington (bot)
d49c99f10d
gh-94360: Fix a tokenizer crash when reading encoded files with syntax errors from stdin (GH-94386)
* gh-94360: Fix a tokenizer crash when reading encoded files with syntax errors from stdin

Signed-off-by: Pablo Galindo <pablogsal@gmail.com>

* nitty nit

Co-authored-by: Łukasz Langa <lukasz@langa.pl>
(cherry picked from commit 36fcde61ba48c4e918830691ecf4092e4e3b9b99)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2022-07-05 10:09:51 -07:00
Miss Islington (bot)
442dd8ffa5
gh-94192: Fix error for dictionary literals with invalid expression as value. (GH-94304)
* Fix error for dictionary literals with invalid expression as value.

* Remove trailing whitespace
(cherry picked from commit 8c237a7a71d52f996f58dc58f6b6ce558d209494)

Co-authored-by: wookie184 <wookie1840@gmail.com>
2022-06-26 12:07:02 -07:00
Pablo Galindo Salgado
65ed8b47ee
[3.11] gh-92858: Improve error message for some suites with syntax error before ':' (GH-92894) (#94180)
(cherry picked from commit 2fc83ac3afa161578200dbf8d823a20e0801c0c0)

Co-authored-by: wookie184 <wookie1840@gmail.com>

Co-authored-by: wookie184 <wookie1840@gmail.com>
2022-06-23 18:38:06 +01:00
Miss Islington (bot)
f9d0240db8
gh-93671: Avoid exponential backtracking in deeply nested sequence patterns in match statements (GH-93680)
Co-authored-by: Łukasz Langa <lukasz@langa.pl>
(cherry picked from commit 53a8b17895e91d08f76a2fb59a555d012cd85ab4)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2022-06-10 09:21:04 -07:00
Miss Islington (bot)
376d53771d
gh-93418: Fix an assert when an f-string expression is followed by an '=', but no closing brace. (gh-93419) (gh-93422)
(cherry picked from commit ee70c70aa93d7a41cbe47a0b361b17f9d7ec8acd)

Co-authored-by: Eric V. Smith <ericvsmith@users.noreply.github.com>

Co-authored-by: Eric V. Smith <ericvsmith@users.noreply.github.com>
2022-06-01 21:04:27 -04:00
Miss Islington (bot)
b425d887aa
gh-92597: Ensure that AST nodes without explicit end positions can be compiled (GH-93359)
(cherry picked from commit 705eaec28f7bee530b1c1635ba385a49a1feaf32)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2022-05-31 16:26:16 -07:00
Miss Islington (bot)
7afccd34a6
gh-90473: Decrease recursion limit and skip tests on WASI (GH-92803)
(cherry picked from commit 137fd3d88aa46669f5717734e823f4c594ab2843)

Co-authored-by: Christian Heimes <christian@python.org>
2022-05-19 08:05:52 -07:00
Victor Stinner
d716a0dfe2
Use static inline function Py_EnterRecursiveCall() (#91988)
Currently, calling Py_EnterRecursiveCall() and
Py_LeaveRecursiveCall() may use a function call or a static inline
function call, depending if the internal pycore_ceval.h header file
is included or not. Use a different name for the static inline
function to ensure that the static inline function is always used in
Python internals for best performance. Similar approach than
PyThreadState_GET() (function call) and _PyThreadState_GET() (static
inline function).

* Rename _Py_EnterRecursiveCall() to _Py_EnterRecursiveCallTstate()
* Rename _Py_LeaveRecursiveCall() to _Py_LeaveRecursiveCallTstate()
* pycore_ceval.h: Rename Py_EnterRecursiveCall() to
  _Py_EnterRecursiveCall() and Py_LeaveRecursiveCall() and
  _Py_LeaveRecursiveCall()
2022-05-04 13:30:23 +02:00
Serhiy Storchaka
3483299a24
gh-81548: Deprecate octal escape sequences with value larger than 0o377 (GH-91668) 2022-04-30 13:16:27 +03:00
Serhiy Storchaka
43a8bf1ea4
gh-87999: Change warning type for numeric literal followed by keyword (GH-91980)
The warning emitted by the Python parser for a numeric literal
immediately followed by keyword has been changed from deprecation
warning to syntax warning.
2022-04-27 20:15:14 +03:00
Matthieu Dartiailh
aa0f056a00
bpo-47212: Improve error messages for un-parenthesized generator expressions (GH-32302) 2022-04-05 14:47:13 +01:00