299 Commits

Author SHA1 Message Date
yui-knk
cf74ff714a Change return value of gets function to be rb_parser_string_t * instead of VALUE
This change reduces parser's dependency on ruby object.
2024-05-04 11:59:10 +09:00
Nobuyoshi Nakada
a6308ca958 ripper: Move DSL line pattern 2024-04-29 08:38:23 +09:00
ydah
f9cf923af2 Use user defined parameterizing rules 2024-04-29 08:38:23 +09:00
卜部昌平
c844968b72 ruby tool/update-deps --fix 2024-04-27 21:55:28 +09:00
yui-knk
33929ef995 Move encoding object conversion outside of parser
Reduce the parser's dependence on `VALUE` and `rb_enc_from_encoding`.
2024-04-23 13:11:46 +09:00
Nobuyoshi Nakada
afa0d58580
Adjust indent [ci skip] 2024-04-23 09:21:38 +09:00
yui-knk
2992e1074a Refactor parser compile functions
Refactor parser compile functions to reduce the dependence
on ruby functions.
This commit includes these changes

1. Refactor `gets`, `input` and `gets_` of `parser_params`

Parser needs two different data structure to get next line, function (`gets`) and input data (`input`).
However `gets_` is used for both function (`call`) and input data (`ptr`).
`call` is used for managing general callback function when `rb_ruby_parser_compile_generic` is used.
`ptr` is used for managing the current pointer on String when `parser_compile_string` is used.
This commit changes parser to used only `gets` and `input` then removes `gets_`.

2. Move parser_compile functions and `gets` functions from parse.y to ruby_parser.c

This change reduces the dependence on ruby functions from parser.

3. Change ruby_parser and ripper to take care of `VALUE input` GC mark

Move the responsibility of calling `rb_gc_mark` for `VALUE input` from parser to ruby_parser and ripper.
`input` is arbitrary data pointer from the viewpoint of parser.

4. Introduce rb_parser_compile_array function

Caller of `rb_parser_compile_generic` needs to take care about GC because ruby_parser doesn’t know
about the detail of `lex_gets` and `input`.
Introduce `rb_parser_compile_array` to reduce the complexity of ast.c.
2024-04-23 07:20:22 +09:00
yui-knk
38b8bdb8ea Remove undefined function's prototype declaration
89cfc152071 removed the definition of these functions.
2024-04-14 10:51:16 +09:00
yui-knk
e816ab0b0c Remove rb_imemo_tmpbuf_t from parser
No parser semantic value types are `VALUE` then no need to
use imemo for managing semantic value stack anymore.
2024-04-02 19:37:27 +09:00
yui-knk
799e854897 [Feature #20331] Simplify parser warnings for hash keys duplication and when clause duplication
This commit simplifies warnings for hash keys duplication and when clause duplication,
based on the discussion of https://bugs.ruby-lang.org/issues/20331.
Warnings are reported only when strings are same to ohters.
2024-04-02 08:26:58 +09:00
S-H-GAMELINKS
060a71d4e7 Fix Ripper memory allocation size when enabled Universal Parser
The size of `struct parser_params` is 8 bytes difference in `ripper_s_allocate` and `rb_ruby_parser_allocate` when the universal parser is
enabled.
This causes a situation where `*r->p` is not fully initialized in `ripper_s_allocate` as shown below.

```console
(gdb) p *r->p
$2 = {heap = 0x0, lval = 0x0, yylloc = 0x0, lex = {strterm = 0x0, gets = 0x0, input = 0, string_buffer = {head = 0x0, last = 0x0}, lastlin
e = 0x0,
    nextline = 0x0, pbeg = 0x0, pcur = 0x0, pend = 0x0, ptok = 0x0, gets_ = {ptr = 0, call = 0x0}, state = EXPR_NONE, paren_nest = 0, lpar
_seen = 0,
    debug = 0, has_shebang = 0, token_seen = 0, token_info_enabled = 0, error_p = 0, cr_seen = 0, value = 0, result = 0, parsing_thread = 0, s_value = 0,
    s_lvalue = 0, s_value_stack = 2097}
````

This seems to cause `double free or corruption (!prev)` and SEGV.
So, fixing this by introduce `rb_ripper_parser_params_allocate` and `rb_ruby_parser_config` functions for Ripper, and `struct parser_params` same size is returned.
2024-03-21 18:10:02 +09:00
Nobuyoshi Nakada
e670892497
Remove no longer needed matching 2024-03-17 18:47:18 +09:00
Nobuyoshi Nakada
9e470ebdcd
Revert "Remove flip-flop usages from build scripts"
This reverts commit 301fa452f7a9cdea922103e9c50d85a2d5652d0d.
2024-03-17 18:28:28 +09:00
Jean Boussier
09d8c99cdc Ensure test suite is compatible with --frozen-string-literal
As preparation for https://bugs.ruby-lang.org/issues/20205
making sure the test suite is compatible with frozen string
literals is making things easier.
2024-03-14 17:56:15 +01:00
yui-knk
7cb8fd7800 Move ripper_validate_object to ripper_init.c.tmpl 2024-02-20 19:19:31 +09:00
yui-knk
89cfc15207 [Feature #20257] Rearchitect Ripper
Introduce another semantic value stack for Ripper so that
Ripper can manage both Node and Ruby Object separately.
This rearchitectutre of Ripper solves these issues.
Therefore adding test cases for them.

* [Bug 10436] https://bugs.ruby-lang.org/issues/10436
* [Bug 18988] https://bugs.ruby-lang.org/issues/18988
* [Bug 20055] https://bugs.ruby-lang.org/issues/20055

Checked the differences of `Ripper.sexp` for files under `/test/ruby`
are only on test_pattern_matching.rb.
The differences comes from the differences between
`new_hash_pattern_tail` functions between parser and Ripper.
Ripper `new_hash_pattern_tail` didn’t call `assignable` then
`kw_rest_arg` wasn’t marked as local variable.
This is also fixed by this commit.

```
--- a/./tmp/before/test_pattern_matching.rb
+++ b/./tmp/after/test_pattern_matching.rb
@@ -3607,7 +3607,7 @@
                  [:in,
                   [:hshptn, nil, [], [:var_field, [:@ident, “a”, [984, 13]]]],
                   [[:binary,
-                    [:vcall, [:@ident, “a”, [985, 10]]],
+                    [:var_ref, [:@ident, “a”, [985, 10]]],
                     :==,
                     [:hash, nil]]],
                   nil]]],
@@ -3662,7 +3662,7 @@
                  [:in,
                   [:hshptn, nil, [], [:var_field, [:@ident, “a”, [993, 13]]]],
                   [[:binary,
-                    [:vcall, [:@ident, “a”, [994, 10]]],
+                    [:var_ref, [:@ident, “a”, [994, 10]]],
                     :==,
                     [:hash,
                      [:assoclist_from_args,
@@ -3813,7 +3813,7 @@
                    [:command,
                     [:@ident, “raise”, [1022, 10]],
                     [:args_add_block,
-                     [[:vcall, [:@ident, “b”, [1022, 16]]]],
+                     [[:var_ref, [:@ident, “b”, [1022, 16]]]],
                      false]]],
                   [:else, [[:var_ref, [:@kw, “true”, [1024, 10]]]]]]]],
                nil,
@@ -3876,7 +3876,7 @@
                      [:@int, “0”, [1033, 15]]],
                     :“&&“,
                     [:binary,
-                     [:vcall, [:@ident, “b”, [1033, 20]]],
+                     [:var_ref, [:@ident, “b”, [1033, 20]]],
                      :==,
                      [:hash, nil]]]],
                   nil]]],
@@ -3946,7 +3946,7 @@
                      [:@int, “0”, [1042, 15]]],
                     :“&&“,
                     [:binary,
-                     [:vcall, [:@ident, “b”, [1042, 20]]],
+                     [:var_ref, [:@ident, “b”, [1042, 20]]],
                      :==,
                      [:hash,
                       [:assoclist_from_args,
@@ -5206,7 +5206,7 @@
                      [[:assoc_new,
                        [:@label, “c:“, [1352, 22]],
                        [:@int, “0”, [1352, 25]]]]]],
-                   [:vcall, [:@ident, “r”, [1352, 29]]]],
+                   [:var_ref, [:@ident, “r”, [1352, 29]]]],
                   false]]],
                [:binary,
                 [:call,
@@ -5299,7 +5299,7 @@
                       [:assoc_new,
                        [:@label, “c:“, [1367, 34]],
                        [:@int, “0”, [1367, 37]]]]]],
-                   [:vcall, [:@ident, “r”, [1367, 41]]]],
+                   [:var_ref, [:@ident, “r”, [1367, 41]]]],
                   false]]],
                [:binary,
                 [:call,
@@ -5931,7 +5931,7 @@
              [:in,
               [:hshptn, nil, [], [:var_field, [:@ident, “r”, [1533, 11]]]],
               [[:binary,
-                [:vcall, [:@ident, “r”, [1534, 8]]],
+                [:var_ref, [:@ident, “r”, [1534, 8]]],
                 :==,
                 [:hash,
                  [:assoclist_from_args,
```
2024-02-20 17:33:58 +09:00
yui-knk
928f388415 [DOC] Fix Ripper DSL input example
'!' suffix is needed for event dispatch.
2024-01-30 22:49:22 +09:00
yui-knk
ee7f63ebba Make lastline and nextline to be rb_parser_string
This commit changes `struct parser_params` lastline and nextline
from `VALUE` (String object) to `rb_parser_string_t *` so that
dependency on Ruby Object is reduced.
`parser_string_buffer_t string_buffer` is added to `struct parser_params`
to manage `rb_parser_string_t` pointers of each line. All allocated line
strings are freed in `rb_ruby_parser_free`.
2024-01-23 08:58:16 +09:00
KJ Tsanaktsidis
61da90c1b8 Mark asan fake stacks during machine stack marking
ASAN leaves a pointer to the fake frame on the stack; we can use the
__asan_addr_is_in_fake_stack API to work out the extent of the fake
stack and thus mark any VALUEs contained therein.

[Bug #20001]
2024-01-19 09:55:12 +11:00
yui-knk
52d9e55903 Statically allocate parser config 2024-01-12 21:17:41 +09:00
KJ Tsanaktsidis
688a6ff510 Revert "Mark asan fake stacks during machine stack marking"
This reverts commit d10bc3a2b8300cffc383e10c3730871e851be24c.
2024-01-12 17:58:54 +11:00
KJ Tsanaktsidis
d10bc3a2b8 Mark asan fake stacks during machine stack marking
ASAN leaves a pointer to the fake frame on the stack; we can use the
__asan_addr_is_in_fake_stack API to work out the extent of the fake
stack and thus mark any VALUEs contained therein.

[Bug #20001]
2024-01-12 17:29:48 +11:00
S-H-GAMELINKS
1b8d01136c Introduce Numeric Node's 2024-01-07 09:24:34 +09:00
Nobuyoshi Nakada
7016ab873e
Verify that events2table.c was generated successfully 2023-12-28 18:07:49 +09:00
KJ Tsanaktsidis
f8effa209a Change the semantics of rb_postponed_job_register
Our current implementation of rb_postponed_job_register suffers from
some safety issues that can lead to interpreter crashes (see bug #1991).
Essentially, the issue is that jobs can be called with the wrong
arguments.

We made two attempts to fix this whilst keeping the promised semantics,
but:
  * The first one involved masking/unmasking when flushing jobs, which
    was believed to be too expensive
  * The second one involved a lock-free, multi-producer, single-consumer
    ringbuffer, which was too complex

The critical insight behind this third solution is that essentially the
only user of these APIs are a) internal, or b) profiling gems.

For a), none of the usages actually require variable data; they will
work just fine with the preregistration interface.

For b), generally profiling gems only call a single callback with a
single piece of data (which is actually usually just zero) for the life
of the program. The ringbuffer is complex because it needs to support
multi-word inserts of job & data (which can't be atomic); but nobody
actually even needs that functionality, really.

So, this comit:
  * Introduces a pre-registration API for jobs, with a GVL-requiring
    rb_postponed_job_prereigster, which returns a handle which can be
    used with an async-signal-safe rb_postponed_job_trigger.
  * Deprecates rb_postponed_job_register (and re-implements it on top of
    the preregister function for compatability)
  * Moves all the internal usages of postponed job register
    pre-registration
2023-12-10 15:00:37 +09:00
yui-knk
9ea1ee66c9 Stop creating ripper.h because it's not used 2023-10-20 12:56:04 +09:00
Nobuyoshi Nakada
ceec988f2e ripper: Support member references in the DSL 2023-10-10 00:09:52 +09:00
yui-knk
cecd1de2eb Use rb_node_opt_arg_t and rb_node_kw_arg_t instead of NODE 2023-10-01 09:19:42 +09:00
Nobuyoshi Nakada
d647709d1a Extract ripper_parser_params 2023-09-30 20:17:38 +09:00
yui-knk
74c6781153 Change RNode structure from union to struct
All kind of AST nodes use same struct RNode, which has u1, u2, u3 union members
for holding different kind of data.
This has two problems.

1. Low flexibility of data structure

Some nodes, for example NODE_TRUE, don’t use u1, u2, u3. On the other hand,
NODE_OP_ASGN2 needs more than three union members. However they use same
structure definition, need to allocate three union members for NODE_TRUE and
need to separate NODE_OP_ASGN2 into another node.
This change removes the restriction so make it possible to
change data structure by each node type.

2. No compile time check for union member access

It’s developer’s responsibility for using correct member for each node type when it’s union.
This change clarifies which node has which type of fields and enables compile time check.

This commit also changes node_buffer_elem_struct buf management to handle
different size data with alignment.
2023-09-28 11:58:10 +09:00
Nobuyoshi Nakada
fbe4db5182 ripper: Support named references in the DSL 2023-09-25 23:04:09 +09:00
Nobuyoshi Nakada
69d7871b02
ripper: Preprocess ripper-dispatchable types only
Keep the other types, which not having setter macros for ripper.
2023-09-17 16:22:01 +09:00
Nobuyoshi Nakada
f2102e4015
Set ripper_init.c.tmpl to C mode [ci skip] 2023-09-10 19:20:31 +09:00
卜部昌平
d9cba2fc74 include missing header 2023-08-25 17:27:53 +09:00
卜部昌平
eec85a6309 tool/update-deps --fix 2023-08-25 17:27:53 +09:00
yui-knk
0a570a0069 Fix #line directive filename of ripper.c
Before:

```c
/* First part of user prologue.  */
#line 14 "parse.y"
```

After:

```c
/* First part of user prologue.  */
#line 14 "ripper.y"
```
2023-07-16 19:27:08 +09:00
Nobuyoshi Nakada
5c77402d88
Fix null pointer access in Ripper#initialize
In `rb_ruby_ripper_parser_allocate`, `r->p` is NULL between creating
`self` and `parser_params` assignment.  As GC can happen there, the
typed-data functions for it need to consider the case.
2023-07-16 15:41:10 +09:00
yui-knk
82cd70ef93 Use functions defined by parser_st.c to reduce dependency on st.c 2023-07-15 12:50:40 +09:00
yui-knk
b2bccf053b Include ripper.h into $distcleanfiles 2023-07-09 13:02:25 +09:00
Nobuyoshi Nakada
c89f519170
More dependencies for ripper 2023-06-29 18:47:56 +09:00
Peter Zhu
a500eb9f8c Fix memory leak in Ripper
The following script leaks memory in Ripper:

```ruby
require "ripper"

20.times do
  100_000.times do
    Ripper.parse("")
  end

  puts `ps -o rss= -p #{$$}`
end
```
2023-06-28 09:50:51 -04:00
Nobuyoshi Nakada
70483f6ca4
Add missing dependencies 2023-06-12 19:10:29 +09:00
yui-knk
b481b673d7 [Feature #19719] Universal Parser
Introduce Universal Parser mode for the parser.
This commit includes these changes:

* Introduce `UNIVERSAL_PARSER` macro. All of CRuby related functions
  are passed via `struct rb_parser_config_struct` when this macro is enabled.
* Add CI task with 'cppflags=-DUNIVERSAL_PARSER' for ubuntu.
2023-06-12 18:23:48 +09:00
yui-knk
7b803eafa2 Ripper does not depend on Bison [ci skip]
It also uses Lrama then no dependency on Bison.
2023-06-03 10:34:24 +09:00
yui-knk
3a4206c7a1 No need to define "BISON" on extconf.rb
"BISON" is defined in "ext/ripper/depend".
2023-06-02 09:28:30 +09:00
Nobuyoshi Nakada
3fe45a3123
Process parse.y without temporary files 2023-05-15 19:10:24 +09:00
Nobuyoshi Nakada
bdaa491565 Add user argument to some macros used by bison 2023-05-14 15:38:48 +09:00
Nobuyoshi Nakada
3150516aab Preprocess input parse.y from stdin 2023-05-14 15:38:48 +09:00
Yuichiro Kaneko
a1b01e7701
Use Lrama LALR parser generator instead of Bison
https://bugs.ruby-lang.org/issues/19637

Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
2023-05-12 18:25:10 +09:00
Matt Valentine-House
2a34bcaa10 Update VPATH for socket, & dependencies
The socket extensions rubysocket.h pulls in the "private" include/gc.h,
which now depends on vm_core.h. vm_core.h pulls in id.h

when tool/update-deps generates the dependencies for the makefiles, it
generates the line for id.h to be based on VPATH, which is configured in
the extconf.rb for each of the extensions. By default VPATH does not
include the actual source directory of the current Ruby so the
dependency fails to resolve and linking fails.

We need to append the topdir and top_srcdir to VPATH to have the
dependancy picked up correctly (and I believe we need both of these to
cope with in-tree and out-of-tree builds).

I copied this from the approach taken in
https://github.com/ruby/ruby/blob/master/ext/objspace/extconf.rb#L3
2023-04-06 11:07:16 +01:00