489 Commits

Author SHA1 Message Date
HASUMI Hitoshi
55a402bb75 Add line_count field to rb_ast_body_t
This patch adds `int line_count` field to `rb_ast_body_t` structure.
Instead, we no longer cast `script_lines` to Fixnum.

## Background

Ref https://github.com/ruby/ruby/pull/10618

In the PR above, we have decoupled IMEMO from `rb_ast_t`.
This means we could lift the five-words-restriction of the structure
that forced us to unionize `rb_ast_t *` and `FIXNUM` in one field.

## Relating refactor

- Remove the second parameter of `rb_ruby_ast_new()` function

## Attention

I will remove a code that assigns -1 to line_count, in `rb_binding_add_dynavars()`
of vm.c, because I don't think it is necessary.
But I will make another PR for this so that we can atomically revert
in case I was wrong (See the comment on the code)
2024-04-27 12:08:26 +09:00
yui-knk
140c59c633 Set SCRIPT_LINES__ outside of parser
Parser should not depend on functions defiend on "ruby_parser.c".
2024-04-26 20:34:49 +09:00
HASUMI Hitoshi
2244c58b00 [Universal parser] Decouple IMEMO from rb_ast_t
This patch removes the `VALUE flags` member from the `rb_ast_t` structure making `rb_ast_t` no longer an IMEMO object.

## Background

We are trying to make the Ruby parser generated from parse.y a universal parser that can be used by other implementations such as mruby.
To achieve this, it is necessary to exclude VALUE and IMEMO from parse.y, AST, and NODE.

## Summary (file by file)

- `rubyparser.h`
  - Remove the `VALUE flags` member from `rb_ast_t`
- `ruby_parser.c` and `internal/ruby_parser.h`
  - Use TypedData_Make_Struct VALUE which wraps `rb_ast_t` `in ast_alloc()` so that GC can manage it
    - You can retrieve `rb_ast_t` from the VALUE by `rb_ruby_ast_data_get()`
  - Change the return type of `rb_parser_compile_XXXX()` functions from `rb_ast_t *` to `VALUE`
  - rb_ruby_ast_new() which internally `calls ast_alloc()` is to create VALUE vast outside ruby_parser.c
- `iseq.c` and `vm_core.h`
  - Amend the first parameter of `rb_iseq_new_XXXX()` functions from `rb_ast_body_t *` to `VALUE`
  - This keeps the VALUE of AST on the machine stack to prevent being removed by GC
- `ast.c`
  - Almost all change is replacement `rb_ast_t *ast` with `VALUE vast` (sorry for the big diff)
  - Fix `node_memsize()`
    - Now it includes `rb_ast_local_table_link`, `tokens` and script_lines
- `compile.c`, `load.c`, `node.c`, `parse.y`, `proc.c`, `ruby.c`, `template/prelude.c.tmpl`, `vm.c` and `vm_eval.c`
  - Follow-up due to the above changes
- `imemo.{c|h}`
  - If an object with `imemo_ast` appears, considers it a bug

Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
2024-04-26 11:21:08 +09:00
Peter Zhu
214811974b Add ruby_mimcalloc
Many places call ruby_mimmalloc then MEMZERO. This can be reduced by
using ruby_mimcalloc instead.
2024-04-24 15:30:43 -04:00
yui-knk
33929ef995 Move encoding object conversion outside of parser
Reduce the parser's dependence on `VALUE` and `rb_enc_from_encoding`.
2024-04-23 13:11:46 +09:00
yui-knk
2992e1074a Refactor parser compile functions
Refactor parser compile functions to reduce the dependence
on ruby functions.
This commit includes these changes

1. Refactor `gets`, `input` and `gets_` of `parser_params`

Parser needs two different data structure to get next line, function (`gets`) and input data (`input`).
However `gets_` is used for both function (`call`) and input data (`ptr`).
`call` is used for managing general callback function when `rb_ruby_parser_compile_generic` is used.
`ptr` is used for managing the current pointer on String when `parser_compile_string` is used.
This commit changes parser to used only `gets` and `input` then removes `gets_`.

2. Move parser_compile functions and `gets` functions from parse.y to ruby_parser.c

This change reduces the dependence on ruby functions from parser.

3. Change ruby_parser and ripper to take care of `VALUE input` GC mark

Move the responsibility of calling `rb_gc_mark` for `VALUE input` from parser to ruby_parser and ripper.
`input` is arbitrary data pointer from the viewpoint of parser.

4. Introduce rb_parser_compile_array function

Caller of `rb_parser_compile_generic` needs to take care about GC because ruby_parser doesn’t know
about the detail of `lex_gets` and `input`.
Introduce `rb_parser_compile_array` to reduce the complexity of ast.c.
2024-04-23 07:20:22 +09:00
yui-knk
af169472c7 Remove unused function 2024-04-20 19:30:26 +09:00
yui-knk
d07df8567e Parser and universal parser share wrapper functions 2024-04-20 18:08:33 +09:00
Peter Zhu
81240493a3 Remove unused rb_size_pool_slot_size 2024-04-18 10:19:42 -04:00
Jean Boussier
3a7846b1aa Add a hint of ASCII-8BIT being BINARY
[Feature #18576]

Since outright renaming `ASCII-8BIT` is deemed to backward incompatible,
the next best thing would be to only change its `#inspect`, particularly
in exception messages.
2024-04-18 10:17:26 +02:00
Matt Valentine-House
065710c0f5 Initialize external GC Library
Co-Authored-By: Peter Zhu <peter@peterzhu.ca>
2024-04-15 19:50:47 +01:00
HASUMI Hitoshi
9b1e97b211 [Universal parser] DeVALUE of p->debug_lines and ast->body.script_lines
This patch is part of universal parser work.

## Summary
- Decouple VALUE from members below:
  - `(struct parser_params *)->debug_lines`
  - `(rb_ast_t *)->body.script_lines`
- Instead, they are now `rb_parser_ary_t *`
  - They can also be a `(VALUE)FIXNUM` as before to hold line count
- `ISEQ_BODY(iseq)->variable.script_lines` remains VALUE
  - In order to do this,
  - Add `VALUE script_lines` param to `rb_iseq_new_with_opt()`
  - Introduce `rb_parser_build_script_lines_from()` to convert `rb_parser_ary_t *` into `VALUE`

## Other details
- Extend `rb_parser_ary_t *`. It previously could only store `rb_parser_ast_token *`, now can store script_lines, too
- Change tactics of building the top-level `SCRIPT_LINES__` in `yycompile0()`
  - Before: While parsing, each line of the script is added to `SCRIPT_LINES__[path]`
  - After: After `yyparse(p)`, `SCRIPT_LINES__[path]` will be built from `p->debug_lines`
- Remove the second parameter of `rb_parser_set_script_lines()` to make it simple
- Introduce `script_lines_free()` to be called from `rb_ast_free()` because the GC no longer takes care of the script_lines
- Introduce `rb_parser_string_deep_copy()` in parse.y to maintain script_lines when `rb_ruby_parser_free()` called
  - With regard to this, please see *Future tasks* below

## Future tasks
- Decouple IMEMO from `rb_ast_t *`
  - This lifts the five-members-restriction of Ruby object,
  - So we will be able to move the ownership of the `lex.string_buffer` from parser to AST
  - Then we remove `rb_parser_string_deep_copy()` to make the whole thing simple
2024-04-15 20:51:54 +09:00
yui-knk
515e52a0b1 Emit warn event for duplicated hash keys on ripper
Need to use `rb_warn` macro instead of calling `rb_compile_warn`
directly to emit `warn` event on ripper.
2024-04-15 06:29:25 +09:00
Jean Boussier
1b830740ba compile.c: use rb_enc_interned_str to reduce allocations
The `rb_fstring(rb_enc_str_new())` pattern is inneficient because:

- It passes a mutable string to `rb_fstring` so if it has to be interned
  it will first be duped.
- It an equivalent interned string already exists, we allocated the string
  for nothing.

With `rb_enc_interned_str` we either directly get the pre-existing string
with 0 allocations, or efficiently directly intern the one we create
without first duping it.
2024-04-11 09:04:31 +02:00
Samuel Williams
b473d304d4
Revert "Enumerator should use a non-blocking fiber. (#10478)" (#10480)
This reverts commit dfa0897de89251a631a67460b941cd24a14c9b55.

This commit accidentally included some change in `parse.h`. Reverting
and re-applying the relevant changes.
2024-04-07 10:20:22 +00:00
Samuel Williams
dfa0897de8
Enumerator should use a non-blocking fiber. (#10478) 2024-04-07 19:18:09 +12:00
yui-knk
6bfabd076b Remove undefined function's prototype declaration 2024-04-07 11:21:04 +09:00
yui-knk
7767db2379 Fix ripper to dispatch warning event for duplicated when clause
Need to separate `check_literal_when` function for parser and
ripper otherwise warning event is not dispatched because
parser `rb_warning1` is used in ripper.
2024-04-07 11:15:09 +09:00
Nobuyoshi Nakada
58918788ab [Bug #20342] Consider wrapped load in main methods 2024-04-05 01:33:08 +09:00
Matt Valentine-House
ef19234b10 Merge rb_objspace_alloc and Init_heap.
Co-Authored-By: Peter Zhu <peter@peterzhu.ca>
2024-04-04 15:00:57 +01:00
Nobuyoshi Nakada
3ac6a03b2e
Revert "hijack SIGCHLD handler for internal use"
This reverts commit 054a412d540e7ed2de63d68da753f585ea6616c3.
SIGCHLD `waidpid`, `waitpid_lock` and related code, have been removed
at ruby/ruby#7527.
2024-04-04 21:48:14 +09:00
yui-knk
6056773105 Move shareable_constant_value logic from parse.y to compile.c 2024-04-04 08:44:10 +09:00
yui-knk
e816ab0b0c Remove rb_imemo_tmpbuf_t from parser
No parser semantic value types are `VALUE` then no need to
use imemo for managing semantic value stack anymore.
2024-04-02 19:37:27 +09:00
yui-knk
799e854897 [Feature #20331] Simplify parser warnings for hash keys duplication and when clause duplication
This commit simplifies warnings for hash keys duplication and when clause duplication,
based on the discussion of https://bugs.ruby-lang.org/issues/20331.
Warnings are reported only when strings are same to ohters.
2024-04-02 08:26:58 +09:00
yui-knk
8066e3ea6e Remove VALUE from struct rb_strterm_struct
In the past, it was imemo. However a075c55 changed it.
Therefore no need to use `VALUE` for the first field.
2024-04-02 07:26:35 +09:00
Peter Zhu
f14e52c8c4 Fix setting GC stress at boot when objspace not available 2024-03-27 09:39:23 -04:00
eileencodes
e16086b7f2 Refactor init_copy gc attributes
This PR moves `rb_copy_wb_protected_attribute` and
`rb_gc_copy_finalizer` into a single function called
`rb_gc_copy_attributes` to be called by `init_copy`. This reduces the
surface area of the GC API.

Co-authored-by: Peter Zhu <peter@peterzhu.ca>
2024-03-26 14:29:36 -04:00
Peter Zhu
9cf754b648 Fix --debug=gc_stress flag
ruby_env_debug_option gets called after Init_gc_stress, so the
--debug=gc_stress flag never works.
2024-03-25 13:07:39 -04:00
S-H-GAMELINKS
060a71d4e7 Fix Ripper memory allocation size when enabled Universal Parser
The size of `struct parser_params` is 8 bytes difference in `ripper_s_allocate` and `rb_ruby_parser_allocate` when the universal parser is
enabled.
This causes a situation where `*r->p` is not fully initialized in `ripper_s_allocate` as shown below.

```console
(gdb) p *r->p
$2 = {heap = 0x0, lval = 0x0, yylloc = 0x0, lex = {strterm = 0x0, gets = 0x0, input = 0, string_buffer = {head = 0x0, last = 0x0}, lastlin
e = 0x0,
    nextline = 0x0, pbeg = 0x0, pcur = 0x0, pend = 0x0, ptok = 0x0, gets_ = {ptr = 0, call = 0x0}, state = EXPR_NONE, paren_nest = 0, lpar
_seen = 0,
    debug = 0, has_shebang = 0, token_seen = 0, token_info_enabled = 0, error_p = 0, cr_seen = 0, value = 0, result = 0, parsing_thread = 0, s_value = 0,
    s_lvalue = 0, s_value_stack = 2097}
````

This seems to cause `double free or corruption (!prev)` and SEGV.
So, fixing this by introduce `rb_ripper_parser_params_allocate` and `rb_ruby_parser_config` functions for Ripper, and `struct parser_params` same size is returned.
2024-03-21 18:10:02 +09:00
Peter Zhu
e07441f05f Make rb_aligned_malloc private
It is not used anywhere else.
2024-03-20 10:27:41 -04:00
Takashi Kokubun
cbcb2d46fc
[DOC] Unify Doxygen formats (#10285) 2024-03-19 10:59:25 -07:00
Étienne Barrié
12be40ae6b Implement chilled strings
[Feature #20205]

As a path toward enabling frozen string literals by default in the future,
this commit introduce "chilled strings". From a user perspective chilled
strings pretend to be frozen, but on the first attempt to mutate them,
they lose their frozen status and emit a warning rather than to raise a
`FrozenError`.

Implementation wise, `rb_compile_option_struct.frozen_string_literal` is
no longer a boolean but a tri-state of `enabled/disabled/unset`.

When code is compiled with frozen string literals neither explictly enabled
or disabled, string literals are compiled with a new `putchilledstring`
instruction. This instruction is identical to `putstring` except it marks
the String with the `STR_CHILLED (FL_USER3)` and `FL_FREEZE` flags.

Chilled strings have the `FL_FREEZE` flag as to minimize the need to check
for chilled strings across the codebase, and to improve compatibility with
C extensions.

Notes:
  - `String#freeze`: clears the chilled flag.
  - `String#-@`: acts as if the string was mutable.
  - `String#+@`: acts as if the string was mutable.
  - `String#clone`: copies the chilled flag.

Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
2024-03-19 09:26:49 +01:00
Peter Zhu
3f5f04afa7 Remove duplicated function prototype rb_gc_disable_no_rest 2024-03-18 16:39:04 -04:00
Peter Zhu
4469729558 Remove rb_raw_obj_info_basic
It's not used outside of gc.c.
2024-03-18 10:19:11 -04:00
tompng
0ff2c7fe6f Faster Integer.sqrt for large bignum
Integer.sqrt uses Newton's method.
This pull request reduces the precision which was unnecessarily high in each calculation step.
2024-03-18 13:52:27 +09:00
Peter Zhu
ff51dc5654 [Feature #20265] Remove rb_newobj_of and RB_NEWOBJ_OF 2024-03-14 12:53:04 -04:00
Jean Boussier
315bde5a0f Exception#set_backtrace accept arrays of Backtrace::Location
[Feature #13557]

Setting the backtrace with an array of strings is lossy. The resulting
exception will return nil on `#backtrace_locations`.

By accepting an array of `Backtrace::Location` instance, we can rebuild
a `Backtrace` instance and have a fully functioning Exception.

Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com>
2024-03-14 11:38:40 +01:00
Peter Zhu
6ad347a105 Don't directly read the SIZE_POOL_COUNT in shapes
This removes the assumption about SIZE_POOL_COUNT for shapes.
2024-03-13 09:55:52 -04:00
Peter Zhu
2fc551e34e Simplify NEWOBJ_OF macro 2024-03-13 09:05:52 -04:00
Peter Zhu
2c349cf4b6 Use NEWOBJ_OF_ec in NEWOBJ_OF_0 2024-03-11 19:30:09 -04:00
Jean Boussier
2d80b6093f Retire RUBY_MARK_UNLESS_NULL
Marking `Qnil` or `Qfalse` works fine, having
an extra macro to avoid it isn't needed.
2024-03-08 14:13:14 +01:00
Jean Boussier
d4f3dcf4df Refactor VM root modules
This `st_table` is used to both mark and pin classes
defined from the C API. But `vm->mark_object_ary` already
does both much more efficiently.

Currently a Ruby process starts with 252 rooted classes,
which uses `7224B` in an `st_table` or `2016B` in an `RArray`.

So a baseline of 5kB saved, but since `mark_object_ary` is
preallocated with `1024` slots but only use `405` of them,
it's a net `7kB` save.

`vm->mark_object_ary` is also being refactored.

Prior to this changes, `mark_object_ary` was a regular `RArray`, but
since this allows for references to be moved, it was marked a second
time from `rb_vm_mark()` to pin these objects.

This has the detrimental effect of marking these references on every
minors even though it's a mostly append only list.

But using a custom TypedData we can save from having to mark
all the references on minor GC runs.

Addtionally, immediate values are now ignored and not appended
to `vm->mark_object_ary` as it's just wasted space.
2024-03-06 15:33:43 -05:00
Jean Boussier
b4a69351ec Move FL_SINGLETON to FL_USER1
This frees FL_USER0 on both T_MODULE and T_CLASS.

Note: prior to this, FL_SINGLETON was never set on T_MODULE,
so checking for `FL_SINGLETON` without first checking that
`FL_TYPE` was `T_CLASS` was valid. That's no longer the case.
2024-03-06 13:11:41 -05:00
Jean Boussier
e626da82ea Don't pin named structs defined in Ruby
[Bug #20311]

`rb_define_class_under` assumes it's called from C and that the
reference might be held in a C global variable, so it adds the
class to the VM root.

In the case of `Struct.new('Name')` it's wasteful and make
the struct immortal.
2024-03-01 08:23:38 +01:00
Peter Zhu
5481dbef07 Make rb_define_finalizer_no_check private 2024-02-28 13:45:19 -05:00
Peter Zhu
dcc976add9 Remove unused rb_gc_id2ref_obj_tbl 2024-02-28 12:21:38 -05:00
Peter Zhu
78ae6dbb11 Remove rb_objspace_marked_object_p
rb_objspace_marked_object_p is no longer used in the objspace module, so
we can remove it.
2024-02-26 17:05:34 -05:00
Peter Zhu
7538703d1b Make rb_objspace_data_type_memsize private
rb_objspace_data_type_memsize is not used in the objspace module, so we
can make it private.
2024-02-26 17:05:34 -05:00
Peter Zhu
c9b6cd4223 Remove unused rb_objspace_each_objects_without_setup 2024-02-26 14:34:24 -05:00
Peter Zhu
e65315a725 Extract imemo functions from gc.c into imemo.c 2024-02-22 11:35:09 -05:00