128 Commits

Author SHA1 Message Date
Matt Valentine-House
7444f415db rename rb_gc_obj_free_on_sweep -> rb_gc_obj_needs_cleanup_p 2026-01-26 18:01:09 +00:00
Matt Valentine-House
8e73aa7ffe We don't need this wrapper function anymore 2026-01-26 18:01:09 +00:00
Matt Valentine-House
efde37b712 Move the gc fast path out of the default GC impl
It relies too much on VM level concerns, such that it can't be built
with modular GC enabled.

We'll move it into the VM, and then expose it to the GC
implementations so they can use it.
2026-01-26 18:01:09 +00:00
Matt Valentine-House
211714f1bf Clarify the use of some FLAGS 2026-01-26 18:01:09 +00:00
Matt Valentine-House
c21f3490d1 Implement a fast path for sweeping (gc_sweep_fast_path_p).
[Feature #21846]

There is a single path through our GC Sweeping code, and we always call
rb_gc_obj_free_vm_weak_references and rb_gc_obj_free before adding the
object back to the freelist.

We do this even when the object has no external resources that require
being free'd and has no weak references pointing to it.

This commit introduces a conservative fast path through gc_sweep_plane
that uses the object flags to identify certain cases where these calls
can be skipped - for these objects we just add them straight back on the
freelist. Any object for which gc_sweep_fast_path_p returns false will
use the current full sweep code (referred to here as the slow path).

Currently there are 2 checks that
will _always_ require an object to go down the slow path:

1. Has it's object_id been observed and stored in the id2ref_table
2. Has it got generic ivars in the gen_fields table

If neither of these are true, then we run some flag checks on the object
and send the following cases down the fast path:

- Objects that are not heap allocated
- Embedded strings that aren't in the fstring table
- Embedded Arrays
- Embedded Hashes
- Embedded Bignums
- Embedded Strings
- Floats, Rationals and Complex
- Various IMEMO subtypes that do no allocation

We've benchmarked this code using ruby-bench as well as the gcbench
benchmarks inside Ruby (benchmarks/gc) and this patch results in a
modest speed improvement on almost all of the headline benchmarks (2% in
railsbench with YJIT enabled), and an observable 30% improvement in time
spent sweeping during the GC benchmarks:

```
master: ruby 4.1.0dev (2026-01-19T12:03:33Z master 859920dfd2) +YJIT +PRISM [x86_64-linux]
experiment: ruby 4.1.0dev (2026-01-16T21:36:46Z mvh-sweep-fast-pat.. c3ffe377a1) +YJIT +PRISM [x86_64-linux]

--------------  -----------  ----------  ---------------  ----------  ------------------  -----------------
bench           master (ms)  stddev (%)  experiment (ms)  stddev (%)  experiment 1st itr  master/experiment
lobsters        N/A          N/A         N/A              N/A         N/A                 N/A
activerecord    132.5        0.9         132.5            1.0         1.056               1.001
chunky-png      577.2        0.4         580.1            0.4         0.994               0.995
erubi-rails     902.9        0.2         894.3            0.2         1.040               1.010
hexapdf         1763.9       3.3         1760.6           3.7         1.027               1.002
liquid-c        56.9         0.6         56.7             1.4         1.004               1.003
liquid-compile  46.3         2.1         46.1             2.1         1.005               1.004
liquid-render   77.8         0.8         75.1             0.9         1.023               1.036
mail            114.7        0.4         113.0            1.4         1.054               1.015
psych-load      1635.4       1.4         1625.9           0.5         0.988               1.006
railsbench      1685.4       2.4         1650.1           2.0         0.989               1.021
rubocop         133.5        8.1         130.3            7.8         1.002               1.024
ruby-lsp        140.3        1.9         137.5            1.8         1.007               1.020
sequel          64.6         0.7         63.9             0.7         1.003               1.011
shipit          1196.2       4.3         1181.5           4.2         1.003               1.012
--------------  -----------  ----------  ---------------  ----------  ------------------  -----------------

Legend:
- experiment 1st itr: ratio of master/experiment time for the first benchmarking iteration.
- master/experiment: ratio of master/experiment time. Higher is better for experiment. Above 1 represents a speedup.
```

```
Benchmark      │    Wall(B)   Sweep(B)  Mark(B) │    Wall(E)   Sweep(E)  Mark(E) │   Wall Δ  Sweep Δ
───────────────┼─────────────────────────────────┼─────────────────────────────────┼──────────────────
null           │     0.000s        1ms      4ms │     0.000s        1ms      4ms │       0%       0%
hash1          │     4.330s      875ms     46ms │     3.960s      531ms     44ms │ +8.6% +39.3%
hash2          │     6.356s      243ms    988ms │     6.298s      176ms    1.03s │ +0.9% +27.6%
rdoc           │    37.337s      2.42s    1.09s │    36.678s      2.11s    1.20s │ +1.8% +13.1%
binary_trees   │     3.366s      426ms    252ms │     3.082s      275ms    239ms │ +8.4% +35.4%
ring           │     5.252s       14ms    2.47s │     5.327s       12ms    2.43s │ -1.4% +14.3%
redblack       │     2.966s       28ms     41ms │     2.940s       21ms     38ms │ +0.9% +25.0%
───────────────┼─────────────────────────────────┼─────────────────────────────────┼──────────────────

Legend: (B) = Baseline, (E) = Experiment, Δ = improvement (positive = faster)
        Wall = total wallclock, Sweep = GC sweeping time, Mark = GC marking time
        Times are median of 3 runs
```

These results are also borne out when YJIT is disabled:

```
master: ruby 4.1.0dev (2026-01-19T12:03:33Z master 859920dfd2) +PRISM [x86_64-linux]
experiment: ruby 4.1.0dev (2026-01-16T21:36:46Z mvh-sweep-fast-pat.. c3ffe377a1) +PRISM [x86_64-linux]

--------------  -----------  ----------  ---------------  ----------  ------------------  -----------------
bench           master (ms)  stddev (%)  experiment (ms)  stddev (%)  experiment 1st itr  master/experiment
lobsters        N/A          N/A         N/A              N/A         N/A                 N/A
activerecord    389.6        0.3         377.5            0.3         1.032               1.032
chunky-png      1123.4       0.2         1109.2           0.2         1.013               1.013
erubi-rails     1754.3       0.1         1725.7           0.1         1.035               1.017
hexapdf         3346.5       0.9         3326.9           0.7         1.003               1.006
liquid-c        84.0         0.5         83.5             0.5         0.992               1.006
liquid-compile  74.0         1.5         73.5             1.4         1.011               1.008
liquid-render   199.9        0.4         199.6            0.4         1.000               1.002
mail            177.8        0.4         176.4            0.4         1.069               1.008
psych-load      2749.6       0.7         2777.0           0.0         0.980               0.990
railsbench      2983.0       1.0         2965.5           0.8         1.041               1.006
rubocop         228.8        1.0         227.5            1.2         1.015               1.005
ruby-lsp        221.8        0.9         216.1            0.8         1.011               1.026
sequel          89.1         0.5         89.1             1.8         1.005               1.000
shipit          2385.6       1.6         2371.8           1.0         1.002               1.006
--------------  -----------  ----------  ---------------  ----------  ------------------  -----------------

Legend:
- experiment 1st itr: ratio of master/experiment time for the first benchmarking iteration.
- master/experiment: ratio of master/experiment time. Higher is better for experiment. Above 1 represents a speedup.
```

```
Benchmark      │    Wall(B)   Sweep(B)  Mark(B) │    Wall(E)   Sweep(E)  Mark(E) │   Wall Δ  Sweep Δ
───────────────┼─────────────────────────────────┼─────────────────────────────────┼──────────────────
null           │     0.000s        1ms      4ms │     0.000s        1ms      3ms │       0%       0%
hash1          │     4.349s      877ms     45ms │     4.045s      532ms     44ms │ +7.0% +39.3%
hash2          │     6.575s      235ms    967ms │     6.540s      181ms    1.04s │ +0.5% +23.0%
rdoc           │    45.782s      2.23s    1.14s │    44.925s      1.90s    1.01s │ +1.9% +15.0%
binary_trees   │     6.433s      426ms    252ms │     6.268s      278ms    240ms │ +2.6% +34.7%
ring           │     6.584s       17ms    2.33s │     6.738s       13ms    2.33s │ -2.3% +30.8%
redblack       │    13.334s       31ms     42ms │    13.296s       24ms    107ms │ +0.3% +22.6%
───────────────┼─────────────────────────────────┼─────────────────────────────────┼──────────────────

Legend: (B) = Baseline, (E) = Experiment, Δ = improvement (positive = faster)
        Wall = total wallclock, Sweep = GC sweeping time, Mark = GC marking time
        Times are median of 3 runs
```
2026-01-26 18:01:09 +00:00
Nobuyoshi Nakada
8ca2f6489b
Revert "Fix rb_interned_str: create strings with BINARY (akak ASCII_8BIT) encoding"
This reverts commit 1f3c52dc155fb7fbc42fc8e146924091ba1dfa20.
2026-01-17 13:38:55 +09:00
Peter Zhu
8a586af33b Don't force major GC when there are allocatable slots
[Bug #21838]

When we have allocatable slots, we can grow the heap instead of forcing
a major GC. This prevents major GC to be ran very often in certain situations.
See the ticket for more details.

On ruby-bench, we can see that this patch doesn't cause any significant
regressions:

    --------------  -----------  ----------  ---------  -----------  ----------  ---------  --------------  -------------
    bench           master (ms)  stddev (%)  RSS (MiB)  branch (ms)  stddev (%)  RSS (MiB)  branch 1st itr  master/branch
    activerecord    148.2        0.3         59.2       150.0        0.8         69.7       1.015           0.988
    chunky-png      435.2        0.3         72.9       438.8        0.1         66.7       0.993           0.992
    erubi-rails     733.8        1.2         118.7      704.8        0.2         98.3       1.077           1.041
    hexapdf         1400.4       1.1         247.0      1405.0       0.9         223.7      0.986           0.997
    liquid-c        32.5         3.3         32.8       32.5         2.1         30.7       1.042           0.999
    liquid-compile  31.0         1.7         35.1       33.4         3.9         32.8       0.938           0.928
    liquid-render   84.7         0.4         30.8       86.3         0.4         30.8       0.981           0.982
    lobsters        594.7        0.6         310.5      596.6        0.4         306.0      1.057           0.997
    mail            75.6         2.8         53.3       76.9         0.7         53.2       0.968           0.982
    psych-load      1122.8       1.2         29.2       1145.1       0.4         31.7       0.964           0.981
    railsbench      1244.7       0.3         115.5      1254.8       1.1         115.2      0.939           0.992
    rubocop         103.7        0.5         94.1       104.3        0.5         92.4       0.985           0.994
    ruby-lsp        88.3         0.6         78.5       88.5         1.2         77.9       0.992           0.997
    sequel          26.9         0.9         33.6       28.3         1.4         32.1       0.954           0.952
    shipit          1119.3       1.5         171.4      1075.7       2.1         162.5      1.873           1.040
    --------------  -----------  ----------  ---------  -----------  ----------  ---------  --------------  -------------
2026-01-16 17:02:03 -05:00
Jean Boussier
1f3c52dc15 Fix rb_interned_str: create strings with BINARY (akak ASCII_8BIT) encoding
[Bug #21842]

The documentation always stated as much, and it's consistent with the
rb_str_* family of functions.
2026-01-16 22:44:38 +01:00
John Hawthorn
c56ce8a6c1 Remove objspace->flags.has_newobj_hook
We aren't using this anymore and the hook is called in gc.c
2026-01-16 12:46:20 -08:00
Peter Zhu
6e480e6714 Allow symbols to move in compaction 2026-01-15 17:57:27 -05:00
Peter Zhu
f2833e358c Fix generational GC for weak references
Fixes issue pointed out in https://bugs.ruby-lang.org/issues/21084#note-7.
The following script crashes:

    wmap = ObjectSpace::WeakMap.new

    GC.disable # only manual GCs
    GC.start
    GC.start

    retain = []
    50.times do
      k = Object.new
      wmap[k] = true
      retain << k
    end

    GC.start # wmap promoted, other objects still young

    retain.clear

    GC.start(full_mark: false)

    wmap.keys.each(&:itself) # call method on keys to cause crash
2025-12-30 10:59:21 -05:00
Peter Zhu
01cd9c9fad Add rb_gc_register_pinning_obj 2025-12-29 09:03:31 -05:00
Peter Zhu
10b97f52fd Implement declaring weak references
[Feature #21084]

 # Summary

The current way of marking weak references uses `rb_gc_mark_weak(VALUE *ptr)`.
This presents challenges because Ruby's GC is incremental, meaning that if the
`ptr` changes (e.g. realloc'd or free'd), then we could have an invalid memory
access. This also overwrites `*ptr = Qundef` if `*ptr` is dead, which prevents
any cleanup to be run (e.g. freeing memory or deleting entries from hash
tables). This ticket proposes `rb_gc_declare_weak_references` which declares
that an object has weak references and calls a cleanup function after marking,
allowing the object to clean up any memory for dead objects.

 # Introduction

In [[Feature #19783]](https://bugs.ruby-lang.org/issues/19783), I introduced an
API allowing objects to mark weak references, the function signature looks like
this:

```c
void rb_gc_mark_weak(VALUE *ptr);
```

`rb_gc_mark_weak` is called during the marking phase of the GC to specify that
the memory at `ptr` holds a pointer to a Ruby object that is weakly referenced.
`rb_gc_mark_weak` appends this pointer to a list that is processed after the
marking phase of the GC. If the object at `*ptr` is no longer alive, then it
overwrites the object reference with a special value (`*ptr = Qundef`).

However, this API resulted in two challenges:

1. Ruby's default GC is incremental, which means that the GC is not ran in one
   phase, but rather split into chunks of work that interleaves with Ruby
   execution. The `ptr` passed into `rb_gc_mark_weak` could be on the malloc
   heap, and that memory could be realloc'd or even free'd. We had to use
   workarounds such as `rb_gc_remove_weak` to ensure that there were no illegal
   memory accesses. This made `rb_gc_mark_weak` difficult to use, impacted
   runtime performance, and increased memory usage.
2. When an object dies, `rb_gc_mark_weak` only overwites the reference with
   `Qundef`. This means that if we want to do any cleanup (e.g. free a piece of
   memory or delete a hash table entry), we could not do that and had to defer
   this process elsewhere (e.g. during marking or runtime).

In this ticket, I'm proposing a new API for weak references. Instead of an
object marking its weak references during the marking phase, the object declares
that it has weak references using the `rb_gc_declare_weak_references` function.
This declaration occurs during runtime (e.g. after the object has been created)
rather than during GC.

After an object declares that it has weak references, it will have its callback
function called after marking as long as that object is alive. This callback
function can then call a special function `rb_gc_handle_weak_references_alive_p`
to determine whether its references are alive. This will allow the callback
function to do whatever it wants on the object, allowing it to perform any
cleanup work it needs.

This significantly simplifies the code for `ObjectSpace::WeakMap` and
`ObjectSpace::WeakKeyMap` because it no longer needs to have the workarounds for
the limitations of `rb_gc_mark_weak`.

 # Performance

The performance results below demonstrate that `ObjectSpace::WeakMap#[]=` is now
about 60% faster because the implementation has been simplified and the number
of allocations has been reduced. We can see that there is not a significant
impact on the performance of `ObjectSpace::WeakMap#[]`.

Base:

```
ObjectSpace::WeakMap#[]=
                          4.620M (± 6.4%) i/s  (216.44 ns/i) -     23.342M in   5.072149s
ObjectSpace::WeakMap#[]
                         30.967M (± 1.9%) i/s   (32.29 ns/i) -    154.998M in   5.007157s
```

Branch:

```
ObjectSpace::WeakMap#[]=
                          7.336M (± 2.8%) i/s  (136.31 ns/i) -     36.755M in   5.013983s
ObjectSpace::WeakMap#[]
                         30.902M (± 5.4%) i/s   (32.36 ns/i) -    155.901M in   5.064060s
```

Code:

```
require "bundler/inline"

gemfile do
  source "https://rubygems.org"
  gem "benchmark-ips"
end

wmap = ObjectSpace::WeakMap.new
key = Object.new
val = Object.new
wmap[key] = val

Benchmark.ips do |x|
  x.report("ObjectSpace::WeakMap#[]=") do |times|
    i = 0
    while i < times
      wmap[Object.new] = Object.new
      i += 1
    end
  end

  x.report("ObjectSpace::WeakMap#[]") do |times|
    i = 0
    while i < times
      wmap[key]
      wmap[val] # does not exist
      i += 1
    end
  end
end
```

 # Alternative designs

Currently, `rb_gc_declare_weak_references` is designed to be an internal-only
API. This allows us to assume the object types that call
`rb_gc_declare_weak_references`. In the future, if we want to open up this API
to third parties, we may want to change this function to something like:

```c
void rb_gc_add_cleaner(VALUE obj, void (*callback)(VALUE obj));
```

This will allow the third party to implement a custom `callback` that gets
called after the marking phase of GC to clean up any dead references. I chose
not to implement this design because it is less efficient as we would need to
store a mapping from `obj` to `callback`, which requires extra memory.
2025-12-25 09:18:17 -05:00
Peter Zhu
e2cf92eddc Move special const check to gc.c for rb_gc_impl_object_moved_p 2025-12-23 13:54:08 -05:00
Benoit Daloze
4d4f414a60 Use RBIMPL_ASSERT_OR_ASSUME instead of ASSUME for better errors when it does not hold 2025-12-16 21:00:27 +01:00
Jean Boussier
094418a6de gc.h: Reintroduce immediate guard in rb_obj_written
This guard was removed in https://github.com/ruby/ruby/pull/13497
on the justification that some GC may need to be notified even for
immediate.

But the two currently available GCs don't, and there are plenty
of assumtions GCs don't everywhere, notably in YJIT and ZJIT.

This optimization is also not so micro (but not huge either).
I routinely see 1-2% wasted there on micro-benchmarks.

So perhaps if in the future we actually need this, it might make
sense to introduce a way for GCs to declare that as an option,
but in the meantime it's extra overhead with little gain.
2025-12-16 21:00:27 +01:00
John Hawthorn
1c29fbeca0 GC_DEBUG_STRESS_TO_CLASS should only be for debug
I believe this was accidentally left in as part of
2beb3798bac52624c3170138f8ef65869f1da6c0
2025-12-10 16:02:01 -08:00
Peter Zhu
791acc5697 Revert "gc.c: Pass shape_id to newobj_init"
This reverts commit 228d13f6ed914d1e7f6bd2416e3f5be8283be865.

This commit makes default.c and mmtk.c depend on shape.h, which prevents
them from building independently.
2025-12-05 15:40:39 -08:00
John Hawthorn
a773bbf0cc Track small malloc/free changes in thread local 2025-12-03 12:37:07 -08:00
John Hawthorn
9913d8da1f Group malloc counters together 2025-12-03 12:37:07 -08:00
Jean Boussier
228d13f6ed gc.c: Pass shape_id to newobj_init
Attempt to fix the following SEGV:

```
ruby(gc_mark) ../src/gc/default/default.c:4429
ruby(gc_mark_children+0x45) [0x560b380bf8b5] ../src/gc/default/default.c:4625
ruby(gc_mark_stacked_objects) ../src/gc/default/default.c:4647
ruby(gc_mark_stacked_objects_all) ../src/gc/default/default.c:4685
ruby(gc_marks_rest) ../src/gc/default/default.c:5707
ruby(gc_marks+0x4e7) [0x560b380c41c1] ../src/gc/default/default.c:5821
ruby(gc_start) ../src/gc/default/default.c:6502
ruby(heap_prepare+0xa4) [0x560b380c4efc] ../src/gc/default/default.c:2074
ruby(heap_next_free_page) ../src/gc/default/default.c:2289
ruby(newobj_cache_miss) ../src/gc/default/default.c:2396
ruby(RB_SPECIAL_CONST_P+0x0) [0x560b380c5df4] ../src/gc/default/default.c:2420
ruby(RB_BUILTIN_TYPE) ../src/include/ruby/internal/value_type.h:184
ruby(newobj_init) ../src/gc/default/default.c:2136
ruby(rb_gc_impl_new_obj) ../src/gc/default/default.c:2500
ruby(newobj_of) ../src/gc.c:996
ruby(rb_imemo_new+0x37) [0x560b380d8bed] ../src/imemo.c:46
ruby(imemo_fields_new) ../src/imemo.c:105
ruby(rb_imemo_fields_new) ../src/imemo.c:120
```

I have no reproduction, but my understanding based on the backtrace
and error is that GC is triggered inside `newobj_init` causing the
new object to be marked while in a incomplete state.

I believe the fix is to pass the `shape_id` down to `newobj_init`
so it can be set before the GC has a chance to trigger.
2025-12-03 19:51:48 +01:00
John Hawthorn
4161c78a9d Add remembered flag to heap dump
This should be less common than than many of the other flags, so should
not inflate the heap too much. This is desirable because reducing the
number of remembered objects will improve minor GC speeds.
2025-12-01 15:02:26 -08:00
Nobuyoshi Nakada
806e554cc0
Compare with the upper bound of the loop variable
Fix sign-compare warning
2025-11-30 14:14:03 +09:00
John Hawthorn
5e2e45fc24 Fix for modgc 2025-11-27 16:04:16 -08:00
John Hawthorn
9929dc4440 Mask off unused VWA bits 2025-11-27 16:04:16 -08:00
John Hawthorn
67a14e94c6 Set age bitmap outside of adding to freelist
This allows us to do less work when allocating a fresh page.
2025-11-26 10:40:56 -08:00
John Hawthorn
795e290ead Avoid extra set of age bit flags 2025-11-26 10:40:56 -08:00
Peter Zhu
8bf333a199 Fix live object count for multi-Ractor forking
Since we do not run a Ractor barrier before forking, it's possible that
another other Ractor is halfway through allocating an object during forking.
This may lead to allocated_objects_count being off by one.

For example, the following script reproduces the bug:

    100.times do |i|
      Ractor.new(i) do |j|
        10000.times do |i|
          "#{j}-#{i}"
        end
        Ractor.receive
      end
      pid = fork { GC.verify_internal_consistency }
      _, status = Process.waitpid2 pid
      raise unless status.success?
    end

We need to run with `taskset -c 1` to force it to use a single CPU core
to more consistenly reproduce the bug:

    heap_pages_final_slots: 1, total_freed_objects: 16628
    test.rb:8: [BUG] inconsistent live slot number: expect 19589, but 19588.
    ruby 4.0.0dev (2025-11-25T03:06:55Z master 55892f5994) +PRISM [x86_64-linux]

    -- Control frame information -----------------------------------------------
    c:0007 p:---- s:0029 e:000028 l:y b:---- CFUNC  :verify_internal_consistency
    c:0006 p:0004 s:0025 e:000024 l:n b:---- BLOCK  test.rb:8 [FINISH]
    c:0005 p:---- s:0022 e:000021 l:y b:---- CFUNC  :fork
    c:0004 p:0012 s:0018 E:0014c0 l:n b:---- BLOCK  test.rb:8
    c:0003 p:0024 s:0011 e:000010 l:y b:0001 METHOD <internal:numeric>:257
    c:0002 p:0005 s:0006 E:001730 l:n b:---- EVAL   test.rb:1 [FINISH]
    c:0001 p:0000 s:0003 E:001d20 l:y b:---- DUMMY  [FINISH]

    -- Ruby level backtrace information ----------------------------------------
    test.rb:1:in '<main>'
    <internal:numeric>:257:in 'times'
    test.rb:8:in 'block in <main>'
    test.rb:8:in 'fork'
    test.rb:8:in 'block (2 levels) in <main>'
    test.rb:8:in 'verify_internal_consistency'

    -- Threading information ---------------------------------------------------
    Total ractor count: 1
    Ruby thread count for this ractor: 1

    -- C level backtrace information -------------------------------------------
    ruby(rb_print_backtrace+0x14) [0x61b67ac48b60] vm_dump.c:1105
    ruby(rb_vm_bugreport) vm_dump.c:1450
    ruby(rb_bug_without_die_internal+0x5f) [0x61b67a818a28] error.c:1098
    ruby(rb_bug) error.c:1116
    ruby(gc_verify_internal_consistency_+0xbdd) [0x61b67a83d8ed] gc/default/default.c:5186
    ruby(gc_verify_internal_consistency+0x2d) [0x61b67a83d960] gc/default/default.c:5241
    ruby(rb_gc_verify_internal_consistency) gc/default/default.c:8950
    ruby(gc_verify_internal_consistency_m) gc/default/default.c:8966
    ruby(vm_call_cfunc_with_frame_+0x10d) [0x61b67a9e50fd] vm_insnhelper.c:3902
    ruby(vm_sendish+0x111) [0x61b67a9eeaf1] vm_insnhelper.c:6124
    ruby(vm_exec_core+0x84) [0x61b67aa07434] insns.def:903
    ruby(vm_exec_loop+0xa) [0x61b67a9f8155] vm.c:2811
    ruby(rb_vm_exec) vm.c:2787
    ruby(vm_yield_with_cref+0x90) [0x61b67a9fd2ea] vm.c:1865
    ruby(vm_yield) vm.c:1873
    ruby(rb_yield) vm_eval.c:1362
    ruby(rb_protect+0xef) [0x61b67a81fe6f] eval.c:1154
    ruby(rb_f_fork+0x16) [0x61b67a8e98ab] process.c:4293
    ruby(rb_f_fork) process.c:4284
2025-11-25 14:19:30 -08:00
Peter Zhu
55892f5994 Fix style for rb_gc_impl_after_fork 2025-11-24 19:06:55 -08:00
Peter Zhu
86b210203e Fix style for rb_gc_impl_before_fork 2025-11-24 19:06:55 -08:00
John Hawthorn
9764306c48 Accurate GC.stat under multi-Ractor mode 2025-11-20 17:19:40 -08:00
Peter Zhu
f5f69d4114 Implement heap_final_slots in GC.stat_heap
[Feature #20408]
2025-11-19 15:25:29 -08:00
Peter Zhu
83bf05427d Implement heap_free_slots in GC.stat_heap
[Feature #20408]
2025-11-19 15:25:29 -08:00
Peter Zhu
fa02d7a01f Implement heap_live_slots in GC.stat_heap
[Feature #20408]
2025-11-19 15:25:29 -08:00
Peter Zhu
a731080f46 Make rb_gc_obj_optimal_size always return allocatable size
It may return sizes that aren't allocatable for arrays and strings.
2025-11-09 11:14:54 -08:00
Peter Zhu
827f11fce3 Move rb_gc_verify_shareable to gc.c
rb_gc_verify_shareable is not GC implementation specific so it should live
in gc.c.
2025-11-08 17:58:25 -08:00
Luke Gruber
f1f2dfebe8
Release VM lock before running finalizers (#15050)
We shouldn't run any ruby code with the VM lock held.
2025-11-04 14:46:01 -05:00
Luke Gruber
16af727908
Avoid taking vm barrier in heap_prepare() (#14425)
We can avoid taking this barrier if we're not incremental marking or lazy sweeping.
I found this was taking a significant amount of samples when profiling `Psych.load`
in multiple ractors due to the vm barrier. With this change, we get significant improvements
in ractor benchmarks that allocate lots of objects.

-- Psych.load benchmark --

```
Before:            After:
r:   itr:   time   r:   itr:   time
0    #1:  960ms    0    #1:  943ms
0    #2:  979ms    0    #2:  939ms
0    #3:  968ms    0    #3:  948ms
0    #4:  963ms    0    #4:  946ms
0    #5:  964ms    0    #5:  944ms
1    #1:  947ms    1    #1:  940ms
1    #2:  950ms    1    #2:  947ms
1    #3:  962ms    1    #3:  950ms
1    #4:  947ms    1    #4:  945ms
1    #5:  947ms    1    #5:  943ms
2    #1: 1131ms    2    #1: 1005ms
2    #2: 1153ms    2    #2:  996ms
2    #3: 1155ms    2    #3: 1003ms
2    #4: 1205ms    2    #4: 1012ms
2    #5: 1179ms    2    #5: 1012ms
4    #1: 1555ms    4    #1: 1209ms
4    #2: 1509ms    4    #2: 1244ms
4    #3: 1529ms    4    #3: 1254ms
4    #4: 1512ms    4    #4: 1267ms
4    #5: 1513ms    4    #5: 1245ms
6    #1: 2122ms    6    #1: 1584ms
6    #2: 2080ms    6    #2: 1532ms
6    #3: 2079ms    6    #3: 1476ms
6    #4: 2021ms    6    #4: 1463ms
6    #5: 1999ms    6    #5: 1461ms
8    #1: 2741ms    8    #1: 1630ms
8    #2: 2711ms    8    #2: 1632ms
8    #3: 2688ms    8    #3: 1654ms
8    #4: 2641ms    8    #4: 1684ms
8    #5: 2656ms    8    #5: 1752ms
```
2025-11-03 14:30:59 -05:00
Koichi Sasada
a177799807 catch up modular-gc 2025-10-23 13:08:26 +09:00
Koichi Sasada
bc00c4468e use SET_SHAREABLE
to adopt strict shareable rule.

* (basically) shareable objects only refer shareable objects
* (exception) shareable objects can refere unshareable objects
  but should not leak reference to unshareable objects to Ruby world
2025-10-23 13:08:26 +09:00
Koichi Sasada
45907b1b00 add SET_SHAREABLE macros
* `RB_OBJ_SET_SHAREABLE(obj)` makes obj shareable.
  All of reachable objects from `obj` should be shareable.
* `RB_OBJ_SET_FROZEN_SHAREABLE(obj)` same as above
  but freeze `obj` before making it shareable.

Also `rb_gc_verify_shareable(obj)` is introduced to check
the `obj` does not violate shareable rule (an shareable object
only refers shareable objects) strictly.

The rule has some exceptions (some shareable objects can refer to
unshareable objects, such as a Ractor object (which is a shareable
object) can refer to the Ractor local objects.
To handle such case, `check_shareable` flag is also introduced.

`STRICT_VERIFY_SHAREABLE` macro is also introduced to verify
the strict shareable rule at `SET_SHAREABLE`.
2025-10-23 13:08:26 +09:00
John Hawthorn
9e4a756963 Use BUILTIN_TYPE in gc_mark_check_t_none 2025-10-15 17:13:19 -07:00
John Hawthorn
17a5a5e2ef Take a full VM barrier in gc_rest
This isn't (yet?) safe to do because it concurrently modifies GC
structures and dfree functions are not necessarily safe to do without
stopping all Ractors.

If it was safe to do this we should also do it for
gc_enter_event_continue. I do think sweeping could be done concurrently
with the mutator and in parallel, but that requires more work first.
2025-10-10 10:18:00 -07:00
Luke Gruber
ff198ad904 Add assertion to rb_gc_impl_writebarrier
We should only be executing WBs when GC is not running. We ran into this
issue when debugging 3cd2407045a67838cf2ab949e5164676b6870958.
2025-10-03 11:15:56 -07:00
John Hawthorn
1f0da24049 ASAN poison parent_object after marking
Previously we were tracking down a bug where this was used after being
valid.

Co-authored-by: Luke Gruber <luke.gru@gmail.com>
2025-10-02 13:24:00 -07:00
John Hawthorn
3cd2407045 Don't call gc_mark from IO::buffer compact
Previously on our mark_and_move we were calling rb_gc_mark, which isn't
safe to call at compaction time.

Co-authored-by: Luke Gruber <luke.gru@gmail.com>
2025-10-02 13:24:00 -07:00
Peter Zhu
ba52af6fc3 Always set parent_object in GC
When we mark a T_NONE, we crash with the object and parent object information
in the bug report. However, if the parent object is young then it is Qfalse.
For example, a bug report looks like:

    [BUG] try to mark T_NONE object (obj: 0x00003990e42d7c70 T_NONE/, parent: (none))

This commit changes it to always set the parent object and also adds a
new field parent_object_old_p to quickly determine if the parent object
is old or not.
2025-09-26 17:17:42 -04:00
Peter Zhu
4a082b5d34 Fix assertion in rb_gc_impl_mark_weak
The FL_WB_PROTECTED flag is no longer used and is not set on objects, so
that assertion cannot be true. Instead, we should use RVALUE_WB_UNPROTECTED.
2025-09-21 13:12:34 -04:00
Peter Zhu
74075617b1 Remove setting v1, v2, v3 when creating a new object
Setting v1, v2, v3 when we allocate an object assumes that we always
allocate 40 byte objects. By removing v1, v2, v3, we can make the base
slot size another size.
2025-09-17 09:25:17 -04:00
Nobuyoshi Nakada
bb5cd8e049
Get rid of strcpy and magic numbers 2025-09-13 16:36:48 +09:00