58 Commits

Author SHA1 Message Date
Jean Boussier
1c7e19f961 rb_free_tmp_buffer: use ruby_sized_xfree
We know the buffer length, we might as well feed that information
back to the GC.
2026-01-16 21:11:17 +01:00
Peter Zhu
01cd9c9fad Add rb_gc_register_pinning_obj 2025-12-29 09:03:31 -05:00
Peter Zhu
56147001ec Move MEMO_NEW to imemo.c and rename to rb_imemo_memo_new 2025-12-29 09:03:31 -05:00
Peter Zhu
ade779b1e1 Implement callcache using declare weak references 2025-12-25 09:18:17 -05:00
Luke Gruber
4fb537b1ee
Make tracepoints with set_trace_func or TracePoint.new ractor local (#15468)
Before this change, GC'ing any Ractor object caused you to lose all
enabled tracepoints across all ractors (even main). Now tracepoints are
ractor-local and this doesn't happen. Internal events are still global.

Fixes [Bug #19112]
2025-12-16 14:06:55 -05:00
Nobuyoshi Nakada
2f53985da9
Revert miscommit at "Reset the cache variable before retrying"
This reverts commit 26a9e0b4e31f7b5a9cbd755e0a15823a8fa51bae partially.
2025-11-26 11:35:15 +09:00
Nobuyoshi Nakada
26a9e0b4e3
Reset the cache variable before retrying 2025-11-26 10:47:17 +09:00
Satoshi Tagomori
e84b91a292 Box: mark/move Box object referred via ENV/rb_env_t 2025-11-26 10:10:47 +09:00
Koichi Sasada
bc00c4468e use SET_SHAREABLE
to adopt strict shareable rule.

* (basically) shareable objects only refer shareable objects
* (exception) shareable objects can refere unshareable objects
  but should not leak reference to unshareable objects to Ruby world
2025-10-23 13:08:26 +09:00
Peter Zhu
3ec597f619 Fix memory leak in cloning complex imemo_fields
When we clone a complex imemo_fields, it calls creates the imemo_fields
using rb_imemo_fields_new_complex, which allocates and initializes a new
st_table. However, st_replace will directly replace any exisiting fields
in the st_table, causing it to leak.

For example, this script demonstrates the leak:

    obj = Class.new
    8.times do |i|
      obj.instance_variable_set(:"@test#{i}", nil)
      obj.remove_instance_variable(:"@test#{i}")
    end

    obj.instance_variable_set(:"@test", 1)

    10.times do
      100_000.times do
        obj.dup
      end

      puts `ps -o rss= -p #{$$}`
    end

Before:

    26320
    39296
    52320
    63136
    75520
    87008
    97856
    114800
    120864
    133504

After:

    16288
    20112
    20416
    20720
    20800
    20864
    21184
    21424
    21904
    21904
2025-09-21 08:24:21 -04:00
Peter Zhu
1663e2fbc8 Fix capacity of imemo_fields objects created from rb_imemo_fields_new_complex_tbl
The imemo_fields_new function takes a capacity in the number of fields to
preallocate. rb_imemo_fields_new_complex_tbl is using it incorrectly because
it is preallocating sizeof(struct rb_fields) number of fields.
2025-09-19 11:05:43 -04:00
Peter Zhu
477b1e79b7 Directly use rb_imemo_new in imemo_fields_new_complex
We should not assume that a complex imemo_field takes only one additional
VALUE space. This is fragile as it will break if we add additional fields
to complex imemo_field.
2025-09-19 08:46:09 -04:00
Peter Zhu
cfc5c56503 Clear out memory for newly allocated tmpbuf 2025-09-17 09:25:17 -04:00
John Hawthorn
e4f09a8c94 Remove next field and unused method from tmpbuf
These used to be used by the parser
2025-09-15 16:08:13 -07:00
Peter Zhu
7dd9c76ad4 Make imemo_tmpbuf not write-barrier protected
imemo_tmpbuf is not write-barrier protected and uses mark maybe to mark
the buffer it holds. The normal rb_imemo_new creates a write-barrier
protected object which can make the tmpbuf miss marking references.
2025-09-15 11:43:05 -04:00
Peter Zhu
1e3e04cd65 Move rb_imemo_tmpbuf_new to imemo.c 2025-09-15 11:43:05 -04:00
Peter Zhu
b0ce1fd549 Combine rb_imemo_tmpbuf_auto_free_pointer and rb_imemo_tmpbuf_new 2025-09-15 09:25:20 -04:00
Peter Zhu
adcde78dbf Use IMEMO_NEW in rb_imemo_tmpbuf_new 2025-09-12 10:05:24 -04:00
Peter Zhu
61d26c35bf Don't pin method hooks of bmethods 2025-08-27 10:34:40 -04:00
Jean Boussier
5257e1298c Replace ROBJECT_EMBED by ROBJECT_HEAP
The embed layout is way more common than the heap one,
especially since WVA.

I think it makes for more readable code to inverse the
flag.
2025-08-27 12:41:07 +02:00
Jean Boussier
14bdf4b57d Ensure T_OBJECT and T_IMEMO/fields have identical layout 2025-08-26 13:44:59 +02:00
Jean Boussier
b6bf44ae0f variable.c: handle cleared fields_obj in genfields cache
[Bug #21547]

Followup: https://github.com/ruby/ruby/pull/14201

When adding an instance variable and the IMEMO/fields need to be
larger, we allocate a new one and clear the old one.

Since the old one may still be in other ec's cache, on a hit we must
check the IMEMO/fields isn't a stale one.
2025-08-21 14:17:29 +02:00
Jean Boussier
10aa4134d4 imemo_fields: store owner object in RBasic.klass
It is much more convenient than storing the klass, especially
when dealing with `object_id` as it allows to update the id2ref
table without having to dereference the owner, which may be
garbage at that point.
2025-08-13 19:53:18 +02:00
John Hawthorn
c41c323f1a
Invalidate CCs when cme is invalidated in marking
* Skip assertion when cc->klass is Qundef
* Invalidate CCs when cme is invalidated in marking
* Add additional assertions that CC references stay valid

Co-authored-by: Peter Zhu <peter@peterzhu.ca>
2025-08-07 15:39:45 -07:00
John Hawthorn
a9f6fe0914 Avoid marking CC children after invalidation
Once klass becomes Qundef, it's disconnected and won't be invalidated
when the CME is. So once that happens we must not mark or attempt to
move the cme_ field.
2025-08-06 15:57:13 -07:00
Jean Boussier
547f111b5b Refactor vm_lookup_cc to allow lock-free lookups in RClass.cc_tbl
In multi-ractor mode, the `cc_tbl` mutations use the RCU pattern,
which allow lock-less reads.

Based on the assumption that invalidations and misses should be
increasingly rare as the process ages, locking on modification
isn't a big concern.
2025-08-01 10:42:04 +02:00
Jean Boussier
f2a7e48dea Make RClass.cc_table a managed object
For now this doesn't change anything, but now that the table
is managed by GC, it opens the door to use RCU when in multi-ractor
mode, hence allow unsynchornized reads.
2025-08-01 10:42:04 +02:00
Jean Boussier
fc5e1541e4 Use rb_gc_mark_weak for cc->klass.
One of the biggest remaining contention point is `RClass.cc_table`.
The logical solution would be to turn it into a managed object, so
we can use an RCU strategy, given it's read heavy.

However, that's not currently possible because the table can't
be freed before the owning class, given the class free function
MUST go over all the CC entries to invalidate them.

However if the `CC->klass` reference is weak marked, then the
GC will take care of setting the reference to `Qundef`.
2025-08-01 10:42:04 +02:00
Jean Boussier
7ee127d2d1 Get rid of imemo_ast
It has been marked as obsolete for a while and I see no reason
to keep it.
2025-07-29 13:05:12 +02:00
Peter Zhu
f186f2cb70 Remove unused imemo_parser_strterm 2025-07-24 09:49:13 -04:00
Peter Zhu
b2a7b76992 Remove dead rb_cc_table_free 2025-07-14 11:11:47 -04:00
Peter Zhu
127cc425b7 Remove dead rb_cc_table_mark 2025-07-14 11:11:47 -04:00
Jean Boussier
8faa32327b Add missing write barriers in rb_imemo_fields_clone. 2025-06-17 15:28:05 +02:00
Jean Boussier
cd9f447be2 Refactor generic fields to use T_IMEMO/fields objects.
Followup: https://github.com/ruby/ruby/pull/13589

This simplify a lot of things, as we no longer need to manually
manage the memory, we can use the Read-Copy-Update pattern and
avoid numerous race conditions.

Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com>
2025-06-17 15:28:05 +02:00
Jean Boussier
164486a954 Refactor rb_imemo_fields_new to not assume T_CLASS 2025-06-17 15:28:05 +02:00
Jean Boussier
fb68721f63 Rename imemo_class_fields -> imemo_fields 2025-06-17 15:28:05 +02:00
Jean Boussier
a74c385208 Make setting and accessing class ivars lock-free
Now that class fields have been deletated to a T_IMEMO/class_fields
when we're in multi-ractor mode, we can read and write class instance
variable in an atomic way using Read-Copy-Update (RCU).

Note when in multi-ractor mode, we always use RCU. In theory
we don't need to, instead if we ensured the field is written
before the shape is updated it would be safe.

Benchmark:

```ruby
Warning[:experimental] = false

class Foo
  @foo = 1
  @bar = 2
  @baz = 3
  @egg = 4
  @spam = 5

  class << self
    attr_reader :foo, :bar, :baz, :egg, :spam
  end
end

ractors = 8.times.map do
  Ractor.new do
    1_000_000.times do
      Foo.bar + Foo.baz * Foo.egg - Foo.spam
    end
  end
end

if Ractor.method_defined?(:value)
  ractors.each(&:value)
else
  ractors.each(&:take)
end
```

This branch vs Ruby 3.4:

```bash
$ hyperfine -w 1 'ruby --disable-all ../test.rb' './miniruby ../test.rb'

Benchmark 1: ruby --disable-all ../test.rb
  Time (mean ± σ):      3.162 s ±  0.071 s    [User: 2.783 s, System: 10.809 s]
  Range (min … max):    3.093 s …  3.337 s    10 runs

Benchmark 2: ./miniruby ../test.rb
  Time (mean ± σ):     208.7 ms ±   4.6 ms    [User: 889.7 ms, System: 6.9 ms]
  Range (min … max):   202.8 ms … 222.0 ms    14 runs

Summary
  ./miniruby ../test.rb ran
   15.15 ± 0.47 times faster than ruby --disable-all ../test.rb
```
2025-06-12 14:55:13 +02:00
Jean Boussier
8b5ac5abf2 Fix class instance variable inside namespaces
Now that classes fields are delegated to an object with its own
shape_id, we no longer need to mark all classes as TOO_COMPLEX.
2025-06-12 13:43:29 +02:00
Jean Boussier
3abdd4241f Turn rb_classext_t.fields into a T_IMEMO/class_fields
This behave almost exactly as a T_OBJECT, the layout is entirely
compatible.

This aims to solve two problems.

First, it solves the problem of namspaced classes having
a single `shape_id`. Now each namespaced classext
has an object that can hold the namespace specific
shape.

Second, it open the door to later make class instance variable
writes atomics, hence be able to read class variables
without locking the VM.
In the future, in multi-ractor mode, we can do the write
on a copy of the `fields_obj` and then atomically swap it.

Considerations:

  - Right now the `RClass` shape_id is always synchronized,
    but with namespace we should likely mark classes that have
    multiple namespace with a specific shape flag.
2025-06-12 07:58:16 +02:00
Satoshi Tagomori
382645d440 namespace on read 2025-05-11 23:32:50 +09:00
Alan Wu
3e04f7b69f
Only mark cc->cme_ on valid imemo_callcache
We observed T_NONE on `cc->cme_` on a --repeat-count=50 run a compaction
test on CI:
http://ci.rvm.jp/results/trunk-repeat50@ruby-sp2-noble-docker/5654900

During reference updating for imemo_callcache in
rb_imemo_mark_and_move(), if `cc->klass` is not live, but `cc->_cme` is
live and moved, we go to the vm_cc_invalidate() path which
leaves `cc->_cme` not updated and stale. In the next marking run after
compaction, CME would've become a T_NONE.

So to quote the comment above "... cc is invalidated by
`vm_cc_invalidate()` and cc->cme is not be accessed."
2025-03-16 16:00:08 -04:00
Peter Zhu
62a1528020 Pass allocation size to rb_imemo_new
This would allow imemo to take advantage of VWA and allocate sizes larger
than RVALUE (40 bytes).
2025-01-08 09:11:59 -05:00
Peter Zhu
d0f9f3e2c6 Remove IMEMO_DEBUG
The code path hasn't compiled for almost a year, since 330830dd1a44b6e497250a14d93efae6fa363f82,
so probably nobody uses it.
2025-01-07 11:01:08 -05:00
Peter Zhu
33f95d632d Don't unpoison the CC in vm_ccs_free
The poison status is maintained by the GC, so don't unpoison it in vm_ccs_free.
If the object is not a garbage object, then it should not be poisoned.
2024-12-19 16:25:23 -05:00
Alan Wu
5978f2f114
Fix use-after-free in vm_ccs_free()
`struct rb_callcache *` point to an imemo object on the GC heap when
pushed into `struct rb_class_cc_entries`, but by the time vm_ccs_free()
runs, the entire GC page the imemo was on could already be deallocated.
With the right conditions, vm_ccs_free() wrote to freed memory.
rb_objspace_garbage_object_p() by itself is not enough to determine
liveness.

I conjectured this situation to be possible in
<https://github.com/ruby/ruby/pull/11995> using hints from crashes
in the wild. With c37bdfa5311be0aa8503b995299fb9547cede0a6 ("Make
asan_poison_object poison the whole slot"), the in-tree test suite
now recreates this scenario[^1][^2][^3].

Use rb_gc_pointer_to_heap_p(). Other uses of
rb_objspace_garbage_object_p() could be making the same mistake, but
correcting them might introduce serious performance regressions, so
leave them alone for now.

[^1]: http://ci.rvm.jp/results/trunk_asan@ruby-sp1/5477412
[^2]: http://ci.rvm.jp/results/trunk_asan@ruby-sp1/5477445
[^3]: http://ci.rvm.jp/results/trunk_asan@ruby-sp1/5477448
2024-12-19 12:28:21 -05:00
Peter Zhu
a58675386c Prefix asan_poison_object with rb 2024-12-19 09:14:34 -05:00
Matt Valentine-House
551be8219e Place all non-default GC API behind USE_SHARED_GC
So that it doesn't get included in the generated binaries for builds
that don't support loading shared GC modules

Co-Authored-By: Peter Zhu <peter@peterzhu.ca>
2024-11-25 13:05:23 +00:00
Peter Zhu
22f12b0a62 Use rb_id_table_foreach_values for marking CC table
We don't use the key, so we can speed it up by not needing to convert the
key to ID in the iterator.
2024-09-10 10:09:50 -04:00
Peter Zhu
51bd816517 [Feature #20470] Split GC into gc_impl.c
This commit splits gc.c into two files:

- gc.c now only contains code not specific to Ruby GC. This includes
  code to mark objects (which the GC implementation may choose not to
  use) and wrappers for internal APIs that the implementation may need
  to use (e.g. locking the VM).

- gc_impl.c now contains the implementation of Ruby's GC. This includes
  marking, sweeping, compaction, and statistics. Most importantly,
  gc_impl.c only uses public APIs in Ruby and a limited set of functions
  exposed in gc.c. This allows us to build gc_impl.c independently of
  Ruby and plug Ruby's GC into itself.
2024-07-03 09:03:40 -04:00
Aaron Patterson
e5160a9c60 Mark the class on orphan call caches
"super" CC's are "orphans", meaning there is no class CC table that
points at them.  Since they are orphans, we should mark the class
reference so that if the cache happens to be used, the class will still
be alive
2024-06-18 09:28:25 -07:00