Currently, root fibers of threads do not have a corresponding Ruby object
backing it by default (it does have one when an object is required, such
as when Fiber.current is called). This is a problem for the new GC weak
references design in #12606 since Thread is not declared as having weak
references but it does hold weak references (the generic ivar cache).
This commit changes it to always allocate a Fiber object for the root
fiber.
This is easier to access as ec->ractor_id instead of pointer-chasing through
ec->thread->ractor->ractor_id
Co-authored-by: Luke Gruber <luke.gru@gmail.com>
Previously when using a JIT and Ractors at the same time with debug
assertions turned on this could rarely fail with:
vm_core.h:1448: Assertion Failed: VM_ENV_FLAGS:FIXNUM_P(flags)
When using Ractors, any time the VM lock is acquired, that may join a
barrier as another Ractor initiates GC. This could be made to happen
reliably by replacing the invalidation with a call to rb_gc().
This assertion failure happens because
VM_STACK_ENV_WRITE(ep, 0, (VALUE)env);
Is setting VM_ENV_DATA_INDEX_FLAGS to the environment, which is not a
valid set of flags (it should be a fixnum). Although we update cfp->ep,
rb_execution_context_mark will also mark the PREV_EP, and until the
recursive calls to vm_make_env_each all finish the "next" ep may still
be pointing to the stack env we've just escaped.
I'm not completely sure why we need to store this on the stack - why is
setting cfp->ep not enough? I'm also not sure why
rb_execution_context_mark needs to mark the prev_ep.
Before this change, GC'ing any Ractor object caused you to lose all
enabled tracepoints across all ractors (even main). Now tracepoints are
ractor-local and this doesn't happen. Internal events are still global.
Fixes [Bug #19112]
to adopt strict shareable rule.
* (basically) shareable objects only refer shareable objects
* (exception) shareable objects can refere unshareable objects
but should not leak reference to unshareable objects to Ruby world
This is due to the way MMTK frees objects, which is on another native thread.
Due to this, there's no `ec` so we can't grab the VM Lock.
This was causing issues in release builds of MMTK on CI like:
```
/home/runner/work/ruby/ruby/build/ruby(sigsegv+0x46) [0x557905117ef6] ../src/signal.c:948
/lib/x86_64-linux-gnu/libc.so.6(0x7f555f845330) [0x7f555f845330]
/home/runner/work/ruby/ruby/build/ruby(rb_ec_thread_ptr+0x0) [0x5579051d59d5] ../src/vm_core.h:2087
/home/runner/work/ruby/ruby/build/ruby(rb_ec_ractor_ptr) ../src/vm_core.h:2036
/home/runner/work/ruby/ruby/build/ruby(rb_current_execution_context) ../src/vm_core.h:2105
/home/runner/work/ruby/ruby/build/ruby(rb_current_ractor_raw) ../src/vm_core.h:2104
/home/runner/work/ruby/ruby/build/ruby(rb_current_ractor) ../src/vm_core.h:2112
/home/runner/work/ruby/ruby/build/ruby(rb_current_ractor) ../src/vm_core.h:2110
/home/runner/work/ruby/ruby/build/ruby(vm_locked) ../src/vm_sync.c:15
/home/runner/work/ruby/ruby/build/ruby(rb_vm_lock_enter_body) ../src/vm_sync.c:141
/home/runner/work/ruby/ruby/build/ruby(rb_vm_lock_enter+0xa) [0x557905390a5a] ../src/vm_sync.h:76
/home/runner/work/ruby/ruby/build/ruby(fiber_pool_stack_release) ../src/cont.c:777
/home/runner/work/ruby/ruby/build/ruby(fiber_stack_release+0xe) [0x557905392075] ../src/cont.c:919
/home/runner/work/ruby/ruby/build/ruby(cont_free) ../src/cont.c:1087
/home/runner/work/ruby/ruby/build/ruby(fiber_free) ../src/cont.c:1180
```
This would have ran into an assertion error in a debug build but we don't run debug builds of MMTK on Github's CI.
Co-authored-by: john.hawthorn@shopify.com
* To show environments stack when the current namespace is unexpected or
namespace detection is broken
* It is displayed only when RUBY_BUGREPORT_NAMESPACE_ENV=1 is specified
We allocate the stack of the main thread using malloc, but we never set
malloc_stack to true and context_stack. If we fork, the main thread may
no longer be the main thread anymore so it reports memory being leaked
in RUBY_FREE_AT_EXIT.
This commit allows the main thread to free its own VM stack at shutdown.
* The current namespace should be based on the Ruby-level location (file, line no in .rb)
and we can get it by LEP(ep) basically (VM_ENV_FLAG_LOCAL flag is set)
* But the control frame with VM_FRAME_MAGIC_CFUNC is also a LOCAL frame because
it's a visible Ruby-level frame without block handlers
* So, for the namespace detection, LEP(ep) is not enough and we need to skip CFUNC
frames to fetch the caller of such frames
* checking all control frames (instead of filtering by VM_FRAME_RUBYFRAME_P)
because VM_FRAME_FLAG_NS_REQUIRE is set on non-rubyframe
* skip frames of CFUNC in the root namespace for Kernel#require (etc) to avoid
detecting the root namespace of those frames wrongly
Calling rb_current_namespace() in rb_namespace_current() means to show
the definition namespace of Namespace.current itself (it's the root always)
but the users' expectation is to show the namespace of the place where
the Namespace.current is called.
to fix inconsistent and wrong current namespace detections.
This includes:
* Moving load_path and related things from rb_vm_t to rb_namespace_t to simplify
accessing those values via namespace (instead of accessing either vm or ns)
* Initializing root_namespace earlier and consolidate builtin_namespace into root_namespace
* Adding VM_FRAME_FLAG_NS_REQUIRE for checkpoints to detect a namespace to load/require files
* Removing implicit refinements in the root namespace which was used to determine
the namespace to be loaded (replaced by VM_FRAME_FLAG_NS_REQUIRE)
* Removing namespaces from rb_proc_t because its namespace can be identified by lexical context
* Starting to use ep[VM_ENV_DATA_INDEX_SPECVAL] to store the current namespace when
the frame type is MAGIC_TOP or MAGIC_CLASS (block handlers don't exist in this case)
call-seq:
Ractor.sharable_proc(self: nil){} -> sharable proc
It returns shareable Proc object. The Proc object is
shareable and the self in a block will be replaced with
the value passed via `self:` keyword.
In a shareable Proc, the outer variables should
* (1) refer shareable objects
* (2) be not be overwritten
```ruby
a = 42
Ractor.shareable_proc{ p a }
#=> OK
b = 43
Ractor.shareable_proc{ p b; b = 44 }
#=> Ractor::IsolationError because 'b' is reassigned in the block.
c = 44
Ractor.shareable_proc{ p c }
#=> Ractor::IsolationError because 'c' will be reassigned outside of the block.
c = 45
d = 45
d = 46 if cond
Ractor.shareable_proc{ p d }
#=> Ractor::IsolationError because 'd' was reassigned outside of the block.
```
The last `d`'s case can be relaxed in a future version.
The above check will be done in a static analysis at compile time,
so the reflection feature such as `Binding#local_varaible_set`
can not be detected.
```ruby
e = 42
shpr = Ractor.shareable_proc{ p e } #=> OK
binding.local_variable_set(:e, 43)
shpr.call #=> 42 (returns captured timing value)
```
Ractor.sharaeble_lambda is also introduced.
[Feature #21550]
[Feature #21557]
Add the macro `RB_THREAD_CURRENT_EC_NOINLINE` to manage the condition to use
no-inline version rb_current_ec for a better maintainability.
Note that the `vm_core.h` includes the `THREAD_IMPL_H` by the
`#include THREAD_IMPL_H`. The `THREAD_IMPL_H` can be `thread_none.h`,
`thread_pthread.h` or `thread_win32.h` according to the
`tool/m4/ruby_thread.m4` creating the `THREAD_IMPL_H`.
The change in this commit only defining the `RB_THREAD_CURRENT_EC_NOINLINE` in
the `thread_pthread.h` is okay in this situation. Because in the
`thread_none.h` case, the thread feature is not used at all, including
Thread-Local Storage (TLS), and in the `thread_win32.h` case, the
`RB_THREAD_LOCAL_SPECIFIER` is not defined. In the `thread_pthread.h` case, the
`RB_THREAD_LOCAL_SPECIFIER` is defined in the `configure.ac`. In the
`thread_none.h` case, the `RB_THREAD_LOCAL_SPECIFIER` is defined in the
`thread_none.h`.
This commit fixes the failures in bootstraptest/test_ractor.rb reported on
the issue ticket <https://bugs.ruby-lang.org/issues/21534>.
TLS (Thread-Local Storage) may not be accessed across .so on ppc64le too.
I am not sure about that. The comment "// TLS can not be accessed across
.so on ..." in this commit comes from the following commit.
319afed20f (diff-408391c43b2372cfe1fefb3e1c2531df0184ed711f46d229b08964ec9e8fa8c7R118)
> // on Darwin, TLS can not be accessed across .so`
This failures only happened when configuring with cppflags=-DRUBY_DEBUG and -O3
on ppc64le.
The reproducing steps were below.
```
$ ./autogen.sh
$ ./configure -C --disable-install-doc cppflags=-DRUBY_DEBUG
$ make -j4
$ make btest BTESTS=bootstraptest/test_ractor.rb
...
FAIL 2/147 tests failed
make: *** [uncommon.mk:913: yes-btest] Error 1
```
The steps with a reproducing script based on the `bootstraptest/test_ractor.rb`
were below.
```
$ cat test_ractor_1.rb
counts = []
counts << Ractor.count
p counts.inspect
ractors = (1..2).map { Ractor.new { Ractor.receive } }
counts << Ractor.count
p counts.inspect
ractors[0].send('End 0').join
sleep 0.1 until ractors[0].inspect =~ /terminated/
counts << Ractor.count
p counts.inspect
ractors[1].send('End 1').join
sleep 0.1 until ractors[1].inspect =~ /terminated/
counts << Ractor.count
p counts.inspect
$ make run TESTRUN_SCRIPT=test_ractor_1.rb
...
vm_core.h:2017: Assertion Failed: rb_current_execution_context:ec == rb_current_ec_noinline()
...
```
The assertion failure happened at the following line.
f3206cc79b/vm_core.h (L2017)
This fix is similar with the following commit for the arm64.
f7059af50a
Fixes [Bug #21534]
This change makes `RubyVM::AST.of` and `.node_id_for_backtrace_location`
return a parent node of NODE_SCOPE (such as NODE_DEFN) instead of the
NODE_SCOPE node itself.
(In future, we may remove NODE_SCOPE, which is a bit hacky AST node.)
This is preparation for [Feature #21543].
There is a high likelyhood that `rb_obj_fields` is called
consecutively for the same object.
If we keep a cache of the last IMEMO/fields we interacted with,
we can save having to lookup the `gen_fields_tbl`, synchronize
the VM lock, etc.
On yjit-bench's, I instrumented the hit rate of this cache at:
- `shipit`: 38%, with 111k hits.
- `lobsters`: 59%, with 367k hits.
- `rubocop`: 100% with only 300 hits.
I also ran a micro-benchmark which shows that ivar access is:
- 1.25x faster when the cache is hit in single ractor mode.
- 2x faster when the cache is hit in multi ractor mode.
- 1.06x slower when the cache miss in single ractor mode.
- 1.01x slower when the cache miss in multi ractor mode.
```yml
prelude: |
class GenIvar < Array
def initialize(...)
super
@iv = 1
end
attr_reader :iv
end
a = GenIvar.new
b = GenIvar.new
benchmark:
hit: a.iv; a.iv; a.iv; a.iv; a.iv; a.iv; a.iv; a.iv; a.iv; a.iv; a.iv; a.iv; a.iv; a.iv; a.iv; a.iv; a.iv; a.iv; a.iv; a.iv;
miss: a.iv; b.iv; a.iv; b.iv; a.iv; b.iv; a.iv; b.iv; a.iv; b.iv; a.iv; b.iv; a.iv; b.iv; a.iv; b.iv; a.iv; b.iv; a.iv; b.iv;
```
Single ractor:
```
compare-ruby: ruby 3.5.0dev (2025-08-12T02:14:57Z master 428937a536) +YJIT +PRISM [arm64-darwin24]
built-ruby: ruby 3.5.0dev (2025-08-12T09:25:35Z gen-fields-cache 9456c35893) +YJIT +PRISM [arm64-darwin24]
warming up..
| |compare-ruby|built-ruby|
|:-----|-----------:|---------:|
|hit | 4.090M| 5.121M|
| | -| 1.25x|
|miss | 3.756M| 3.534M|
| | 1.06x| -|
```
Multi-ractor:
```
compare-ruby: ruby 3.5.0dev (2025-08-12T02:14:57Z master 428937a536) +YJIT +PRISM [arm64-darwin24]
built-ruby: ruby 3.5.0dev (2025-08-12T09:25:35Z gen-fields-cache 9456c35893) +YJIT +PRISM [arm64-darwin24]
warming up..
| |compare-ruby|built-ruby|
|:-----|-----------:|---------:|
|hit | 2.205M| 4.460M|
| | -| 2.02x|
|miss | 2.117M| 2.094M|
| | 1.01x| -|
```
One of the biggest remaining contention point is `RClass.cc_table`.
The logical solution would be to turn it into a managed object, so
we can use an RCU strategy, given it's read heavy.
However, that's not currently possible because the table can't
be freed before the owning class, given the class free function
MUST go over all the CC entries to invalidate them.
However if the `CC->klass` reference is weak marked, then the
GC will take care of setting the reference to `Qundef`.
When sharing between threads we need both atomic reads and writes. We
probably didn't need to use this in some cases (where we weren't running
in multi-ractor mode) but I think it's best to be consistent.