This commit adds a field handle_weak_references to rb_data_type_struct for
the callback when handling weak references. This avoids TypedData objects
from needing to expose their rb_data_type_struct and weak references function.
Currently, root fibers of threads do not have a corresponding Ruby object
backing it by default (it does have one when an object is required, such
as when Fiber.current is called). This is a problem for the new GC weak
references design in #12606 since Thread is not declared as having weak
references but it does hold weak references (the generic ivar cache).
This commit changes it to always allocate a Fiber object for the root
fiber.
* Document Range#to_set
* Update Thread#raise and Fiber#raise signatures and docs
* Add reference to String#strip to character_selectors.rdoc
* Update *nil docs when calling methods
* Enhance Array#find and #rfind docs
* Add a notice to Kernel#raise about cause:
Mutexes spend a significant amount of time in `rb_fiber_serial`
because it can't be inlined (except with LTO).
The fiber struct is opaque the so function can't be defined as inlineable.
Ideally the while fiber struct would not be opaque to the rest of
Ruby core, but it's tricky to do.
Instead we can store the fiber serial in the execution context
itself, and make its access cheaper:
```
$ hyperfine './miniruby-baseline --yjit /tmp/mut.rb' './miniruby-inline-serial --yjit /tmp/mut.rb'
Benchmark 1: ./miniruby-baseline --yjit /tmp/mut.rb
Time (mean ± σ): 4.011 s ± 0.084 s [User: 3.977 s, System: 0.011 s]
Range (min … max): 3.950 s … 4.245 s 10 runs
Benchmark 2: ./miniruby-inline-serial --yjit /tmp/mut.rb
Time (mean ± σ): 3.495 s ± 0.150 s [User: 3.448 s, System: 0.009 s]
Range (min … max): 3.340 s … 3.869 s 10 runs
Summary
./miniruby-inline-serial --yjit /tmp/mut.rb ran
1.15 ± 0.05 times faster than ./miniruby-baseline --yjit /tmp/mut.rb
```
```ruby
i = 10_000_000
mut = Mutex.new
while i > 0
i -= 1
mut.synchronize { }
mut.synchronize { }
mut.synchronize { }
mut.synchronize { }
mut.synchronize { }
mut.synchronize { }
mut.synchronize { }
mut.synchronize { }
mut.synchronize { }
mut.synchronize { }
end
```
Previously this held a pointer to the Fiber itself, which requires
marking it (which was only implemented recently, prior to that it was
buggy). Using a monotonically increasing integer instead allows us to
avoid having a free function and keeps everything simpler.
My main motivations in making this change are that the root fiber lazily
allocates self, which makes the writebarrier implementation challenging
to do correctly, and wanting to avoid sending Mutexes to the remembered
set when locked by a short-lived Fiber.
This is due to the way MMTK frees objects, which is on another native thread.
Due to this, there's no `ec` so we can't grab the VM Lock.
This was causing issues in release builds of MMTK on CI like:
```
/home/runner/work/ruby/ruby/build/ruby(sigsegv+0x46) [0x557905117ef6] ../src/signal.c:948
/lib/x86_64-linux-gnu/libc.so.6(0x7f555f845330) [0x7f555f845330]
/home/runner/work/ruby/ruby/build/ruby(rb_ec_thread_ptr+0x0) [0x5579051d59d5] ../src/vm_core.h:2087
/home/runner/work/ruby/ruby/build/ruby(rb_ec_ractor_ptr) ../src/vm_core.h:2036
/home/runner/work/ruby/ruby/build/ruby(rb_current_execution_context) ../src/vm_core.h:2105
/home/runner/work/ruby/ruby/build/ruby(rb_current_ractor_raw) ../src/vm_core.h:2104
/home/runner/work/ruby/ruby/build/ruby(rb_current_ractor) ../src/vm_core.h:2112
/home/runner/work/ruby/ruby/build/ruby(rb_current_ractor) ../src/vm_core.h:2110
/home/runner/work/ruby/ruby/build/ruby(vm_locked) ../src/vm_sync.c:15
/home/runner/work/ruby/ruby/build/ruby(rb_vm_lock_enter_body) ../src/vm_sync.c:141
/home/runner/work/ruby/ruby/build/ruby(rb_vm_lock_enter+0xa) [0x557905390a5a] ../src/vm_sync.h:76
/home/runner/work/ruby/ruby/build/ruby(fiber_pool_stack_release) ../src/cont.c:777
/home/runner/work/ruby/ruby/build/ruby(fiber_stack_release+0xe) [0x557905392075] ../src/cont.c:919
/home/runner/work/ruby/ruby/build/ruby(cont_free) ../src/cont.c:1087
/home/runner/work/ruby/ruby/build/ruby(fiber_free) ../src/cont.c:1180
```
This would have ran into an assertion error in a debug build but we don't run debug builds of MMTK on Github's CI.
Co-authored-by: john.hawthorn@shopify.com
* Add support for `cause:` argument to `Fiber#raise` and `Thread#raise`.
The implementation behaviour is consistent with `Kernel#raise` and
`Exception#initialize` methods, allowing the `cause:` argument to be
passed to `Fiber#raise` and `Thread#raise`. This change ensures that
the `cause:` argument is handled correctly, providing a more consistent
and expected behavior when raising exceptions in fibers and threads.
[Feature #21360]
* Shared specs for Fiber/Thread/Kernel raise.
---------
Co-authored-by: Samuel Williams <samuel.williams@shopify.com>
When creating fibers in multiple ractors at the same time there were
issues with the manipulation of this structure, causing segfaults.
I didn't add any tests for this because I'm making a more general
PR in the very near future to be able to run test methods (test-all suite)
inside multiple ractors at the same time. This is how this bug was
caught, running test/ruby/test_fiber.rb inside 10 ractors at once.
we had been using a stub weak definition of `mprotect` in wasm/missing.c
so far, but wasi-sdk 23 added mprotect emulation to wasi-libc[^1], so the
emulation is now linked instead. However, the emulation doesn't support
PROT_NONE and fails with ENOSYS, so we need to avoid calling mprotect
completely on WASI.
[^1]: 7528b13170
The fiber pool allocations form a singly-linked list, so when we're
running with RUBY_FREE_AT_EXIT we need to walk the linked list freeing
each element, otherwise it can be detected as a memory leak.
Use PR_SET_VMA_ANON_NAME to set human-readable names for anonymous
virtual memory areas mapped by `mmap()` when compiled and run on Linux
5.17 or higher. This makes it convenient for developers to debug mmap.
28a1c4f33e3349a98c04b8e068d9c674eb936064 seems to call an improper
ensure clause. [Bug #20655]
Than fixing it properly, I bet it would be much better to simply revert
that commit. It reduces the unneeded complexity. Jumping into a block
called by a C function like Hash#each with callcc is user's fault.
It does not need serious support.
This reverts commit dfa0897de89251a631a67460b941cd24a14c9b55.
This commit accidentally included some change in `parse.h`. Reverting
and re-applying the relevant changes.
Currently, fiber stacks are marked separately from the rest of the
execution context. The fiber code deliberately does _NOT_ set
ec->machine.stack_end on the saved EC, so that the code in
`rb_execution_context_mark` does not mark it; instead, the stack marking
is done in `cont_mark`.
Instead, we can set ec->machine.stack_end, and skip out on doing the
stack marking separately in `cont_mark`; that way, all machine stack
marking shares the same code (which does the nescessary ASAN things).
[Bug #20310]
callcc's implementation is fundamentally incompatible with ASAN. Since
callcc is deprecated and almost never used, it's probably OK to disable
callcc when ruby is compiled with ASAN.
[Bug #20273]
When a forked process was started in a thread, this would result in a
double-free during the child process exit.
RUBY_FREE_AT_EXIT=1 ./miniruby -e 'Thread.new { fork { } }.join; Process.waitpid'
This is because the main thread in the forked process was not the
initial VM thread, and the new thread's stack was freed as part of
objectspace iteration.
This change also allows rb_threadptr_root_fiber_release to run without
EC being available.
when the RUBY_FREE_ON_SHUTDOWN environment variable is set, manually free memory at shutdown.
Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
Co-authored-by: Peter Zhu <peter@peterzhu.ca>
* YJIT: Cancel on-stack jit_return on invalidation
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
* Use RUBY_VM_CONTROL_FRAME_STACK_OVERFLOW_P
---------
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
Contrary to my initial assumption, rb_threadptr_root_fiber_setup() is
called for each Ractor, not just once at boot. I changed the place to
call rb_jit_cont_init() to avoid calling it multiple times.