* Consistent and clear.
* Avoids the confusion that "column number" might be understood
as a column in an editor starting at 1 (they all start at 0).
https://github.com/ruby/prism/commit/91f1c4b9d5
In the C API, we want to use slices instead of locations in the
AST. In this case a "slice" is effectively the same thing as the
location, expect it is represented using a 32-bit offset and a
32-bit length. This will cut down on half of the space of all of
the locations in the AST.
Note that from the Ruby/Java/JavaScript side, this is effectively
an invisible change. This only impacts the C/Rust side.
Since `on_sp` is emitted, it doesn't do a whole lot anymore.
This leaves one incompatibility for code like `"x#$%"`
Ripper confuses this for bare interpolation with a global, but `$%` is not a valid global name. Still,
it emits two string tokens in such a case. It doesn't make sense for prism to work around this bug,
so the affected files are added as excludes.
Since the only usage of this method makes sense for testing in prism itself,
the method is removed instead of deprecated.
https://github.com/ruby/prism/commit/31be379f98
We would like to do type matching on the VRegId. Extracting the VRegID
from a usize makes the code a bit easier to understand and refactor.
MemBase uses a VReg, and there is also a VReg in Opnd. We should be
sharing types between these two, so this is a step in the direction of
sharing a type
Continually locking a mutex m can lead to starvation if all other threads are on the waitq of m.
See https://bugs.ruby-lang.org/issues/21840 for more details.
Solution:
When a thread `T1` wakes up `T2` during mutex unlock but `T1` or any other thread successfully acquires it
before `T2`, then we record the `running_time` of the thread during mutex acquisition. Then during unlock, if
that thread's running_time is less than the saved running time, we set it back to the saved time.
Fixes [Bug #21840]
It relies too much on VM level concerns, such that it can't be built
with modular GC enabled.
We'll move it into the VM, and then expose it to the GC
implementations so they can use it.
timer_thread_check_exceed() was returning true when the remaining time
was less than 1ms, treating it as "too short time". This caused
sub-millisecond sleeps (like sleep(0.0001)) to return immediately
instead of actually sleeping.
The fix removes this optimization that was incorrectly short-circuiting
short sleep durations. Now the timeout is only considered exceeded when
the actual deadline has passed.
Note: There's still a separate performance issue where MN_THREADS mode
is slower for sub-millisecond sleeps due to the timer thread using
millisecond-resolution polling. This will require a separate fix to
use sub-millisecond timeouts in kqueue/epoll.
[Bug #21836]
This reverts commit 23f9a0d655c4d405bb2397a147a1523436205486, "win32:
Strip CR from batch files", and add CR to the other batch files too.
`cmd.exe` seems to work well with LF at a glance, but sometimes `goto`
jumps to an unexpected line. This is probably because it is looking
for the beginning of a line, assuming that all lines end with CRLF,
and as a result mistaking the `goto` operand for a label.
- ### Problem
If you have a `version` in your config file (this feature was
introduced in #6817), then running any `bundle` command will
make Bundler re-exec and ultimately run the `bundle` binstub twice.
### Details
When the `bundle` binstub gets executed, a `require "bundler"` is
evaluated. RubyGems tries to require the `bundler.rb` file from
the right `bundler` gem (in the event where you have multiple
bundler versions in your system).
RubyGems will prioritize a bundler version based on a few
heurisitics.
b50c40c92a/lib/rubygems/bundler_version_finder.rb (L19-L21)
This prioritize logic doesn't take into account the bundler version
a user has specific in this config. So what happens is:
1. User execute the `bundle` binstub
2. `require 'bundler'` is evaluated.
3. RubyGems prioritize activating the bundler version specified
in the Gemfile.lock
4. The CLI starts, and [Auto switch kicks in](b50c40c92a/bundler/lib/bundler/cli.rb (L81)). Bundler detects that
user specifed a version in its config and the current Bundler
version doesn't match.
5. Bundler exit and re-exec with the right bundler version.
### Solution
This patch introduce two fixes. First, it reads the bundler config
file and check for the local config first and then the global
config. This is because the local has precedence over global.
Second, the prioritization takes into account the version in config
and let RubyGems activate the right version in order to prevent
re-exec moments later.
Finally, I also want to fix this problem because its a step toward
fixing https://github.com/ruby/rubygems/issues/8106. I'll open
a follow up patch to explain.
https://github.com/ruby/rubygems/commit/d6e0f43133
- Fix https://github.com/ruby/rubygems/issues/9238
- ### Problem
This is an issue that bites gem maintainers from time to time, with
the most recent one in https://github.com/minitest/minitest/issues/1040#issuecomment-3679370619
The issue is summarized as follow:
1) A gem "X" has a feature in "lib/feature.rb"
2) Maintainer wants to extract this feature into its own gem "Y"
3) Maintainer cut a release of X without that new feature.
4) Users install the new version of X and also install the new
gem "Y" since the feature is now extracted.
5) When a call to "require 'feature'" is encountered, RG will
fail to load the right gem, resulting in a `LoadError`.
### Details
Now that we have two gems (old version of X and new gem Y) with
the same path, RubyGems will detect that `feature.rb` can be loaded
from the old version of X, but if the new version of X had already
been loaded, then RubyGems will raise due to versions conflicting.
```ruby
require 'x' # Loads the new version of X without the feature which was extracted.
require 'feature' # Rubygems see that the old version of X include that file and tries to activate the spec.
```
### Solution
I propose that RubyGems fallback to a spec that's not yet loaded.
We try to find a spec by its path and filter it out in case a spec
with the same name has already been loaded.
Its worth to note that RubyGems already has a
`find_inactive_by_path` but we can't use it. This method only checks
if the spec object is active and doesn't look if other spec with the
same name have been loaded. The new method we are introducing
verifies this.
https://github.com/ruby/rubygems/commit/f298e2c68e
Timeout with 0-valued timespec means try to get an event, but return
immediately if there is none. Apparently timespec can have other
members, so best to 0 it out in that case.
If an exception is raised by the SSLContext#servername_cb proc, the
handshake should be canceled by sending an "unrecognized_name" alert to
the client, and the exception should be re-raised from SSLSocket#accept.
Add more direct assertions to confirm these behaviors.
https://github.com/ruby/openssl/commit/ac8df7f30f
The errno reported in an OpenSSL::SSL::SSLError raised by
SSLSocket#accept and #connect sometimes does not match what SSL_accept()
or SSL_connect() actually encountered. Depending on the evaluation order
of arguments passed to ossl_raise(), errno may be overwritten by
peeraddr_ip_str().
While we could just fix peeraddr_ip_str(), we should avoid passing
around errno since it is error-prone. Replace rb_sys_fail() and
rb_io_{maybe_,}wait_{read,writ}able() with equivalents that do not read
errno.
https://github.com/ruby/openssl/commit/bfc7df860f
It was not clear to me that you have to do anything for this command to work.
Previous versions (for example on the 3.4 branch) had this check
but it got lost along the way.
Without this when the folder doesn't exist, you get this error (after it deleted all the files):
```
$ ./tool/sync_default_gems.rb syntax_suggest
Sync ruby/syntax_suggest
./tool/sync_default_gems.rb:464:in 'SyncDefaultGems.check_prerelease_version': undefined method 'version' for nil (NoMethodError)
puts "#{gem}-#{spec.version} is not latest version of rubygems.org" if spec.version.to_s != latest_version
^^^^^^^^
from ./tool/sync_default_gems.rb:436:in 'SyncDefaultGems.sync_default_gems'
from ./tool/sync_default_gems.rb:942:in '<module:SyncDefaultGems>'
from ./tool/sync_default_gems.rb:10:in '<main>'
```
Now you get
```
$ ./tool/sync_default_gems.rb syntax_suggest
Sync ruby/syntax_suggest
Expected '../ruby/syntax_suggest' (/home/earlopain/Documents/ruby/syntax_suggest) to be a directory, but it didn't exist.
```
This was changed in b722631b48
Since then, `sync_lib` is unused, delete it
After Kokubun requested named unions, I realized we don't actually need
a `Type::subtract` function. They were only used for the ad-hoc unions.
Also, add a test that is illustrative of what we can get from this
partial SSI.
This is a follow up to #15816. Since I was only optimizing `invokesuper` for monomorphic cases, I could track that with a boolean value (actually, `Option` in this case). But, `TypeDistribution` is a better way to track this information and will put us on better footing if we end up handling polymorphic cases.
Do a sort of "partial static single information (SSI)" form that learns
types of operands from branch instructions. A branchif, for example,
tells us that in the truthy path, we know the operand is not nil, and
not false. Similarly, in the falsy path, we know the operand is either
nil or false.
Add a RefineType instruction to attach this information.
This PR does this in SSA construction because it's pretty
straightforward, but we can also do a more aggressive version of this
that can learn information about e.g. int ranges from other checks later
in the optimization pipeline.
This PR is a follow-up to #15816. There, I introduced the `GuardSuperMethodEntry` HIR instruction and that needed the LEP. The LEP was also used by `GetBlockHandler`. Consequently, the codegen for `invokesuper` ended up loading the LEP twice. By introducing a new HIR instruction, we can load the LEP once and use it in both `GetBlockHandler` and `GuardSuperMethodEntry`.
I also updated `IsBlockGiven`, which conditionally loaded the LEP. To ensure we only use `GetLEP` in the cases we need it, I lifted most of the `IsBlockGiven` handler to HIR. As an added benefit, this addressed a TODO that @tekknolagi had written: when `block_given?` is called outside of a method we can rewrite to a constant `false`.
We could use `GetLEP` in the handling of `Defined`, but that looked a bit more involved and I wanted to keep this PR focused, so I'm suggesting we handle that as future work.
We want to use [linear scan register allocation](https://bernsteinbear.com/blog/linear-scan/), but a prerequisite is having a CFG available. Previously LIR only had a linear block of instructions, this PR introduces a CFG to the LIR backend. I've done my best to ensure that the "hot path" machine code we generate is the same (as I was testing I noticed that side exit machine code was being dumped in a different order).
This PR doesn't make any changes to the existing register allocator, it simply introduces a CFG to LIR. The basic blocks in the LIR CFG always start with a label (the first instruction is a label) and the last 0, 1, or 2 instructions will be jump instructions. No other jump instructions should appear mid-block.
The reason this logic for different methods branches in the class instead of internally was to be eagerly aggressive about runtime performance. This code is currently only used once for the document where it's invoked ~N times (where N is number of lines):
```ruby
module SyntaxSuggest
class CleanDocument
# ...
def join_trailing_slash!
trailing_groups = @document.select(&:trailing_slash?).map do |code_line|
take_while_including(code_line.index..) { |x| x.trailing_slash? }
end
join_groups(trailing_groups)
self
end
```
Since this is not currently a hot-spot I think merging the branches and using a case statement is a reasonable tradeoff and avoids the need to do specific version testing.
An alternative idea was presented in #241 of behavior-based testing for branch logic (which I would prefer), however, calling the code triggered requiring a `DelegateClass` when the `syntax_suggest/api` is being required.
https://github.com/ruby/syntax_suggest/commit/ab122c455f
The actual algorithm is largely unchanged, just allowed to use
singlebyte checks for common encodings.
It could certainly be optimized much further, as here again it often
scans from the front of the string when we're interested in the back of
it. But the algorithm as many Windows only corner cases so I'd rather
ship a good improvement now and eventually come back to it later.
Most of improvement here is from the reduced setup cost (avodi double
null checks, avoid duping the argument, etc), and skipping the multi-byte
checks.
```
compare-ruby: ruby 4.1.0dev (2026-01-19T03:51:30Z master 631bf19b37) +PRISM [arm64-darwin25]
built-ruby: ruby 4.1.0dev (2026-01-21T08:21:05Z opt-basename 7eb11745b2) +PRISM [arm64-darwin25]
```
| |compare-ruby|built-ruby|
|:----------|-----------:|---------:|
|long | 3.412M| 18.158M|
| | -| 5.32x|
|long_name | 1.981M| 8.580M|
| | -| 4.33x|
|withext | 3.200M| 12.986M|
| | -| 4.06x|
There is no splitting for these so let's add a assert to try and catch
misuse. VRegs are not necessarily registers in the end, so this is best
effort. In those situations they'll get a less proximate panic message.
Since we automatically preserve registers across calls, it's never
necessary to manually and imprecisely do it with `C{Push,Pop}All`.
Delete them to remove the maintenance burden and reduce confusion.
You're supposed to return the first argument.
```rb
# Before
[[:stmts_new], [:rescue_mod, nil, nil], [:stmts_add, nil, nil], [:program, nil]]
# After
[[:stmts_new], [:rescue_mod, "1", "2"], [:stmts_add, nil, "1"], [:program, nil]]
```
The correct result would be:
`[[:rescue_mod, "1", "2"], [:stmts_new], [:stmts_add, nil, "1"], [:program, nil]]`
But the order depends on the prism AST so it seems very difficult to match.
https://github.com/ruby/prism/commit/94e0107729
Resolves https://github.com/Shopify/ruby/issues/880
Implemented this by using the code generation for `GuardType` as a reference.
Not sure if this is the best way to go about it, but it seems to work.
As `TARGET_SO_DIR_TIMESTAMP` contains `ruby_version`, after bumping
`RUBY_ABI_VERSION` it should not be existing. Usually such outdated
files will be removed by `make outdate-bundled-gems` automatically
invoked by `make up`.
* Handle line continuations.
* Handle space at the end of file in LexCompat.
https://github.com/ruby/prism/commit/32bd13eb7d
Co-authored-by: Earlopain <14981592+Earlopain@users.noreply.github.com>
Similar optimizations to the ones performed in GH-15907.
- Skip the expensive multi-byte encoding handling for the common
encodings that are known to be safe.
- Use `CheckPath` to save on copying the argument and only scan it for
NULL bytes once.
- Create the return string with rb_enc_str_new instead of rb_str_subseq
as it's going to be a very small string anyway.
This could be optimized a little bit further by searching for both `.` and `dirsep`
in one pass,
```
compare-ruby: ruby 4.1.0dev (2026-01-19T03:51:30Z master 631bf19b37) +PRISM [arm64-darwin25]
built-ruby: ruby 4.1.0dev (2026-01-20T07:33:42Z master 6fb50434e3) +PRISM [arm64-darwin25]
```
| |compare-ruby|built-ruby|
|:----------|-----------:|---------:|
|long | 3.606M| 22.229M|
| | -| 6.17x|
|long_name | 2.254M| 13.416M|
| | -| 5.95x|
|short | 16.488M| 29.969M|
| | -| 1.82x|
`strrdirsep` quite innficiently search for the last separator from the front
of the string.
This is surprising but necessary because in Shift-JS, `0x5c` can
be the second byte of some multi-byte characters, as such it's
not possible to do a pure ASCII search. And it's even more costly
because for each character we need to do expensive checks to
handle this possibility.
However in the overwhelming majority of cases, paths are encoded
in UTF-8 or ASCII, so for these common encodings we can use the
more logical and efficient algorithm.
```
compare-ruby: ruby 4.1.0dev (2026-01-17T14:40:03Z master 00a3b71eaf) +PRISM [arm64-darwin25]
built-ruby: ruby 4.1.0dev (2026-01-19T07:43:57Z file-dirname-lower.. a8d3535e5b) +PRISM [arm64-darwin25]
```
| |compare-ruby|built-ruby|
|:------|-----------:|---------:|
|long | 3.974M| 23.674M|
| | -| 5.96x|
|short | 15.281M| 29.034M|
| | -| 1.90x|
- `str_null_check` was performed twice, once by `FilePathStringValue`
and a second time by `StringValueCStr`.
- `StringValueCStr` was checking for the terminator presence, but we
don't care about that.
- `FilePathStringValue` calls `rb_str_new_frozen` to ensure `fname`
isn't mutated, but that's costly for such a check. Instead we
can do it in debug mode only.
- `rb_enc_get` is slow because it accepts arbitrary objects, even immediates,
so it has to do numerous type checks. Add a much faster `rb_str_enc_get`
when we know we're dealing with a string.
- `rb_enc_copy` is slow for the same reasons, since we already have the
encoding, we can use `rb_enc_str_new` instead.
Previously, there were a lot of nops after conditional branches. They
come from branch to LIR labels:
./miniruby --zjit-call-threshold=1 --zjit-dump-disasm -e 'Object || String'
# Insn: v14 CheckInterrupts
# RUBY_VM_CHECK_INTS(ec)
ldur w2, [x20, #0x20]
tst w2, w2
b.ne #0x120900278
nop
nop
nop
nop
nop
# Insn: v15 Test v11
tst x0, #-5
mov x2, #0
mov x3, #1
csel x2, x2, x3, eq
# Insn: v16 IfTrue v15, bb3(v6, v11)
tst x2, x2
b.eq #0x120900198
nop
nop
nop
nop
nop
They gunk up the disassembly and can't be helpful for speed. This commit
removes them. I think they were accidentally inherited from certain YJIT
branches that require padding for patching. ZJIT doesn't have these
requirements.
Use a single branch instruction for conditional branches to labels; Jmp
already uses a single `B` instruction. This will work for assemblers
that generate less than ~260,000 instructions -- plenty.
Let the CodeBlock::label_ref() callback return a failure, so we can
fail compilation instead of panicking in case we do get large offsets.
Related to [Bug #21842].
* rb_interned_str: document what decides whether the returned string is
in US-ASCII or BINARY encoding.
* rb_interned_str_cstr: include the same description as rb_interned_str
for the encoding. This one was still missing the update for US-ASCII
and erroneously said the returned string was alwasy in BINARY encoding
* rb_str_to_interned_str: document how the encoding of the result is
defined.
Co-authored-by: Herwin <herwinw@users.noreply.github.com>
`chompdirsep` searches from the start of the string each time, which
perhaps is necessary for certain encodings (not even sure?) but for
the common encodings it's very wasteful. Instead we can start from the
back of the string and only compare one or two characters in most cases.
Also replace `StringValueCStr` for the simpler `rb_str_null_check`
as we only care about whether the string contains `NULL` bytes, we
don't care whether it is NULL terminated or not.
We also only check the final string for NULLs.
```
compare-ruby: ruby 4.1.0dev (2026-01-17T14:40:03Z master 00a3b71eaf) +PRISM [arm64-darwin25]
built-ruby: ruby 4.1.0dev (2026-01-18T12:55:15Z spedup-file-join 5948e92e03) +PRISM [arm64-darwin25]
warming up....
| |compare-ruby|built-ruby|
|:-------------|-----------:|---------:|
|two_strings | 2.477M| 19.317M|
| | -| 7.80x|
|many_strings | 547.577k| 10.298M|
| | -| 18.81x|
|array | 515.280k| 523.291k|
| | -| 1.02x|
|mixed | 621.840k| 635.422k|
| | -| 1.02x|
```
`File.join` is a hotspot for common libraries such as Zeitwerk
and Bootsnap. It has a fairly flexible signature, but 99% of
the time it's called with just two (or a small number of) UTF-8 strings.
If we optimistically optimize for that use case we can cut down a large
number of type and encoding checks, significantly speeding up the method.
The one remaining expensive check we could try to optimize is `str_null_check`.
Given it's common to use the same base string for joining, we could memoize it.
Also we could precompute it for literal strings.
```
compare-ruby: ruby 4.1.0dev (2026-01-17T14:40:03Z master 00a3b71eaf) +PRISM [arm64-darwin25]
built-ruby: ruby 4.1.0dev (2026-01-18T12:10:38Z spedup-file-join 069bab58d4) +PRISM [arm64-darwin25]
warming up....
| |compare-ruby|built-ruby|
|:-------------|-----------:|---------:|
|two_strings | 2.475M| 9.444M|
| | -| 3.82x|
|many_strings | 551.975k| 2.346M|
| | -| 4.25x|
|array | 514.946k| 522.034k|
| | -| 1.01x|
|mixed | 621.236k| 633.189k|
| | -| 1.02x|
```
InvokeProc and HIR effects landed without an intermediate rebase so we
got a conflict in the form of a type checker error (not handled new
opcode in a new function).
**Progress**
I've added a new directory, `zjit/src/hir_effect`. It follows the same structure as `zjit/src/hir_type` and includes:
- a ruby script to generate a rust file containing a bitset of effects we want to track
- a modified `hir.rs` to include an `effects_of` function that catalogs effects for each HIR instruction, similar to `infer_type`. Right now these effects are not specialized, all instructions currently return the top of the lattice (any effect)
- a module file for effects at `zjit/src/hir_effect/mod.rs` that again, mirrors `zjit/src/hir_type/mod.rs`. This contains a lot of helper functions and lattice operations like union and intersection
**Design Idea**
The effect system is bitset-based rather than range-based. This is the first kind of effect system described in [Max's blog post](https://bernsteinbear.com/blog/compiler-effects/).
Practically, having effects defined for each HIR instruction should allow us to have better generalization than the implicit effect system we have for c functions that we annotation as elidable, leaf, etc. Additionally, this could allow us to reason about the effects of multiple HIR instructions unioned together, something I don't believe currently exists.
**Practical Goals**
This PR replaces `has_effects` with a new effects-based `is_elidable` function. This has no behavior change to the JIT, but will make it easier to reason about effects of basic blocks and CCalls with the new design. We may be able to accomplish other quality of life improvements, such as consolidation of `nogc`, `leaf`, and other annotations.
This is everything that `irb` uses. It works in their test-suite, but there are 20 failures when using the shim that I haven't looked into at all.
`parse` is not used by `irb`. `scan` is, and it's basically `parse` but also including errors. `irb` doesn't seem to care about the errors, so I didn't implement that.
https://github.com/ruby/prism/commit/2c5826b39f
This patch silences the "this won't work in the next version of Ruby"
warning displayed when irb is autoloaded via `binding.irb`.
main.rb:1: warning: irb used to be loaded from the standard library, but is not part of the default gems since Ruby 4.0.0.
You can add irb to your Gemfile or gemspec to fix this error.
/.../irb.rb:9: warning: reline used to be loaded from the standard library, but is not part of the default gems since Ruby 4.0.0.
You can add reline to your Gemfile or gemspec to fix this error.
From: main.rb @ line 1 :
=> 1: binding.irb
/.../input-method.rb:284: warning: rdoc used to be loaded from the standard library, but is not part of the default gems since Ruby 4.0.0.
You can add rdoc to your Gemfile or gemspec to fix this error.
This warning is incorrect and misleading: users should not need to
include irb (and its dependencies) to their Gemfiles to use
`binding.irb`, even in future versions of Ruby. It is agreed that the
runtime takes care of that.
This patch fixes a problem where `binding.irb` (= force_activate('irb'))
fails under `bundle exec` when the Gemfile does not contain `irb` and
does contain a gem which is (1) not installed in GEM_HOME (2) sourced
using `path:`/`git:`.
The original approach constructing a temporary definition fails since
it does not set the equalivent of `path:`/`git:`.
Always reconstructing a definition from a Gemfile and applying lockfile
constraints should be a more robust approach.
[Bug #21723]
The previous example code was too complex and includes extra logics
that's not relevant to its main usage: `bind`.
The new example code focuses on `bind_call` so that readers can
understand how it works more easily.
* ZJIT: Profile `invokesuper` instructions
* ZJIT: Introduce the `InvokeSuperDirect` HIR instruction
The new instruction is an optimized version of `InvokeSuper` when we know the `super` target is an ISEQ.
* ZJIT: Expand definition of unspecializable to more complex cases
* ZJIT: Ensure `invokesuper` optimization works when the inheritance hierarchy is modified
* ZJIT: Simplify `invokesuper` specialization to most common case
Looking at ruby-bench, most `super` calls don't pass a block, which means we can use the already optimized `SendWithoutBlockDirect`.
* ZJIT: Track `super` method entries directly to avoid GC issues
Because the method entry isn't typed as a `VALUE`, we set up barriers on its `VALUE` fields. But, that was insufficient as the method entry itself could be collected in certain cases, resulting in dangling objects. Now we track the method entry as a `VALUE` and can more naturally mark it and its children.
* ZJIT: Optimize `super` calls with simple argument forms
* ZJIT: Report the reason why we can't optimize an `invokesuper` instance
* ZJIT: Revise send fallback reasons for `super` calls
* ZJIT: Assert `super` calls are `FCALL` and don't need visibily checks
Previously, cpop_all() did not in fact restore the register mapping
state since it was effectively doing a no-op
`self.ctx.set_reg_mapping(self.ctx.get_reg_mapping())`. This desync in
bookkeeping led to issues with the --yjit-dump-insns option because
print_str() used to use cpush_all() and cpop_all().
cpush_all() and cpop_all() in theory enabled these `print_*` utilities
to work in more spots, but with automatically spilling in asm.ccall(),
the benefits are now limited. They also have a bug at the moment. Stop
using them to dodge the bug.
io.c: pre-allocate IO.select result arrays based on input size
The ternary (rp?rb_ary_new():rb_ary_new2(0)) became pointless after
commit a51f30c671 (Variable Width Allocation, Mar 2022) made both
rb_ary_new() and rb_ary_new2(0) equivalent.
Instead of just removing the dead code, improve on the original intent
by pre-allocating based on the actual input array size. This avoids
reallocations when many FDs are ready.
Benchmark (100 ready FDs): ~8% improvement (5.59 -> 5.11 us/op)
Resolves https://github.com/Shopify/ruby/issues/915
When we have `LoadField` with a `Shape` return type, we can fold it similar to the object case.
`GuardBitEquals` can be removed when the argument is `Const` and the values are equal.
The behaviors for loading instances variables from frozen/dynamic objects are already covered in existing tests so no new tests were added.
Commit 981ee02c7c ("Fix performance problem with /k/i and /s/i") was
merged for Ruby 4.0 to enable partial Boyer-Moore optimization for
patterns containing 's' or 'k' by using the prefix before those
characters.
However, when 's' or 'k' appears at the start of a pattern (no usable
prefix), set_bm_skip() returns 0 and the code returned early without
setting any optimization mode, leaving reg->optimize at
ONIG_OPTIMIZE_NONE. This caused up to 30x slowdown for patterns like
/slackware/i when matched against strings with non-ASCII characters.
This patch keeps the improvement from 981ee02c7c for patterns with
3+ char prefix, while fixing the regression by falling back to
ONIG_OPTIMIZE_EXACT_IC with the full pattern when the usable prefix
is less than 3 characters.
Before: /\bslackware\b/i with non-ASCII string: 2.24 us/op
After: /\bslackware\b/i with non-ASCII string: 0.70 us/op (3.2x faster)
[Bug #21824]
on `has_commit` check for the `backport` command.
I don't maintain local "master" branch on my ruby repository for stable
branch maintenance. I want just running `git fetch origin` to make it
work. It should work for those who pull origin/master into their local
master too.
One per version seems excessive.
Do note that `rubocop-ast` used to require individual parser files. I wouldn't consider that to be part of the API since everything is autoloaded.
From a GitHub code search, I didn't find anyone else doing it like that.
https://github.com/ruby/prism/commit/458f622c34
When onig_reg_init() returns an error, onig_free_body() which is called
via onig_new() may crash because some members are not properly
initialized. Fix it.
https://github.com/k-takata/Onigmo/commit/d2a090a57e
The 'w' format (BER compressed integer) was allocating an empty
string with rb_str_new(0, 0) then immediately overwriting it with
the correctly-sized allocation. Remove the wasted first allocation.
~50% improvement on BER pack benchmarks.
RSTRUCT_LEN / RSTRUCT_GET / RSTRUCT_SET all existing in two
versions, one public that does type and frozens checks
and one private that doesn't.
The problem is that this is error prone because the public version
is always accessible, but the private one require to include
`internal/struct.h`. So you may have some code that rely on the
public version, and later on the private header is included and
changes the behavior.
This already led to introducing a bug in YJIT & ZJIT:
https://github.com/ruby/ruby/pull/15835
This batch file used `nmake` on the old `command.com` to extract the
parent directory name of this file and to get around the command line
argument length limit. However, Windows 9X support as a build host
ended over a decade ago, and this file now utilizes the functionality
of `cmd.exe` already.
Ripper exposes Ripper::Lexer:State in its output, which is a bit of a problem. To make this work, I basically copy-pasted the implementation.
I'm unsure if that is acceptable and added a test to make sure that these values never go out of sync.
I don't imagine them changing often, prism maps them 1:1 for its own usage.
This also fixed the shim by accident. `Ripper.lex` went to `Translation::Ripper.lex` when it should have been the original. Removing the need for the original resolves that issue.
https://github.com/ruby/prism/commit/2c0bea076d
If the baseruby is explicitly specified, fail because the option is
not accepted if it does not meet the requirements. If the option is
not specified, just display the warning and continue, in the hope that
it is not needed.
Follow up GH-15809
When requiring a file like "benchmark/ips", the warning system would
incorrectly warn about the "benchmark" gem not being a default gem,
even when the user has "benchmark-ips" (a separate third-party gem)
in their Gemfile.
The fix checks if a hyphenated version of the require path exists in
the bundle specs before issuing a warning. For example, requiring
"benchmark/ips" now checks for both "benchmark" and "benchmark-ips"
in the Gemfile.
[Bug #21828]
This test obtains an available port number by calling `TCPServer.new`,
then closes it and passes the same port number as `local_port` to `TCPSocket.new`.
However, `TCPSocket.new` could occasionally fail with `Errno::EADDRINUSE`
at the bind(2) step.
I believe this happens when tests are run in parallel and another process
on the same host happens to bind the same port in the short window between
closing the `TCPServer` and calling `TCPSocket.new`.
To address this race condition, the test now retries with a newly selected
available port when such a conflict occurs.
I stumbled across a bundler bug that had me scratching my head for
awhile, because I hadn't experienced it before.
In some cases when changing the source in a gemfile from a
`Source::Gemspec` to either a `Source::Path` or `Source::Git` only the
parent gem will have it's gem replaced and updated and the child
components will retain the original version. This only happens if the gem
version of the `Source::Gemspec` and `Source::Git` are the same. It also
requires another gem to share a dependency with the one being updated.
For example if I have the following gemfile:
```
gem "rails", "~> 8.1.1"
gem "propshaft"
```
Rails has a component called `actionpack` which `propshaft` depends on.
If I change `rails` to point at a git source (or path source), only the
path for `rails` gets updated:
```
gem "rails", github: "rails/rails", branch: "8-1-stable"
gem "propshaft"
```
Because `actionpack` is a dependency of `propshaft`, it will remain in
the rubygems source in the lock file WHILE the other gems are correctly
pointing to the git source.
Gemfile.lock:
```
GIT
remote: https://github.com/rails/rails.git
revision: https://github.com/ruby/rubygems/commit/9439f463e0ef
branch: 8-1-stable
specs:
actioncable (8.1.1)
...
actionmailbox (8.1.1)
...
actionmailer (8.1.1)
...
actiontext (8.1.1)
...
activejob (8.1.1)
...
activemodel (8.1.1)
...
activerecord (8.1.1)
...
activestorage (8.1.1)
...
rails (8.1.1)
...
railties (8.1.1)
...
GEM
remote: https://rubygems.org/
specs:
action_text-trix (2.1.15)
railties
actionpack (8.1.1) <===== incorrectly left in Rubygems source
...
```
The gemfile will contain `actionpack` in the rubygems source, but will
be missing in the git source so the path will be incorrect. A bundle
show on Rails will point to the correct place:
```
$ bundle show rails
/Users/eileencodes/.gem/ruby/3.4.4/bundler/gems/rails-9439f463e0ef
```
but a bundle show on actionpack will be incorrect:
```
$ bundle show actionpack
/Users/eileencodes/.gem/ruby/3.4.4/gems/actionpack-8.1.1
```
This bug requires the following to reproduce:
1) A gem like Rails that contains components that are released as their
own standalone gem is added to the gemfile pointing to rubygems
2) A second gem is added that depends on one of the gems in the first
gem (like propshaft does on actionpack)
3) The Rails gem is updated to use a git source, pointing to the same
version that is being used by rubygems (ie 8.1.1)
4) `bundle` will only update the path for Rails component gems if no
other gem depends on it.
This incorrectly leaves Rails (or any gem like it) using two different
codepaths / gem source code.
https://github.com/ruby/rubygems/commit/dff76ba4f6
Add `rustc_flags` option for configure that appends to RUSTC_FLAGS
flags used when compiling with rustc for customizable build flags.
It appends to existing defaults in RUSTC_FLAGS.
Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
Previously: In #9218 a reproduction is shared where running `bundle clean` using a binstub (`bin/bundle`) results in bundler removing itself. This results in Ruby falling back to its default bundler version. This behavior seems to be present for as long as there has been a default version of bundler (Ruby 2.6+).
Now: Bundler will explicitly add its current version number to the specs to be preserved. This prevents `bundle clean` from removing the current bundler version.
close https://github.com/ruby/rubygems/pull/9218https://github.com/ruby/rubygems/commit/e3f0167ae4
When sleeping with `sleep`, currently the main thread can get woken up from sigchld
from any thread (subprocess exited). The timer thread wakes up the main thread when this
happens, as it checks for signals. The main thread then executes the ruby sigchld handler
if one is registered and is supposed to go back to sleep immediately. This is not ideal but
it's the way it's worked for a while. In commit 8d8159e7d8 I added writes to `th->status`
before and after `wait_running_turn` in `thread_sched_to_waiting_until_wakeup`, which is
called from `sleep`. This is usually the right way to set the thread's status, but `sleep`
is an exception because the writes to `th->status` are done in `sleep_forever`. There's a
loop that checks `th->status` in `sleep_forever`. When the main thread got woken up from
sigchld it saw the changed `th->status` and continued to run the main thread instead of
going back to sleep.
The following script shows the error. It was returning instead of sleeping forever.
```ruby
t = Thread.new do
sleep 0.3
`echo hello` # Spawns subprocess
puts "Subprocess exited"
end
puts "Main thread sleeping..."
result = sleep # Should block forever
puts "sleep returned: #{result.inspect}"
```
Fixes [Bug #21812]
"Code" (when used to refer to what we create in Ruby or any other programming language) is an abstract non-count noun, so it cannot be pluralized. ("Codes" would be used when referring to specific countable things like PIN codes, which is a different use of the word "code".)
This is somewhat confusing because English allows converting count nouns into non-count nouns, and converting non-count nouns into count nouns, and because many words have both forms.
For an example of converting a non-count noun to a count noun, "water" is normally a non-count noun:
> The world is covered with water.
but people who work in restaurants often use the word as a count noun, as a shorthand for "cup of water":
> I need 7 waters on the big table by the window.
For an example of the opposite conversion, "worm" is normally a count noun:
> There are lots of worms in the puddle.
but someone might use it as a non-count noun when talking about non-distinct remains of worms:
> You have worm all over the bottom of your shoe!
So although a given noun can be flexible enough to be used in either way—even when it is unconventional—there is a definite change of meaning when using a word as a count noun or a non-count noun.
* https://github.com/ruby/ruby/actions/runs/20694508956/job/59407571754
1)
UNIXSocket.pair emulates unnamed sockets with a temporary file with a path FAILED
Expected "C:\\a\\_temp\\102424668889-2384.($)".match? /\\AppData\\Local\\Temp\\\d+-\d+\.\(\$\)\z/
to be truthy but was false
This commit adds a field handle_weak_references to rb_data_type_struct for
the callback when handling weak references. This avoids TypedData objects
from needing to expose their rb_data_type_struct and weak references function.
Before, passing the wrong number of arguments (e.g., 2) to
OpenSSL::PKey::EC::Group.new raised a generic "wrong number of
arguments"
error.
This change updates it to show the actual argument count and the
expected
options (1 or 4), making debugging easier for the user.
Example:
ArgumentError: wrong number of arguments (given 2, expected 1 or 4)
I hope it helps!
https://github.com/ruby/openssl/commit/783c99e6c7
Although the example code comments indicate that it returns `false`,
a non-matching result for `=~` is actually `nil`.
```ruby
Foo.foo.blank? #=> false
"foo".blank? #=> false
```
https://github.com/ruby/ruby/blob/v4.0.0-preview3/doc/language/box.md?plain=1#L115-L122
This PR replaces `=~` with `match?` so that it returns the expected `false`.
Since this makes the result a boolean, it also aligns with the expected behavior of
a predicate method name like `blank?`.
Based on the example, it appears that `foo.rb` and `main.rb` are expected to be in the same directory.
Since Ruby 1.9, the current directory is not included in `$LOAD_PATH` by default.
As a result, running `box.require('foo')` as shown in the sample code raises a `LoadError`:
```console
main.rb:2:in `Ruby::Box#require': cannot load such file -- foo (LoadError)
from main.rb:2:in `<main>'
```
To avoid this, it seems simplest to show either `box.require('./foo')` or `box.require_relative('foo')`.
In this PR, `box.require('foo')` is replaced with `box.require_relative('foo')` to make the intention of
using a relative path explicit.
This should reduce the chance that users trying Ruby Box will run into an unexpected error.
Anonymous memberless Structs and Data were returning `#<struct >` and
`#<data >` with a trailing space. Now they return `#<struct>` and
`#<data>` to match attrless class behavior and look a bit more compact.
Thread::Queue spends a significant amount of time in array functions,
checking for invariants we know aren't a problem, and whether the backing
array need to reordered.
By using a ring buffer we can remove a lot of overhead (~23% faster).
```
$ hyperfine './miniruby --yjit /tmp/q.rb' './miniruby-qrb --yjit /tmp/q.rb'
Benchmark 1: ./miniruby --yjit /tmp/q.rb
Time (mean ± σ): 1.050 s ± 0.191 s [User: 0.988 s, System: 0.004 s]
Range (min … max): 0.984 s … 1.595 s 10 runs
Benchmark 2: ./miniruby-qrb --yjit /tmp/q.rb
Time (mean ± σ): 844.2 ms ± 3.1 ms [User: 840.4 ms, System: 2.8 ms]
Range (min … max): 838.6 ms … 848.9 ms 10 runs
Summary
./miniruby-qrb --yjit /tmp/q.rb ran
1.24 ± 0.23 times faster than ./miniruby --yjit /tmp/q.rb
```
```
q = Queue.new([1, 2, 3, 4, 5, 6, 7, 8])
i = 2_000_000
while i > 0
i -= 1
q.push(q.pop)
q.push(q.pop)
q.push(q.pop)
q.push(q.pop)
q.push(q.pop)
q.push(q.pop)
q.push(q.pop)
q.push(q.pop)
q.push(q.pop)
q.push(q.pop)
end
```
A plain `char` may be `signed` or `unsigned` depending on the
implementation. Also, bitwise ORing of `signed` values is not
guaranteed to be `signed`. To ensure portability, should logical-OR
each comparison, but casting to `signed char` is usually sufficient.
https://github.com/ruby/json/commit/8ad744c532
The flags for `rb_data_type_t::flags` are public constants for
defining `rb_data_type_t`. The embedded data flag and mask are
internal implementation detail.
Fixes issue pointed out in https://bugs.ruby-lang.org/issues/21084#note-7.
The following script crashes:
wmap = ObjectSpace::WeakMap.new
GC.disable # only manual GCs
GC.start
GC.start
retain = []
50.times do
k = Object.new
wmap[k] = true
retain << k
end
GC.start # wmap promoted, other objects still young
retain.clear
GC.start(full_mark: false)
wmap.keys.each(&:itself) # call method on keys to cause crash
* Consistent with plain `blocks` and `for` blocks and methods
where the source_location covers their entire definition.
* Matches the documentation which mentions
"where the definition starts/ends".
* Partially reverts d357d50f0a74409446f4cccec78593373f5adf2f
which was a workaround to be compatible with parse.y.
* This reverts commit 065c48cdf11a1c4cece84db44ed8624d294f8fd5.
* This functionality is very valuable and has already taken 14 years
to agree on the API.
* Let's just document it's byte columns (in the next commit).
* See https://bugs.ruby-lang.org/issues/21783#note-9
Without this change, classes (including iclass) are allocated
as un-boxable classes after initializing user boxes (after starting
script evaluation). Under this situation, iclasses are created as
un-boxabled class when core modules are included by a class in the
root box, then it causes problems because it's in the root box but
it can't have multiple classexts.
This change makes it possible to allocate boxable classes even after
initializing user boxes. Classes create in the root box will be
boxable, and those can have 2 or more classexts.
This commit adds an `expect1_opening` function that expects a token and
attaches the error to the opening token location rather than the current
position. This is useful for errors about missing closing tokens, where
we want to point to the line with the opening token rather than the end
of the file.
For example:
```ruby
def foo
def bar
def baz
^ expected an `end` to close the `def` statement
^ expected an `end` to close the `def` statement
^ expected an `end` to close the `def` statement
```
This would previously produce three identical errors at the end of the
file. After this commit, they would be reported at the opening token
location:
```ruby
def foo
^~~ expected an `end` to close the `def` statement
def bar
^~~ expected an `end` to close the `def` statement
def baz
^~~ expected an `end` to close the `def` statement
```
I considered using the end of the line where the opening token is
located, but in some cases that would be less useful than the opening
token location itself. For example:
```ruby
def foo def bar def baz
```
Here the end of the line where the opening token is located would be the
same for each of the unclosed `def` nodes.
https://github.com/ruby/prism/commit/2d7829f060
`ALLOCA` with too large size may result in stack overflow.
Incidentally, this suppresses the GCC false maybe-uninitialized
warning in `product_each`.
Also shrink `struct product_state` when `sizeof(int) < sizeof(VALUE)`.
Fixes the following compiler warnings:
random.c: In function `random_init`:
random.c:416:38: warning: `rng` may be used uninitialized in this function [-Wmaybe-uninitialized]
416 | unsigned int major = rng->version.major;
| ~~~~~~~~~~~~^~~~~~
random.c: In function `random_bytes`:
random.c:1284:8: warning: `rng` may be used uninitialized in this function [-Wmaybe-uninitialized]
1284 | rng->get_bytes(rnd, ptr, n);
| ~~~^~~~~~~~~~~
random.c:1299:34: note: `rng` was declared here
1299 | const rb_random_interface_t *rng;
| ^~~
random.c: In function `rand_random_number`:
random.c:1606:12: warning: `rng` may be used uninitialized in this function [-Wmaybe-uninitialized]
1606 | return rand_range(obj, rng, rnd, vmax);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
random.c:1624:34: note: `rng` was declared here
1624 | const rb_random_interface_t *rng;
| ^~~
random.c: In function `random_rand`:
random.c:1120:15: warning: `rng` may be used uninitialized in this function [-Wmaybe-uninitialized]
1120 | return rng->get_int32(rnd);
| ~~~^~~~~~~~~~~
random.c:1573:34: note: `rng` was declared here
1573 | const rb_random_interface_t *rng;
| ^~~
Currently, root fibers of threads do not have a corresponding Ruby object
backing it by default (it does have one when an object is required, such
as when Fiber.current is called). This is a problem for the new GC weak
references design in #12606 since Thread is not declared as having weak
references but it does hold weak references (the generic ivar cache).
This commit changes it to always allocate a Fiber object for the root
fiber.
`fast fallback` cannot be used with explicitly specified local port,
because concurrent binds to the same `local_host:local_port`
can raise `Errno::EADDRINUSE`.
This issue is more likely to occur on hosts with `IPV6_V6ONLY` disabled,
because IPv6 binds can also occupy IPv4-mapped IPv6 address space.
Commit https://github.com/ruby/openssl/commit/1de3b80a46c2 (cipher: make output buffer String independent,
2024-12-10) ensures the output buffer String has sufficient capacity,
bu the length can be shorter. The assert() is simply incorrect and
should be removed.
Also remove a similar assert() in Cipher#final. While not incorrect, it
is not useful either.
https://github.com/ruby/openssl/commit/0ce6ab97dd
The RSET_IS_MEMBER macro had a parameter named 'sobj' but the macro
body used 'set' instead, causing the first argument to be ignored.
This worked by accident because all current callers use a variable
named 'set', but would cause compilation failure if called with a
differently named variable:
error: use of undeclared identifier 'set'
Changed the parameter name from 'sobj' to 'set' to match the macro
body and be consistent with other RSET_* macros.
```
Errno::EACCES: Permission denied @ rb_file_s_rename
...
D:/a/ruby/ruby/src/lib/rubygems/util/atomic_file_writer.rb:42:in 'File.rename'
```
It may caused with atomic_file_writer.rb
This PR updates the fallback version for `Prism::Translation::ParserCurrent` from 3.4 to 4.0.
Currently, the fallback resolves to `Parser34`, as shown below:
```console
$ ruby -v -rprism -rprism/translation/parser_current -e 'p Prism::Translation::ParserCurrent'
ruby 3.0.7p220 (2024-04-23 revision https://github.com/ruby/prism/commit/724a071175) [x86_64-darwin23]
warning: `Prism::Translation::Current` is loading Prism::Translation::Parser34, but you are running 3.0.
Prism::Translation::Parser34
```
Following the comment "Keep this in sync with released Ruby.",
it seems like the right time to set this to Ruby 4.0, which is scheduled for release this week.
https://github.com/ruby/prism/commit/115f0a118c
This change updates `write_binary` to use a new class,
`AtomicFileWriter.open` to write the gem's files. This implementation
is borrowed from Active Support's [`atomic_write`](https://github.com/rails/rails/blob/main/activesupport/lib/active_support/core_ext/file/atomic.rb).
Atomic write will write the files to a temporary file and then once
created, sets permissions and renames the file. If the file is corrupted
- ie on failed download, an error occurs, or for some other reason, the
real file will not be created. The changes made here make `verify_gz`
obsolete, we don't need to verify it if we have successfully created the
file atomically. If it exists, it is not corrupt. If it is corrupt, the
file won't exist on disk.
While writing tests for this functionality I replaced the
`RemoteFetcher` stub with `FakeFetcher` except for where we really do
need to overwrite the `RemoteFetcher`. The new test implementation is much
clearer on what it's trying to accomplish versus the prior test
implementation.
https://github.com/ruby/rubygems/commit/0cd4b54291
[Feature #21084]
# Summary
The current way of marking weak references uses `rb_gc_mark_weak(VALUE *ptr)`.
This presents challenges because Ruby's GC is incremental, meaning that if the
`ptr` changes (e.g. realloc'd or free'd), then we could have an invalid memory
access. This also overwrites `*ptr = Qundef` if `*ptr` is dead, which prevents
any cleanup to be run (e.g. freeing memory or deleting entries from hash
tables). This ticket proposes `rb_gc_declare_weak_references` which declares
that an object has weak references and calls a cleanup function after marking,
allowing the object to clean up any memory for dead objects.
# Introduction
In [[Feature #19783]](https://bugs.ruby-lang.org/issues/19783), I introduced an
API allowing objects to mark weak references, the function signature looks like
this:
```c
void rb_gc_mark_weak(VALUE *ptr);
```
`rb_gc_mark_weak` is called during the marking phase of the GC to specify that
the memory at `ptr` holds a pointer to a Ruby object that is weakly referenced.
`rb_gc_mark_weak` appends this pointer to a list that is processed after the
marking phase of the GC. If the object at `*ptr` is no longer alive, then it
overwrites the object reference with a special value (`*ptr = Qundef`).
However, this API resulted in two challenges:
1. Ruby's default GC is incremental, which means that the GC is not ran in one
phase, but rather split into chunks of work that interleaves with Ruby
execution. The `ptr` passed into `rb_gc_mark_weak` could be on the malloc
heap, and that memory could be realloc'd or even free'd. We had to use
workarounds such as `rb_gc_remove_weak` to ensure that there were no illegal
memory accesses. This made `rb_gc_mark_weak` difficult to use, impacted
runtime performance, and increased memory usage.
2. When an object dies, `rb_gc_mark_weak` only overwites the reference with
`Qundef`. This means that if we want to do any cleanup (e.g. free a piece of
memory or delete a hash table entry), we could not do that and had to defer
this process elsewhere (e.g. during marking or runtime).
In this ticket, I'm proposing a new API for weak references. Instead of an
object marking its weak references during the marking phase, the object declares
that it has weak references using the `rb_gc_declare_weak_references` function.
This declaration occurs during runtime (e.g. after the object has been created)
rather than during GC.
After an object declares that it has weak references, it will have its callback
function called after marking as long as that object is alive. This callback
function can then call a special function `rb_gc_handle_weak_references_alive_p`
to determine whether its references are alive. This will allow the callback
function to do whatever it wants on the object, allowing it to perform any
cleanup work it needs.
This significantly simplifies the code for `ObjectSpace::WeakMap` and
`ObjectSpace::WeakKeyMap` because it no longer needs to have the workarounds for
the limitations of `rb_gc_mark_weak`.
# Performance
The performance results below demonstrate that `ObjectSpace::WeakMap#[]=` is now
about 60% faster because the implementation has been simplified and the number
of allocations has been reduced. We can see that there is not a significant
impact on the performance of `ObjectSpace::WeakMap#[]`.
Base:
```
ObjectSpace::WeakMap#[]=
4.620M (± 6.4%) i/s (216.44 ns/i) - 23.342M in 5.072149s
ObjectSpace::WeakMap#[]
30.967M (± 1.9%) i/s (32.29 ns/i) - 154.998M in 5.007157s
```
Branch:
```
ObjectSpace::WeakMap#[]=
7.336M (± 2.8%) i/s (136.31 ns/i) - 36.755M in 5.013983s
ObjectSpace::WeakMap#[]
30.902M (± 5.4%) i/s (32.36 ns/i) - 155.901M in 5.064060s
```
Code:
```
require "bundler/inline"
gemfile do
source "https://rubygems.org"
gem "benchmark-ips"
end
wmap = ObjectSpace::WeakMap.new
key = Object.new
val = Object.new
wmap[key] = val
Benchmark.ips do |x|
x.report("ObjectSpace::WeakMap#[]=") do |times|
i = 0
while i < times
wmap[Object.new] = Object.new
i += 1
end
end
x.report("ObjectSpace::WeakMap#[]") do |times|
i = 0
while i < times
wmap[key]
wmap[val] # does not exist
i += 1
end
end
end
```
# Alternative designs
Currently, `rb_gc_declare_weak_references` is designed to be an internal-only
API. This allows us to assume the object types that call
`rb_gc_declare_weak_references`. In the future, if we want to open up this API
to third parties, we may want to change this function to something like:
```c
void rb_gc_add_cleaner(VALUE obj, void (*callback)(VALUE obj));
```
This will allow the third party to implement a custom `callback` that gets
called after the marking phase of GC to clean up any dead references. I chose
not to implement this design because it is less efficient as we would need to
store a mapping from `obj` to `callback`, which requires extra memory.
* 3.3.2 to [v3.3.3][csv-v3.3.3], [v3.3.4][csv-v3.3.4], [v3.3.5][csv-v3.3.5]
* repl_type_completor 0.1.12
* mutex_m 0.3.0
* resolv-replace 0.2.0
* rdoc 7.1.0
### RubyGems and Bundler
Ruby 4.0 bundled RubyGems and Bundler version 4. see the following links for details.
* [Upgrading to RubyGems/Bundler 4 - RubyGems Blog](https://blog.rubygems.org/2025/12/03/upgrade-to-rubygems-bundler-4.html)
* [4.0.0 Released - RubyGems Blog](https://blog.rubygems.org/2025/12/03/4.0.0-released.html)
* [4.0.1 Released - RubyGems Blog](https://blog.rubygems.org/2025/12/09/4.0.1-released.html)
* [4.0.2 Released - RubyGems Blog](https://blog.rubygems.org/2025/12/17/4.0.2-released.html)
* [4.0.3 Released - RubyGems Blog](https://blog.rubygems.org/2025/12/23/4.0.3-released.html)
## Supported platforms
* Windows
* Dropped support for MSVC versions older than 14.0 (_MSC_VER 1900).
This means Visual Studio 2015 or later is now required.
## Compatibility issues
* The following methods were removed from Ractor due to the addition of `Ractor::Port`:
* `Ractor.yield`
* `Ractor#take`
* `Ractor#close_incoming`
* `Ractor#close_outgoing`
[[Feature #21262]]
* `ObjectSpace._id2ref` is deprecated. [[Feature #15408]]
* `Process::Status#&` and `Process::Status#>>` have been removed.
They were deprecated in Ruby 3.3. [[Bug #19868]]
* `rb_path_check` has been removed. This function was used for
`$SAFE` path checking which was removed in Ruby 2.7,
and was already deprecated.
[[Feature #20971]]
* A backtrace for `ArgumentError` of "wrong number of arguments" now
include the receiver's class or module name (e.g., in `Foo#bar`
instead of in `bar`). [[Bug #21698]]
* Backtraces no longer display `internal` frames.
These methods now appear as if it is in the Ruby source file,
consistent with other C-implemented methods. [[Bug #20968]]
Before:
```
ruby -e '[1].fetch_values(42)'
<internal:array>:211:in 'Array#fetch': index 42 outside of array bounds: -1...1 (IndexError)
from <internal:array>:211:in 'block in Array#fetch_values'
from <internal:array>:211:in 'Array#map!'
from <internal:array>:211:in 'Array#fetch_values'
from -e:1:in '<main>'
```
After:
```
$ ruby -e '[1].fetch_values(42)'
-e:1:in 'Array#fetch_values': index 42 outside of array bounds: -1...1 (IndexError)
from -e:1:in '<main>'
```
## Stdlib compatibility issues
* CGI library is removed from the default gems. Now we only provide `cgi/escape` for
the following methods:
* `CGI.escape` and `CGI.unescape`
* `CGI.escapeHTML` and `CGI.unescapeHTML`
* `CGI.escapeURIComponent` and `CGI.unescapeURIComponent`
* `CGI.escapeElement` and `CGI.unescapeElement`
[[Feature #21258]]
* With the move of `Set` from stdlib to core class, `set/sorted_set.rb` has
been removed, and `SortedSet` is no longer an autoloaded constant. Please
install the `sorted_set` gem and `require 'sorted_set'` to use `SortedSet`.
[[Feature #21287]]
* Net::HTTP
* The default behavior of automatically setting the `Content-Type` header
to `application/x-www-form-urlencoded` for requests with a body
(e.g., `POST`, `PUT`) when the header was not explicitly set has been
removed. If your application relied on this automatic default, your
requests will now be sent without a Content-Type header, potentially
breaking compatibility with certain servers.
[[GH-net-http #205]]
## C API updates
* IO
* `rb_thread_fd_close` is deprecated and now a no-op. If you need to expose
file descriptors from C extensions to Ruby code, create an `IO` instance
using `RUBY_IO_MODE_EXTERNAL` and use `rb_io_close(io)` to close it (this
also interrupts and waits for all pending operations on the `IO`
instance). Directly closing file descriptors does not interrupt pending
operations, and may lead to undefined behaviour. In other words, if two
`IO` objects share the same file descriptor, closing one does not affect
the other. [[Feature #18455]]
* GVL
* `rb_thread_call_with_gvl` now works with or without the GVL.
This allows gems to avoid checking `ruby_thread_has_gvl_p`.
Please still be diligent about the GVL. [[Feature #20750]]
* Set
* A C API for `Set` has been added. The following methods are supported:
[[Feature #21459]]
* `rb_set_foreach`
* `rb_set_new`
* `rb_set_new_capa`
* `rb_set_lookup`
* `rb_set_add`
* `rb_set_clear`
* `rb_set_delete`
* `rb_set_size`
## Implementation improvements
* `Class#new` (ex. `Object.new`) is faster in all cases, but especially when passing keyword arguments. This has also been integrated into YJIT and ZJIT. [[Feature #21254]]
* GC heaps of different size pools now grow independently, reducing memory usage when only some pools contain long-lived objects
* GC sweeping is faster on pages of large objects
* "Generic ivar" objects (String, Array, `TypedData`, etc.) now use a new internal "fields" object for faster instance variable access
* The GC avoids maintaining an internal `id2ref` table until it is first used, making `object_id` allocation and GC sweeping faster
* `object_id` and `hash` are faster on Class and Module objects
* Larger bignum Integers can remain embedded using variable width allocation
`StringScanner`, and some internal objects are now write-barrier protected,
which reduces GC overhead.
### Ractor
A lot of work has gone into making Ractors more stable, performant, and usable. These improvements bring Ractor implementation closer to leaving experimental status.
* Performance improvements
* Frozen strings and the symbol table internally use a lock-free hash set [[Feature #21268]]
* Method cache lookups avoid locking in most cases
* Class (and generic ivar) instance variable access is faster and avoids locking
* CPU cache contention is avoided in object allocation by using a per-ractor counter
* CPU cache contention is avoided in xmalloc/xfree by using a thread-local counter
* `object_id` avoids locking in most cases
* Bug fixes and stability
* Fixed possible deadlocks when combining Ractors and Threads
* Fixed issues with require and autoload in a Ractor
* Fixed encoding/transcoding issues across Ractors
* Fixed race conditions in GC operations and method invalidation
* Fixed issues with processes forking after starting a Ractor
* GC allocation counts are now accurate under Ractors
* Fixed TracePoints not working after GC [[Bug #19112]]
## JIT
* ZJIT
* Introduce an [experimental method-based JIT compiler](https://docs.ruby-lang.org/en/master/jit/zjit_md.html).
Where available, ZJIT can be enabled at runtime with the `--zjit` option or by calling `RubyVM::ZJIT.enable`.
When building Ruby, Rust 1.85.0 or later is required to include ZJIT support.
* As of Ruby 4.0.0, ZJIT is faster than the interpreter, but not yet as fast as YJIT.
We encourage experimentation with ZJIT, but advise against deploying it in production for now.
* Our goal is to make ZJIT faster than YJIT and production-ready in Ruby 4.1.
* YJIT
* `RubyVM::YJIT.runtime_stats`
* `ratio_in_yjit` no longer works in the default build.
Use `--enable-yjit=stats` on `configure` to enable it on `--yjit-stats`.
* Add `invalidate_everything` to default stats, which is
incremented when every code is invalidated by TracePoint.
* Add `mem_size:` and `call_threshold:` options to `RubyVM::YJIT.enable`.
* RJIT
* `--rjit` is removed. We will move the implementation of the third-party JIT API
to the [ruby/rjit](https://github.com/ruby/rjit) repository.
* 3.3.2 to [v3.3.3][csv-v3.3.3], [v3.3.4][csv-v3.3.4], [v3.3.5][csv-v3.3.5]
* repl_type_completor 0.1.12
### RubyGems and Bundler
Ruby 4.0 bundled RubyGems and Bundler version 4. see the following links for details.
* [Upgrading to RubyGems/Bundler 4 - RubyGems Blog](https://blog.rubygems.org/2025/12/03/upgrade-to-rubygems-bundler-4.html)
* [4.0.0 Released - RubyGems Blog](https://blog.rubygems.org/2025/12/03/4.0.0-released.html)
* [4.0.1 Released - RubyGems Blog](https://blog.rubygems.org/2025/12/09/4.0.1-released.html)
* [4.0.2 Released - RubyGems Blog](https://blog.rubygems.org/2025/12/17/4.0.2-released.html)
* [4.0.3 Released - RubyGems Blog](https://blog.rubygems.org/2025/12/23/4.0.3-released.html)
## Supported platforms
* Windows
* Dropped support for MSVC versions older than 14.0 (_MSC_VER 1900).
This means Visual Studio 2015 or later is now required.
## Compatibility issues
* The following methods were removed from Ractor due to the addition of `Ractor::Port`:
* `Ractor.yield`
* `Ractor#take`
* `Ractor#close_incoming`
* `Ractor#close_outgoing`
[[Feature #21262]]
* `ObjectSpace._id2ref` is deprecated. [[Feature #15408]]
* `Process::Status#&` and `Process::Status#>>` have been removed.
They were deprecated in Ruby 3.3. [[Bug #19868]]
* `rb_path_check` has been removed. This function was used for
`$SAFE` path checking which was removed in Ruby 2.7,
and was already deprecated.
[[Feature #20971]]
* A backtrace for `ArgumentError` of "wrong number of arguments" now
include the receiver's class or module name (e.g., in `Foo#bar`
instead of in `bar`). [[Bug #21698]]
* Backtraces no longer display `internal` frames.
These methods now appear as if it is in the Ruby source file,
consistent with other C-implemented methods. [[Bug #20968]]
Before:
```
ruby -e '[1].fetch_values(42)'
<internal:array>:211:in 'Array#fetch': index 42 outside of array bounds: -1...1 (IndexError)
from <internal:array>:211:in 'block in Array#fetch_values'
from <internal:array>:211:in 'Array#map!'
from <internal:array>:211:in 'Array#fetch_values'
from -e:1:in '<main>'
```
After:
```
$ ruby -e '[1].fetch_values(42)'
-e:1:in 'Array#fetch_values': index 42 outside of array bounds: -1...1 (IndexError)
from -e:1:in '<main>'
```
## Stdlib compatibility issues
* CGI library is removed from the default gems. Now we only provide `cgi/escape` for
the following methods:
* `CGI.escape` and `CGI.unescape`
* `CGI.escapeHTML` and `CGI.unescapeHTML`
* `CGI.escapeURIComponent` and `CGI.unescapeURIComponent`
* `CGI.escapeElement` and `CGI.unescapeElement`
[[Feature #21258]]
* With the move of `Set` from stdlib to core class, `set/sorted_set.rb` has
been removed, and `SortedSet` is no longer an autoloaded constant. Please
install the `sorted_set` gem and `require 'sorted_set'` to use `SortedSet`.
[[Feature #21287]]
* Net::HTTP
* The default behavior of automatically setting the `Content-Type` header
to `application/x-www-form-urlencoded` for requests with a body
(e.g., `POST`, `PUT`) when the header was not explicitly set has been
removed. If your application relied on this automatic default, your
requests will now be sent without a Content-Type header, potentially
breaking compatibility with certain servers.
[[GH-net-http #205]]
## C API updates
* IO
* `rb_thread_fd_close` is deprecated and now a no-op. If you need to expose
file descriptors from C extensions to Ruby code, create an `IO` instance
using `RUBY_IO_MODE_EXTERNAL` and use `rb_io_close(io)` to close it (this
also interrupts and waits for all pending operations on the `IO`
instance). Directly closing file descriptors does not interrupt pending
operations, and may lead to undefined behaviour. In other words, if two
`IO` objects share the same file descriptor, closing one does not affect
the other. [[Feature #18455]]
* GVL
* `rb_thread_call_with_gvl` now works with or without the GVL.
This allows gems to avoid checking `ruby_thread_has_gvl_p`.
Please still be diligent about the GVL. [[Feature #20750]]
* Set
* A C API for `Set` has been added. The following methods are supported:
[[Feature #21459]]
* `rb_set_foreach`
* `rb_set_new`
* `rb_set_new_capa`
* `rb_set_lookup`
* `rb_set_add`
* `rb_set_clear`
* `rb_set_delete`
* `rb_set_size`
## Implementation improvements
* `Class#new` (ex. `Object.new`) is faster in all cases, but especially when passing keyword arguments. This has also been integrated into YJIT and ZJIT. [[Feature #21254]]
* GC heaps of different size pools now grow independently, reducing memory usage when only some pools contain long-lived objects
* GC sweeping is faster on pages of large objects
* "Generic ivar" objects (String, Array, `TypedData`, etc.) now use a new internal "fields" object for faster instance variable access
* The GC avoids maintaining an internal `id2ref` table until it is first used, making `object_id` allocation and GC sweeping faster
* `object_id` and `hash` are faster on Class and Module objects
* Larger bignum Integers can remain embedded using variable width allocation
`StringScanner`, and some internal objects are now write-barrier protected,
which reduces GC overhead.
### Ractor
A lot of work has gone into making Ractors more stable, performant, and usable. These improvements bring Ractor implementation closer to leaving experimental status.
* Performance improvements
* Frozen strings and the symbol table internally use a lock-free hash set [[Feature #21268]]
* Method cache lookups avoid locking in most cases
* Class (and generic ivar) instance variable access is faster and avoids locking
* CPU cache contention is avoided in object allocation by using a per-ractor counter
* CPU cache contention is avoided in xmalloc/xfree by using a thread-local counter
* `object_id` avoids locking in most cases
* Bug fixes and stability
* Fixed possible deadlocks when combining Ractors and Threads
* Fixed issues with require and autoload in a Ractor
* Fixed encoding/transcoding issues across Ractors
* Fixed race conditions in GC operations and method invalidation
* Fixed issues with processes forking after starting a Ractor
* GC allocation counts are now accurate under Ractors
* Fixed TracePoints not working after GC [[Bug #19112]]
## JIT
* ZJIT
* Introduce an [experimental method-based JIT compiler](https://docs.ruby-lang.org/en/master/jit/zjit_md.html).
Where available, ZJIT can be enabled at runtime with the `--zjit` option or by calling `RubyVM::ZJIT.enable`.
When building Ruby, Rust 1.85.0 or later is required to include ZJIT support.
* As of Ruby 4.0.0, ZJIT is faster than the interpreter, but not yet as fast as YJIT.
We encourage experimentation with ZJIT, but advise against deploying it in production for now.
* Our goal is to make ZJIT faster than YJIT and production-ready in Ruby 4.1.
* YJIT
* `RubyVM::YJIT.runtime_stats`
* `ratio_in_yjit` no longer works in the default build.
Use `--enable-yjit=stats` on `configure` to enable it on `--yjit-stats`.
* Add `invalidate_everything` to default stats, which is
incremented when every code is invalidated by TracePoint.
* Add `mem_size:` and `call_threshold:` options to `RubyVM::YJIT.enable`.
* RJIT
* `--rjit` is removed. We will move the implementation of the third-party JIT API
to the [ruby/rjit](https://github.com/ruby/rjit) repository.
- `MakeMakefile`: A module used to generate a Makefile for C extensions
- `RbConfig`: Information about your Ruby configuration and build
- `Gem`: A package management framework for Ruby
- `Pathname`: Representation of the name of a file or directory on the filesystem. Pathname is a core class, but only methods that depend on other libraries are provided as a library.
## Extensions
@ -58,7 +59,6 @@ of each.
- Time ([GitHub][time]): Extends the Time class with methods for parsing and conversion
- Timeout ([GitHub][timeout]): Auto-terminate potentially long-running operations in Ruby
- TmpDir ([GitHub][tmpdir]): Extends the Dir class to manage the OS temporary file path
- TSort ([GitHub][tsort]): Topological sorting using Tarjan's algorithm
- UN ([GitHub][un]): Utilities to replace common UNIX commands
- URI ([GitHub][uri]): A Ruby module providing support for Uniform Resource Identifiers
- YAML ([GitHub][yaml]): The Ruby client library for the Psych YAML implementation
@ -75,7 +75,6 @@ of each.
- IO#wait ([GitHub][io-wait]): Provides the feature for waiting until IO is readable or writable without blocking.
- JSON ([GitHub][json]): Implements JavaScript Object Notation for Ruby
- OpenSSL ([GitHub][openssl]): Provides SSL, TLS, and general-purpose cryptography for Ruby
- Pathname ([GitHub][pathname]): Representation of the name of a file or directory on the filesystem
- Psych ([GitHub][psych]): A YAML parser and emitter for Ruby
- StringIO ([GitHub][stringio]): Pseudo-I/O on String objects
- StringScanner ([GitHub][strscan]): Provides lexical scanning operations on a String
@ -126,6 +125,7 @@ of each.
- [reline][reline-doc] ([GitHub][reline]): GNU Readline and Editline in a pure Ruby implementation
- [readline]: Wrapper for the Readline extension and Reline
- [fiddle]: A libffi wrapper for Ruby
- [tsort]: Topological sorting using Tarjan's algorithm
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.