We want to use [linear scan register allocation](https://bernsteinbear.com/blog/linear-scan/), but a prerequisite is having a CFG available. Previously LIR only had a linear block of instructions, this PR introduces a CFG to the LIR backend. I've done my best to ensure that the "hot path" machine code we generate is the same (as I was testing I noticed that side exit machine code was being dumped in a different order).
This PR doesn't make any changes to the existing register allocator, it simply introduces a CFG to LIR. The basic blocks in the LIR CFG always start with a label (the first instruction is a label) and the last 0, 1, or 2 instructions will be jump instructions. No other jump instructions should appear mid-block.
The reason this logic for different methods branches in the class instead of internally was to be eagerly aggressive about runtime performance. This code is currently only used once for the document where it's invoked ~N times (where N is number of lines):
```ruby
module SyntaxSuggest
class CleanDocument
# ...
def join_trailing_slash!
trailing_groups = @document.select(&:trailing_slash?).map do |code_line|
take_while_including(code_line.index..) { |x| x.trailing_slash? }
end
join_groups(trailing_groups)
self
end
```
Since this is not currently a hot-spot I think merging the branches and using a case statement is a reasonable tradeoff and avoids the need to do specific version testing.
An alternative idea was presented in #241 of behavior-based testing for branch logic (which I would prefer), however, calling the code triggered requiring a `DelegateClass` when the `syntax_suggest/api` is being required.
https://github.com/ruby/syntax_suggest/commit/ab122c455f
The actual algorithm is largely unchanged, just allowed to use
singlebyte checks for common encodings.
It could certainly be optimized much further, as here again it often
scans from the front of the string when we're interested in the back of
it. But the algorithm as many Windows only corner cases so I'd rather
ship a good improvement now and eventually come back to it later.
Most of improvement here is from the reduced setup cost (avodi double
null checks, avoid duping the argument, etc), and skipping the multi-byte
checks.
```
compare-ruby: ruby 4.1.0dev (2026-01-19T03:51:30Z master 631bf19b37) +PRISM [arm64-darwin25]
built-ruby: ruby 4.1.0dev (2026-01-21T08:21:05Z opt-basename 7eb11745b2) +PRISM [arm64-darwin25]
```
| |compare-ruby|built-ruby|
|:----------|-----------:|---------:|
|long | 3.412M| 18.158M|
| | -| 5.32x|
|long_name | 1.981M| 8.580M|
| | -| 4.33x|
|withext | 3.200M| 12.986M|
| | -| 4.06x|
There is no splitting for these so let's add a assert to try and catch
misuse. VRegs are not necessarily registers in the end, so this is best
effort. In those situations they'll get a less proximate panic message.
Since we automatically preserve registers across calls, it's never
necessary to manually and imprecisely do it with `C{Push,Pop}All`.
Delete them to remove the maintenance burden and reduce confusion.
You're supposed to return the first argument.
```rb
# Before
[[:stmts_new], [:rescue_mod, nil, nil], [:stmts_add, nil, nil], [:program, nil]]
# After
[[:stmts_new], [:rescue_mod, "1", "2"], [:stmts_add, nil, "1"], [:program, nil]]
```
The correct result would be:
`[[:rescue_mod, "1", "2"], [:stmts_new], [:stmts_add, nil, "1"], [:program, nil]]`
But the order depends on the prism AST so it seems very difficult to match.
https://github.com/ruby/prism/commit/94e0107729
Resolves https://github.com/Shopify/ruby/issues/880
Implemented this by using the code generation for `GuardType` as a reference.
Not sure if this is the best way to go about it, but it seems to work.
As `TARGET_SO_DIR_TIMESTAMP` contains `ruby_version`, after bumping
`RUBY_ABI_VERSION` it should not be existing. Usually such outdated
files will be removed by `make outdate-bundled-gems` automatically
invoked by `make up`.
* Handle line continuations.
* Handle space at the end of file in LexCompat.
https://github.com/ruby/prism/commit/32bd13eb7d
Co-authored-by: Earlopain <14981592+Earlopain@users.noreply.github.com>
Similar optimizations to the ones performed in GH-15907.
- Skip the expensive multi-byte encoding handling for the common
encodings that are known to be safe.
- Use `CheckPath` to save on copying the argument and only scan it for
NULL bytes once.
- Create the return string with rb_enc_str_new instead of rb_str_subseq
as it's going to be a very small string anyway.
This could be optimized a little bit further by searching for both `.` and `dirsep`
in one pass,
```
compare-ruby: ruby 4.1.0dev (2026-01-19T03:51:30Z master 631bf19b37) +PRISM [arm64-darwin25]
built-ruby: ruby 4.1.0dev (2026-01-20T07:33:42Z master 6fb50434e3) +PRISM [arm64-darwin25]
```
| |compare-ruby|built-ruby|
|:----------|-----------:|---------:|
|long | 3.606M| 22.229M|
| | -| 6.17x|
|long_name | 2.254M| 13.416M|
| | -| 5.95x|
|short | 16.488M| 29.969M|
| | -| 1.82x|
`strrdirsep` quite innficiently search for the last separator from the front
of the string.
This is surprising but necessary because in Shift-JS, `0x5c` can
be the second byte of some multi-byte characters, as such it's
not possible to do a pure ASCII search. And it's even more costly
because for each character we need to do expensive checks to
handle this possibility.
However in the overwhelming majority of cases, paths are encoded
in UTF-8 or ASCII, so for these common encodings we can use the
more logical and efficient algorithm.
```
compare-ruby: ruby 4.1.0dev (2026-01-17T14:40:03Z master 00a3b71eaf) +PRISM [arm64-darwin25]
built-ruby: ruby 4.1.0dev (2026-01-19T07:43:57Z file-dirname-lower.. a8d3535e5b) +PRISM [arm64-darwin25]
```
| |compare-ruby|built-ruby|
|:------|-----------:|---------:|
|long | 3.974M| 23.674M|
| | -| 5.96x|
|short | 15.281M| 29.034M|
| | -| 1.90x|
- `str_null_check` was performed twice, once by `FilePathStringValue`
and a second time by `StringValueCStr`.
- `StringValueCStr` was checking for the terminator presence, but we
don't care about that.
- `FilePathStringValue` calls `rb_str_new_frozen` to ensure `fname`
isn't mutated, but that's costly for such a check. Instead we
can do it in debug mode only.
- `rb_enc_get` is slow because it accepts arbitrary objects, even immediates,
so it has to do numerous type checks. Add a much faster `rb_str_enc_get`
when we know we're dealing with a string.
- `rb_enc_copy` is slow for the same reasons, since we already have the
encoding, we can use `rb_enc_str_new` instead.
Previously, there were a lot of nops after conditional branches. They
come from branch to LIR labels:
./miniruby --zjit-call-threshold=1 --zjit-dump-disasm -e 'Object || String'
# Insn: v14 CheckInterrupts
# RUBY_VM_CHECK_INTS(ec)
ldur w2, [x20, #0x20]
tst w2, w2
b.ne #0x120900278
nop
nop
nop
nop
nop
# Insn: v15 Test v11
tst x0, #-5
mov x2, #0
mov x3, #1
csel x2, x2, x3, eq
# Insn: v16 IfTrue v15, bb3(v6, v11)
tst x2, x2
b.eq #0x120900198
nop
nop
nop
nop
nop
They gunk up the disassembly and can't be helpful for speed. This commit
removes them. I think they were accidentally inherited from certain YJIT
branches that require padding for patching. ZJIT doesn't have these
requirements.
Use a single branch instruction for conditional branches to labels; Jmp
already uses a single `B` instruction. This will work for assemblers
that generate less than ~260,000 instructions -- plenty.
Let the CodeBlock::label_ref() callback return a failure, so we can
fail compilation instead of panicking in case we do get large offsets.
Related to [Bug #21842].
* rb_interned_str: document what decides whether the returned string is
in US-ASCII or BINARY encoding.
* rb_interned_str_cstr: include the same description as rb_interned_str
for the encoding. This one was still missing the update for US-ASCII
and erroneously said the returned string was alwasy in BINARY encoding
* rb_str_to_interned_str: document how the encoding of the result is
defined.
Co-authored-by: Herwin <herwinw@users.noreply.github.com>