- `str_null_check` was performed twice, once by `FilePathStringValue`
and a second time by `StringValueCStr`.
- `StringValueCStr` was checking for the terminator presence, but we
don't care about that.
- `FilePathStringValue` calls `rb_str_new_frozen` to ensure `fname`
isn't mutated, but that's costly for such a check. Instead we
can do it in debug mode only.
- `rb_enc_get` is slow because it accepts arbitrary objects, even immediates,
so it has to do numerous type checks. Add a much faster `rb_str_enc_get`
when we know we're dealing with a string.
- `rb_enc_copy` is slow for the same reasons, since we already have the
encoding, we can use `rb_enc_str_new` instead.
Previously, there were a lot of nops after conditional branches. They
come from branch to LIR labels:
./miniruby --zjit-call-threshold=1 --zjit-dump-disasm -e 'Object || String'
# Insn: v14 CheckInterrupts
# RUBY_VM_CHECK_INTS(ec)
ldur w2, [x20, #0x20]
tst w2, w2
b.ne #0x120900278
nop
nop
nop
nop
nop
# Insn: v15 Test v11
tst x0, #-5
mov x2, #0
mov x3, #1
csel x2, x2, x3, eq
# Insn: v16 IfTrue v15, bb3(v6, v11)
tst x2, x2
b.eq #0x120900198
nop
nop
nop
nop
nop
They gunk up the disassembly and can't be helpful for speed. This commit
removes them. I think they were accidentally inherited from certain YJIT
branches that require padding for patching. ZJIT doesn't have these
requirements.
Use a single branch instruction for conditional branches to labels; Jmp
already uses a single `B` instruction. This will work for assemblers
that generate less than ~260,000 instructions -- plenty.
Let the CodeBlock::label_ref() callback return a failure, so we can
fail compilation instead of panicking in case we do get large offsets.
Related to [Bug #21842].
* rb_interned_str: document what decides whether the returned string is
in US-ASCII or BINARY encoding.
* rb_interned_str_cstr: include the same description as rb_interned_str
for the encoding. This one was still missing the update for US-ASCII
and erroneously said the returned string was alwasy in BINARY encoding
* rb_str_to_interned_str: document how the encoding of the result is
defined.
Co-authored-by: Herwin <herwinw@users.noreply.github.com>
`chompdirsep` searches from the start of the string each time, which
perhaps is necessary for certain encodings (not even sure?) but for
the common encodings it's very wasteful. Instead we can start from the
back of the string and only compare one or two characters in most cases.
Also replace `StringValueCStr` for the simpler `rb_str_null_check`
as we only care about whether the string contains `NULL` bytes, we
don't care whether it is NULL terminated or not.
We also only check the final string for NULLs.
```
compare-ruby: ruby 4.1.0dev (2026-01-17T14:40:03Z master 00a3b71eaf) +PRISM [arm64-darwin25]
built-ruby: ruby 4.1.0dev (2026-01-18T12:55:15Z spedup-file-join 5948e92e03) +PRISM [arm64-darwin25]
warming up....
| |compare-ruby|built-ruby|
|:-------------|-----------:|---------:|
|two_strings | 2.477M| 19.317M|
| | -| 7.80x|
|many_strings | 547.577k| 10.298M|
| | -| 18.81x|
|array | 515.280k| 523.291k|
| | -| 1.02x|
|mixed | 621.840k| 635.422k|
| | -| 1.02x|
```
`File.join` is a hotspot for common libraries such as Zeitwerk
and Bootsnap. It has a fairly flexible signature, but 99% of
the time it's called with just two (or a small number of) UTF-8 strings.
If we optimistically optimize for that use case we can cut down a large
number of type and encoding checks, significantly speeding up the method.
The one remaining expensive check we could try to optimize is `str_null_check`.
Given it's common to use the same base string for joining, we could memoize it.
Also we could precompute it for literal strings.
```
compare-ruby: ruby 4.1.0dev (2026-01-17T14:40:03Z master 00a3b71eaf) +PRISM [arm64-darwin25]
built-ruby: ruby 4.1.0dev (2026-01-18T12:10:38Z spedup-file-join 069bab58d4) +PRISM [arm64-darwin25]
warming up....
| |compare-ruby|built-ruby|
|:-------------|-----------:|---------:|
|two_strings | 2.475M| 9.444M|
| | -| 3.82x|
|many_strings | 551.975k| 2.346M|
| | -| 4.25x|
|array | 514.946k| 522.034k|
| | -| 1.01x|
|mixed | 621.236k| 633.189k|
| | -| 1.02x|
```
InvokeProc and HIR effects landed without an intermediate rebase so we
got a conflict in the form of a type checker error (not handled new
opcode in a new function).
**Progress**
I've added a new directory, `zjit/src/hir_effect`. It follows the same structure as `zjit/src/hir_type` and includes:
- a ruby script to generate a rust file containing a bitset of effects we want to track
- a modified `hir.rs` to include an `effects_of` function that catalogs effects for each HIR instruction, similar to `infer_type`. Right now these effects are not specialized, all instructions currently return the top of the lattice (any effect)
- a module file for effects at `zjit/src/hir_effect/mod.rs` that again, mirrors `zjit/src/hir_type/mod.rs`. This contains a lot of helper functions and lattice operations like union and intersection
**Design Idea**
The effect system is bitset-based rather than range-based. This is the first kind of effect system described in [Max's blog post](https://bernsteinbear.com/blog/compiler-effects/).
Practically, having effects defined for each HIR instruction should allow us to have better generalization than the implicit effect system we have for c functions that we annotation as elidable, leaf, etc. Additionally, this could allow us to reason about the effects of multiple HIR instructions unioned together, something I don't believe currently exists.
**Practical Goals**
This PR replaces `has_effects` with a new effects-based `is_elidable` function. This has no behavior change to the JIT, but will make it easier to reason about effects of basic blocks and CCalls with the new design. We may be able to accomplish other quality of life improvements, such as consolidation of `nogc`, `leaf`, and other annotations.