558 Commits

Author SHA1 Message Date
Kevin Menard
4a21b83693
ZJIT: Optimize common invokesuper cases (#15816)
* ZJIT: Profile `invokesuper` instructions

* ZJIT: Introduce the `InvokeSuperDirect` HIR instruction

The new instruction is an optimized version of `InvokeSuper` when we know the `super` target is an ISEQ.

* ZJIT: Expand definition of unspecializable to more complex cases

* ZJIT: Ensure `invokesuper` optimization works when the inheritance hierarchy is modified

* ZJIT: Simplify `invokesuper` specialization to most common case

Looking at ruby-bench, most `super` calls don't pass a block, which means we can use the already optimized `SendWithoutBlockDirect`.

* ZJIT: Track `super` method entries directly to avoid GC issues

Because the method entry isn't typed as a `VALUE`, we set up barriers on its `VALUE` fields. But, that was insufficient as the method entry itself could be collected in certain cases, resulting in dangling objects. Now we track the method entry as a `VALUE` and can more naturally mark it and its children.

* ZJIT: Optimize `super` calls with simple argument forms

* ZJIT: Report the reason why we can't optimize an `invokesuper` instance

* ZJIT: Revise send fallback reasons for `super` calls

* ZJIT: Assert `super` calls are `FCALL` and don't need visibily checks
2026-01-14 19:10:06 -05:00
Max Bernstein
0eb53053f0
ZJIT: Specialize setinstancevariable when ivar is already in shape (#15290)
Don't support shape transitions for now.
2025-11-25 18:50:55 +00:00
Max Bernstein
f52edf172d
ZJIT: Specialize monomorphic DefinedIvar (#15281)
This lets us constant-fold common monomorphic cases.
2025-11-21 11:48:36 -05:00
Satoshi Tagomori
d2a587c791 renaming internal data structures and functions from namespace to box 2025-11-07 13:14:54 +09:00
Max Bernstein
02267417da
ZJIT: Profile specific objects for invokeblock (#15051)
I made a special kind of `ProfiledType` that looks at specific objects, not just their classes/shapes (https://github.com/ruby/ruby/pull/15051). Then I profiled some of our benchmarks.

For lobsters:

```
Top-6 invokeblock handler (100.0% of total 1,064,155):
        megamorphic: 494,931 (46.5%)
   monomorphic_iseq: 337,171 (31.7%)
        polymorphic: 113,381 (10.7%)
  monomorphic_ifunc:  52,260 ( 4.9%)
  monomorphic_other:  38,970 ( 3.7%)
        no_profiles:  27,442 ( 2.6%)
```

For railsbench:

```
Top-6 invokeblock handler (100.0% of total 2,529,104):
   monomorphic_iseq: 834,452 (33.0%)
        megamorphic: 818,347 (32.4%)
        polymorphic: 632,273 (25.0%)
  monomorphic_ifunc: 224,243 ( 8.9%)
  monomorphic_other:  19,595 ( 0.8%)
        no_profiles:     194 ( 0.0%)
```

For shipit:

```
Top-6 invokeblock handler (100.0% of total 2,104,148):
        megamorphic: 1,269,889 (60.4%)
        polymorphic:   411,475 (19.6%)
        no_profiles:   173,367 ( 8.2%)
  monomorphic_other:   118,619 ( 5.6%)
   monomorphic_iseq:    84,891 ( 4.0%)
  monomorphic_ifunc:    45,907 ( 2.2%)
```

Seems like a monomorphic case for a specific ISEQ actually isn't a bad way of going about this, at least to start...
2025-11-05 20:01:17 +00:00
Stan Lo
e047cea280
ZJIT: Optimize send with block into CCallWithFrame (#14863)
Since `Send` has a block iseq, I updated `CCallWithFrame` to take an optional `blockiseq` as well, and then generate `CCallWithFrame` for `Send` when the condition is right.

## Stats

`liquid-render` Benchmark

  | Metric               | Before             | After              | Change  |
  |----------------------|--------------------|--------------------|--------------------- |
  | send_no_profiles     | 3,209,418 (34.1%)  | 4,119 (0.1%)       | -3,205,299 (-99.9%) |
  | dynamic_send_count   | 9,410,758 (23.1%)  | 6,459,678 (15.9%)  | -2,951,080 (-31.4%) |
  | optimized_send_count | 31,269,388 (76.9%) | 34,220,474 (84.1%) | +2,951,086 (+9.4%) |

`lobsters` Benchmark

  | Metric               | Before     | After      | Change              |
  |----------------------|------------|------------|---------------------|
  | send_no_profiles     | 10,769,052 | 2,902,865  | -7,866,187 (-73.0%) |
  | dynamic_send_count   | 45,673,185 | 42,880,160 | -2,793,025 (-6.1%)  |
  | optimized_send_count | 75,142,407 | 78,378,514 | +3,236,107 (+4.3%)  |


### `liquid-render` Before

<details>

```
Average of last 22, non-warmup iters: 262ms
***ZJIT: Printing ZJIT statistics on exit***
Top-20 not inlined C methods (96.9% of total 10,370,809):
                    Kernel#respond_to?: 5,069,204 (48.9%)
                             Hash#key?: 2,394,488 (23.1%)
                          Set#include?:   778,429 ( 7.5%)
                            String#===:   326,134 ( 3.1%)
                             String#<<:   203,231 ( 2.0%)
                            Integer#<<:   166,768 ( 1.6%)
                          Kernel#is_a?:   164,272 ( 1.6%)
                         Kernel#format:   124,262 ( 1.2%)
                             Integer#/:   124,262 ( 1.2%)
                              Array#<<:   115,325 ( 1.1%)
                     Regexp.last_match:    94,862 ( 0.9%)
                              Hash#[]=:    88,485 ( 0.9%)
                    String#start_with?:    55,933 ( 0.5%)
             CGI::EscapeExt#escapeHTML:    55,471 ( 0.5%)
                           Array#shift:    55,298 ( 0.5%)
                            Regexp#===:    48,928 ( 0.5%)
                             String#=~:    48,477 ( 0.5%)
                         Array#unshift:    47,331 ( 0.5%)
                         String#empty?:    42,870 ( 0.4%)
                            Array#push:    41,215 ( 0.4%)
Top-20 not annotated C methods (97.1% of total 10,394,421):
                    Kernel#respond_to?: 5,069,204 (48.8%)
                             Hash#key?: 2,394,488 (23.0%)
                          Set#include?:   778,429 ( 7.5%)
                            String#===:   326,134 ( 3.1%)
                          Kernel#is_a?:   208,664 ( 2.0%)
                             String#<<:   203,231 ( 2.0%)
                            Integer#<<:   166,768 ( 1.6%)
                             Integer#/:   124,262 ( 1.2%)
                         Kernel#format:   124,262 ( 1.2%)
                              Array#<<:   115,325 ( 1.1%)
                     Regexp.last_match:    94,862 ( 0.9%)
                              Hash#[]=:    88,485 ( 0.9%)
                    String#start_with?:    55,933 ( 0.5%)
             CGI::EscapeExt#escapeHTML:    55,471 ( 0.5%)
                           Array#shift:    55,298 ( 0.5%)
                            Regexp#===:    48,928 ( 0.5%)
                             String#=~:    48,477 ( 0.5%)
                         Array#unshift:    47,331 ( 0.5%)
                         String#empty?:    42,870 ( 0.4%)
                            Array#push:    41,215 ( 0.4%)
Top-2 not optimized method types for send (100.0% of total 2,382):
  cfunc: 1,196 (50.2%)
   iseq: 1,186 (49.8%)
Top-4 not optimized method types for send_without_block (100.0% of total 2,561,006):
       iseq: 2,442,091 (95.4%)
  optimized:   118,882 ( 4.6%)
      alias:        20 ( 0.0%)
       null:        13 ( 0.0%)
Top-9 not optimized instructions (100.0% of total 685,128):
             invokeblock: 227,376 (33.2%)
                 opt_neq: 166,471 (24.3%)
                 opt_and: 166,471 (24.3%)
                  opt_eq:  66,721 ( 9.7%)
             invokesuper:  39,363 ( 5.7%)
                  opt_le:  16,278 ( 2.4%)
               opt_minus:   1,574 ( 0.2%)
  opt_send_without_block:     772 ( 0.1%)
                  opt_or:     102 ( 0.0%)
Top-8 send fallback reasons (100.0% of total 9,410,758):
                              send_no_profiles: 3,209,418 (34.1%)
                send_without_block_polymorphic: 2,858,558 (30.4%)
  send_without_block_not_optimized_method_type: 2,561,006 (27.2%)
                     not_optimized_instruction:   685,128 ( 7.3%)
                send_without_block_no_profiles:    91,913 ( 1.0%)
                send_not_optimized_method_type:     2,382 ( 0.0%)
                      obj_to_string_not_string:     2,352 ( 0.0%)
       send_without_block_cfunc_array_variadic:         1 ( 0.0%)
Top-3 unhandled YARV insns (100.0% of total 83,682):
  getclassvariable: 83,431 (99.7%)
              once:    137 ( 0.2%)
       getconstant:    114 ( 0.1%)
Top-3 compile error reasons (100.0% of total 5,431,910):
  register_spill_on_alloc: 4,665,393 (85.9%)
        exception_handler:   766,347 (14.1%)
  register_spill_on_ccall:       170 ( 0.0%)
Top-11 side exit reasons (100.0% of total 14,635,508):
                        compile_error: 5,431,910 (37.1%)
                  guard_shape_failure: 3,436,341 (23.5%)
                   guard_type_failure: 2,545,791 (17.4%)
                      unhandled_splat: 2,162,907 (14.8%)
                      unhandled_kwarg:   952,568 ( 6.5%)
                  unhandled_yarv_insn:    83,682 ( 0.6%)
                   unhandled_hir_insn:    19,112 ( 0.1%)
     patchpoint_stable_constant_names:     1,608 ( 0.0%)
               obj_to_string_fallback:       902 ( 0.0%)
          patchpoint_method_redefined:       599 ( 0.0%)
  block_param_proxy_not_iseq_or_ifunc:        88 ( 0.0%)
                             send_count: 40,680,153
                     dynamic_send_count:  9,410,758 (23.1%)
                   optimized_send_count: 31,269,395 (76.9%)
              iseq_optimized_send_count: 13,886,902 (34.1%)
      inline_cfunc_optimized_send_count:  7,011,684 (17.2%)
non_variadic_cfunc_optimized_send_count:  4,670,333 (11.5%)
    variadic_cfunc_optimized_send_count:  5,700,476 (14.0%)
dynamic_getivar_count:                         1,144,613
dynamic_setivar_count:                           950,830
compiled_iseq_count:                                 402
failed_iseq_count:                                    48
compile_time:                                      976ms
profile_time:                                    3,223ms
gc_time:                                            22ms
invalidation_time:                                   0ms
vm_write_pc_count:                            37,744,491
vm_write_sp_count:                            37,511,865
vm_write_locals_count:                        37,511,865
vm_write_stack_count:                         37,511,865
vm_write_to_parent_iseq_local_count:             558,177
vm_read_from_parent_iseq_local_count:         14,317,032
code_region_bytes:                             2,211,840
side_exit_count:                              14,635,508
total_insn_count:                            476,097,972
vm_insn_count:                               253,795,154
zjit_insn_count:                             222,302,818
ratio_in_zjit:                                     46.7%
```

</details>

### `liquid-render` After

<details>

```
Average of last 21, non-warmup iters: 272ms
***ZJIT: Printing ZJIT statistics on exit***
Top-20 not inlined C methods (96.8% of total 10,093,966):
                    Kernel#respond_to?: 4,932,224 (48.9%)
                             Hash#key?: 2,329,928 (23.1%)
                          Set#include?:   757,389 ( 7.5%)
                            String#===:   317,494 ( 3.1%)
                             String#<<:   197,831 ( 2.0%)
                            Integer#<<:   162,268 ( 1.6%)
                          Kernel#is_a?:   159,892 ( 1.6%)
                         Kernel#format:   120,902 ( 1.2%)
                             Integer#/:   120,902 ( 1.2%)
                              Array#<<:   112,225 ( 1.1%)
                     Regexp.last_match:    92,382 ( 0.9%)
                              Hash#[]=:    86,145 ( 0.9%)
                    String#start_with?:    54,953 ( 0.5%)
                           Array#shift:    54,038 ( 0.5%)
             CGI::EscapeExt#escapeHTML:    53,971 ( 0.5%)
                            Regexp#===:    47,848 ( 0.5%)
                             String#=~:    47,237 ( 0.5%)
                         Array#unshift:    46,051 ( 0.5%)
                         String#empty?:    41,750 ( 0.4%)
                            Array#push:    40,115 ( 0.4%)
Top-20 not annotated C methods (97.1% of total 10,116,938):
                    Kernel#respond_to?: 4,932,224 (48.8%)
                             Hash#key?: 2,329,928 (23.0%)
                          Set#include?:   757,389 ( 7.5%)
                            String#===:   317,494 ( 3.1%)
                          Kernel#is_a?:   203,084 ( 2.0%)
                             String#<<:   197,831 ( 2.0%)
                            Integer#<<:   162,268 ( 1.6%)
                         Kernel#format:   120,902 ( 1.2%)
                             Integer#/:   120,902 ( 1.2%)
                              Array#<<:   112,225 ( 1.1%)
                     Regexp.last_match:    92,382 ( 0.9%)
                              Hash#[]=:    86,145 ( 0.9%)
                    String#start_with?:    54,953 ( 0.5%)
                           Array#shift:    54,038 ( 0.5%)
             CGI::EscapeExt#escapeHTML:    53,971 ( 0.5%)
                            Regexp#===:    47,848 ( 0.5%)
                             String#=~:    47,237 ( 0.5%)
                         Array#unshift:    46,051 ( 0.5%)
                         String#empty?:    41,750 ( 0.4%)
                            Array#push:    40,115 ( 0.4%)
Top-2 not optimized method types for send (100.0% of total 182,938):
   iseq: 178,414 (97.5%)
  cfunc:   4,524 ( 2.5%)
Top-4 not optimized method types for send_without_block (100.0% of total 2,492,246):
       iseq: 2,376,511 (95.4%)
  optimized:   115,702 ( 4.6%)
      alias:        20 ( 0.0%)
       null:        13 ( 0.0%)
Top-9 not optimized instructions (100.0% of total 667,727):
             invokeblock: 221,375 (33.2%)
                 opt_neq: 161,971 (24.3%)
                 opt_and: 161,971 (24.3%)
                  opt_eq:  64,921 ( 9.7%)
             invokesuper:  39,243 ( 5.9%)
                  opt_le:  15,838 ( 2.4%)
               opt_minus:   1,534 ( 0.2%)
  opt_send_without_block:     772 ( 0.1%)
                  opt_or:     102 ( 0.0%)
Top-9 send fallback reasons (100.0% of total 6,287,956):
                send_without_block_polymorphic: 2,782,058 (44.2%)
  send_without_block_not_optimized_method_type: 2,492,246 (39.6%)
                     not_optimized_instruction:   667,727 (10.6%)
                send_not_optimized_method_type:   182,938 ( 2.9%)
                send_without_block_no_profiles:    89,613 ( 1.4%)
                              send_polymorphic:    66,962 ( 1.1%)
                              send_no_profiles:     4,059 ( 0.1%)
                      obj_to_string_not_string:     2,352 ( 0.0%)
       send_without_block_cfunc_array_variadic:         1 ( 0.0%)
Top-3 unhandled YARV insns (100.0% of total 81,482):
  getclassvariable: 81,231 (99.7%)
              once:    137 ( 0.2%)
       getconstant:    114 ( 0.1%)
Top-3 compile error reasons (100.0% of total 5,286,310):
  register_spill_on_alloc: 4,540,413 (85.9%)
        exception_handler:   745,727 (14.1%)
  register_spill_on_ccall:       170 ( 0.0%)
Top-12 side exit reasons (100.0% of total 14,244,881):
                        compile_error: 5,286,310 (37.1%)
                  guard_shape_failure: 3,346,873 (23.5%)
                   guard_type_failure: 2,477,071 (17.4%)
                      unhandled_splat: 2,104,447 (14.8%)
                      unhandled_kwarg:   926,828 ( 6.5%)
                  unhandled_yarv_insn:    81,482 ( 0.6%)
                   unhandled_hir_insn:    18,672 ( 0.1%)
     patchpoint_stable_constant_names:     1,608 ( 0.0%)
               obj_to_string_fallback:       902 ( 0.0%)
          patchpoint_method_redefined:       599 ( 0.0%)
  block_param_proxy_not_iseq_or_ifunc:        88 ( 0.0%)
                            interrupt:         1 ( 0.0%)
                             send_count: 39,591,410
                     dynamic_send_count:  6,287,956 (15.9%)
                   optimized_send_count: 33,303,454 (84.1%)
              iseq_optimized_send_count: 13,514,283 (34.1%)
      inline_cfunc_optimized_send_count:  6,823,745 (17.2%)
non_variadic_cfunc_optimized_send_count:  7,417,432 (18.7%)
    variadic_cfunc_optimized_send_count:  5,547,994 (14.0%)
dynamic_getivar_count:                        1,110,647
dynamic_setivar_count:                          927,309
compiled_iseq_count:                                403
failed_iseq_count:                                   48
compile_time:                                     968ms
profile_time:                                   3,547ms
gc_time:                                           22ms
invalidation_time:                                  0ms
vm_write_pc_count:                           36,735,108
vm_write_sp_count:                           36,508,262
vm_write_locals_count:                       36,508,262
vm_write_stack_count:                        36,508,262
vm_write_to_parent_iseq_local_count:            543,097
vm_read_from_parent_iseq_local_count:        13,930,672
code_region_bytes:                            2,228,224
side_exit_count:                             14,244,881
total_insn_count:                           463,357,969
vm_insn_count:                              247,003,727
zjit_insn_count:                            216,354,242
ratio_in_zjit:                                    46.7%
```

</details>

### `lobsters` Before

<details>

```
Average of last 10, non-warmup iters: 898ms
***ZJIT: Printing ZJIT statistics on exit***
Top-20 not inlined C methods (61.3% of total 19,495,906):
                                  String#<<: 1,764,437 ( 9.1%)
                               Kernel#is_a?: 1,615,120 ( 8.3%)
                                   Hash#[]=: 1,159,455 ( 5.9%)
                              Regexp#match?:   777,496 ( 4.0%)
                              String#empty?:   722,953 ( 3.7%)
                                  Hash#key?:   685,258 ( 3.5%)
                         Kernel#respond_to?:   602,017 ( 3.1%)
                              TrueClass#===:   447,671 ( 2.3%)
                             FalseClass#===:   439,276 ( 2.3%)
                             Array#include?:   426,758 ( 2.2%)
                        Kernel#block_given?:   405,271 ( 2.1%)
                                 Hash#fetch:   382,302 ( 2.0%)
                 ObjectSpace::WeakKeyMap#[]:   356,654 ( 1.8%)
                         String#start_with?:   353,793 ( 1.8%)
                            Kernel#kind_of?:   340,341 ( 1.7%)
                                 Kernel#dup:   328,162 ( 1.7%)
                                 String.new:   306,667 ( 1.6%)
                                  String#==:   287,549 ( 1.5%)
                             BasicObject#!=:   284,642 ( 1.5%)
                              String#length:   256,070 ( 1.3%)
Top-20 not annotated C methods (62.4% of total 19,796,172):
                               Kernel#is_a?: 1,993,676 (10.1%)
                                  String#<<: 1,764,437 ( 8.9%)
                                   Hash#[]=: 1,159,634 ( 5.9%)
                              Regexp#match?:   777,496 ( 3.9%)
                              String#empty?:   738,030 ( 3.7%)
                                  Hash#key?:   685,258 ( 3.5%)
                         Kernel#respond_to?:   602,017 ( 3.0%)
                              TrueClass#===:   447,671 ( 2.3%)
                             FalseClass#===:   439,276 ( 2.2%)
                             Array#include?:   426,758 ( 2.2%)
                        Kernel#block_given?:   425,813 ( 2.2%)
                                 Hash#fetch:   382,302 ( 1.9%)
                 ObjectSpace::WeakKeyMap#[]:   356,654 ( 1.8%)
                         String#start_with?:   353,793 ( 1.8%)
                            Kernel#kind_of?:   340,375 ( 1.7%)
                                 Kernel#dup:   328,169 ( 1.7%)
                                 String.new:   306,667 ( 1.5%)
                                  String#==:   293,520 ( 1.5%)
                             BasicObject#!=:   284,825 ( 1.4%)
                              String#length:   256,070 ( 1.3%)
Top-2 not optimized method types for send (100.0% of total 115,007):
  cfunc: 76,172 (66.2%)
   iseq: 38,835 (33.8%)
Top-6 not optimized method types for send_without_block (100.0% of total 8,003,641):
       iseq: 3,999,211 (50.0%)
    bmethod: 1,750,271 (21.9%)
  optimized: 1,653,426 (20.7%)
      alias:   591,342 ( 7.4%)
       null:     8,174 ( 0.1%)
      cfunc:     1,217 ( 0.0%)
Top-13 not optimized instructions (100.0% of total 7,590,826):
             invokesuper: 4,335,446 (57.1%)
             invokeblock: 1,329,215 (17.5%)
             sendforward:   841,463 (11.1%)
                  opt_eq:   810,614 (10.7%)
                opt_plus:   141,773 ( 1.9%)
               opt_minus:    52,270 ( 0.7%)
  opt_send_without_block:    43,248 ( 0.6%)
                 opt_neq:    15,047 ( 0.2%)
                opt_mult:    13,824 ( 0.2%)
                  opt_or:     7,451 ( 0.1%)
                  opt_lt:       348 ( 0.0%)
                  opt_ge:        91 ( 0.0%)
                  opt_gt:        36 ( 0.0%)
Top-9 send fallback reasons (100.0% of total 45,673,212):
                send_without_block_polymorphic: 17,390,335 (38.1%)
                              send_no_profiles: 10,769,053 (23.6%)
  send_without_block_not_optimized_method_type:  8,003,641 (17.5%)
                     not_optimized_instruction:  7,590,826 (16.6%)
                send_without_block_no_profiles:  1,757,109 ( 3.8%)
                send_not_optimized_method_type:    115,007 ( 0.3%)
       send_without_block_cfunc_array_variadic:     31,149 ( 0.1%)
                      obj_to_string_not_string:     15,518 ( 0.0%)
       send_without_block_direct_too_many_args:        574 ( 0.0%)
Top-9 unhandled YARV insns (100.0% of total 1,242,228):
         expandarray: 622,203 (50.1%)
        checkkeyword: 316,111 (25.4%)
    getclassvariable: 120,540 ( 9.7%)
       getblockparam:  88,480 ( 7.1%)
  invokesuperforward:  78,842 ( 6.3%)
   opt_duparray_send:  14,149 ( 1.1%)
         getconstant:   1,588 ( 0.1%)
          checkmatch:     288 ( 0.0%)
                once:      27 ( 0.0%)
Top-3 compile error reasons (100.0% of total 6,769,693):
  register_spill_on_alloc: 6,188,305 (91.4%)
  register_spill_on_ccall:   347,108 ( 5.1%)
        exception_handler:   234,280 ( 3.5%)
Top-17 side exit reasons (100.0% of total 20,142,827):
                        compile_error: 6,769,693 (33.6%)
                   guard_type_failure: 5,169,050 (25.7%)
                  guard_shape_failure: 3,726,362 (18.5%)
                  unhandled_yarv_insn: 1,242,228 ( 6.2%)
  block_param_proxy_not_iseq_or_ifunc:   984,480 ( 4.9%)
                      unhandled_kwarg:   800,154 ( 4.0%)
                unknown_newarray_send:   539,317 ( 2.7%)
     patchpoint_stable_constant_names:   340,283 ( 1.7%)
                      unhandled_splat:   229,440 ( 1.1%)
                   unhandled_hir_insn:   147,351 ( 0.7%)
        patchpoint_no_singleton_class:   128,856 ( 0.6%)
          patchpoint_method_redefined:    32,718 ( 0.2%)
           block_param_proxy_modified:    25,274 ( 0.1%)
              patchpoint_no_ep_escape:     7,559 ( 0.0%)
               obj_to_string_fallback:        24 ( 0.0%)
               guard_type_not_failure:        22 ( 0.0%)
                            interrupt:        16 ( 0.0%)
                             send_count: 120,815,640
                     dynamic_send_count:  45,673,212 (37.8%)
                   optimized_send_count:  75,142,428 (62.2%)
              iseq_optimized_send_count:  32,188,039 (26.6%)
      inline_cfunc_optimized_send_count:  23,458,483 (19.4%)
non_variadic_cfunc_optimized_send_count:  14,809,797 (12.3%)
    variadic_cfunc_optimized_send_count:   4,686,109 ( 3.9%)
dynamic_getivar_count:                       13,023,437
dynamic_setivar_count:                       12,311,158
compiled_iseq_count:                              4,806
failed_iseq_count:                                  466
compile_time:                                   8,943ms
profile_time:                                      99ms
gc_time:                                           45ms
invalidation_time:                                239ms
vm_write_pc_count:                          113,652,291
vm_write_sp_count:                          111,209,623
vm_write_locals_count:                      111,209,623
vm_write_stack_count:                       111,209,623
vm_write_to_parent_iseq_local_count:            516,800
vm_read_from_parent_iseq_local_count:        11,225,587
code_region_bytes:                           22,609,920
side_exit_count:                             20,142,827
total_insn_count:                           926,088,942
vm_insn_count:                              297,636,255
zjit_insn_count:                            628,452,687
ratio_in_zjit:                                    67.9%
```

</details> 

### `lobsters` After

<details>

```
Average of last 10, non-warmup iters: 919ms
***ZJIT: Printing ZJIT statistics on exit***
Top-20 not inlined C methods (61.3% of total 19,495,868):
                                  String#<<: 1,764,437 ( 9.1%)
                               Kernel#is_a?: 1,615,110 ( 8.3%)
                                   Hash#[]=: 1,159,455 ( 5.9%)
                              Regexp#match?:   777,496 ( 4.0%)
                              String#empty?:   722,953 ( 3.7%)
                                  Hash#key?:   685,258 ( 3.5%)
                         Kernel#respond_to?:   602,016 ( 3.1%)
                              TrueClass#===:   447,671 ( 2.3%)
                             FalseClass#===:   439,276 ( 2.3%)
                             Array#include?:   426,758 ( 2.2%)
                        Kernel#block_given?:   405,271 ( 2.1%)
                                 Hash#fetch:   382,302 ( 2.0%)
                 ObjectSpace::WeakKeyMap#[]:   356,654 ( 1.8%)
                         String#start_with?:   353,793 ( 1.8%)
                            Kernel#kind_of?:   340,341 ( 1.7%)
                                 Kernel#dup:   328,162 ( 1.7%)
                                 String.new:   306,667 ( 1.6%)
                                  String#==:   287,545 ( 1.5%)
                             BasicObject#!=:   284,642 ( 1.5%)
                              String#length:   256,070 ( 1.3%)
Top-20 not annotated C methods (62.4% of total 19,796,134):
                               Kernel#is_a?: 1,993,666 (10.1%)
                                  String#<<: 1,764,437 ( 8.9%)
                                   Hash#[]=: 1,159,634 ( 5.9%)
                              Regexp#match?:   777,496 ( 3.9%)
                              String#empty?:   738,030 ( 3.7%)
                                  Hash#key?:   685,258 ( 3.5%)
                         Kernel#respond_to?:   602,016 ( 3.0%)
                              TrueClass#===:   447,671 ( 2.3%)
                             FalseClass#===:   439,276 ( 2.2%)
                             Array#include?:   426,758 ( 2.2%)
                        Kernel#block_given?:   425,813 ( 2.2%)
                                 Hash#fetch:   382,302 ( 1.9%)
                 ObjectSpace::WeakKeyMap#[]:   356,654 ( 1.8%)
                         String#start_with?:   353,793 ( 1.8%)
                            Kernel#kind_of?:   340,375 ( 1.7%)
                                 Kernel#dup:   328,169 ( 1.7%)
                                 String.new:   306,667 ( 1.5%)
                                  String#==:   293,516 ( 1.5%)
                             BasicObject#!=:   284,825 ( 1.4%)
                              String#length:   256,070 ( 1.3%)
Top-4 not optimized method types for send (100.0% of total 4,749,678):
   iseq: 2,563,391 (54.0%)
  cfunc: 2,064,888 (43.5%)
  alias:   118,577 ( 2.5%)
   null:     2,822 ( 0.1%)
Top-6 not optimized method types for send_without_block (100.0% of total 8,003,641):
       iseq: 3,999,211 (50.0%)
    bmethod: 1,750,271 (21.9%)
  optimized: 1,653,426 (20.7%)
      alias:   591,342 ( 7.4%)
       null:     8,174 ( 0.1%)
      cfunc:     1,217 ( 0.0%)
Top-13 not optimized instructions (100.0% of total 7,590,818):
             invokesuper: 4,335,442 (57.1%)
             invokeblock: 1,329,215 (17.5%)
             sendforward:   841,463 (11.1%)
                  opt_eq:   810,610 (10.7%)
                opt_plus:   141,773 ( 1.9%)
               opt_minus:    52,270 ( 0.7%)
  opt_send_without_block:    43,248 ( 0.6%)
                 opt_neq:    15,047 ( 0.2%)
                opt_mult:    13,824 ( 0.2%)
                  opt_or:     7,451 ( 0.1%)
                  opt_lt:       348 ( 0.0%)
                  opt_ge:        91 ( 0.0%)
                  opt_gt:        36 ( 0.0%)
Top-10 send fallback reasons (100.0% of total 43,152,037):
                send_without_block_polymorphic: 17,390,322 (40.3%)
  send_without_block_not_optimized_method_type:  8,003,641 (18.5%)
                     not_optimized_instruction:  7,590,818 (17.6%)
                send_not_optimized_method_type:  4,749,678 (11.0%)
                              send_no_profiles:  2,893,666 ( 6.7%)
                send_without_block_no_profiles:  1,757,109 ( 4.1%)
                              send_polymorphic:    719,562 ( 1.7%)
       send_without_block_cfunc_array_variadic:     31,149 ( 0.1%)
                      obj_to_string_not_string:     15,518 ( 0.0%)
       send_without_block_direct_too_many_args:        574 ( 0.0%)
Top-9 unhandled YARV insns (100.0% of total 1,242,215):
         expandarray: 622,203 (50.1%)
        checkkeyword: 316,111 (25.4%)
    getclassvariable: 120,540 ( 9.7%)
       getblockparam:  88,467 ( 7.1%)
  invokesuperforward:  78,842 ( 6.3%)
   opt_duparray_send:  14,149 ( 1.1%)
         getconstant:   1,588 ( 0.1%)
          checkmatch:     288 ( 0.0%)
                once:      27 ( 0.0%)
Top-3 compile error reasons (100.0% of total 6,769,688):
  register_spill_on_alloc: 6,188,305 (91.4%)
  register_spill_on_ccall:   347,108 ( 5.1%)
        exception_handler:   234,275 ( 3.5%)
Top-17 side exit reasons (100.0% of total 20,144,372):
                        compile_error: 6,769,688 (33.6%)
                   guard_type_failure: 5,169,204 (25.7%)
                  guard_shape_failure: 3,726,374 (18.5%)
                  unhandled_yarv_insn: 1,242,215 ( 6.2%)
  block_param_proxy_not_iseq_or_ifunc:   984,480 ( 4.9%)
                      unhandled_kwarg:   800,154 ( 4.0%)
                unknown_newarray_send:   539,317 ( 2.7%)
     patchpoint_stable_constant_names:   340,283 ( 1.7%)
                      unhandled_splat:   229,440 ( 1.1%)
                   unhandled_hir_insn:   147,351 ( 0.7%)
        patchpoint_no_singleton_class:   130,252 ( 0.6%)
          patchpoint_method_redefined:    32,716 ( 0.2%)
           block_param_proxy_modified:    25,274 ( 0.1%)
              patchpoint_no_ep_escape:     7,559 ( 0.0%)
               obj_to_string_fallback:        24 ( 0.0%)
               guard_type_not_failure:        22 ( 0.0%)
                            interrupt:        19 ( 0.0%)
                             send_count: 120,812,030
                     dynamic_send_count:  43,152,037 (35.7%)
                   optimized_send_count:  77,659,993 (64.3%)
              iseq_optimized_send_count:  32,187,900 (26.6%)
      inline_cfunc_optimized_send_count:  23,458,491 (19.4%)
non_variadic_cfunc_optimized_send_count:  17,327,499 (14.3%)
    variadic_cfunc_optimized_send_count:   4,686,103 ( 3.9%)
dynamic_getivar_count:                       13,023,424
dynamic_setivar_count:                       12,310,991
compiled_iseq_count:                              4,806
failed_iseq_count:                                  466
compile_time:                                   9,012ms
profile_time:                                     104ms
gc_time:                                           44ms
invalidation_time:                                239ms
vm_write_pc_count:                          113,648,665
vm_write_sp_count:                          111,205,997
vm_write_locals_count:                      111,205,997
vm_write_stack_count:                       111,205,997
vm_write_to_parent_iseq_local_count:            516,800
vm_read_from_parent_iseq_local_count:        11,225,587
code_region_bytes:                           23,052,288
side_exit_count:                             20,144,372
total_insn_count:                           926,090,214
vm_insn_count:                              297,647,811
zjit_insn_count:                            628,442,403
ratio_in_zjit:                                    67.9%
```

</details>
2025-10-20 20:10:25 +00:00
Max Bernstein
1d95d75c3f
ZJIT: Profile opt_succ and inline Integer#succ for Fixnum (#14846)
This is only really called a lot in the benchmark harness, as far as I
can tell.
2025-10-15 23:40:45 -04:00
Max Bernstein
de9298635d
ZJIT: Profile opt_size, opt_length, opt_regexpmatch2 (#14837)
These bring `send_without_block_no_profiles` numbers down more.

On lobsters:
  Before:  send_without_block_no_profiles: 1,293,375
  After:   send_without_block_no_profiles: 998,724

all stats before:

```
***ZJIT: Printing ZJIT statistics on exit***
Top-20 not inlined C methods (71.1% of total 15,575,335):
                                    Hash#[]: 4,519,774 (29.0%)
                               Kernel#is_a?: 1,030,758 ( 6.6%)
                                  String#<<:   851,929 ( 5.5%)
                                   Hash#[]=:   742,941 ( 4.8%)
                              Regexp#match?:   399,889 ( 2.6%)
                              String#empty?:   353,775 ( 2.3%)
                                  Hash#key?:   349,129 ( 2.2%)
                         String#start_with?:   334,961 ( 2.2%)
                         Kernel#respond_to?:   316,527 ( 2.0%)
                 ObjectSpace::WeakKeyMap#[]:   238,978 ( 1.5%)
                              TrueClass#===:   235,771 ( 1.5%)
                             FalseClass#===:   231,144 ( 1.5%)
                             Array#include?:   211,381 ( 1.4%)
                                 Hash#fetch:   204,702 ( 1.3%)
                        Kernel#block_given?:   181,792 ( 1.2%)
         ActiveSupport::OrderedOptions#_get:   181,272 ( 1.2%)
                                 Kernel#dup:   179,340 ( 1.2%)
                             BasicObject#!=:   175,997 ( 1.1%)
                                  Class#new:   168,078 ( 1.1%)
                            Kernel#kind_of?:   165,600 ( 1.1%)
Top-20 not annotated C methods (71.6% of total 15,737,478):
                                    Hash#[]: 4,519,784 (28.7%)
                               Kernel#is_a?: 1,212,649 ( 7.7%)
                                  String#<<:   851,929 ( 5.4%)
                                   Hash#[]=:   743,120 ( 4.7%)
                              Regexp#match?:   399,889 ( 2.5%)
                              String#empty?:   361,013 ( 2.3%)
                                  Hash#key?:   349,129 ( 2.2%)
                         String#start_with?:   334,961 ( 2.1%)
                         Kernel#respond_to?:   316,527 ( 2.0%)
                 ObjectSpace::WeakKeyMap#[]:   238,978 ( 1.5%)
                              TrueClass#===:   235,771 ( 1.5%)
                             FalseClass#===:   231,144 ( 1.5%)
                             Array#include?:   211,381 ( 1.3%)
                                 Hash#fetch:   204,702 ( 1.3%)
                        Kernel#block_given?:   191,661 ( 1.2%)
         ActiveSupport::OrderedOptions#_get:   181,272 ( 1.2%)
                                 Kernel#dup:   179,347 ( 1.1%)
                             BasicObject#!=:   176,181 ( 1.1%)
                                  Class#new:   168,078 ( 1.1%)
                            Kernel#kind_of?:   165,634 ( 1.1%)
Top-2 not optimized method types for send (100.0% of total 72,318):
  cfunc: 48,055 (66.4%)
   iseq: 24,263 (33.6%)
Top-6 not optimized method types for send_without_block (100.0% of total 4,523,648):
       iseq: 2,271,904 (50.2%)
    bmethod:   985,636 (21.8%)
  optimized:   949,702 (21.0%)
      alias:   310,746 ( 6.9%)
       null:     5,106 ( 0.1%)
      cfunc:       554 ( 0.0%)
Top-13 not optimized instructions (100.0% of total 4,293,096):
             invokesuper: 2,373,391 (55.3%)
             invokeblock:   811,872 (18.9%)
             sendforward:   505,448 (11.8%)
                  opt_eq:   451,754 (10.5%)
                opt_plus:    74,403 ( 1.7%)
               opt_minus:    36,225 ( 0.8%)
  opt_send_without_block:    21,792 ( 0.5%)
                 opt_neq:     7,231 ( 0.2%)
                opt_mult:     6,752 ( 0.2%)
                  opt_or:     3,753 ( 0.1%)
                  opt_lt:       348 ( 0.0%)
                  opt_ge:        91 ( 0.0%)
                  opt_gt:        36 ( 0.0%)
Top-9 send fallback reasons (100.0% of total 25,824,463):
                send_without_block_polymorphic: 9,721,727 (37.6%)
                              send_no_profiles: 5,894,760 (22.8%)
  send_without_block_not_optimized_method_type: 4,523,648 (17.5%)
                     not_optimized_instruction: 4,293,096 (16.6%)
                send_without_block_no_profiles: 1,293,386 ( 5.0%)
                send_not_optimized_method_type:    72,318 ( 0.3%)
       send_without_block_cfunc_array_variadic:    15,134 ( 0.1%)
                      obj_to_string_not_string:     9,765 ( 0.0%)
       send_without_block_direct_too_many_args:       629 ( 0.0%)
Top-9 unhandled YARV insns (100.0% of total 690,482):
         expandarray: 328,490 (47.6%)
        checkkeyword: 190,694 (27.6%)
    getclassvariable:  59,901 ( 8.7%)
  invokesuperforward:  49,503 ( 7.2%)
       getblockparam:  48,651 ( 7.0%)
   opt_duparray_send:  11,978 ( 1.7%)
         getconstant:     952 ( 0.1%)
          checkmatch:     290 ( 0.0%)
                once:      23 ( 0.0%)
Top-3 compile error reasons (100.0% of total 3,752,502):
  register_spill_on_alloc: 3,457,791 (92.1%)
  register_spill_on_ccall:   176,348 ( 4.7%)
        exception_handler:   118,363 ( 3.2%)
Top-14 side exit reasons (100.0% of total 10,860,787):
                        compile_error: 3,752,502 (34.6%)
                   guard_type_failure: 2,638,903 (24.3%)
                  guard_shape_failure: 1,917,195 (17.7%)
                  unhandled_yarv_insn:   690,482 ( 6.4%)
  block_param_proxy_not_iseq_or_ifunc:   535,787 ( 4.9%)
                      unhandled_kwarg:   421,943 ( 3.9%)
                           patchpoint:   370,449 ( 3.4%)
                unknown_newarray_send:   314,785 ( 2.9%)
                      unhandled_splat:   122,060 ( 1.1%)
                   unhandled_hir_insn:    76,396 ( 0.7%)
           block_param_proxy_modified:    19,193 ( 0.2%)
               obj_to_string_fallback:       566 ( 0.0%)
                            interrupt:       504 ( 0.0%)
               guard_type_not_failure:        22 ( 0.0%)
                             send_count: 66,945,801
                     dynamic_send_count: 25,824,463 (38.6%)
                   optimized_send_count: 41,121,338 (61.4%)
              iseq_optimized_send_count: 18,587,368 (27.8%)
      inline_cfunc_optimized_send_count:  6,958,635 (10.4%)
non_variadic_cfunc_optimized_send_count: 12,911,155 (19.3%)
    variadic_cfunc_optimized_send_count:  2,664,180 ( 4.0%)
dynamic_getivar_count:                        7,365,975
dynamic_setivar_count:                        7,245,897
compiled_iseq_count:                              4,794
failed_iseq_count:                                  450
compile_time:                                     760ms
profile_time:                                       9ms
gc_time:                                            8ms
invalidation_time:                                 55ms
vm_write_pc_count:                           64,284,053
vm_write_sp_count:                           62,940,297
vm_write_locals_count:                       62,940,297
vm_write_stack_count:                        62,940,297
vm_write_to_parent_iseq_local_count:            292,446
vm_read_from_parent_iseq_local_count:         6,470,923
code_region_bytes:                           23,019,520
side_exit_count:                             10,860,787
total_insn_count:                           517,576,320
vm_insn_count:                              163,188,910
zjit_insn_count:                            354,387,410
ratio_in_zjit:                                    68.5%
```

all stats after:

```
***ZJIT: Printing ZJIT statistics on exit***
Top-20 not inlined C methods (70.4% of total 15,740,856):
                                    Hash#[]: 4,519,792 (28.7%)
                               Kernel#is_a?: 1,030,776 ( 6.5%)
                                  String#<<:   851,940 ( 5.4%)
                                   Hash#[]=:   742,914 ( 4.7%)
                              Regexp#match?:   399,887 ( 2.5%)
                              String#empty?:   353,775 ( 2.2%)
                                  Hash#key?:   349,139 ( 2.2%)
                         String#start_with?:   334,961 ( 2.1%)
                         Kernel#respond_to?:   316,529 ( 2.0%)
                 ObjectSpace::WeakKeyMap#[]:   238,978 ( 1.5%)
                              TrueClass#===:   235,771 ( 1.5%)
                             FalseClass#===:   231,144 ( 1.5%)
                             Array#include?:   211,381 ( 1.3%)
                                 Hash#fetch:   204,702 ( 1.3%)
                        Kernel#block_given?:   181,788 ( 1.2%)
         ActiveSupport::OrderedOptions#_get:   181,272 ( 1.2%)
                                 Kernel#dup:   179,341 ( 1.1%)
                             BasicObject#!=:   175,996 ( 1.1%)
                                  Class#new:   168,079 ( 1.1%)
                            Kernel#kind_of?:   165,600 ( 1.1%)
Top-20 not annotated C methods (70.9% of total 15,902,999):
                                    Hash#[]: 4,519,802 (28.4%)
                               Kernel#is_a?: 1,212,667 ( 7.6%)
                                  String#<<:   851,940 ( 5.4%)
                                   Hash#[]=:   743,093 ( 4.7%)
                              Regexp#match?:   399,887 ( 2.5%)
                              String#empty?:   361,013 ( 2.3%)
                                  Hash#key?:   349,139 ( 2.2%)
                         String#start_with?:   334,961 ( 2.1%)
                         Kernel#respond_to?:   316,529 ( 2.0%)
                 ObjectSpace::WeakKeyMap#[]:   238,978 ( 1.5%)
                              TrueClass#===:   235,771 ( 1.5%)
                             FalseClass#===:   231,144 ( 1.5%)
                             Array#include?:   211,381 ( 1.3%)
                                 Hash#fetch:   204,702 ( 1.3%)
                        Kernel#block_given?:   191,657 ( 1.2%)
         ActiveSupport::OrderedOptions#_get:   181,272 ( 1.1%)
                                 Kernel#dup:   179,348 ( 1.1%)
                             BasicObject#!=:   176,180 ( 1.1%)
                                  Class#new:   168,079 ( 1.1%)
                            Kernel#kind_of?:   165,634 ( 1.0%)
Top-2 not optimized method types for send (100.0% of total 72,318):
  cfunc: 48,055 (66.4%)
   iseq: 24,263 (33.6%)
Top-6 not optimized method types for send_without_block (100.0% of total 4,523,637):
       iseq: 2,271,900 (50.2%)
    bmethod:   985,636 (21.8%)
  optimized:   949,695 (21.0%)
      alias:   310,746 ( 6.9%)
       null:     5,106 ( 0.1%)
      cfunc:       554 ( 0.0%)
Top-13 not optimized instructions (100.0% of total 4,293,128):
             invokesuper: 2,373,401 (55.3%)
             invokeblock:   811,890 (18.9%)
             sendforward:   505,449 (11.8%)
                  opt_eq:   451,754 (10.5%)
                opt_plus:    74,403 ( 1.7%)
               opt_minus:    36,228 ( 0.8%)
  opt_send_without_block:    21,792 ( 0.5%)
                 opt_neq:     7,231 ( 0.2%)
                opt_mult:     6,752 ( 0.2%)
                  opt_or:     3,753 ( 0.1%)
                  opt_lt:       348 ( 0.0%)
                  opt_ge:        91 ( 0.0%)
                  opt_gt:        36 ( 0.0%)
Top-9 send fallback reasons (100.0% of total 25,530,605):
                send_without_block_polymorphic: 9,722,499 (38.1%)
                              send_no_profiles: 5,894,763 (23.1%)
  send_without_block_not_optimized_method_type: 4,523,637 (17.7%)
                     not_optimized_instruction: 4,293,128 (16.8%)
                send_without_block_no_profiles:   998,732 ( 3.9%)
                send_not_optimized_method_type:    72,318 ( 0.3%)
       send_without_block_cfunc_array_variadic:    15,134 ( 0.1%)
                      obj_to_string_not_string:     9,765 ( 0.0%)
       send_without_block_direct_too_many_args:       629 ( 0.0%)
Top-9 unhandled YARV insns (100.0% of total 690,482):
         expandarray: 328,490 (47.6%)
        checkkeyword: 190,694 (27.6%)
    getclassvariable:  59,901 ( 8.7%)
  invokesuperforward:  49,503 ( 7.2%)
       getblockparam:  48,651 ( 7.0%)
   opt_duparray_send:  11,978 ( 1.7%)
         getconstant:     952 ( 0.1%)
          checkmatch:     290 ( 0.0%)
                once:      23 ( 0.0%)
Top-3 compile error reasons (100.0% of total 3,752,500):
  register_spill_on_alloc: 3,457,792 (92.1%)
  register_spill_on_ccall:   176,348 ( 4.7%)
        exception_handler:   118,360 ( 3.2%)
Top-14 side exit reasons (100.0% of total 10,860,797):
                        compile_error: 3,752,500 (34.6%)
                   guard_type_failure: 2,638,909 (24.3%)
                  guard_shape_failure: 1,917,203 (17.7%)
                  unhandled_yarv_insn:   690,482 ( 6.4%)
  block_param_proxy_not_iseq_or_ifunc:   535,784 ( 4.9%)
                      unhandled_kwarg:   421,947 ( 3.9%)
                           patchpoint:   370,474 ( 3.4%)
                unknown_newarray_send:   314,786 ( 2.9%)
                      unhandled_splat:   122,067 ( 1.1%)
                   unhandled_hir_insn:    76,395 ( 0.7%)
           block_param_proxy_modified:    19,193 ( 0.2%)
               obj_to_string_fallback:       566 ( 0.0%)
                            interrupt:       469 ( 0.0%)
               guard_type_not_failure:        22 ( 0.0%)
                             send_count: 66,945,326
                     dynamic_send_count: 25,530,605 (38.1%)
                   optimized_send_count: 41,414,721 (61.9%)
              iseq_optimized_send_count: 18,587,439 (27.8%)
      inline_cfunc_optimized_send_count:  7,086,426 (10.6%)
non_variadic_cfunc_optimized_send_count: 13,076,682 (19.5%)
    variadic_cfunc_optimized_send_count:  2,664,174 ( 4.0%)
dynamic_getivar_count:                       7,365,985
dynamic_setivar_count:                       7,245,954
compiled_iseq_count:                             4,794
failed_iseq_count:                                 450
compile_time:                                    748ms
profile_time:                                      9ms
gc_time:                                           8ms
invalidation_time:                                58ms
vm_write_pc_count:                          64,155,801
vm_write_sp_count:                          62,812,041
vm_write_locals_count:                      62,812,041
vm_write_stack_count:                       62,812,041
vm_write_to_parent_iseq_local_count:           292,448
vm_read_from_parent_iseq_local_count:        6,470,939
code_region_bytes:                          23,052,288
side_exit_count:                            10,860,797
total_insn_count:                          517,576,915
vm_insn_count:                             163,192,099
zjit_insn_count:                           354,384,816
ratio_in_zjit:                                   68.5%
```
2025-10-14 16:17:54 -04:00
Max Bernstein
d75207d004
ZJIT: Profile opt_ltlt and opt_aset (#14834)
These bring `send_without_block_no_profiles` numbers down dramatically.

On lobsters:
  Before: send_without_block_no_profiles: 3,466,375
  After:  send_without_block_no_profiles: 1,293,375

all stats before:

```
***ZJIT: Printing ZJIT statistics on exit***
Top-20 not inlined C methods (70.4% of total 14,174,061):
                                    Hash#[]: 4,519,771 (31.9%)
                               Kernel#is_a?: 1,030,757 ( 7.3%)
                              Regexp#match?:   399,885 ( 2.8%)
                              String#empty?:   353,775 ( 2.5%)
                                  Hash#key?:   349,125 ( 2.5%)
                                   Hash#[]=:   344,348 ( 2.4%)
                         String#start_with?:   334,961 ( 2.4%)
                         Kernel#respond_to?:   316,527 ( 2.2%)
                 ObjectSpace::WeakKeyMap#[]:   238,978 ( 1.7%)
                              TrueClass#===:   235,770 ( 1.7%)
                             FalseClass#===:   231,143 ( 1.6%)
                             Array#include?:   211,383 ( 1.5%)
                                 Hash#fetch:   204,702 ( 1.4%)
                        Kernel#block_given?:   181,793 ( 1.3%)
         ActiveSupport::OrderedOptions#_get:   181,272 ( 1.3%)
                                 Kernel#dup:   179,341 ( 1.3%)
                             BasicObject#!=:   175,996 ( 1.2%)
                                  Class#new:   168,079 ( 1.2%)
                            Kernel#kind_of?:   165,600 ( 1.2%)
                                  String#==:   157,734 ( 1.1%)
Top-20 not annotated C methods (71.1% of total 14,336,035):
                                    Hash#[]: 4,519,781 (31.5%)
                               Kernel#is_a?: 1,212,647 ( 8.5%)
                              Regexp#match?:   399,885 ( 2.8%)
                              String#empty?:   361,013 ( 2.5%)
                                  Hash#key?:   349,125 ( 2.4%)
                                   Hash#[]=:   344,348 ( 2.4%)
                         String#start_with?:   334,961 ( 2.3%)
                         Kernel#respond_to?:   316,527 ( 2.2%)
                 ObjectSpace::WeakKeyMap#[]:   238,978 ( 1.7%)
                              TrueClass#===:   235,770 ( 1.6%)
                             FalseClass#===:   231,143 ( 1.6%)
                             Array#include?:   211,383 ( 1.5%)
                                 Hash#fetch:   204,702 ( 1.4%)
                        Kernel#block_given?:   191,662 ( 1.3%)
         ActiveSupport::OrderedOptions#_get:   181,272 ( 1.3%)
                                 Kernel#dup:   179,348 ( 1.3%)
                             BasicObject#!=:   176,180 ( 1.2%)
                                  Class#new:   168,079 ( 1.2%)
                            Kernel#kind_of?:   165,634 ( 1.2%)
                                  String#==:   163,666 ( 1.1%)
Top-2 not optimized method types for send (100.0% of total 72,318):
  cfunc: 48,055 (66.4%)
   iseq: 24,263 (33.6%)
Top-6 not optimized method types for send_without_block (100.0% of total 4,536,895):
       iseq: 2,281,897 (50.3%)
    bmethod:   985,679 (21.7%)
  optimized:   952,914 (21.0%)
      alias:   310,745 ( 6.8%)
       null:     5,106 ( 0.1%)
      cfunc:       554 ( 0.0%)
Top-13 not optimized instructions (100.0% of total 4,293,123):
             invokesuper: 2,373,396 (55.3%)
             invokeblock:   811,891 (18.9%)
             sendforward:   505,449 (11.8%)
                  opt_eq:   451,754 (10.5%)
                opt_plus:    74,403 ( 1.7%)
               opt_minus:    36,227 ( 0.8%)
  opt_send_without_block:    21,792 ( 0.5%)
                 opt_neq:     7,231 ( 0.2%)
                opt_mult:     6,752 ( 0.2%)
                  opt_or:     3,753 ( 0.1%)
                  opt_lt:       348 ( 0.0%)
                  opt_ge:        91 ( 0.0%)
                  opt_gt:        36 ( 0.0%)
Top-9 send fallback reasons (100.0% of total 27,795,022):
                send_without_block_polymorphic: 9,505,835 (34.2%)
                              send_no_profiles: 5,894,763 (21.2%)
  send_without_block_not_optimized_method_type: 4,536,895 (16.3%)
                     not_optimized_instruction: 4,293,123 (15.4%)
                send_without_block_no_profiles: 3,466,407 (12.5%)
                send_not_optimized_method_type:    72,318 ( 0.3%)
       send_without_block_cfunc_array_variadic:    15,134 ( 0.1%)
                      obj_to_string_not_string:     9,918 ( 0.0%)
       send_without_block_direct_too_many_args:       629 ( 0.0%)
Top-9 unhandled YARV insns (100.0% of total 690,482):
         expandarray: 328,490 (47.6%)
        checkkeyword: 190,694 (27.6%)
    getclassvariable:  59,901 ( 8.7%)
  invokesuperforward:  49,503 ( 7.2%)
       getblockparam:  48,651 ( 7.0%)
   opt_duparray_send:  11,978 ( 1.7%)
         getconstant:     952 ( 0.1%)
          checkmatch:     290 ( 0.0%)
                once:      23 ( 0.0%)
Top-3 compile error reasons (100.0% of total 3,752,391):
  register_spill_on_alloc: 3,457,680 (92.1%)
  register_spill_on_ccall:   176,348 ( 4.7%)
        exception_handler:   118,363 ( 3.2%)
Top-14 side exit reasons (100.0% of total 10,852,021):
                        compile_error: 3,752,391 (34.6%)
                   guard_type_failure: 2,630,877 (24.2%)
                  guard_shape_failure: 1,917,208 (17.7%)
                  unhandled_yarv_insn:   690,482 ( 6.4%)
  block_param_proxy_not_iseq_or_ifunc:   535,784 ( 4.9%)
                      unhandled_kwarg:   421,989 ( 3.9%)
                           patchpoint:   369,799 ( 3.4%)
                unknown_newarray_send:   314,786 ( 2.9%)
                      unhandled_splat:   122,062 ( 1.1%)
                   unhandled_hir_insn:    76,394 ( 0.7%)
           block_param_proxy_modified:    19,193 ( 0.2%)
               obj_to_string_fallback:       566 ( 0.0%)
                            interrupt:       468 ( 0.0%)
               guard_type_not_failure:        22 ( 0.0%)
                             send_count: 66,989,407
                     dynamic_send_count: 27,795,022 (41.5%)
                   optimized_send_count: 39,194,385 (58.5%)
              iseq_optimized_send_count: 18,060,194 (27.0%)
      inline_cfunc_optimized_send_count:  6,960,130 (10.4%)
non_variadic_cfunc_optimized_send_count: 11,523,682 (17.2%)
    variadic_cfunc_optimized_send_count:  2,650,379 ( 4.0%)
dynamic_getivar_count:                        7,365,982
dynamic_setivar_count:                        7,245,929
compiled_iseq_count:                              4,795
failed_iseq_count:                                  449
compile_time:                                     846ms
profile_time:                                      12ms
gc_time:                                            9ms
invalidation_time:                                 61ms
vm_write_pc_count:                           64,326,442
vm_write_sp_count:                           62,982,524
vm_write_locals_count:                       62,982,524
vm_write_stack_count:                        62,982,524
vm_write_to_parent_iseq_local_count:            292,448
vm_read_from_parent_iseq_local_count:         6,471,353
code_region_bytes:                           22,708,224
side_exit_count:                             10,852,021
total_insn_count:                           517,550,288
vm_insn_count:                              162,946,459
zjit_insn_count:                            354,603,829
ratio_in_zjit:                                    68.5%
```

all stats after:

```
***ZJIT: Printing ZJIT statistics on exit***
Top-20 not inlined C methods (71.1% of total 15,575,343):
                                    Hash#[]: 4,519,778 (29.0%)
                               Kernel#is_a?: 1,030,758 ( 6.6%)
                                  String#<<:   851,931 ( 5.5%)
                                   Hash#[]=:   742,938 ( 4.8%)
                              Regexp#match?:   399,886 ( 2.6%)
                              String#empty?:   353,775 ( 2.3%)
                                  Hash#key?:   349,127 ( 2.2%)
                         String#start_with?:   334,961 ( 2.2%)
                         Kernel#respond_to?:   316,529 ( 2.0%)
                 ObjectSpace::WeakKeyMap#[]:   238,978 ( 1.5%)
                              TrueClass#===:   235,771 ( 1.5%)
                             FalseClass#===:   231,144 ( 1.5%)
                             Array#include?:   211,380 ( 1.4%)
                                 Hash#fetch:   204,701 ( 1.3%)
                        Kernel#block_given?:   181,792 ( 1.2%)
         ActiveSupport::OrderedOptions#_get:   181,272 ( 1.2%)
                                 Kernel#dup:   179,341 ( 1.2%)
                             BasicObject#!=:   175,997 ( 1.1%)
                                  Class#new:   168,079 ( 1.1%)
                            Kernel#kind_of?:   165,600 ( 1.1%)
Top-20 not annotated C methods (71.6% of total 15,737,486):
                                    Hash#[]: 4,519,788 (28.7%)
                               Kernel#is_a?: 1,212,649 ( 7.7%)
                                  String#<<:   851,931 ( 5.4%)
                                   Hash#[]=:   743,117 ( 4.7%)
                              Regexp#match?:   399,886 ( 2.5%)
                              String#empty?:   361,013 ( 2.3%)
                                  Hash#key?:   349,127 ( 2.2%)
                         String#start_with?:   334,961 ( 2.1%)
                         Kernel#respond_to?:   316,529 ( 2.0%)
                 ObjectSpace::WeakKeyMap#[]:   238,978 ( 1.5%)
                              TrueClass#===:   235,771 ( 1.5%)
                             FalseClass#===:   231,144 ( 1.5%)
                             Array#include?:   211,380 ( 1.3%)
                                 Hash#fetch:   204,701 ( 1.3%)
                        Kernel#block_given?:   191,661 ( 1.2%)
         ActiveSupport::OrderedOptions#_get:   181,272 ( 1.2%)
                                 Kernel#dup:   179,348 ( 1.1%)
                             BasicObject#!=:   176,181 ( 1.1%)
                                  Class#new:   168,079 ( 1.1%)
                            Kernel#kind_of?:   165,634 ( 1.1%)
Top-2 not optimized method types for send (100.0% of total 72,318):
  cfunc: 48,055 (66.4%)
   iseq: 24,263 (33.6%)
Top-6 not optimized method types for send_without_block (100.0% of total 4,523,650):
       iseq: 2,271,911 (50.2%)
    bmethod:   985,636 (21.8%)
  optimized:   949,696 (21.0%)
      alias:   310,747 ( 6.9%)
       null:     5,106 ( 0.1%)
      cfunc:       554 ( 0.0%)
Top-13 not optimized instructions (100.0% of total 4,293,126):
             invokesuper: 2,373,395 (55.3%)
             invokeblock:   811,894 (18.9%)
             sendforward:   505,449 (11.8%)
                  opt_eq:   451,754 (10.5%)
                opt_plus:    74,403 ( 1.7%)
               opt_minus:    36,228 ( 0.8%)
  opt_send_without_block:    21,792 ( 0.5%)
                 opt_neq:     7,231 ( 0.2%)
                opt_mult:     6,752 ( 0.2%)
                  opt_or:     3,753 ( 0.1%)
                  opt_lt:       348 ( 0.0%)
                  opt_ge:        91 ( 0.0%)
                  opt_gt:        36 ( 0.0%)
Top-9 send fallback reasons (100.0% of total 25,824,512):
                send_without_block_polymorphic: 9,721,725 (37.6%)
                              send_no_profiles: 5,894,761 (22.8%)
  send_without_block_not_optimized_method_type: 4,523,650 (17.5%)
                     not_optimized_instruction: 4,293,126 (16.6%)
                send_without_block_no_profiles: 1,293,404 ( 5.0%)
                send_not_optimized_method_type:    72,318 ( 0.3%)
       send_without_block_cfunc_array_variadic:    15,134 ( 0.1%)
                      obj_to_string_not_string:     9,765 ( 0.0%)
       send_without_block_direct_too_many_args:       629 ( 0.0%)
Top-9 unhandled YARV insns (100.0% of total 690,482):
         expandarray: 328,490 (47.6%)
        checkkeyword: 190,694 (27.6%)
    getclassvariable:  59,901 ( 8.7%)
  invokesuperforward:  49,503 ( 7.2%)
       getblockparam:  48,651 ( 7.0%)
   opt_duparray_send:  11,978 ( 1.7%)
         getconstant:     952 ( 0.1%)
          checkmatch:     290 ( 0.0%)
                once:      23 ( 0.0%)
Top-3 compile error reasons (100.0% of total 3,752,504):
  register_spill_on_alloc: 3,457,793 (92.1%)
  register_spill_on_ccall:   176,348 ( 4.7%)
        exception_handler:   118,363 ( 3.2%)
Top-14 side exit reasons (100.0% of total 10,860,754):
                        compile_error: 3,752,504 (34.6%)
                   guard_type_failure: 2,638,901 (24.3%)
                  guard_shape_failure: 1,917,198 (17.7%)
                  unhandled_yarv_insn:   690,482 ( 6.4%)
  block_param_proxy_not_iseq_or_ifunc:   535,785 ( 4.9%)
                      unhandled_kwarg:   421,947 ( 3.9%)
                           patchpoint:   370,447 ( 3.4%)
                unknown_newarray_send:   314,786 ( 2.9%)
                      unhandled_splat:   122,065 ( 1.1%)
                   unhandled_hir_insn:    76,395 ( 0.7%)
           block_param_proxy_modified:    19,193 ( 0.2%)
               obj_to_string_fallback:       566 ( 0.0%)
                            interrupt:       463 ( 0.0%)
               guard_type_not_failure:        22 ( 0.0%)
                             send_count: 66,945,926
                     dynamic_send_count: 25,824,512 (38.6%)
                   optimized_send_count: 41,121,414 (61.4%)
              iseq_optimized_send_count: 18,587,430 (27.8%)
      inline_cfunc_optimized_send_count:  6,958,641 (10.4%)
non_variadic_cfunc_optimized_send_count: 12,911,166 (19.3%)
    variadic_cfunc_optimized_send_count:  2,664,177 ( 4.0%)
dynamic_getivar_count:                        7,365,985
dynamic_setivar_count:                        7,245,942
compiled_iseq_count:                              4,794
failed_iseq_count:                                  450
compile_time:                                     852ms
profile_time:                                      13ms
gc_time:                                           11ms
invalidation_time:                                 63ms
vm_write_pc_count:                           64,284,194
vm_write_sp_count:                           62,940,427
vm_write_locals_count:                       62,940,427
vm_write_stack_count:                        62,940,427
vm_write_to_parent_iseq_local_count:            292,447
vm_read_from_parent_iseq_local_count:         6,470,931
code_region_bytes:                           23,019,520
side_exit_count:                             10,860,754
total_insn_count:                           517,576,267
vm_insn_count:                              163,188,187
zjit_insn_count:                            354,388,080
ratio_in_zjit:                                    68.5%
```
2025-10-14 19:09:53 +00:00
Aiden Fox Ivey
d7f2a1ec9a
ZJIT: Profile opt_aref (#14778)
* ZJIT: Profile opt_aref

* ZJIT: Add test for opt_aref

* ZJIT: Move test and add hash opt test

* ZJIT: Update zjit bindgen

* ZJIT: Add inspect calls to opt_aref tests
2025-10-09 09:42:09 -07:00
Satoshi Tagomori
4f47327287 Update current namespace management by using control frames and lexical contexts
to fix inconsistent and wrong current namespace detections.

This includes:
* Moving load_path and related things from rb_vm_t to rb_namespace_t to simplify
  accessing those values via namespace (instead of accessing either vm or ns)
* Initializing root_namespace earlier and consolidate builtin_namespace into root_namespace
* Adding VM_FRAME_FLAG_NS_REQUIRE for checkpoints to detect a namespace to load/require files
* Removing implicit refinements in the root namespace which was used to determine
  the namespace to be loaded (replaced by VM_FRAME_FLAG_NS_REQUIRE)
* Removing namespaces from rb_proc_t because its namespace can be identified by lexical context
* Starting to use ep[VM_ENV_DATA_INDEX_SPECVAL] to store the current namespace when
  the frame type is MAGIC_TOP or MAGIC_CLASS (block handlers don't exist in this case)
2025-09-29 01:15:38 +09:00
André Luiz Tiago Soares
f7fe43610b
ZJIT: Optimize ObjToString with type guards (#14469)
* failing test for ObjToString optimization with GuardType

* profile ObjToString receiver and rewrite with guard

* adjust integration tests for objtostring type guard optimization

* Implement new GuardTypeNot HIR; objtostring sends to_s directly on profiled nonstrings

* codegen for GuardTypeNot

* typo fixes

* better name for tests; fix side exit reason for GuardTypeNot

* revert accidental change

* make bindgen

* Fix is_string to identify subclasses of String; fix codegen for identifying if val is String
2025-09-09 08:58:03 -07:00
Takashi Kokubun
799d57bd01 insns.def: Drop unused leafness_of_check_ints
It was used to let MJIT override the leafness of the instruction when it
decides to remove check_ints for it. Now that MJIT is gone, nobody needs
to "override" the leafness using this.
2025-09-05 09:15:00 -07:00
Max Bernstein
b6f4b5399d
ZJIT: Specialize monomorphic GetIvar (#14388)
Specialize monomorphic `GetIvar` into:

* `GuardType(HeapObject)`
* `GuardShape`
* `LoadIvarEmbedded` or `LoadIvarExtended`

This requires profiling self for `getinstancevariable` (it's not on the operand
stack).

This also optimizes `GetIvar`s that happen as a result of inlining
`attr_reader` and `attr_accessor`.

Also move some (newly) shared JIT helpers into jit.c.
2025-08-29 12:46:08 -04:00
Max Bernstein
4652879f43
ZJIT: Specialize some Sends (#14363)
* ZJIT: Profile and specialize Array#empty?
* ZJIT: Specialize BasicObject#==
* ZJIT: Specialize Hash#empty?
* ZJIT: Specialize BasicObject#!

Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
2025-08-27 19:03:32 +00:00
Aaron Patterson
fb6e3a8000 Remove opt_aref_with and opt_aset_with
When these instructions were introduced it was common to read from a
hash with mutable string literals.  However, these days, I think these
instructions are fairly rare.

I tested this with the lobsters benchmark, and saw no difference in
speed.  In order to be sure, I tracked down every use of this
instruction in the lobsters benchmark, and there were only 4 places
where it was used.

Additionally, this patch fixes a case where "chilled strings" should
emit a warning but they don't.

```ruby
class Foo
  def self.[](x)= x.gsub!(/hello/, "hi")
end

Foo["hello world"]
```

Removing these instructions shows this warning:

```
> ./miniruby -vw test.rb
ruby 3.5.0dev (2025-08-25T21:36:50Z rm-opt_aref_with dca08e286c) +PRISM [arm64-darwin24]
test.rb:2: warning: literal string will be frozen in the future (run with --debug-frozen-string-literal for more information)
```

[Feature #21553]
2025-08-26 13:02:17 -07:00
Stan Lo
10b582dab6 ZJIT: Profile opt_and and opt_or instructions 2025-07-09 17:50:41 -04:00
Stan Lo
79915e6f78 ZJIT: Profile nil? calls
This allows ZJIT to profile `nil?` calls and create type guards for
its receiver.

- Add `zjit_profile` to `opt_nil_p` insn
- Start profiling `opt_nil_p` calls
- Use `runtime_exact_ruby_class` instead of `exact_ruby_class` to determine
  the profiled receiver class
2025-07-08 15:51:43 -04:00
Aaron Patterson
55c9c75b47 Maintain same behavior regardless of tracepoint state
Always use opt_new behavior regardless of tracepoint state.
2025-05-15 14:19:48 -07:00
Max Bernstein
b42c8398ba Don't support blockarg in opt_new
We don't calculate the correct argc so the bookkeeping slot is something
else (unexpected) instead of Qnil (expected).
2025-04-29 09:13:25 -07:00
Aaron Patterson
ec3b48d3da Deopt if iseq trace events are enabled 2025-04-25 13:46:05 -07:00
Aaron Patterson
8ac8225c50 Inline Class#new.
This commit inlines instructions for Class#new.  To make this work, we
added a new YARV instructions, `opt_new`.  `opt_new` checks whether or
not the `new` method is the default allocator method.  If it is, it
allocates the object, and pushes the instance on the stack.  If not, the
instruction jumps to the "slow path" method call instructions.

Old instructions:

```
> ruby --dump=insns -e'Object.new'
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,10)>
0000 opt_getconstant_path                   <ic:0 Object>             (   1)[Li]
0002 opt_send_without_block                 <calldata!mid:new, argc:0, ARGS_SIMPLE>
0004 leave
```

New instructions:

```
> ./miniruby --dump=insns -e'Object.new'
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,10)>
0000 opt_getconstant_path                   <ic:0 Object>             (   1)[Li]
0002 putnil
0003 swap
0004 opt_new                                <calldata!mid:new, argc:0, ARGS_SIMPLE>, 11
0007 opt_send_without_block                 <calldata!mid:initialize, argc:0, FCALL|ARGS_SIMPLE>
0009 jump                                   14
0011 opt_send_without_block                 <calldata!mid:new, argc:0, ARGS_SIMPLE>
0013 swap
0014 pop
0015 leave
```

This commit speeds up basic object allocation (`Foo.new`) by 60%, but
classes that take keyword parameters see an even bigger benefit because
no hash is allocated when instantiating the object (3x to 6x faster).

Here is an example that uses `Hash.new(capacity: 0)`:

```
> hyperfine "ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end'" "./ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end'"
Benchmark 1: ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end'
  Time (mean ± σ):      1.082 s ±  0.004 s    [User: 1.074 s, System: 0.008 s]
  Range (min … max):    1.076 s …  1.088 s    10 runs

Benchmark 2: ./ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end'
  Time (mean ± σ):     627.9 ms ±   3.5 ms    [User: 622.7 ms, System: 4.8 ms]
  Range (min … max):   622.7 ms … 633.2 ms    10 runs

Summary
  ./ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end' ran
    1.72 ± 0.01 times faster than ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end'
```

This commit changes the backtrace for `initialize`:

```
aaron@tc ~/g/ruby (inline-new)> cat test.rb
class Foo
  def initialize
    puts caller
  end
end

def hello
  Foo.new
end

hello
aaron@tc ~/g/ruby (inline-new)> ruby -v test.rb
ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +PRISM [arm64-darwin24]
test.rb:8:in 'Class#new'
test.rb:8:in 'Object#hello'
test.rb:11:in '<main>'
aaron@tc ~/g/ruby (inline-new)> ./miniruby -v test.rb
ruby 3.5.0dev (2025-03-28T23:59:40Z inline-new c4157884e4) +PRISM [arm64-darwin24]
test.rb:8:in 'Object#hello'
test.rb:11:in '<main>'
```

It also increases memory usage for calls to `new` by 122 bytes:

```
aaron@tc ~/g/ruby (inline-new)> cat test.rb
require "objspace"

class Foo
  def initialize
    puts caller
  end
end

def hello
  Foo.new
end

puts ObjectSpace.memsize_of(RubyVM::InstructionSequence.of(method(:hello)))
aaron@tc ~/g/ruby (inline-new)> make runruby
RUBY_ON_BUG='gdb -x ./.gdbinit -p' ./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems  ./test.rb
656
aaron@tc ~/g/ruby (inline-new)> ruby -v test.rb
ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +PRISM [arm64-darwin24]
544
```

Thanks to @ko1 for coming up with this idea!

Co-Authored-By: John Hawthorn <john@hawthorn.email>
2025-04-25 13:46:05 -07:00
Alan Wu
2cf769376b Add profiling for opt_send_without_block
Split out from the CCall changes since we discussed during pairing that
this is useful to unblock some other changes. No tests since no one
consumes this profiling data yet.
2025-04-18 21:53:01 +09:00
Takashi Kokubun
8b2a4625cb Profile instructions for fixnum arithmetic (https://github.com/Shopify/zjit/pull/24)
* Profile instructions for fixnum arithmetic

* Drop PartialEq from Type

* Do not push PatchPoint onto the stack

* Avoid pushing the output of the guards

* Pop operands after guards

* Test HIR from profiled runs

* Implement Display for new instructions

* Drop unused FIXNUM_BITS

* Use a Rust function to split lines

* Use Display for GuardType operands

Co-authored-by: Max Bernstein <max@bernsteinbear.com>

* Fix tests with Display-ed values

---------

Co-authored-by: Max Bernstein <max@bernsteinbear.com>
2025-04-18 21:52:59 +09:00
Takashi Kokubun
0a543daf15 Add zjit_* instructions to profile the interpreter (https://github.com/Shopify/zjit/pull/16)
* Add zjit_* instructions to profile the interpreter

* Rename FixnumPlus to FixnumAdd

* Update a comment about Invalidate

* Rename Guard to GuardType

* Rename Invalidate to PatchPoint

* Drop unneeded debug!()

* Plan on profiling the types

* Use the output of GuardType as type refined outputs
2025-04-18 21:52:59 +09:00
Nobuyoshi Nakada
c9d433947e
Adjust style [ci skip] 2025-03-18 16:22:34 +09:00
Aaron Patterson
ae7890df33 Use the EC parameter in instructions.
The forwarding instructions should use the `ec` parameter passed to
vm_exec_core instead of trying to look up the EC via `GET_EC()`.  It's
cheaper to get the local than to try looking up a global
2025-03-13 15:21:37 -07:00
Randy Stauner
1dd40ec18a
Optimize instructions when creating an array just to call include? (#12123)
* Add opt_duparray_send insn to skip the allocation on `#include?`

If the method isn't going to modify the array we don't need to copy it.
This avoids the allocation / array copy for things like `[:a, :b].include?(x)`.

This adds a BOP for include? and tracks redefinition for it on Array.

Co-authored-by: Andrew Novoselac <andrew.novoselac@shopify.com>

* YJIT: Implement opt_duparray_send include_p

Co-authored-by: Andrew Novoselac <andrew.novoselac@shopify.com>

* Update opt_newarray_send to support simple forms of include?(arg)

Similar to opt_duparray_send but for non-static arrays.

* YJIT: Implement opt_newarray_send include_p

---------

Co-authored-by: Andrew Novoselac <andrew.novoselac@shopify.com>
2024-11-26 14:31:08 -05:00
Étienne Barrié
bf9879791a Optimized instruction for Hash#freeze
If a Hash which is empty or only using literals is frozen, we detect
this as a peephole optimization and change the instructions to be
`opt_hash_freeze`.

[Feature #20684]

Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
2024-09-05 12:46:02 +02:00
Étienne Barrié
a99707cd9c Optimized instruction for Array#freeze
If an Array which is empty or only using literals is frozen, we detect
this as a peephole optimization and change the instructions to be
`opt_ary_freeze`.

[Feature #20684]

Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
2024-09-05 12:46:02 +02:00
Alan Wu
525008cd78
Delete newarraykwsplat
The pushtoarraykwsplat instruction was designed to replace newarraykwsplat,
and we now meet the condition for deletion mentioned in
77c1233f79a0f96a081b70da533fbbde4f3037fa.
2024-08-13 20:56:35 +00:00
Randy Stauner
acbb8d4fb5 Expand opt_newarray_send to support Array#pack with buffer keyword arg
Use an enum for the method arg instead of needing to add an id
that doesn't map to an actual method name.

$ ruby --dump=insns -e 'b = "x"; [v].pack("E*", buffer: b)'

before:

```
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,34)>
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 1] b@0
0000 putchilledstring                       "x"                       (   1)[Li]
0002 setlocal_WC_0                          b@0
0004 putself
0005 opt_send_without_block                 <calldata!mid:v, argc:0, FCALL|VCALL|ARGS_SIMPLE>
0007 newarray                               1
0009 putchilledstring                       "E*"
0011 getlocal_WC_0                          b@0
0013 opt_send_without_block                 <calldata!mid:pack, argc:2, kw:[#<Symbol:0x000000000023110c>], KWARG>
0015 leave
```

after:

```
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,34)>
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 1] b@0
0000 putchilledstring                       "x"                       (   1)[Li]
0002 setlocal_WC_0                          b@0
0004 putself
0005 opt_send_without_block                 <calldata!mid:v, argc:0, FCALL|VCALL|ARGS_SIMPLE>
0007 putchilledstring                       "E*"
0009 getlocal                               b@0, 0
0012 opt_newarray_send                      3, 5
0015 leave
```
2024-07-29 16:26:58 -04:00
Aaron Patterson
a661c82972 Refactor so we don't have _cd
This should make the diff more clean
2024-06-18 09:28:25 -07:00
Aaron Patterson
cc97a27008 Add two new instructions for forwarding calls
This commit adds `sendforward` and `invokesuperforward` for forwarding
parameters to calls

Co-authored-by: Matt Valentine-House <matt@eightbitraptor.com>
2024-06-18 09:28:25 -07:00
Aaron Patterson
cdf33ed5f3 Optimized forwarding callers and callees
This patch optimizes forwarding callers and callees. It only optimizes methods that only take `...` as their parameter, and then pass `...` to other calls.

Calls it optimizes look like this:

```ruby
def bar(a) = a
def foo(...) = bar(...) # optimized
foo(123)
```

```ruby
def bar(a) = a
def foo(...) = bar(1, 2, ...) # optimized
foo(123)
```

```ruby
def bar(*a) = a

def foo(...)
  list = [1, 2]
  bar(*list, ...) # optimized
end
foo(123)
```

All variants of the above but using `super` are also optimized, including a bare super like this:

```ruby
def foo(...)
  super
end
```

This patch eliminates intermediate allocations made when calling methods that accept `...`.
We can observe allocation elimination like this:

```ruby
def m
  x = GC.stat(:total_allocated_objects)
  yield
  GC.stat(:total_allocated_objects) - x
end

def bar(a) = a
def foo(...) = bar(...)

def test
  m { foo(123) }
end

test
p test # allocates 1 object on master, but 0 objects with this patch
```

```ruby
def bar(a, b:) = a + b
def foo(...) = bar(...)

def test
  m { foo(1, b: 2) }
end

test
p test # allocates 2 objects on master, but 0 objects with this patch
```

How does it work?
-----------------

This patch works by using a dynamic stack size when passing forwarded parameters to callees.
The caller's info object (known as the "CI") contains the stack size of the
parameters, so we pass the CI object itself as a parameter to the callee.
When forwarding parameters, the forwarding ISeq uses the caller's CI to determine how much stack to copy, then copies the caller's stack before calling the callee.
The CI at the forwarded call site is adjusted using information from the caller's CI.

I think this description is kind of confusing, so let's walk through an example with code.

```ruby
def delegatee(a, b) = a + b

def delegator(...)
  delegatee(...)  # CI2 (FORWARDING)
end

def caller
  delegator(1, 2) # CI1 (argc: 2)
end
```

Before we call the delegator method, the stack looks like this:

```
Executing Line | Code                                  | Stack
---------------+---------------------------------------+--------
              1| def delegatee(a, b) = a + b           | self
              2|                                       | 1
              3| def delegator(...)                    | 2
              4|   #                                   |
              5|   delegatee(...)  # CI2 (FORWARDING)  |
              6| end                                   |
              7|                                       |
              8| def caller                            |
          ->  9|   delegator(1, 2) # CI1 (argc: 2)     |
             10| end                                   |
```

The ISeq for `delegator` is tagged as "forwardable", so when `caller` calls in
to `delegator`, it writes `CI1` on to the stack as a local variable for the
`delegator` method.  The `delegator` method has a special local called `...`
that holds the caller's CI object.

Here is the ISeq disasm fo `delegator`:

```
== disasm: #<ISeq:delegator@-e:1 (1,0)-(1,39)>
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 1] "..."@0
0000 putself                                                          (   1)[LiCa]
0001 getlocal_WC_0                          "..."@0
0003 send                                   <calldata!mid:delegatee, argc:0, FCALL|FORWARDING>, nil
0006 leave                                  [Re]
```

The local called `...` will contain the caller's CI: CI1.

Here is the stack when we enter `delegator`:

```
Executing Line | Code                                  | Stack
---------------+---------------------------------------+--------
              1| def delegatee(a, b) = a + b           | self
              2|                                       | 1
              3| def delegator(...)                    | 2
           -> 4|   #                                   | CI1 (argc: 2)
              5|   delegatee(...)  # CI2 (FORWARDING)  | cref_or_me
              6| end                                   | specval
              7|                                       | type
              8| def caller                            |
              9|   delegator(1, 2) # CI1 (argc: 2)     |
             10| end                                   |
```

The CI at `delegatee` on line 5 is tagged as "FORWARDING", so it knows to
memcopy the caller's stack before calling `delegatee`.  In this case, it will
memcopy self, 1, and 2 to the stack before calling `delegatee`.  It knows how much
memory to copy from the caller because `CI1` contains stack size information
(argc: 2).

Before executing the `send` instruction, we push `...` on the stack.  The
`send` instruction pops `...`, and because it is tagged with `FORWARDING`, it
knows to memcopy (using the information in the CI it just popped):

```
== disasm: #<ISeq:delegator@-e:1 (1,0)-(1,39)>
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 1] "..."@0
0000 putself                                                          (   1)[LiCa]
0001 getlocal_WC_0                          "..."@0
0003 send                                   <calldata!mid:delegatee, argc:0, FCALL|FORWARDING>, nil
0006 leave                                  [Re]
```

Instruction 001 puts the caller's CI on the stack.  `send` is tagged with
FORWARDING, so it reads the CI and _copies_ the callers stack to this stack:

```
Executing Line | Code                                  | Stack
---------------+---------------------------------------+--------
              1| def delegatee(a, b) = a + b           | self
              2|                                       | 1
              3| def delegator(...)                    | 2
              4|   #                                   | CI1 (argc: 2)
           -> 5|   delegatee(...)  # CI2 (FORWARDING)  | cref_or_me
              6| end                                   | specval
              7|                                       | type
              8| def caller                            | self
              9|   delegator(1, 2) # CI1 (argc: 2)     | 1
             10| end                                   | 2
```

The "FORWARDING" call site combines information from CI1 with CI2 in order
to support passing other values in addition to the `...` value, as well as
perfectly forward splat args, kwargs, etc.

Since we're able to copy the stack from `caller` in to `delegator`'s stack, we
can avoid allocating objects.

I want to do this to eliminate object allocations for delegate methods.
My long term goal is to implement `Class#new` in Ruby and it uses `...`.

I was able to implement `Class#new` in Ruby
[here](https://github.com/ruby/ruby/pull/9289).
If we adopt the technique in this patch, then we can optimize allocating
objects that take keyword parameters for `initialize`.

For example, this code will allocate 2 objects: one for `SomeObject`, and one
for the kwargs:

```ruby
SomeObject.new(foo: 1)
```

If we combine this technique, plus implement `Class#new` in Ruby, then we can
reduce allocations for this common operation.

Co-Authored-By: John Hawthorn <john@hawthorn.email>
Co-Authored-By: Alan Wu <XrXr@users.noreply.github.com>
2024-06-18 09:28:25 -07:00
Jean Boussier
730e3b2ce0 Stop exposing rb_str_chilled_p
[Feature #20205]

Now that chilled strings no longer appear as frozen, there is no
need to offer an API to check for chilled strings.

We however need to change `rb_check_frozen_internal` to no
longer be a macro, as it needs to check for chilled strings.
2024-06-02 13:53:35 +02:00
Nobuyoshi Nakada
49fcd33e13 Introduce a specialize instruction for Array#pack
Instructions for this code:

```ruby
  # frozen_string_literal: true

[a].pack("C")
```

Before this commit:

```
== disasm: #<ISeq:<main>@test.rb:1 (1,0)-(3,13)>
0000 putself                                                          (   3)[Li]
0001 opt_send_without_block                 <calldata!mid:a, argc:0, FCALL|VCALL|ARGS_SIMPLE>
0003 newarray                               1
0005 putobject                              "C"
0007 opt_send_without_block                 <calldata!mid:pack, argc:1, ARGS_SIMPLE>
0009 leave
```

After this commit:

```
== disasm: #<ISeq:<main>@test.rb:1 (1,0)-(3,13)>
0000 putself                                                          (   3)[Li]
0001 opt_send_without_block                 <calldata!mid:a, argc:0, FCALL|VCALL|ARGS_SIMPLE>
0003 putobject                              "C"
0005 opt_newarray_send                      2, :pack
0008 leave
```

Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2024-05-23 12:11:50 -07:00
Étienne Barrié
12be40ae6b Implement chilled strings
[Feature #20205]

As a path toward enabling frozen string literals by default in the future,
this commit introduce "chilled strings". From a user perspective chilled
strings pretend to be frozen, but on the first attempt to mutate them,
they lose their frozen status and emit a warning rather than to raise a
`FrozenError`.

Implementation wise, `rb_compile_option_struct.frozen_string_literal` is
no longer a boolean but a tri-state of `enabled/disabled/unset`.

When code is compiled with frozen string literals neither explictly enabled
or disabled, string literals are compiled with a new `putchilledstring`
instruction. This instruction is identical to `putstring` except it marks
the String with the `STR_CHILLED (FL_USER3)` and `FL_FREEZE` flags.

Chilled strings have the `FL_FREEZE` flag as to minimize the need to check
for chilled strings across the codebase, and to improve compatibility with
C extensions.

Notes:
  - `String#freeze`: clears the chilled flag.
  - `String#-@`: acts as if the string was mutable.
  - `String#+@`: acts as if the string was mutable.
  - `String#clone`: copies the chilled flag.

Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
2024-03-19 09:26:49 +01:00
Jeremy Evans
77c1233f79 Add pushtoarraykwsplat instruction to avoid unnecessary array allocation
This is designed to replace the newarraykwsplat instruction, which is
no longer used in the parse.y compiler after this commit.  This avoids
an unnecessary array allocation in the case where ARGSCAT is followed
by LIST with keyword:

```ruby
a = []
kw = {}
[*a, 1, **kw]
```

Previous Instructions:

```
0000 newarray                               0                         (   1)[Li]
0002 setlocal_WC_0                          a@0
0004 newhash                                0                         (   2)[Li]
0006 setlocal_WC_0                          kw@1
0008 getlocal_WC_0                          a@0                       (   3)[Li]
0010 splatarray                             true
0012 putobject_INT2FIX_1_
0013 putspecialobject                       1
0015 newhash                                0
0017 getlocal_WC_0                          kw@1
0019 opt_send_without_block                 <calldata!mid:core#hash_merge_kwd, argc:2, ARGS_SIMPLE>
0021 newarraykwsplat                        2
0023 concattoarray
0024 leave
```

New Instructions:

```
0000 newarray                               0                         (   1)[Li]
0002 setlocal_WC_0                          a@0
0004 newhash                                0                         (   2)[Li]
0006 setlocal_WC_0                          kw@1
0008 getlocal_WC_0                          a@0                       (   3)[Li]
0010 splatarray                             true
0012 putobject_INT2FIX_1_
0013 pushtoarray                            1
0015 putspecialobject                       1
0017 newhash                                0
0019 getlocal_WC_0                          kw@1
0021 opt_send_without_block                 <calldata!mid:core#hash_merge_kwd, argc:2, ARGS_SIMPLE>
0023 pushtoarraykwsplat
0024 leave
```

pushtoarraykwsplat is designed to be simpler than newarraykwsplat.
It does not take a variable number of arguments from the stack, it
pops the top of the stack, and appends it to the second from the top,
unless the top of the stack is an empty hash.

During this work, I found the ARGSPUSH followed by HASH with keyword
did not compile correctly, as it pushed the generated hash to the
array even if the hash was empty.  This fixes the behavior, to use
pushtoarraykwsplat instead of pushtoarray in that case:

```ruby
a = []
kw = {}
[*a, **kw]

[{}] # Before

[] # After
```

This does not remove the newarraykwsplat instruction, as it is still
referenced in the prism compiler (which should be updated similar
to this), YJIT (only in the bindings, it does not appear to be
implemented), and RJIT (in a couple comments).  After those are
updated, the newarraykwsplat instruction should be removed.
2024-02-20 10:47:44 -08:00
Alan Wu
e878bbd641 Allow foo(**nil, &block_arg)
Previously, `**nil` by itself worked, but if you add a block argument,
it raised a conversion error. The presence of the block argument
shouldn't change how keyword splat works.

See: <https://bugs.ruby-lang.org/issues/20064>
2024-02-12 13:02:50 -05:00
Nobuyoshi Nakada
361b3efe16
Use UNDEF_P 2024-01-30 14:48:59 +09:00
Jeremy Evans
b8516d6d01 Add pushtoarray VM instruction
This instruction is similar to concattoarray, but it takes the
number of arguments to push to the array, removes that number
of arguments from the stack, and adds them to the array now at
the top of the stack.

This allows `f(*a, 1)` to allocate only a single array on the
caller side (which can be reused on the callee side in the case of
`def f(*a)`). Prior to this commit, `f(*a, 1)` would generate
3 arrays:

* a dupped by splatarray true
* 1 wrapped in array by newarray
* a dupped again by concatarray

Instructions Before for `a = []; f(*a, 1)`:

```
0000 newarray                               0                         (   1)[Li]
0002 setlocal_WC_0                          a@0
0004 putself
0005 getlocal_WC_0                          a@0
0007 splatarray                             true
0009 putobject_INT2FIX_1_
0010 newarray                               1
0012 concatarray
0013 opt_send_without_block                 <calldata!mid:f, argc:1, ARGS_SPLAT|FCALL>
0015 leave
```

Instructions After for `a = []; f(*a, 1)`:

```
0000 newarray                               0                         (   1)[Li]
0002 setlocal_WC_0                          a@0
0004 putself
0005 getlocal_WC_0                          a@0
0007 splatarray                             true
0009 putobject_INT2FIX_1_
0010 pushtoarray                            1
0012 opt_send_without_block                 <calldata!mid:f, argc:1, ARGS_SPLAT|ARGS_SPLAT_MUT|FCALL>
0014 leave
```

With these changes, method calls to Ruby methods should
implicitly allocate at most one array.

Ignore typeprof bundled gem failure due to unrecognized instruction.
2024-01-24 18:25:55 -08:00
Jeremy Evans
6e06d0d180 Add concattoarray VM instruction
This instruction is similar to concatarray, but assumes the first
object is already an array, and appends to it directly.  This is
different than concatarray, which will create a new array instead
of appending to an existing array.

Additionally, for both concatarray and concattoarray, if the second
argument cannot be converted to an array, then just push it onto
the array, instead of creating a new array to wrap it, and then
using concat array.  This saves an array allocation in that case.

This allows `f(*a, *a, *1)` to allocate only a single array on the
caller side (which can be reused on the callee side in the case of
`def f(*a)`). Prior to this commit, `f(*a, *a, *1)` would generate
4 arrays:

* a dupped by splatarray true
* a dupped again by first concatarray
* 1 wrapped in array by third splatarray
* result of [*a, *a] dupped by second concatarray

Instructions Before for `a = []; f(*a, *a, *1)`:

```
0000 newarray                               0                         (   1)[Li]
0002 setlocal_WC_0                          a@0
0004 putself
0005 getlocal_WC_0                          a@0
0007 splatarray                             true
0009 getlocal_WC_0                          a@0
0011 splatarray                             false
0013 concatarray
0014 putobject_INT2FIX_1_
0015 splatarray                             false
0017 concatarray
0018 opt_send_without_block                 <calldata!mid:g, argc:1, ARGS_SPLAT|ARGS_SPLAT_MUT|FCALL>
0020 leave
```

Instructions After for `a = []; f(*a, *a, *1)`:

```
0000 newarray                               0                         (   1)[Li]
0002 setlocal_WC_0                          a@0
0004 putself
0005 getlocal_WC_0                          a@0
0007 splatarray                             true
0009 getlocal_WC_0                          a@0
0011 concattoarray
0012 putobject_INT2FIX_1_
0013 concattoarray
0014 opt_send_without_block                 <calldata!mid:f, argc:1, ARGS_SPLAT|ARGS_SPLAT_MUT|FCALL>
0016 leave
```
2024-01-24 18:25:55 -08:00
Jeremy Evans
a950f23078 Ensure f(**kw, &block) calls kw.to_hash before block.to_proc
Previously, block.to_proc was called first, by vm_caller_setup_arg_block.
kw.to_hash was called later inside CALLER_SETUP_ARG or setup_parameters_complex.

This adds a splatkw instruction that is inserted before sends with
ARGS_BLOCKARG and KW_SPLAT and without KW_SPLAT_MUT. This is not needed in the
KW_SPLAT_MUT case, because then you know the value is a hash, and you don't
need to call to_hash on it.

The splatkw instruction checks whether the second to top block is a hash,
and if not, replaces it with the value of calling to_hash on it (using
rb_to_hash_type).  As it is always before a send with ARGS_BLOCKARG and
KW_SPLAT, second to top is the keyword splat, and top is the passed block.
2023-12-09 13:15:47 -08:00
Peter Zhu
0aed37b973 Make expandarray compaction safe
The expandarray instruction can allocate an array, which can trigger
a GC compaction. However, since it does not increment the sp until the
end of the instruction, the objects it places on the stack are not
marked or reference updated by the GC, which can cause the objects to
move which leaves broken or incorrect objects on the stack.

This commit changes the instruction to be handles_sp so the sp is
incremented inside of the instruction right after the object is written
on the stack.
2023-12-01 17:13:56 -05:00
Takashi Kokubun
5808999d30
YJIT: Fallback opt_getconstant_path for const_missing (#8623)
* YJIT: Fallback opt_getconstant_path for const_missing

* Fix a comment [ci skip]

* Remove a wrapper function
2023-10-13 08:52:23 -07:00
Takashi Kokubun
1702b0f438
Remove unused opt_call_c_function insn (#7750) 2023-04-21 23:58:07 -07:00
Aaron Patterson
c5fc1ce975 Emit special instruction for array literal + .(hash|min|max)
This commit introduces a new instruction `opt_newarray_send` which is
used when there is an array literal followed by either the `hash`,
`min`, or `max` method.

```
[a, b, c].hash
```

Will emit an `opt_newarray_send` instruction.  This instruction falls
back to a method call if the "interested" method has been monkey
patched.

Here are some examples of the instructions generated:

```
$ ./miniruby --dump=insns -e '[@a, @b].max'
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,12)> (catch: FALSE)
0000 getinstancevariable                    :@a, <is:0>               (   1)[Li]
0003 getinstancevariable                    :@b, <is:1>
0006 opt_newarray_send                      2, :max
0009 leave
$ ./miniruby --dump=insns -e '[@a, @b].min'
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,12)> (catch: FALSE)
0000 getinstancevariable                    :@a, <is:0>               (   1)[Li]
0003 getinstancevariable                    :@b, <is:1>
0006 opt_newarray_send                      2, :min
0009 leave
$ ./miniruby --dump=insns -e '[@a, @b].hash'
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,13)> (catch: FALSE)
0000 getinstancevariable                    :@a, <is:0>               (   1)[Li]
0003 getinstancevariable                    :@b, <is:1>
0006 opt_newarray_send                      2, :hash
0009 leave
```

[Feature #18897] [ruby-core:109147]

Co-authored-by: John Hawthorn <jhawthorn@github.com>
2023-04-18 17:16:22 -07:00
Takashi Kokubun
9947574b9c Refactor jit_func_t and jit_exec
I closed https://github.com/ruby/ruby/pull/7543, but part of the diff
seems useful regardless, so I extracted it.
2023-03-16 10:42:17 -07:00
Takashi Kokubun
9a43c63d43
YJIT: Implement throw instruction (#7491)
* Break up jit_exec from vm_sendish

* YJIT: Implement throw instruction

* YJIT: Explain what rb_vm_throw does [ci skip]
2023-03-14 13:39:06 -07:00