73 Commits

Author SHA1 Message Date
Benoit Daloze
bf36ad9c12 ZJIT: remove unused rb_RSTRUCT_LEN() 2026-01-12 08:44:26 +01:00
Benoit Daloze
916c0a8105 ZJIT: remove unused rb_RSTRUCT_SET() 2026-01-12 08:44:26 +01:00
Benoit Daloze
5e27581c3b ZJIT: Use rb_zjit_writebarrier_check_immediate() instead of rb_gc_writebarrier() in gen_write_barrier()
* To avoid calling rb_gc_writebarrier() with an immediate value in gen_write_barrier(),
  and avoid the LIR jump issue.
2025-12-16 21:00:27 +01:00
Abrar Habib
edca81a1bb
ZJIT: Add codegen for FixnumDiv (#15452)
Fixes https://github.com/Shopify/ruby/issues/902

This pull request adds code generation for dividing fixnums.
Testing confirms the normal case, flooring, and side-exiting on division by zero.
2025-12-09 12:41:09 +00:00
Alan Wu
109ddd291e ZJIT: Avoid binding to rb_iseq_constant_body
Its definition changes depending on e.g. whether there is YJIT in the
build.
2025-12-05 15:49:25 -05:00
Max Bernstein
0af85a1fe2
ZJIT: Optimize setivar with shape transition (#15375)
Since we do a decent job of pre-sizing objects, don't handle the case where we would need to re-size an object. Also don't handle too-complex shapes.

lobsters stats before:

```
Top-20 calls to C functions from JIT code (79.4% of total 90,051,140):
                             rb_vm_opt_send_without_block: 19,762,433 (21.9%)
                                rb_vm_setinstancevariable:  7,698,314 ( 8.5%)
                                             rb_hash_aref:  6,767,461 ( 7.5%)
                                          rb_vm_env_write:  5,373,080 ( 6.0%)
                                               rb_vm_send:  5,049,229 ( 5.6%)
                                rb_vm_getinstancevariable:  4,535,259 ( 5.0%)
                                        rb_obj_is_kind_of:  3,746,306 ( 4.2%)
                           rb_ivar_get_at_no_ractor_check:  3,745,237 ( 4.2%)
                                        rb_vm_invokesuper:  3,037,467 ( 3.4%)
                                             rb_ary_entry:  2,351,983 ( 2.6%)
                               rb_vm_opt_getconstant_path:  1,344,740 ( 1.5%)
                                        rb_vm_invokeblock:  1,184,474 ( 1.3%)
                                                 Hash#[]=:  1,064,288 ( 1.2%)
                                       rb_gc_writebarrier:  1,006,972 ( 1.1%)
                                rb_ec_ary_new_from_values:    902,687 ( 1.0%)
                                                    fetch:    898,667 ( 1.0%)
                                        rb_str_buf_append:    833,787 ( 0.9%)
                               rb_class_allocate_instance:    822,024 ( 0.9%)
                                               Hash#fetch:    699,580 ( 0.8%)
                                                    _bi20:    682,068 ( 0.8%)
Top-4 setivar fallback reasons (100.0% of total 7,732,326):
  shape_transition: 6,032,109 (78.0%)
   not_monomorphic: 1,469,300 (19.0%)
      not_t_object:   172,636 ( 2.2%)
       too_complex:    58,281 ( 0.8%)
```

lobsters stats after:

```
Top-20 calls to C functions from JIT code (79.0% of total 88,322,656):
                             rb_vm_opt_send_without_block: 19,777,880 (22.4%)
                                             rb_hash_aref:  6,771,589 ( 7.7%)
                                          rb_vm_env_write:  5,372,789 ( 6.1%)
                                       rb_gc_writebarrier:  5,195,527 ( 5.9%)
                                               rb_vm_send:  5,049,145 ( 5.7%)
                                rb_vm_getinstancevariable:  4,538,485 ( 5.1%)
                                        rb_obj_is_kind_of:  3,746,241 ( 4.2%)
                           rb_ivar_get_at_no_ractor_check:  3,745,172 ( 4.2%)
                                        rb_vm_invokesuper:  3,037,157 ( 3.4%)
                                             rb_ary_entry:  2,351,968 ( 2.7%)
                                rb_vm_setinstancevariable:  1,703,337 ( 1.9%)
                               rb_vm_opt_getconstant_path:  1,344,730 ( 1.5%)
                                        rb_vm_invokeblock:  1,184,290 ( 1.3%)
                                                 Hash#[]=:  1,061,868 ( 1.2%)
                                rb_ec_ary_new_from_values:    902,666 ( 1.0%)
                                                    fetch:    898,666 ( 1.0%)
                                        rb_str_buf_append:    833,784 ( 0.9%)
                               rb_class_allocate_instance:    821,778 ( 0.9%)
                                               Hash#fetch:    755,913 ( 0.9%)
Top-4 setivar fallback reasons (100.0% of total 1,703,337):
            not_monomorphic: 1,472,405 (86.4%)
               not_t_object:   172,629 (10.1%)
                too_complex:    58,281 ( 3.4%)
  new_shape_needs_extension:        22 ( 0.0%)
```

I also noticed that primitive printing in HIR was broken so I fixed that.

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2025-12-03 21:27:56 -05:00
Max Bernstein
3efd8c6764
ZJIT: Inline Kernel#class (#15397)
We generally know the receiver's class from profile info. I see 600k of these when running lobsters.
2025-12-04 01:25:52 +00:00
Benoit Daloze
07ea9a3809 ZJIT: Optimize GetIvar for non-T_OBJECT
* All Invariant::SingleRactorMode PatchPoint are replaced by
  assume_single_ractor_mode() to fix https://github.com/Shopify/ruby/issues/875
  for SingleRactorMode patchpoints.
2025-12-02 01:42:14 +01:00
Max Bernstein
8aed311038 ZJIT: Specialize String#<< with Fixnum
Append a codepoint.
2025-12-01 15:19:26 -08:00
Max Bernstein
0eb53053f0
ZJIT: Specialize setinstancevariable when ivar is already in shape (#15290)
Don't support shape transitions for now.
2025-11-25 18:50:55 +00:00
Alan Wu
7a09df45f2 Name the iseq->body->param struct and update bindings for JITs
This will make reading the parameters nicer for the JITs. Should be
no-op for the C side.
2025-11-20 19:52:28 -05:00
Jacob
f3f3e76882
Extract KW_SPECIFIED_BITS_MAX for JITs (GH-15039)
Rename to `VM_KW_SPECIFIED_BITS_MAX` now that it's in `vm_core.h`.
2025-11-18 22:32:11 +00:00
Max Bernstein
38d31dc49b
ZJIT: Untag block handler (#15085)
Storing the tagged block handler in profiles is not GC-safe (nice catch,
Kokubun). Store the untagged block handler instead.

Fix bug in https://github.com/ruby/ruby/pull/15051
2025-11-06 20:07:02 +00:00
Max Bernstein
02267417da
ZJIT: Profile specific objects for invokeblock (#15051)
I made a special kind of `ProfiledType` that looks at specific objects, not just their classes/shapes (https://github.com/ruby/ruby/pull/15051). Then I profiled some of our benchmarks.

For lobsters:

```
Top-6 invokeblock handler (100.0% of total 1,064,155):
        megamorphic: 494,931 (46.5%)
   monomorphic_iseq: 337,171 (31.7%)
        polymorphic: 113,381 (10.7%)
  monomorphic_ifunc:  52,260 ( 4.9%)
  monomorphic_other:  38,970 ( 3.7%)
        no_profiles:  27,442 ( 2.6%)
```

For railsbench:

```
Top-6 invokeblock handler (100.0% of total 2,529,104):
   monomorphic_iseq: 834,452 (33.0%)
        megamorphic: 818,347 (32.4%)
        polymorphic: 632,273 (25.0%)
  monomorphic_ifunc: 224,243 ( 8.9%)
  monomorphic_other:  19,595 ( 0.8%)
        no_profiles:     194 ( 0.0%)
```

For shipit:

```
Top-6 invokeblock handler (100.0% of total 2,104,148):
        megamorphic: 1,269,889 (60.4%)
        polymorphic:   411,475 (19.6%)
        no_profiles:   173,367 ( 8.2%)
  monomorphic_other:   118,619 ( 5.6%)
   monomorphic_iseq:    84,891 ( 4.0%)
  monomorphic_ifunc:    45,907 ( 2.2%)
```

Seems like a monomorphic case for a specific ISEQ actually isn't a bad way of going about this, at least to start...
2025-11-05 20:01:17 +00:00
Aiden Fox Ivey
7a736545e9
ZJIT: Specialize Array#pop for no argument case (#14933)
Fixes https://github.com/Shopify/ruby/issues/814

This change specializes the case of calling `Array#pop` on a non frozen array with no arguments. `Array#pop` exists in the non-inlined C function list in the ZJIT SFR performance burndown list.

If in the future it is helpful, this patch could be extended to support the case where an argument is provided, but this initial work seeks to elide the ruby frame normally pushed in the case of `Array#pop` without an argument.
2025-10-28 11:44:25 -04:00
Max Bernstein
fa5481bc06 ZJIT: Fetch Primitive.attr!(leaf) for InvokeBuiltin
Fix https://github.com/Shopify/ruby/issues/670
2025-10-22 17:10:14 -07:00
Alan Wu
bb7f3d17ed YJIT: ZJIT: Extract common bindings to jit.c and remove unnamed enums.
The type name bindgen picks for anonymous enums creates desync issues on
the bindgen CI checks.
2025-10-21 16:48:45 -04:00
Alan Wu
35c2230734 ZJIT: Fix binding to INVALID_SHAPE_ID under -std=c99 -pedantic
```
  /src/jit.c:19:5: error: ISO C restricts enumerator values to range of 'int' (4294967295 is too large) [-Werror,-Wpedantic]
     19 |     RB_INVALID_SHAPE_ID = INVALID_SHAPE_ID,
        |     ^                     ~~~~~~~~~~~~~~~~
```
2025-10-21 16:48:45 -04:00
Max Bernstein
fba349e658
ZJIT: Implement expandarray (#14847)
Only support the simple case: no splat or rest.

lobsters before:

<details>

```
***ZJIT: Printing ZJIT statistics on exit***
Top-20 not inlined C methods (60.5% of total 11,039,954):
                               Kernel#is_a?: 1,030,769 ( 9.3%)
                                  String#<<:   851,954 ( 7.7%)
                                   Hash#[]=:   742,941 ( 6.7%)
                              Regexp#match?:   399,894 ( 3.6%)
                              String#empty?:   353,775 ( 3.2%)
                                  Hash#key?:   349,147 ( 3.2%)
                         String#start_with?:   334,961 ( 3.0%)
                         Kernel#respond_to?:   316,528 ( 2.9%)
                 ObjectSpace::WeakKeyMap#[]:   238,978 ( 2.2%)
                              TrueClass#===:   235,771 ( 2.1%)
                             FalseClass#===:   231,144 ( 2.1%)
                             Array#include?:   211,385 ( 1.9%)
                                 Hash#fetch:   204,702 ( 1.9%)
                        Kernel#block_given?:   181,797 ( 1.6%)
                                 Kernel#dup:   179,341 ( 1.6%)
                             BasicObject#!=:   175,997 ( 1.6%)
                                  Class#new:   168,079 ( 1.5%)
                            Kernel#kind_of?:   165,600 ( 1.5%)
                                  String#==:   157,735 ( 1.4%)
                       Module#clock_gettime:   144,992 ( 1.3%)
Top-20 not annotated C methods (61.4% of total 11,202,087):
                               Kernel#is_a?: 1,212,660 (10.8%)
                                  String#<<:   851,954 ( 7.6%)
                                   Hash#[]=:   743,120 ( 6.6%)
                              Regexp#match?:   399,894 ( 3.6%)
                              String#empty?:   361,013 ( 3.2%)
                                  Hash#key?:   349,147 ( 3.1%)
                         String#start_with?:   334,961 ( 3.0%)
                         Kernel#respond_to?:   316,528 ( 2.8%)
                 ObjectSpace::WeakKeyMap#[]:   238,978 ( 2.1%)
                              TrueClass#===:   235,771 ( 2.1%)
                             FalseClass#===:   231,144 ( 2.1%)
                             Array#include?:   211,385 ( 1.9%)
                                 Hash#fetch:   204,702 ( 1.8%)
                        Kernel#block_given?:   191,666 ( 1.7%)
                                 Kernel#dup:   179,348 ( 1.6%)
                             BasicObject#!=:   176,181 ( 1.6%)
                                  Class#new:   168,079 ( 1.5%)
                            Kernel#kind_of?:   165,634 ( 1.5%)
                                  String#==:   163,667 ( 1.5%)
                       Module#clock_gettime:   144,992 ( 1.3%)
Top-2 not optimized method types for send (100.0% of total 72,318):
  cfunc: 48,055 (66.4%)
   iseq: 24,263 (33.6%)
Top-6 not optimized method types for send_without_block (100.0% of total 4,523,682):
       iseq: 2,271,936 (50.2%)
    bmethod:   985,636 (21.8%)
  optimized:   949,703 (21.0%)
      alias:   310,747 ( 6.9%)
       null:     5,106 ( 0.1%)
      cfunc:       554 ( 0.0%)
Top-13 not optimized instructions (100.0% of total 4,293,171):
             invokesuper: 2,373,404 (55.3%)
             invokeblock:   811,926 (18.9%)
             sendforward:   505,452 (11.8%)
                  opt_eq:   451,754 (10.5%)
                opt_plus:    74,404 ( 1.7%)
               opt_minus:    36,228 ( 0.8%)
  opt_send_without_block:    21,792 ( 0.5%)
                 opt_neq:     7,231 ( 0.2%)
                opt_mult:     6,752 ( 0.2%)
                  opt_or:     3,753 ( 0.1%)
                  opt_lt:       348 ( 0.0%)
                  opt_ge:        91 ( 0.0%)
                  opt_gt:        36 ( 0.0%)
Top-9 send fallback reasons (100.0% of total 25,530,724):
                send_without_block_polymorphic: 9,722,491 (38.1%)
                              send_no_profiles: 5,894,788 (23.1%)
  send_without_block_not_optimized_method_type: 4,523,682 (17.7%)
                     not_optimized_instruction: 4,293,171 (16.8%)
                send_without_block_no_profiles:   998,746 ( 3.9%)
                send_not_optimized_method_type:    72,318 ( 0.3%)
       send_without_block_cfunc_array_variadic:    15,134 ( 0.1%)
                      obj_to_string_not_string:     9,765 ( 0.0%)
       send_without_block_direct_too_many_args:       629 ( 0.0%)
Top-9 unhandled YARV insns (100.0% of total 690,950):
         expandarray: 328,490 (47.5%)
        checkkeyword: 190,694 (27.6%)
    getclassvariable:  59,901 ( 8.7%)
  invokesuperforward:  49,503 ( 7.2%)
       getblockparam:  49,119 ( 7.1%)
   opt_duparray_send:  11,978 ( 1.7%)
         getconstant:     952 ( 0.1%)
          checkmatch:     290 ( 0.0%)
                once:      23 ( 0.0%)
Top-3 compile error reasons (100.0% of total 3,718,636):
  register_spill_on_alloc: 3,418,255 (91.9%)
  register_spill_on_ccall:   182,018 ( 4.9%)
        exception_handler:   118,363 ( 3.2%)
Top-14 side exit reasons (100.0% of total 10,860,385):
                        compile_error: 3,718,636 (34.2%)
                   guard_type_failure: 2,638,926 (24.3%)
                  guard_shape_failure: 1,917,209 (17.7%)
                  unhandled_yarv_insn:   690,950 ( 6.4%)
  block_param_proxy_not_iseq_or_ifunc:   535,789 ( 4.9%)
                      unhandled_kwarg:   455,347 ( 4.2%)
                           patchpoint:   370,476 ( 3.4%)
                unknown_newarray_send:   314,786 ( 2.9%)
                      unhandled_splat:   122,071 ( 1.1%)
                   unhandled_hir_insn:    76,397 ( 0.7%)
           block_param_proxy_modified:    19,193 ( 0.2%)
               obj_to_string_fallback:       566 ( 0.0%)
               guard_type_not_failure:        22 ( 0.0%)
                            interrupt:        17 ( 0.0%)
                             send_count: 62,244,604
                     dynamic_send_count: 25,530,724 (41.0%)
                   optimized_send_count: 36,713,880 (59.0%)
              iseq_optimized_send_count: 18,587,512 (29.9%)
      inline_cfunc_optimized_send_count:  7,086,414 (11.4%)
non_variadic_cfunc_optimized_send_count:  8,375,754 (13.5%)
    variadic_cfunc_optimized_send_count:  2,664,200 ( 4.3%)
dynamic_getivar_count:                        7,365,995
dynamic_setivar_count:                        7,245,005
compiled_iseq_count:                              4,796
failed_iseq_count:                                  447
compile_time:                                     814ms
profile_time:                                       9ms
gc_time:                                            9ms
invalidation_time:                                 72ms
vm_write_pc_count:                           64,156,223
vm_write_sp_count:                           62,812,449
vm_write_locals_count:                       62,812,449
vm_write_stack_count:                        62,812,449
vm_write_to_parent_iseq_local_count:            292,458
vm_read_from_parent_iseq_local_count:         6,599,701
code_region_bytes:                           22,953,984
side_exit_count:                             10,860,385
total_insn_count:                           517,606,340
vm_insn_count:                              162,979,530
zjit_insn_count:                            354,626,810
ratio_in_zjit:                                    68.5%
```

</details>

lobsters after:

<details>

```
***ZJIT: Printing ZJIT statistics on exit***
Top-20 not inlined C methods (59.9% of total 11,291,815):
                               Kernel#is_a?: 1,046,269 ( 9.3%)
                                  String#<<:   851,954 ( 7.5%)
                                   Hash#[]=:   743,274 ( 6.6%)
                              Regexp#match?:   399,894 ( 3.5%)
                              String#empty?:   353,775 ( 3.1%)
                                  Hash#key?:   349,147 ( 3.1%)
                         String#start_with?:   334,961 ( 3.0%)
                         Kernel#respond_to?:   316,502 ( 2.8%)
                 ObjectSpace::WeakKeyMap#[]:   238,978 ( 2.1%)
                              TrueClass#===:   235,771 ( 2.1%)
                             FalseClass#===:   231,144 ( 2.0%)
                                String#sub!:   219,579 ( 1.9%)
                             Array#include?:   211,385 ( 1.9%)
                                 Hash#fetch:   204,702 ( 1.8%)
                        Kernel#block_given?:   181,797 ( 1.6%)
                                 Kernel#dup:   179,341 ( 1.6%)
                             BasicObject#!=:   175,997 ( 1.6%)
                                  Class#new:   168,079 ( 1.5%)
                            Kernel#kind_of?:   165,600 ( 1.5%)
                                  String#==:   157,742 ( 1.4%)
Top-20 not annotated C methods (60.9% of total 11,466,928):
                               Kernel#is_a?: 1,239,923 (10.8%)
                                  String#<<:   851,954 ( 7.4%)
                                   Hash#[]=:   743,453 ( 6.5%)
                              Regexp#match?:   399,894 ( 3.5%)
                              String#empty?:   361,013 ( 3.1%)
                                  Hash#key?:   349,147 ( 3.0%)
                         String#start_with?:   334,961 ( 2.9%)
                         Kernel#respond_to?:   316,502 ( 2.8%)
                 ObjectSpace::WeakKeyMap#[]:   238,978 ( 2.1%)
                              TrueClass#===:   235,771 ( 2.1%)
                             FalseClass#===:   231,144 ( 2.0%)
                                String#sub!:   219,579 ( 1.9%)
                             Array#include?:   211,385 ( 1.8%)
                                 Hash#fetch:   204,702 ( 1.8%)
                        Kernel#block_given?:   191,666 ( 1.7%)
                                 Kernel#dup:   179,348 ( 1.6%)
                             BasicObject#!=:   176,181 ( 1.5%)
                                  Class#new:   168,079 ( 1.5%)
                            Kernel#kind_of?:   165,634 ( 1.4%)
                                  String#==:   163,674 ( 1.4%)
Top-2 not optimized method types for send (100.0% of total 72,318):
  cfunc: 48,055 (66.4%)
   iseq: 24,263 (33.6%)
Top-6 not optimized method types for send_without_block (100.0% of total 4,524,016):
       iseq: 2,272,269 (50.2%)
    bmethod:   985,636 (21.8%)
  optimized:   949,704 (21.0%)
      alias:   310,747 ( 6.9%)
       null:     5,106 ( 0.1%)
      cfunc:       554 ( 0.0%)
Top-13 not optimized instructions (100.0% of total 4,294,241):
             invokesuper: 2,375,446 (55.3%)
             invokeblock:   810,955 (18.9%)
             sendforward:   505,451 (11.8%)
                  opt_eq:   451,754 (10.5%)
                opt_plus:    74,404 ( 1.7%)
               opt_minus:    36,228 ( 0.8%)
  opt_send_without_block:    21,792 ( 0.5%)
                 opt_neq:     7,231 ( 0.2%)
                opt_mult:     6,752 ( 0.2%)
                  opt_or:     3,753 ( 0.1%)
                  opt_lt:       348 ( 0.0%)
                  opt_ge:        91 ( 0.0%)
                  opt_gt:        36 ( 0.0%)
Top-9 send fallback reasons (100.0% of total 25,534,542):
                send_without_block_polymorphic: 9,723,469 (38.1%)
                              send_no_profiles: 5,896,023 (23.1%)
  send_without_block_not_optimized_method_type: 4,524,016 (17.7%)
                     not_optimized_instruction: 4,294,241 (16.8%)
                send_without_block_no_profiles:   998,947 ( 3.9%)
                send_not_optimized_method_type:    72,318 ( 0.3%)
       send_without_block_cfunc_array_variadic:    15,134 ( 0.1%)
                      obj_to_string_not_string:     9,765 ( 0.0%)
       send_without_block_direct_too_many_args:       629 ( 0.0%)
Top-8 unhandled YARV insns (100.0% of total 362,460):
        checkkeyword: 190,694 (52.6%)
    getclassvariable:  59,901 (16.5%)
  invokesuperforward:  49,503 (13.7%)
       getblockparam:  49,119 (13.6%)
   opt_duparray_send:  11,978 ( 3.3%)
         getconstant:     952 ( 0.3%)
          checkmatch:     290 ( 0.1%)
                once:      23 ( 0.0%)
Top-3 compile error reasons (100.0% of total 3,798,744):
  register_spill_on_alloc: 3,495,669 (92.0%)
  register_spill_on_ccall:   184,712 ( 4.9%)
        exception_handler:   118,363 ( 3.1%)
Top-15 side exit reasons (100.0% of total 10,637,319):
                        compile_error: 3,798,744 (35.7%)
                   guard_type_failure: 2,655,504 (25.0%)
                  guard_shape_failure: 1,917,217 (18.0%)
  block_param_proxy_not_iseq_or_ifunc:   535,789 ( 5.0%)
                      unhandled_kwarg:   455,492 ( 4.3%)
                           patchpoint:   370,478 ( 3.5%)
                  unhandled_yarv_insn:   362,460 ( 3.4%)
                unknown_newarray_send:   314,786 ( 3.0%)
                      unhandled_splat:   122,071 ( 1.1%)
                   unhandled_hir_insn:    83,066 ( 0.8%)
           block_param_proxy_modified:    19,193 ( 0.2%)
             guard_int_equals_failure:     1,914 ( 0.0%)
               obj_to_string_fallback:       566 ( 0.0%)
               guard_type_not_failure:        22 ( 0.0%)
                            interrupt:        17 ( 0.0%)
                             send_count: 62,495,067
                     dynamic_send_count: 25,534,542 (40.9%)
                   optimized_send_count: 36,960,525 (59.1%)
              iseq_optimized_send_count: 18,582,072 (29.7%)
      inline_cfunc_optimized_send_count:  7,086,638 (11.3%)
non_variadic_cfunc_optimized_send_count:  8,392,657 (13.4%)
    variadic_cfunc_optimized_send_count:  2,899,158 ( 4.6%)
dynamic_getivar_count:                        7,365,994
dynamic_setivar_count:                        7,248,500
compiled_iseq_count:                              4,780
failed_iseq_count:                                  463
compile_time:                                     816ms
profile_time:                                       9ms
gc_time:                                           11ms
invalidation_time:                                 70ms
vm_write_pc_count:                           64,363,541
vm_write_sp_count:                           63,022,221
vm_write_locals_count:                       63,022,221
vm_write_stack_count:                        63,022,221
vm_write_to_parent_iseq_local_count:            292,458
vm_read_from_parent_iseq_local_count:         6,850,977
code_region_bytes:                           23,019,520
side_exit_count:                             10,637,319
total_insn_count:                           517,303,190
vm_insn_count:                              160,562,103
zjit_insn_count:                            356,741,087
ratio_in_zjit:                                    69.0%
```

</details>

railsbench before:

<details>

```
***ZJIT: Printing ZJIT statistics on exit***
Top-20 not inlined C methods (66.1% of total 25,524,934):
                                   Hash#[]=: 1,700,237 ( 6.7%)
                             String#getbyte: 1,572,123 ( 6.2%)
                                  String#<<: 1,494,022 ( 5.9%)
                               Kernel#is_a?: 1,429,930 ( 5.6%)
                              String#empty?: 1,370,323 ( 5.4%)
                              Regexp#match?: 1,235,067 ( 4.8%)
                         Kernel#respond_to?: 1,198,251 ( 4.7%)
                                  Hash#key?: 1,087,406 ( 4.3%)
                             String#setbyte:   810,022 ( 3.2%)
                                  Integer#^:   766,624 ( 3.0%)
                        Kernel#block_given?:   603,613 ( 2.4%)
                                  String#==:   590,409 ( 2.3%)
                                  Class#new:   506,216 ( 2.0%)
                                Hash#delete:   455,288 ( 1.8%)
                             BasicObject#!=:   428,771 ( 1.7%)
                                 Hash#fetch:   408,621 ( 1.6%)
                         String#ascii_only?:   373,915 ( 1.5%)
                 ObjectSpace::WeakKeyMap#[]:   287,957 ( 1.1%)
                               NilClass#===:   277,244 ( 1.1%)
                               Kernel#Array:   269,590 ( 1.1%)
Top-20 not annotated C methods (66.8% of total 25,392,654):
                                   Hash#[]=: 1,700,416 ( 6.7%)
                             String#getbyte: 1,572,123 ( 6.2%)
                               Kernel#is_a?: 1,515,672 ( 6.0%)
                                  String#<<: 1,494,022 ( 5.9%)
                              String#empty?: 1,370,478 ( 5.4%)
                              Regexp#match?: 1,235,067 ( 4.9%)
                         Kernel#respond_to?: 1,198,251 ( 4.7%)
                                  Hash#key?: 1,087,406 ( 4.3%)
                             String#setbyte:   810,022 ( 3.2%)
                                  Integer#^:   766,624 ( 3.0%)
                        Kernel#block_given?:   603,613 ( 2.4%)
                                  String#==:   601,115 ( 2.4%)
                                  Class#new:   506,216 ( 2.0%)
                                Hash#delete:   455,288 ( 1.8%)
                             BasicObject#!=:   428,876 ( 1.7%)
                                 Hash#fetch:   408,621 ( 1.6%)
                         String#ascii_only?:   373,915 ( 1.5%)
                 ObjectSpace::WeakKeyMap#[]:   287,957 ( 1.1%)
                               NilClass#===:   277,244 ( 1.1%)
                               Kernel#Array:   269,590 ( 1.1%)
Top-2 not optimized method types for send (100.0% of total 186,159):
   iseq: 112,747 (60.6%)
  cfunc:  73,412 (39.4%)
Top-6 not optimized method types for send_without_block (100.0% of total 8,142,248):
       iseq: 3,464,671 (42.6%)
  optimized: 2,632,884 (32.3%)
    bmethod: 1,290,701 (15.9%)
      alias:   706,020 ( 8.7%)
       null:    47,942 ( 0.6%)
      cfunc:        30 ( 0.0%)
Top-11 not optimized instructions (100.0% of total 8,394,873):
             invokesuper: 5,602,274 (66.7%)
             invokeblock: 1,764,936 (21.0%)
             sendforward:   551,832 ( 6.6%)
                  opt_eq:   441,959 ( 5.3%)
                opt_plus:    31,635 ( 0.4%)
  opt_send_without_block:     1,163 ( 0.0%)
                  opt_lt:       372 ( 0.0%)
                opt_mult:       251 ( 0.0%)
                  opt_ge:       193 ( 0.0%)
                 opt_neq:       149 ( 0.0%)
                  opt_or:       109 ( 0.0%)
Top-8 send fallback reasons (100.0% of total 40,748,753):
                send_without_block_polymorphic: 12,933,923 (31.7%)
                              send_no_profiles:  9,033,636 (22.2%)
                     not_optimized_instruction:  8,394,873 (20.6%)
  send_without_block_not_optimized_method_type:  8,142,248 (20.0%)
                send_without_block_no_profiles:  1,839,228 ( 4.5%)
       send_without_block_cfunc_array_variadic:    215,046 ( 0.5%)
                send_not_optimized_method_type:    186,159 ( 0.5%)
                      obj_to_string_not_string:      3,640 ( 0.0%)
Top-9 unhandled YARV insns (100.0% of total 1,604,456):
    getclassvariable: 458,136 (28.6%)
       getblockparam: 455,921 (28.4%)
        checkkeyword: 265,425 (16.5%)
  invokesuperforward: 239,383 (14.9%)
         expandarray: 137,305 ( 8.6%)
         getconstant:  48,100 ( 3.0%)
          checkmatch:     149 ( 0.0%)
                once:      23 ( 0.0%)
   opt_duparray_send:      14 ( 0.0%)
Top-3 compile error reasons (100.0% of total 5,570,130):
  register_spill_on_alloc: 4,994,130 (89.7%)
        exception_handler:   356,784 ( 6.4%)
  register_spill_on_ccall:   219,216 ( 3.9%)
Top-13 side exit reasons (100.0% of total 12,412,181):
                        compile_error: 5,570,130 (44.9%)
                  unhandled_yarv_insn: 1,604,456 (12.9%)
                  guard_shape_failure: 1,462,872 (11.8%)
                   guard_type_failure:   845,891 ( 6.8%)
  block_param_proxy_not_iseq_or_ifunc:   765,968 ( 6.2%)
                      unhandled_kwarg:   658,341 ( 5.3%)
                           patchpoint:   504,437 ( 4.1%)
                      unhandled_splat:   446,990 ( 3.6%)
                unknown_newarray_send:   332,740 ( 2.7%)
                   unhandled_hir_insn:   160,205 ( 1.3%)
           block_param_proxy_modified:    59,589 ( 0.5%)
               obj_to_string_fallback:       553 ( 0.0%)
                            interrupt:         9 ( 0.0%)
                             send_count: 119,067,587
                     dynamic_send_count:  40,748,753 (34.2%)
                   optimized_send_count:  78,318,834 (65.8%)
              iseq_optimized_send_count:  39,936,542 (33.5%)
      inline_cfunc_optimized_send_count:  12,857,358 (10.8%)
non_variadic_cfunc_optimized_send_count:  19,722,584 (16.6%)
    variadic_cfunc_optimized_send_count:   5,802,350 ( 4.9%)
dynamic_getivar_count:                      10,980,323
dynamic_setivar_count:                      12,962,726
compiled_iseq_count:                             2,531
failed_iseq_count:                                 245
compile_time:                                    414ms
profile_time:                                     21ms
gc_time:                                          33ms
invalidation_time:                                 5ms
vm_write_pc_count:                         129,093,714
vm_write_sp_count:                         126,023,084
vm_write_locals_count:                     126,023,084
vm_write_stack_count:                      126,023,084
vm_write_to_parent_iseq_local_count:           385,461
vm_read_from_parent_iseq_local_count:       11,266,484
code_region_bytes:                          12,156,928
side_exit_count:                            12,412,181
total_insn_count:                          866,780,158
vm_insn_count:                             216,821,134
zjit_insn_count:                           649,959,024
ratio_in_zjit:                                   75.0%
```

</details>

railsbench after:

<details>

```
***ZJIT: Printing ZJIT statistics on exit***
Top-20 not inlined C methods (66.0% of total 25,597,895):
                                   Hash#[]=: 1,724,042 ( 6.7%)
                             String#getbyte: 1,572,123 ( 6.1%)
                                  String#<<: 1,494,022 ( 5.8%)
                               Kernel#is_a?: 1,429,946 ( 5.6%)
                              String#empty?: 1,370,323 ( 5.4%)
                              Regexp#match?: 1,235,067 ( 4.8%)
                         Kernel#respond_to?: 1,198,251 ( 4.7%)
                                  Hash#key?: 1,087,406 ( 4.2%)
                             String#setbyte:   810,022 ( 3.2%)
                                  Integer#^:   766,624 ( 3.0%)
                        Kernel#block_given?:   603,613 ( 2.4%)
                                  String#==:   590,699 ( 2.3%)
                                  Class#new:   506,216 ( 2.0%)
                                Hash#delete:   455,288 ( 1.8%)
                             BasicObject#!=:   428,771 ( 1.7%)
                                 Hash#fetch:   408,621 ( 1.6%)
                         String#ascii_only?:   373,915 ( 1.5%)
                 ObjectSpace::WeakKeyMap#[]:   287,957 ( 1.1%)
                               NilClass#===:   277,244 ( 1.1%)
                               Kernel#Array:   269,590 ( 1.1%)
Top-20 not annotated C methods (66.7% of total 25,465,615):
                                   Hash#[]=: 1,724,221 ( 6.8%)
                             String#getbyte: 1,572,123 ( 6.2%)
                               Kernel#is_a?: 1,515,688 ( 6.0%)
                                  String#<<: 1,494,022 ( 5.9%)
                              String#empty?: 1,370,478 ( 5.4%)
                              Regexp#match?: 1,235,067 ( 4.8%)
                         Kernel#respond_to?: 1,198,251 ( 4.7%)
                                  Hash#key?: 1,087,406 ( 4.3%)
                             String#setbyte:   810,022 ( 3.2%)
                                  Integer#^:   766,624 ( 3.0%)
                        Kernel#block_given?:   603,613 ( 2.4%)
                                  String#==:   601,405 ( 2.4%)
                                  Class#new:   506,216 ( 2.0%)
                                Hash#delete:   455,288 ( 1.8%)
                             BasicObject#!=:   428,876 ( 1.7%)
                                 Hash#fetch:   408,621 ( 1.6%)
                         String#ascii_only?:   373,915 ( 1.5%)
                 ObjectSpace::WeakKeyMap#[]:   287,957 ( 1.1%)
                               NilClass#===:   277,244 ( 1.1%)
                               Kernel#Array:   269,590 ( 1.1%)
Top-2 not optimized method types for send (100.0% of total 186,159):
   iseq: 112,747 (60.6%)
  cfunc:  73,412 (39.4%)
Top-6 not optimized method types for send_without_block (100.0% of total 8,142,248):
       iseq: 3,464,671 (42.6%)
  optimized: 2,632,884 (32.3%)
    bmethod: 1,290,701 (15.9%)
      alias:   706,020 ( 8.7%)
       null:    47,942 ( 0.6%)
      cfunc:        30 ( 0.0%)
Top-11 not optimized instructions (100.0% of total 8,442,456):
             invokesuper: 5,649,857 (66.9%)
             invokeblock: 1,764,936 (20.9%)
             sendforward:   551,832 ( 6.5%)
                  opt_eq:   441,959 ( 5.2%)
                opt_plus:    31,635 ( 0.4%)
  opt_send_without_block:     1,163 ( 0.0%)
                  opt_lt:       372 ( 0.0%)
                opt_mult:       251 ( 0.0%)
                  opt_ge:       193 ( 0.0%)
                 opt_neq:       149 ( 0.0%)
                  opt_or:       109 ( 0.0%)
Top-8 send fallback reasons (100.0% of total 40,796,314):
                send_without_block_polymorphic: 12,933,921 (31.7%)
                              send_no_profiles:  9,033,616 (22.1%)
                     not_optimized_instruction:  8,442,456 (20.7%)
  send_without_block_not_optimized_method_type:  8,142,248 (20.0%)
                send_without_block_no_profiles:  1,839,228 ( 4.5%)
       send_without_block_cfunc_array_variadic:    215,046 ( 0.5%)
                send_not_optimized_method_type:    186,159 ( 0.5%)
                      obj_to_string_not_string:      3,640 ( 0.0%)
Top-8 unhandled YARV insns (100.0% of total 1,467,151):
    getclassvariable: 458,136 (31.2%)
       getblockparam: 455,921 (31.1%)
        checkkeyword: 265,425 (18.1%)
  invokesuperforward: 239,383 (16.3%)
         getconstant:  48,100 ( 3.3%)
          checkmatch:     149 ( 0.0%)
                once:      23 ( 0.0%)
   opt_duparray_send:      14 ( 0.0%)
Top-3 compile error reasons (100.0% of total 5,825,923):
  register_spill_on_alloc: 5,225,940 (89.7%)
        exception_handler:   356,784 ( 6.1%)
  register_spill_on_ccall:   243,199 ( 4.2%)
Top-13 side exit reasons (100.0% of total 12,530,763):
                        compile_error: 5,825,923 (46.5%)
                  unhandled_yarv_insn: 1,467,151 (11.7%)
                  guard_shape_failure: 1,462,876 (11.7%)
                   guard_type_failure:   845,913 ( 6.8%)
  block_param_proxy_not_iseq_or_ifunc:   765,968 ( 6.1%)
                      unhandled_kwarg:   658,341 ( 5.3%)
                           patchpoint:   504,437 ( 4.0%)
                      unhandled_splat:   446,990 ( 3.6%)
                unknown_newarray_send:   332,740 ( 2.7%)
                   unhandled_hir_insn:   160,273 ( 1.3%)
           block_param_proxy_modified:    59,589 ( 0.5%)
               obj_to_string_fallback:       553 ( 0.0%)
                            interrupt:         9 ( 0.0%)
                             send_count: 119,163,569
                     dynamic_send_count:  40,796,314 (34.2%)
                   optimized_send_count:  78,367,255 (65.8%)
              iseq_optimized_send_count:  39,911,967 (33.5%)
      inline_cfunc_optimized_send_count:  12,857,393 (10.8%)
non_variadic_cfunc_optimized_send_count:  19,770,401 (16.6%)
    variadic_cfunc_optimized_send_count:   5,827,494 ( 4.9%)
dynamic_getivar_count:                      10,980,323
dynamic_setivar_count:                      12,986,381
compiled_iseq_count:                             2,523
failed_iseq_count:                                 252
compile_time:                                    420ms
profile_time:                                     21ms
gc_time:                                          30ms
invalidation_time:                                 4ms
vm_write_pc_count:                         128,973,665
vm_write_sp_count:                         125,926,968
vm_write_locals_count:                     125,926,968
vm_write_stack_count:                      125,926,968
vm_write_to_parent_iseq_local_count:           385,752
vm_read_from_parent_iseq_local_count:       11,267,766
code_region_bytes:                          12,189,696
side_exit_count:                            12,530,763
total_insn_count:                          866,667,490
vm_insn_count:                             217,813,201
zjit_insn_count:                           648,854,289
ratio_in_zjit:                                   74.9%
```

</details>
2025-10-20 10:55:52 -04:00
Max Bernstein
7a474e1fbd
ZJIT: Inline String#getbyte (#14842) 2025-10-16 02:01:00 +00:00
Alan Wu
4c426e98a8 ZJIT: Use rb_gc_disable() over rb_gc_disable_no_rest()
no_rest() trips an assert inside the GC when we allocate with the GC
disabled this way:

    (gc_continue) ../src/gc/default/default.c:2029
    (newobj_cache_miss+0x128) [0x105040048] ../src/gc/default/default.c:2370
    (rb_gc_impl_new_obj+0x7c) [0x105036374] ../src/gc/default/default.c:2482
    (newobj_of) ../src/gc.c:995
    (rb_method_entry_alloc+0x40) [0x1051e6c64] ../src/vm_method.c:1102
    (rb_method_entry_complement_defined_class) ../src/vm_method.c:1180
    (prepare_callable_method_entry+0x14c) [0x1051e87b8] ../src/vm_method.c:1728
    (callable_method_entry_or_negative+0x1e8) [0x1051e809c] ../src/vm_method.c:1874

It's tries to continue the GC because it was out of space. Looks like
it's not safe to allocate new objects after using
rb_gc_disable_no_rest(); existing usages use it for malloc calls.
2025-10-15 16:36:46 -04:00
Alan Wu
31a1a39ace ZJIT: Never yield to the GC while compiling
This fixes a reliable "ZJIT saw a dead object" repro on my machine, and should
fix the flaky ones on CI. The code for disabling the GC is the same as
the code in newobj_of().

See: https://github.com/ruby/ruby/actions/runs/18511676257/job/52753782036
2025-10-15 13:27:30 -04:00
Alan Wu
5bda42e4de ZJIT: Include GC object dump when seeing dead objects
Strictly more info than just the builtin_type from `assert_ne!`.

Old:

    assertion `left != right` failed: ZJIT should only see live objects
      left: 0
     right: 0

New:

    ZJIT saw a dead object. T_type=0, out-of-heap:0x0000000110d4bb40

Also, the new `VALUE::obj_info` is more flexible for print debugging than the
dump_info() it replaces. It now allows you to use it as part of a `format!`
string instead of always printing to stderr for you.
2025-10-14 22:34:50 -04:00
Aiden Fox Ivey
50cd34c4e8
ZJIT: Add Insn:: ArrayArefFixnum to accelerate Array#[] (#14717)
* ZJIT: Add Insn:: ArrayArefFixnum to accelerate Array#[]

* ZJIT: Use result from GuardType in ArrayArefFixnum

* ZJIT: Unbox index for aref_fixnum

* ZJIT: Change condition and add ArrayArefFixnum test

* ZJIT: Fix ArrayArefFixnum display for InsnPrinter

* ZJIT: Change insta test
2025-10-10 10:22:15 -07:00
Max Bernstein
09e5c5eed1
ZJIT: Name enum for bindgen (#14802)
Relying on having the same compiler version and behavior across
platforms is brittle, as Kokubun points out. Instead, name the enum so
we don't have to rely on gensym stability.

Fix https://github.com/Shopify/ruby/issues/787
2025-10-09 17:06:49 +00:00
Aiden Fox Ivey
2f1c30cd50
ZJIT: Add --zjit-trace-exits (#14640)
Add side exit tracing functionality for ZJIT
2025-09-30 15:55:33 +00:00
Max Bernstein
254b9b4952 ZJIT: Expand the list of safe allocators
It's not just the default allocator; other allocators are also leaf.
2025-09-19 22:38:29 -04:00
Stan Lo
797a4115bb
ZJIT: Support variadic C calls (#14575)
* ZJIT: Support variadic C calls

This reduces the `dynamic_send_count` in `liquid-render` by ~21%

* ZJIT: Reuse gen_push_frame

* ZJIT: Avoid optimizing variadic C call when tracing is enabled
2025-09-18 14:46:03 -04:00
Max Bernstein
88e0ac35a3 ZJIT: Prevent custom allocator in ObjectAllocClass 2025-09-17 17:27:35 -04:00
Max Bernstein
7a82f1faa0 ZJIT: Const-fold IsMethodCfunc 2025-09-17 17:27:35 -04:00
Max Bernstein
c7c6bcc9c8
ZJIT: Print local names in FrameState (#14571) 2025-09-16 15:41:08 -04:00
Takashi Kokubun
d3cb347a40
ZJIT: Share more code with YJIT in jit.c (#14520)
* ZJIT: Share more code with YJIT in jit.c

* Fix ZJIT references to JIT
2025-09-12 20:34:55 +00:00
Takashi Kokubun
4131ace07a
ZJIT, YJIT: Drop "// From xxx.h" comments in bindgen (#14519) 2025-09-11 21:50:27 -07:00
Takashi Kokubun
866e474ac8
ZJIT: Fix backtraces on opt_new (#14461) 2025-09-08 09:50:33 -07:00
Takashi Kokubun
4f030951f2
ZJIT: Invalidate local variables on EP escape (#14448) 2025-09-05 11:26:01 -07:00
Stan Lo
856db87a2a
ZJIT: Add patchpoint for TracePoint (#14420)
ZJIT: Add patchpoint for TracePoint activation

Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
2025-09-04 11:37:06 -07:00
Stan Lo
77a421fb05
ZJIT: Clear jit entry from iseqs after TracePoint activation (#14407)
ZJIT: Remove JITed code after TracePoint is enabled
2025-09-02 19:20:08 +00:00
Stan Lo
3f3a54efff Add rb_jit_vm_unlock and share it in ZJIT and YJIT 2025-08-29 12:55:14 -07:00
Stan Lo
561050496c Add rb_jit_vm_lock_then_barrier and share it in ZJIT and YJIT 2025-08-29 12:55:14 -07:00
Stan Lo
2f6a9c5167 Add rb_jit_multi_ractor_p and share it in ZJIT and YJIT 2025-08-29 12:55:14 -07:00
Max Bernstein
b6f4b5399d
ZJIT: Specialize monomorphic GetIvar (#14388)
Specialize monomorphic `GetIvar` into:

* `GuardType(HeapObject)`
* `GuardShape`
* `LoadIvarEmbedded` or `LoadIvarExtended`

This requires profiling self for `getinstancevariable` (it's not on the operand
stack).

This also optimizes `GetIvar`s that happen as a result of inlining
`attr_reader` and `attr_accessor`.

Also move some (newly) shared JIT helpers into jit.c.
2025-08-29 12:46:08 -04:00
Max Bernstein
8521725225 ZJIT: Generate code for ArrayExtend 2025-08-28 10:14:37 -07:00
Max Bernstein
07e28ba486 ZJIT: Generate code for DefinedIvar 2025-08-28 10:14:37 -07:00
Takashi Kokubun
76810fc349
ZJIT: Implement side exit stats (#14357) 2025-08-27 10:01:07 -07:00
Étienne Barrié
b0c80c2be8 Remove unused SPECIAL_CONST_SHAPE_ID
Its usage was removed in 306d50811dd060d876d1eb364a0d5e6106f5e4f1.
2025-08-21 17:41:39 +02:00
Daniel Colson
fc5ee247d5
ZJIT: Compile toregexp (#14200)
`toregexp` is fairly similar to `concatstrings`, so this commit extracts
a helper for pushing and popping operands on the native stack.

There's probably opportunity to move some of this into lir (e.g. Alan
suggested a push_many that could use STP on ARM to push 2 at a time),
but I might save that for another day.
2025-08-19 10:02:13 -04:00
Max Bernstein
ef95e5ba3d
ZJIT: Profile type+shape distributions (#13901)
ZJIT uses the interpreter to take type profiles of what objects pass through
the code. It stores a compressed record of the history per opcode for the
opcodes we select.

Before this change, we re-used the HIR Type data-structure, a shallow type
lattice, to store historical type information. This was quick for bringup but
is quite lossy as profiles go: we get one bit per built-in type seen, and if we
see a non-built-in type in addition, we end up with BasicObject. Not very
helpful. Additionally, it does not give us any notion of cardinality: how many
of each type did we see?

This change brings with it a much more interesting slice of type history: a
histogram. A Distribution holds a record of the top-N (where N is fixed at Ruby
compile-time) `(Class, ShapeId)` pairs and their counts. It also holds an
*other* count in case we see more than N pairs.

Using this distribution, we can make more informed decisions about when we
should use type information. We can determine if we are strictly monomorphic,
very nearly monomorphic, or something else. Maybe the call-site is polymorphic,
so we should have a polymorphic inline cache. Exciting stuff.

I also plumb this new distribution into the HIR part of the compilation
pipeline.
2025-08-05 16:56:04 -04:00
Takashi Kokubun
b22eb0e468
ZJIT: Add --zjit-stats (#14034) 2025-07-29 10:00:15 -07:00
Alan Wu
960fae438b
ZJIT: Add missing write barrier in profiling (GH-13922)
Fixes `TestZJIT::test_require_rubygems`. It was crashing locally due to
false collection of a live object. See
<https://alanwu.space/post/write-barrier/>.

Co-authored-by: Max Bernstein <max@bernsteinbear.com>
Co-authored-by: Takashi Kokubun <takashi.kokubun@shopify.com>
Co-authored-by: Stan Lo <stan.lo@shopify.com>
2025-07-16 23:25:37 +00:00
Takashi Kokubun
acc3172530
ZJIT: Profile each instruction at most num_profiles times (#13903)
* ZJIT: Profile each instruction at most num_profiles times

* Use saturating_add for num_profiles
2025-07-16 09:53:10 -07:00