> ..., and on other POSIX systems we'll use `read`.
As `pm_string_mapped_init`'s doc comment says, it should fall back to
`read(2)`-based implementation on platforms without memory-mapped files
like WASI, but it didn't. This commit fixes it by calling `pm_string_file_init`
in the fallback case.
Also `defined(_POSIX_MAPPED_FILES)` check for `read(2)`-based path is
unnecessary, and it prevents the fallback from being executed, so this
change removes it.
https://github.com/ruby/prism/commit/b3d9064b71
Previously, GCC 11 on x86-64 inlined the heavy weight logic for
potentially triggering GC into newobj_alloc(). This slowed down
the hotter code path where the ractor cache hits, causing a degradation
to allocation throughput.
Outline the logic into a separate function and have it never inlined.
This restores allocation throughput to the same level as
98eeadc ("Development of 3.4.0 started.").
To evaluate, instrument miniruby so it allocates a bunch of objects and
then exits:
diff --git a/eval.c b/eval.c
--- a/eval.c
+++ b/eval.c
@@ -92,6 +92,15 @@ ruby_setup(void)
}
EC_POP_TAG();
+rb_gc_disable();
+rb_execution_context_t *ec = GET_EC();
+long const n = 20000000;
+for (long i = 0; i < n; ++i) {
+ rb_wb_protected_newobj_of(ec, 0, T_OBJECT, 40);
+}
+printf("alloc %ld\n", n);
+exit(0);
+
return state;
}
With `3.3-equiv` being 98eeadc, and `pre` being f2728c3393d
and `post` being this commit, I have:
$ hyperfine -L buildtag post,pre,3.3-equiv '/ruby/build-{buildtag}/miniruby'
Benchmark 1: /ruby/build-post/miniruby
Time (mean ± σ): 873.4 ms ± 2.8 ms [User: 377.6 ms, System: 490.2 ms]
Range (min … max): 868.3 ms … 877.8 ms 10 runs
Benchmark 2: /ruby/build-pre/miniruby
Time (mean ± σ): 960.1 ms ± 2.8 ms [User: 430.8 ms, System: 523.9 ms]
Range (min … max): 955.5 ms … 964.2 ms 10 runs
Benchmark 3: /ruby/build-3.3-equiv/miniruby
Time (mean ± σ): 886.9 ms ± 2.8 ms [User: 379.5 ms, System: 501.0 ms]
Range (min … max): 883.0 ms … 890.8 ms 10 runs
Summary
'/ruby/build-post/miniruby' ran
1.02 ± 0.00 times faster than '/ruby/build-3.3-equiv/miniruby'
1.10 ± 0.00 times faster than '/ruby/build-pre/miniruby'
These results are from a Skylake server with GCC 11.
We discovered that having gc.o and gc_impl.o in separate translation
units diminishes codegen quality with GCC 11 on x86-64. This commit
solves that problem by including default/gc.c into gc.c, letting the
optimizer have visibility into the body of functions again in builds
not using link-time optimization, which are common.
This effectively restores things to the way they were before
[Feature #20470] from the optimizer's perspective while maintaining the
ability to build gc/default.c as a DSO.
There were a few functions duplicated across gc.c and gc/default.c.
Extract them and put them into gc/gc.h.
[Bug #20653]
This commit refactors how Onigmo handles timeout. Instead of raising a
timeout error, onig_search will return a ONIGERR_TIMEOUT which the
caller can free memory, and then raise a timeout error.
This fixes a memory leak in String#start_with when the regexp times out.
For example:
regex = Regexp.new("^#{"(a*)" * 10_000}x$", timeout: 0.000001)
str = "a" * 1000000 + "x"
10.times do
100.times do
str.start_with?(regex)
rescue
end
puts `ps -o rss= -p #{$$}`
end
Before:
33216
51936
71152
81728
97152
103248
120384
133392
133520
133616
After:
14912
15376
15824
15824
16128
16128
16144
16144
16160
16160
We use pre-existence of `rake_path` to decide whether we need to
regenerate dummy test gems in `tmp`. When changing rubies, the previous
implementation will believe that the correct `rake_path` exists
and avoids regenerating dummy gems, given an error like the following
when specs are run:
```
(...)
Could not find rubygems-generate_index lib directory in /path/to/rubygems/bundler/tmp/1/gems/base/ruby/3.2.0
# ./spec/support/builders.rb:253:in `block in update_repo'
# ./spec/support/helpers.rb:337:in `block in with_gem_path_as'
# ./spec/support/helpers.rb:351:in `without_env_side_effects'
# ./spec/support/helpers.rb:332:in `with_gem_path_as'
# ./spec/support/builders.rb:251:in `update_repo'
# ./spec/support/builders.rb:228:in `build_repo'
# ./spec/support/builders.rb:197:in `build_repo4'
# ./spec/commands/lock_spec.rb:103:in `block (2 levels) in <top (required)>'
(...)
```
To fix this, fix the part of the path that depends on the implementation
and the Ruby version so that we don't give false positives.
https://github.com/rubygems/rubygems/commit/fafacfa210
Removes the symlink for gems.rb.tt and instead uses the singular
template file. Only the destination filename for the gemfile reads from
the `init_gems_rb` setting.
https://github.com/rubygems/rubygems/commit/43ce0e1666
I get a slight boost from these with GCC 11 on Intel Skylake.
Part of a larger story to fix an allocation throughput regression
compared to 98eeadc ("Development of 3.4.0 started.") as the baseline.
[Bug #20650]
The capture group allocates memory that is leaked when it times out.
For example:
re = Regexp.new("^#{"(a*)" * 10_000}x$", timeout: 0.000001)
str = "a" * 1000000 + "x"
10.times do
100.times do
re =~ str
rescue Regexp::TimeoutError
end
puts `ps -o rss= -p #{$$}`
end
Before:
34688
56416
78288
100368
120784
140704
161904
183568
204320
224800
After:
16288
16288
16880
16896
16912
16928
16944
17184
17184
17200
* Note that we could shift the flags by 2 on serialize & deserialize
but it does not seems worth it as it does not save serialized size
in any significant amount, i.e. average was 0.799 before #2924.
* $ bundle exec rake serialized_size:topgems
Before:
Total sizes for top 100 gems:
total source size: 90207647
total serialized size: 69477115
total serialized/total source: 0.770
Stats of ratio serialized/source per file:
average: 0.844
median: 0.825
1st quartile: 0.597
3rd quartile: 1.064
min - max: 0.078 - 3.792
After:
Total sizes for top 100 gems:
total source size: 90207647
total serialized size: 66150209
total serialized/total source: 0.733
Stats of ratio serialized/source per file:
average: 0.800
median: 0.779
1st quartile: 0.568
3rd quartile: 1.007
min - max: 0.076 - 3.675
https://github.com/ruby/prism/commit/e012072f70
* $ bundle exec rake serialized_size:topgems
Before:
Total sizes for top 100 gems:
total source size: 90207647
total serialized size: 86284647
total serialized/total source: 0.957
Stats of ratio serialized/source per file:
average: 0.952
median: 0.937
1st quartile: 0.669
3rd quartile: 1.206
min - max: 0.080 - 4.065
After:
Total sizes for top 100 gems:
total source size: 90207647
total serialized size: 69477115
total serialized/total source: 0.770
Stats of ratio serialized/source per file:
average: 0.844
median: 0.825
1st quartile: 0.597
3rd quartile: 1.064
min - max: 0.078 - 3.792
https://github.com/ruby/prism/commit/cf90fe5759
OpenSSL::ASN1 is being rewritten in Ruby. To make it easier, let's
remove dependency to the instance variables and the internal-use
function ossl_asn1_get_asn1type() outside OpenSSL::ASN1.
This also fixes the insufficient validation of the passed value with
its tagging.
https://github.com/ruby/openssl/commit/35a157462e
The spec is actually testing a behaviour stemming from NUM2INT(), and
since `sizeof(long)>=sizeof(int)`, `min_long-1` always makes NUM2INT()
raise `RangeError`.
There is no guarantee that Integer#size will continue to return
`sizeof(long)` for small integers.
Use the `l!` specifier for Array#pack instead. It is a public
interface that has a direct relationship with the `long` type.
What a "word" is when talking about sizes is confusing because it's a
highly overloaded term. Intel, Microsoft, and GDB are just a few vendors
that have their own definition of what a "word" is. Specs that used the
"wordsize" guard actually were mostly testing for the size of the C
`long` fundamental type, so rename the guard for clarity.
Also, get the size of `long` directly from RbConfig instead of guessing
using Integer#size. Integer#size is not guaranteed to have anything to
do with the `long` type.
This addresses one of the issues in the `test_kw_splat_nil` failure, but
doesn't make the test pass because of other changes that need to be made
to Prism directly.
One issue was when we have the following code Prism was using
`putobject` with an empty hash whereas the parse.y parser used `putnil`.
```ruby
:ok.itself(**nil)
```
Before:
```
0000 putobject :ok ( 1)[Li]
0002 putobject {}
0004 opt_send_without_block <calldata!mid:itself, argc:1, KW_SPLAT>
0006 leave
```
After:
```
== disasm: #<ISeq:<main>@test2.rb:1 (1,0)-(1,17)>
0000 putobject :ok ( 1)[Li]
0002 putnil
0003 opt_send_without_block <calldata!mid:itself, argc:1, KW_SPLAT>
0005 leave
```
Related to ruby/prism#2935.