2351 Commits

Author SHA1 Message Date
Jean Boussier
f48e45d1e9 Move object_id in object fields.
And get rid of the `obj_to_id_tbl`

It's no longer needed, the `object_id` is now stored inline
in the object alongside instance variables.

We still need the inverse table in case `_id2ref` is invoked, but
we lazily build it by walking the heap if that happens.

The `object_id` concern is also no longer a GC implementation
concern, but a generic implementation.

Co-Authored-By: Matt Valentine-House <matt@eightbitraptor.com>
2025-05-08 07:58:05 +02:00
Jean Boussier
0ea210d1ea Rename ivptr -> fields, next_iv_index -> next_field_index
Ivars will longer be the only thing stored inline
via shapes, so keeping the `iv_index` and `ivptr` names
would be confusing.

Instance variables won't be the only thing stored inline
via shapes, so keeping the `ivptr` name would be confusing.

`field` encompass anything that can be stored in a VALUE array.

Similarly, `gen_ivtbl` becomes `gen_fields_tbl`.
2025-05-08 07:58:05 +02:00
Nobuyoshi Nakada
bbf1130f91 Add RBIMPL_ATTR_NONSTRING_ARRAY() macro for GCC 15 2025-05-05 18:25:04 +09:00
Jeremy Evans
ce51ef30df Save one VALUE per embedded RTypedData
This halves the amount of memory used for embedded RTypedData if they
are one VALUE (8 bytes on 64-bit platforms) over the slot size limit.

For Set, on 64-bit it uses an embedded 56-byte struct.  With the
previous implementation, the embedded structs starts at offset 32,
resulting in a total size of 88.  Since that is over the 80 byte
limit, it goes to the next highest bucket, 160 bytes, wasting 72
bytes.  This allows it to fit in a 80 byte bucket, which reduces
the total size for small sets of from 224 bytes (160 bytes
embedded, 64 bytes malloc, 72 bytes wasted in embedding) to 144
bytes (80 bytes embedded, 64 bytes malloc, 0 bytes wasted in
embedding).

Any other embedded RTypedData will see similar advantages if they
are currently one VALUE over the limit.

To implement this, remove the typed_flag from struct RTypedData.
Embed the typed_flag information in the type member, which is
now a tagged pointer using VALUE type, using the bottom low 2 bits
as flags (1 bit for typed flag, the other for the embedded flag).
To get the actual pointer, RTYPEDDATA_TYPE masks out
the low 2 bits and then casts.  That moves the RTypedData data
pointer from offset 32 to offset 24 (on 64-bit).

Vast amount of code in the internals (and probably external C
extensions) expects the following code to work for both RData and
non-embedded RTypedData:

```c
DATA_PTR(obj) = some_pointer;
```

Allow this to work by moving the data pointer in RData between
the dmark and dfree pointers, so it is at the same offset (24
on 64-bit).

Other than these changes to the include files, the only changes
needed were to gc.c, to account for the new struct layouts,
handle setting the low bits in the type member, and to use
RTYPEDDATA_TYPE(obj) instead of RTYPEDDATA(obj)->type.
2025-05-05 09:46:32 +09:00
Jean Boussier
c65991978b get_next_shape_internal: Skip VM lock for single child case
If the shape has only one child, we check it lock-free without
compromising thread safety.

I haven't computed hard data as to how often that it the case,
but we can assume that it's not too rare for shapes to have
a single child that is often requested, typically when freezing
and object.
2025-04-30 23:32:33 +02:00
Nobuyoshi Nakada
b42afa1dbc
Suppress gcc 15 unterminated-string-initialization warnings 2025-04-30 20:04:10 +09:00
Alan Wu
719486a642 Fix C23 (GCC 15) WIN32 compatibility for rb_define_* functions
Fixes [Bug #21286]
2025-04-30 19:44:59 +09:00
Matt Valentine-House
5e8b744dbc RUBY_T_{TRUE,FALSE} comments were reversed
[ci skip]
2025-04-30 08:08:54 +02:00
John Hawthorn
b28363a838 Work on ATOMIC_VALUE_SET 2025-04-18 13:03:54 +09:00
Samuel Williams
8d21f666b8
Introduce enum rb_io_mode. (#7894) 2025-04-16 07:50:37 +00:00
Samuel Williams
4e970c5d5a Expose ruby_thread_has_gvl_p. 2025-04-14 18:28:09 +09:00
Richard Böhme
3aee7b982b Mark first argument to all C-API tracepoint functions as nonnull 2025-03-28 23:08:28 +09:00
Richard Böhme
04ebedf7f0 Make rb_tracearg_(parameters|eval_script|instruction_sequence) public C-API
This allows C-Extension developers to call those methods to retrieve
information about a TracePoint's parameters, eval script and
instruction sequence.

Implements [Feature #20757]
2025-03-28 23:08:28 +09:00
Nobuyoshi Nakada
bb7f1619d2
Suppress sign-conversion warning [ci skip] 2025-03-18 16:22:49 +09:00
Nobuyoshi Nakada
47d75b65bf Make wrapper of main for wasm more generic 2025-03-16 17:33:58 +09:00
Nobuyoshi Nakada
453f88f7f1 Make ASAN default option string built-in libruby
The content depends on ruby internal, not responsibility of the
caller.  Revive `RUBY_GLOBAL_SETUP` macro to define the hook function.
2025-03-16 17:33:58 +09:00
Nobuyoshi Nakada
42c0722f83
[DOC] Fix the comment for RUBY_CONST_ID and rb_intern
RUBY_CONST_ID has never been deprecated; `rb_intern` is handy but it
is using non-standard GCC extensions and does not cache the ID with
other compilers.
2025-02-28 12:55:46 +09:00
Peter Zhu
16f41eca53 Remove dead iv_index_tbl field in RObject 2025-02-12 14:03:07 -05:00
Nobuyoshi Nakada
c961d093b1 [Bug #21024] <cstdbool> header has been useless
And finally deprecated at C++-17.
Patched by jprokop (Jarek Prokop).
2025-01-14 21:56:14 +09:00
Nobuyoshi Nakada
8891890bff
Mark rb_path_check as internal only 2025-01-14 11:26:29 +09:00
Nobuyoshi Nakada
f7fd42ce74
Move the declaration of rb_path_check
Although this function is unrelated to hash, it was defined in hash.c
to check PATH environment variable originally.  Then the definition
was moeved to file.c but the declaration was left in the hash.c block.
2025-01-13 19:10:26 +09:00
Nobuyoshi Nakada
d9d08484d2
[DOC] Fix the description of rb_path_check
c.f. #20971
2025-01-12 13:53:15 +09:00
Nobuyoshi Nakada
1b3037081e
[Bug #21024] <cstdbool> header is deprecated in C++17 2025-01-11 12:21:57 +09:00
Peter Zhu
99ff0224a5 Move rbimpl_size_add_overflow from gc.c to memory.h 2025-01-02 11:03:04 -05:00
Yukihiro "Matz" Matsumoto
2f064b3b4b
Development of 3.5.0 started. 2024-12-25 18:15:17 +09:00
Naohisa Goto
528ec70604 use RBIMPL_ATTR_MAYBE_UNUSED
The macro MAYBE_UNUSED, prepared by ./configure, may not be defined in
some environments such as Oracle Developer Studio 12.5 on Solaris 10.

This fixes [Bug #20963]
2024-12-18 23:37:22 +09:00
Alan Wu
6336431a64 [DOC] rb_id2name(): Note truncation danger (+minor copyediting)
Thanks, nobu!
2024-12-17 21:50:00 -05:00
Peter Zhu
375fec7c53 [DOC] Add note to rb_id2name about GC compaction 2024-12-17 16:32:13 -05:00
Nobuyoshi Nakada
5a7a1a4a13
Win32: Fix rbimpl_size_mul_overflow on arm64
`_umul128` is specific to x86_64 platform, see higher words by
`__umulh` on arm64.
2024-12-17 20:25:06 +09:00
Nobuyoshi Nakada
86f00c9922
[DOC] Update rb_strlen_lit
It is not "in bytes" for wide char literal.
2024-12-13 13:49:33 +09:00
Alan Wu
c0e12bf8d2 Fix typos in public headers [ci skip] 2024-12-04 16:26:31 -05:00
Alan Wu
88764dde78 [DOC] Rewrite docs for rb_sym2str()
Explaining this by reference to rb_id2str() obscures a few important
details because IDs and symbols don't map to each other perfectly (you
can have a dynamic symbol without an ID!) Also, it used to take 2
redirections to get to concrete information, and I think being more
direct is friendlier.
2024-11-29 18:33:44 -05:00
Alan Wu
2a0006c101 [DOC] Mention that rb_id2str() returns a frozen string 2024-11-29 18:33:44 -05:00
Samuel Williams
9c268302bf
Introduce Fiber::Scheduler#blocking_operation_wait. (#12016)
Redirect `rb_nogvl` blocking operations to the fiber scheduler if possible
to prevent stalling the event loop.

[Feature #20876]
2024-11-20 19:40:17 +13:00
Jean byroot Boussier
6deeec5d45
Mark strings returned by Symbol#to_s as chilled (#12065)
* Use FL_USER0 for ELTS_SHARED

This makes space in RString for two bits for chilled strings.

* Mark strings returned by `Symbol#to_s` as chilled

[Feature #20350]

`STR_CHILLED` now spans on two user flags. If one bit is set it
marks a chilled string literal, if it's the other it marks a
`Symbol#to_s` chilled string.

Since it's not possible, and doesn't make much sense to include
debug info when `--debug-frozen-string-literal` is set, we can't
include allocation source, but we can safely include the symbol
name in the warning message, making it much easier to find the source
of the issue.

Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com>

---------

Co-authored-by: Étienne Barrié <etienne.barrie@gmail.com>
Co-authored-by: Jean Boussier <jean.boussier@gmail.com>
2024-11-13 09:20:00 -05:00
Nobuyoshi Nakada
f17cfb4958 Add missing macros for __has_builtin 2024-11-12 16:40:52 +09:00
Nobuyoshi Nakada
bce1bd1dc1
rb_strlen_lit: support wide string literals 2024-11-10 22:52:17 +09:00
Nobuyoshi Nakada
2f88a9258d
Fix sign-conversion warnings on IL32 platforms
If `long` and `int` are the same size, `unsigned int` max would exceed
`signed long` range.  It is guaranteed by `RB_POSFIXABLE` that `v` can
be casted to `long` safely here.
2024-11-10 21:57:56 +09:00
Samuel Williams
3b9896acfc
Revert "Introduce Fiber Scheduler blocking_region hook. (#11963)" (#12013)
This reverts some of commit 87fb44dff6409a19d12052cf0fc07ba80a4c45ac.

We will rename and propose a slightly different interface.
2024-11-06 22:19:40 +13:00
Koichi Sasada
ab7ab9e450 Warning[:strict_unused_block]
to show unused block warning strictly.

```ruby
class C
  def f = nil
end

class D
  def f = yield
end

[C.new, D.new].each{|obj| obj.f{}}
```

In this case, `D#f` accepts a block. However `C#f` doesn't
accept a block. There are some cases passing a block with
`obj.f{}` where `obj` is `C` or `D`. To avoid warnings on
such cases, "unused block warning" will be warned only if
there is not same name which accepts a block.
On the above example, `C.new.f{}` doesn't show any warnings
because there is a same name `D#f` which accepts a block.

We call this default behavior as "relax mode".

`strict_unused_block` new warning category changes from
"relax mode" to "strict mode", we don't check same name
methods and `C.new.f{}` will be warned.

[Feature #15554]
2024-11-06 11:06:18 +09:00
Nobuyoshi Nakada
e2909570bb
Include windows.h for LONG and Interlocked functions 2024-11-02 22:27:03 +09:00
Samuel Williams
87fb44dff6
Introduce Fiber Scheduler blocking_region hook. (#11963) 2024-10-31 17:26:37 +13:00
Nobuyoshi Nakada
7d1011d3fa Fix false warning by gcc 14 for aarch64
gcc 14 for aarch64 with `-O3` may emit a false positive warning for a
pointer access of `RB_BUILTIN_TYPE` called from `RB_TYPE_P`.  `Qfalse`
shouldn't get there because of `RB_SPECIAL_CONST_P`, but the optimizer
seems to ignore this condition in some cases (`ASSUME` just before the
access doesn't seem to have any effect either).  Only by reversing the
order in `RB_SPECIAL_CONST_P` to compare with 0 first does the warning
seem to go away.
2024-10-23 23:02:15 +09:00
Nobuyoshi Nakada
9a90cd2284 Cast via uintptr_t function pointer between object pointer
- ISO C forbids conversion of function pointer to object pointer type
- ISO C forbids conversion of object pointer to function pointer type
2024-10-08 23:29:49 +09:00
Samuel Williams
c878843b2c
Better handling of timeout in rb_io_maybe_wait_*. (#9531) 2024-10-04 19:36:06 +13:00
Samuel Williams
96d69d2df2
Clarify rb_io_maybe_wait behaviour. (#9527) 2024-10-04 18:40:38 +13:00
NAITOH Jun
373f679e48 fix rb_memsearch() document
## Why?
The explanation of x and y is reversed.

ddbd644001/re.c (L251-L256)
```
long
rb_memsearch(const void *x0, long m, const void *y0, long n, rb_encoding *enc)
{
    const unsigned char *x = x0, *y = y0;

    if (m > n) return -1;
```
2024-09-24 15:12:48 +09:00
Alan Wu
3fa5b4be19 [DOC] Mention rb_io_fdopen() takes ownership of the FD 2024-08-28 17:27:57 -04:00
Nobuyoshi Nakada
1fd0a1b4ce
Fix sign-conversion warning
```
../../.././include/ruby/internal/special_consts.h:349:36: error: conversion to ‘VALUE’ {aka ‘long unsigned int’} from ‘int’ may change the sign of the result [-Werror=sign-conversion]
  349 |     return RB_SPECIAL_CONST_P(obj) * RUBY_Qtrue;
      |            ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~
```
2024-08-11 16:04:37 +09:00
Peter Zhu
10574857ce Fix memory leak in Regexp capture group when timeout
[Bug #20650]

The capture group allocates memory that is leaked when it times out.

For example:

    re = Regexp.new("^#{"(a*)" * 10_000}x$", timeout: 0.000001)
    str = "a" * 1000000 + "x"

    10.times do
      100.times do
        re =~ str
      rescue Regexp::TimeoutError
      end

      puts `ps -o rss= -p #{$$}`
    end

Before:

    34688
    56416
    78288
    100368
    120784
    140704
    161904
    183568
    204320
    224800

After:

    16288
    16288
    16880
    16896
    16912
    16928
    16944
    17184
    17184
    17200
2024-07-25 09:23:49 -04:00