323 Commits

Author SHA1 Message Date
Peter Zhu
d729c1575e Output object_id in ObjectSpace.dump
Outputs the object ID in the dump for objects that have it seen.
2025-01-30 11:48:14 -05:00
Koichi Sasada
c695536cc8 use st_update to prevent table extension
to prevent the following scenario:

1. `delete_unique_str()` can be called while GC (sweeping)
2. it calls `st_insert()` to decrement the counter
3. `st_insert()` can try to extend the table even if the key exists
4. `xmalloc` while GC and cause BUG
2024-12-23 11:05:34 +09:00
Peter Zhu
a58675386c Prefix asan_poison_object with rb 2024-12-19 09:14:34 -05:00
Peter Zhu
516a6cd1ad Check whether object is valid in allocation_info_tracer_compact
When reference updating ObjectSpace.trace_object_allocations, we need to
check whether the object is valid or not because it does not mark the
object so the object may be dead. This can cause a segmentation fault
if the object is on a free heap page.

For example, the following script crashes:

    require "objspace"

    objs = []
    ObjectSpace.trace_object_allocations do
      1_000_000.times do
        objs << Object.new
      end
    end

    objs = nil

    # Free pages that the objs were on
    GC.start

    # Run compaction and check that it doesn't crash
    GC.compact
2024-12-16 12:24:24 -05:00
Peter Zhu
15765eac0a Fix ObjectSpace.trace_object_allocations for compaction
We need to reinsert into the ST table when an object moves because it is
a numtable that hashes on the object address, so when an object moves we
need to reinsert it rather than just updating the key.
2024-12-16 10:12:54 -05:00
Peter Zhu
b038530506 Fix compaction check for ObjectSpace.trace_object_allocations
We should be checking for key for moved objects rather than the value
because the key is a Ruby object and the value is malloc'd memory.
2024-12-16 10:12:54 -05:00
Alan Wu
476d655053 objspace_dump: Use FILE* to avoid crashing in mark functions
We observed crashes from rb_io_bufwrite() thread switching (through
rb_thread_check_ints()) in the middle of rb_execution_context_mark(). By
the time rb_execution_context_mark() gets a timeslice again, it read
garbage from a frame that was already popped in another thread, crashing
the process in SEGV. Other mark functions probably have their own ways
of breaking, but clearly, the usual IO code do too much for this
perilous pseudo GC context.

Use `FILE*` like before 5001cc47169614ea07d87651c95c2ee185e374e0
("Optimize ObjectSpace.dump_all"). Also, add type checking for
the private _dump methods.

Co-authored-by: Peter Zhu <peter@peterzhu.ca>
2024-12-09 16:08:35 -05:00
Jean Boussier
ee1cd1656f ObjectSpace.dump: handle Module#set_temporary_name
[Bug #20892]

Until the introduction of that method, it was impossible for a
Module name not to be valid JSON, hence it wasn't going through
the slower escaping function.

This assumption no longer hold.
2024-11-12 20:21:27 +01:00
Peter Zhu
51bd816517 [Feature #20470] Split GC into gc_impl.c
This commit splits gc.c into two files:

- gc.c now only contains code not specific to Ruby GC. This includes
  code to mark objects (which the GC implementation may choose not to
  use) and wrappers for internal APIs that the implementation may need
  to use (e.g. locking the VM).

- gc_impl.c now contains the implementation of Ruby's GC. This includes
  marking, sweeping, compaction, and statistics. Most importantly,
  gc_impl.c only uses public APIs in Ruby and a limited set of functions
  exposed in gc.c. This allows us to build gc_impl.c independently of
  Ruby and plug Ruby's GC into itself.
2024-07-03 09:03:40 -04:00
卜部昌平
c844968b72 ruby tool/update-deps --fix 2024-04-27 21:55:28 +09:00
Étienne Barrié
12be40ae6b Implement chilled strings
[Feature #20205]

As a path toward enabling frozen string literals by default in the future,
this commit introduce "chilled strings". From a user perspective chilled
strings pretend to be frozen, but on the first attempt to mutate them,
they lose their frozen status and emit a warning rather than to raise a
`FrozenError`.

Implementation wise, `rb_compile_option_struct.frozen_string_literal` is
no longer a boolean but a tri-state of `enabled/disabled/unset`.

When code is compiled with frozen string literals neither explictly enabled
or disabled, string literals are compiled with a new `putchilledstring`
instruction. This instruction is identical to `putstring` except it marks
the String with the `STR_CHILLED (FL_USER3)` and `FL_FREEZE` flags.

Chilled strings have the `FL_FREEZE` flag as to minimize the need to check
for chilled strings across the codebase, and to improve compatibility with
C extensions.

Notes:
  - `String#freeze`: clears the chilled flag.
  - `String#-@`: acts as if the string was mutable.
  - `String#+@`: acts as if the string was mutable.
  - `String#clone`: copies the chilled flag.

Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
2024-03-19 09:26:49 +01:00
Jean Boussier
b4a69351ec Move FL_SINGLETON to FL_USER1
This frees FL_USER0 on both T_MODULE and T_CLASS.

Note: prior to this, FL_SINGLETON was never set on T_MODULE,
so checking for `FL_SINGLETON` without first checking that
`FL_TYPE` was `T_CLASS` was valid. That's no longer the case.
2024-03-06 13:11:41 -05:00
Lazarus Lazaridis
61ea202f8b
[DOC] Fix invalid documentation for reachable_objects_from (#10172)
Previous documentation is stating the opposite (that the method won't
work for CRuby).
2024-03-05 01:35:51 +09:00
Peter Zhu
386a006630 Use rb_hash_foreach in objspace.c
Using RHASH_TBL_RAW is a private API, so we should use rb_hash_foreach
rather than RHASH_TBL_RAW with st_foreach.
2024-02-23 10:24:21 -05:00
KJ Tsanaktsidis
61da90c1b8 Mark asan fake stacks during machine stack marking
ASAN leaves a pointer to the fake frame on the stack; we can use the
__asan_addr_is_in_fake_stack API to work out the extent of the fake
stack and thus mark any VALUEs contained therein.

[Bug #20001]
2024-01-19 09:55:12 +11:00
KJ Tsanaktsidis
688a6ff510 Revert "Mark asan fake stacks during machine stack marking"
This reverts commit d10bc3a2b8300cffc383e10c3730871e851be24c.
2024-01-12 17:58:54 +11:00
KJ Tsanaktsidis
d10bc3a2b8 Mark asan fake stacks during machine stack marking
ASAN leaves a pointer to the fake frame on the stack; we can use the
__asan_addr_is_in_fake_stack API to work out the extent of the fake
stack and thus mark any VALUEs contained therein.

[Bug #20001]
2024-01-12 17:29:48 +11:00
Jean Boussier
6391ae9ebc objspace_dump.c: dump call cache ids with dump_append_id
Not all `ID` have an associated string.

Fixes a SEGFAULT in ObjectSpace.dump_all spec.
2023-11-22 10:24:35 +01:00
yui-knk
c3ab946e86 ObjectSpace.count_nodes doesn't count nodes
Node has not been managed by GC from Ruby 2.5.
Therefore these codes are not needed. If ObjectSpace depends on Node,
it needs to update the file when node type is updated. Delete node
related codes to avoid such update.
2023-11-21 14:39:06 +09:00
Aaron Patterson
6fce8c7980 Don't try compacting ivars on Classes that are "too complex"
Too complex classes use a hash table to store ivs, and should always pin
their IVs.  We shouldn't touch those classes in compaction.
2023-11-20 16:09:48 -08:00
Peter Zhu
68869e9bd9 Revert "Revert "Remove SHAPE_CAPACITY_CHANGE shapes""
This reverts commit 5f3fb4f4e397735783743fe52a7899b614bece20.
2023-11-13 18:26:36 -05:00
John Hawthorn
b41270842a Record more info from CALLCACHE in heap dumps
This records the called_id and klass from imemo_callcache objects in
heap dumps.
2023-11-13 15:03:11 -08:00
Peter Zhu
5f3fb4f4e3 Revert "Remove SHAPE_CAPACITY_CHANGE shapes"
This reverts commit f6910a61122931e4193bcc0fad18d839c319b720.

We're seeing crashes in the test suite of Shopify's core monolith after
this change.
2023-11-10 11:27:49 -05:00
Peter Zhu
f6910a6112 Remove SHAPE_CAPACITY_CHANGE shapes
We don't need to create a shape to transition capacity as we can
transition the capacity when the capacity of the SHAPE_IVAR changes.
2023-11-09 09:25:02 -05:00
Peter Zhu
38ba040d8b Make every initial size pool shape a root shape
This commit makes every initial size pool shape a root shape and assigns
it a capacity of 0.
2023-11-02 13:42:11 -04:00
John Hawthorn
1c871c08d9 Switch mid dump to dump_append_string_value
I don't think it's possible to create a CI with a mid which would need
escaping to be in a JSON string, but I think we might as well not rely
on that assumption.
2023-10-12 10:22:32 +02:00
John Hawthorn
635b92099e Fix ObjectSpace.dump with super() callinfo
super() uses 0 as mid for its callinfo, so we need to check for that to
avoid a segfault when using dump_all.
2023-10-12 10:22:32 +02:00
Nobuyoshi Nakada
a5cc6341c0
Remove NODE_VALUES
This node type was added for the multi-value experiment back in 2004.
The feature itself was removed after a few years, but this is its
remnant.
2023-10-06 03:39:58 +09:00
Nobuyoshi Nakada
70e1635950 Move internal NODE_DEF_TEMP to parse.y 2023-10-05 14:23:42 +09:00
Peter Zhu
63e504d6e6 Dump name of method for imemo callinfo
This commit dumps the `mid` of the imemo callinfo when calling
`ObjectSpace.dump_all`.
2023-10-02 09:49:13 -04:00
yui-knk
68ae87546e Merge NODE_DEF_TEMP and NODE_DEF_TEMP2 2023-09-29 19:36:34 +09:00
yui-knk
74c6781153 Change RNode structure from union to struct
All kind of AST nodes use same struct RNode, which has u1, u2, u3 union members
for holding different kind of data.
This has two problems.

1. Low flexibility of data structure

Some nodes, for example NODE_TRUE, don’t use u1, u2, u3. On the other hand,
NODE_OP_ASGN2 needs more than three union members. However they use same
structure definition, need to allocate three union members for NODE_TRUE and
need to separate NODE_OP_ASGN2 into another node.
This change removes the restriction so make it possible to
change data structure by each node type.

2. No compile time check for union member access

It’s developer’s responsibility for using correct member for each node type when it’s union.
This change clarifies which node has which type of fields and enables compile time check.

This commit also changes node_buffer_elem_struct buf management to handle
different size data with alignment.
2023-09-28 11:58:10 +09:00
yui-knk
b481b673d7 [Feature #19719] Universal Parser
Introduce Universal Parser mode for the parser.
This commit includes these changes:

* Introduce `UNIVERSAL_PARSER` macro. All of CRuby related functions
  are passed via `struct rb_parser_config_struct` when this macro is enabled.
* Add CI task with 'cppflags=-DUNIVERSAL_PARSER' for ubuntu.
2023-06-12 18:23:48 +09:00
Samuel Williams
3fe09eba9d
Add deprecations for public struct rb_io members. (#7916)
* Add deprecations for public struct rb_io members.
2023-06-08 20:22:43 +09:00
eileencodes
40f090f433 Revert "Revert "Fix cvar caching when class is cloned""
This reverts commit 10621f7cb9a0c70e568f89cce47a02e878af6778.

This was reverted because the gc integrity build started failing. We
have figured out a fix so I'm reopening the PR.

Original commit message:

Fix cvar caching when class is cloned

The class variable cache that was added in
ruby#4544 changed the behavior of class
variables on cloned classes. As reported when a class is cloned AND a
class variable was set, and the class variable was read from the
original class, reading a class variable from the cloned class would
return the value from the original class.

This was happening because the IC (inline cache) is stored on the ISEQ
which is shared between the original and cloned class, therefore they
share the cache too.

To fix this we are now storing the `cref` in the cache so that we can
check if it's equal to the current `cref`. If it's different we don't
want to read from the cache. If it's the same we do. Cloned classes
don't share the same cref with their original class.

This will need to be backported to 3.1 in addition to 3.2 since the bug
exists in both versions.

We also added a marking function which was missing.

Fixes [Bug #19379]

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2023-06-05 11:11:12 -07:00
Aaron Patterson
10621f7cb9
Revert "Fix cvar caching when class is cloned"
This reverts commit 77d1b082470790c17c24a2f406b4fec5d522636b.
2023-06-01 14:55:36 -07:00
eileencodes
77d1b08247 Fix cvar caching when class is cloned
The class variable cache that was added in
https://github.com/ruby/ruby/pull/4544 changed the behavior of class
variables on cloned classes. As reported when a class is cloned AND a
class variable was set, and the class variable was read from the
original class, reading a class variable from the cloned class would
return the value from the original class.

This was happening because the IC (inline cache) is stored on the ISEQ
which is shared between the original and cloned class, therefore they
share the cache too.

To fix this we are now storing the `cref` in the cache so that we can
check if it's equal to the current `cref`. If it's different we don't
want to read from the cache. If it's the same we do. Cloned classes
don't share the same cref with their original class.

This will need to be backported to 3.1 in addition to 3.2 since the bug
exists in both versions.

We also added a marking function which was missing.

Fixes [Bug #19379]

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2023-06-01 08:52:48 -07:00
NARUSE, Yui
85dcc4866d Revert "Hide most of the implementation of struct rb_io. (#6511)"
This reverts commit 18e55fc1e1ec20e8f3166e3059e76c885fc9f8f2.

fix [Bug #19704]
https://bugs.ruby-lang.org/issues/19704
This breaks compatibility for extension libraries. Such changes
need a discussion.
2023-06-01 08:43:22 +09:00
Samuel Williams
18e55fc1e1
Hide most of the implementation of struct rb_io. (#6511)
* Add rb_io_path and rb_io_open_descriptor.

* Use rb_io_open_descriptor to create PTY objects

* Rename FMODE_PREP -> FMODE_EXTERNAL and expose it

FMODE_PREP I believe refers to the concept of a "pre-prepared" file, but
FMODE_EXTERNAL is clearer about what the file descriptor represents and
aligns with language in the IO::Buffer module.

* Ensure that rb_io_open_descriptor closes the FD if it fails

If FMODE_EXTERNAL is not set, then it's guaranteed that Ruby will be
responsible for closing your file, eventually, if you pass it to
rb_io_open_descriptor, even if it raises an exception.

* Rename IS_EXTERNAL_FD -> RUBY_IO_EXTERNAL_P

* Expose `rb_io_closed_p`.

* Add `rb_io_mode` to get IO mode.

---------

Co-authored-by: KJ Tsanaktsidis <ktsanaktsidis@zendesk.com>
2023-05-30 10:02:40 +09:00
Matt Valentine-House
2a34bcaa10 Update VPATH for socket, & dependencies
The socket extensions rubysocket.h pulls in the "private" include/gc.h,
which now depends on vm_core.h. vm_core.h pulls in id.h

when tool/update-deps generates the dependencies for the makefiles, it
generates the line for id.h to be based on VPATH, which is configured in
the extconf.rb for each of the extensions. By default VPATH does not
include the actual source directory of the current Ruby so the
dependency fails to resolve and linking fails.

We need to append the topdir and top_srcdir to VPATH to have the
dependancy picked up correctly (and I believe we need both of these to
cope with in-tree and out-of-tree builds).

I copied this from the approach taken in
https://github.com/ruby/ruby/blob/master/ext/objspace/extconf.rb#L3
2023-04-06 11:07:16 +01:00
Matt Valentine-House
5e4b80177e Update the depend files 2023-02-28 09:09:00 -08:00
Matt Valentine-House
f38c6552f9 Remove intern/gc.h from Make deps 2023-02-27 10:11:56 -08:00
zverok
e1b447a323 [DOC] Improve ObjectSpace#dump_XXX method docs
* remove false call-seq (output from Ruby parsing is cleaner)
* explain output: argument in plain words
* change parameter name in docs of #dump_shapes (typo)
2023-02-19 22:32:52 +02:00
Jean Boussier
7413079dae Encapsulate RCLASS_ATTACHED_OBJECT
Right now the attached object is stored as an instance variable
and all the call sites that either get or set it have to know how it's
stored.

It's preferable to hide this implementation detail behind accessors
so that it is easier to change how it's stored.
2023-02-15 15:24:22 +01:00
Matt Valentine-House
72aba64fff Merge gc.h and internal/gc.h
[Feature #19425]
2023-02-09 10:32:29 -05:00
Nobuyoshi Nakada
899ea35035
Extract include/ruby/internal/attr/packed_struct.h
Split `PACKED_STRUCT` and `PACKED_STRUCT_UNALIGNED` macros into the
macros bellow:
* `RBIMPL_ATTR_PACKED_STRUCT_BEGIN`
* `RBIMPL_ATTR_PACKED_STRUCT_END`
* `RBIMPL_ATTR_PACKED_STRUCT_UNALIGNED_BEGIN`
* `RBIMPL_ATTR_PACKED_STRUCT_UNALIGNED_END`
2023-02-08 12:34:13 +09:00
Peter Zhu
2056c0a7c6 Add embedded status to dumps of T_OBJECT
This commit adds `"embedded":true` in ObjectSpace.dump for T_OBJECTs
that are embedded.
2023-01-05 16:00:36 -05:00
Peter Zhu
b8a3f1bd45 Fix crash in tracing object allocations
ObjectSpace.trace_object_allocations_start could crash since it adds a
TracePoint for when objects are freed. However, TracePoint could crash
since it modifies st tables while inside the GC that is trying to free
the object. This could cause a memory allocation to happen which would
crash if it triggers another GC.

See a crash log: http://ci.rvm.jp/results/trunk@ruby-sp1/4373707
2023-01-04 09:10:58 -05:00
Nobuyoshi Nakada
f527a0911d
[DOC] [Bug #19290] fix formatting 2023-01-01 14:50:39 +09:00
Jemma Issroff
e9ba3042e1 Indicate if a shape is too_complex in ObjectSpace#dump 2022-12-15 13:41:47 -08:00