mirror of https://github.com/ruby/ruby.git synced 2026-01-27 04:24:23 +00:00

Go to file

schneems 5487ee4fe8 [ruby/syntax_suggest] Fix sibling bug to #177

While #177 is reported as being caused by a comment, the underlying behavior is a problem due to the newline that we generated (from a comment). The prior commit fixed that problem by preserving whitespace before the comment. That guarantees that a block will form there from the frontier before it will be expanded there via a "neighbors" method. Since empty lines are valid ruby code, it will be hidden and be safe.

## Problem setup

This failure mode is not fixed by the prior commit, because the indentation is 0. To provide good results, we must make the algorithm less greedy. One heuristic/signal to follow is developer added newlines. If a developer puts a newline between code, it's more likely they're unrelated. For example:

```
port = rand(1000...9999)
stub_request(:any, "localhost:#{port}")

query = Cutlass::FunctionQuery.new(
  port: port
).call

expect(WebMock).to have_requested(:post, "localhost:#{port}").
  with(body: "{}")
```

This code is split into three chunks by the developer. Each are likely (but not guaranteed) to be intended to stand on their own (in terms of syntax). This behavior is good for scanning neighbors (same indent or higher) within a method, but bad for parsing neighbors across methods.

## Problem

Code is expanded to capture all neighbors, and then it decreases indent level which allows it to capture surrounding scope (think moving from within the method to also capturing the `def/end` definition. Once the indentation level has been increased, we go back to scanning neighbors, but now neighbors also contain keywords.

For example:

```
  1 def bark
  2
  3 end
  4
  5 def sit
  6 end
```

In this case if lines 4, 5, and 6 are in a block when it tries to expand neighbors it will expand up. If it stops after line 2 or 3 it may cause problems since there's a valid kw/end pair, but the block will be checked without it.

TLDR; It's good to stop scanning code after hitting a newline when you're in a method...it causes a problem scanning code between methods when everything inside of one of the methods is an empty line.

In this case it grabs the end on line 3 and since the problem was an extra end, the program now compiles correctly. It incorrectly assumes that the block it captured was causing the problem.

## Extra bit of context

One other technical detail is that after we've decided to stop scanning code for a new neighbor block expansion, we look around the block and grab any empty newlines. Basically adding empty newlines before of after a code block do not affect the parsing of that block.

## The fix

Since we know that this problem only happens when there's a newline inside of a method and we know this particular failure mode is due to having an invalid block (capturing an extra end, but not it's keyword) we have all the metadata we need to detect this scenario and correct it.

We know that the next line above our block must be code or empty (since we grabbed extra newlines). Same for code below it. We can count all the keywords and ends in the block. If they are balanced, it's likely (but not guaranteed) we formed the block correctly. If they're imbalanced, look above or below (depending on the nature of the imbalance), check to see if adding that line would balance the count.

This concept of balance and "leaning" comes from work in https://github.com/ruby/syntax_suggest/pull/152 and has proven useful, but not been formally introduced into the main branch.

## Outcome

Adding this extra check introduced no regressions and fixed the test case. It might be possible there's a mirror or similar problem that we're not handling. That will come out in time. It might also be possible that this causes a worse case in some code not under test. That too would come out in time.

One other possible concern to adding logic in this area (which is a hot codepath), is performance. This extra count check will be performed for every block. In general the two most helpful performance strategies I've found are reducing total number of blocks (therefore reducing overall N internal iterations) and making better matches (the parser to determine if a close block is valid or not is a major bottleneck. If we can split valid code into valid blocks, then it's only evaluated by the parser once, where as invalid code must be continuously re-checked by the parser until it becomes valid, or is determined to be the cause of the core problem.

This extra logic should very rarely result in a change, but when it does it should tend to produce slightly larger blocks (by one line) and more accurate blocks.

Informally it seems to have no impact on performance:

``
This branch:
DEBUG_DISPLAY=1 bundle exec rspec spec/ --format=failures  3.01s user 1.62s system 113% cpu 4.076 total
```

```
On main:
DEBUG_DISPLAY=1 bundle exec rspec spec/ --format=failures  3.02s user 1.64s system 113% cpu 4.098 total
```

https://github.com/ruby/syntax_suggest/commit/13739c6946

2023-04-06 15:45:28 +09:00

.github

Bump github/codeql-action from 2.2.9 to 2.2.10

2023-04-06 12:43:19 +09:00

basictest

…

benchmark

Remove MJIT-specific benchmarks

2023-03-06 22:36:57 -08:00

bin

util/rubocop -A --only Layout/EmptyLineAfterMagicComment

2023-03-23 17:18:49 +09:00

bootstraptest

YJIT: Add codegen for Integer methods (#7665 )

2023-04-05 13:19:31 -07:00

ccan

…

coroutine

Add support for LoongArch (#7343 )

2023-02-22 13:11:33 +09:00

coverage

…

cygwin

…

defs

BundledGem.dummy_spec needs to checkout revision after cloning repository.

2023-03-08 17:48:43 +09:00

doc

Add BIN as an entry in the glossary (#7667 )

2023-04-05 16:11:04 -07:00

enc

Fix handling of 6-byte codepoints in left_adjust_char_head in CESU-8 encoding

2023-03-18 15:43:54 +09:00

ext

[Feature #19579 ] Remove !USE_RVARGC code (#7655 )

2023-04-04 17:30:06 -04:00

gems

Fix a test in typeprof

2023-04-01 15:13:08 -07:00

include

[Feature #19579 ] Remove !USE_RVARGC code (#7655 )

2023-04-04 17:30:06 -04:00

internal

[Feature #19579 ] Remove !USE_RVARGC code (#7655 )

2023-04-04 17:30:06 -04:00

lib

[ruby/syntax_suggest] Fix sibling bug to #177

2023-04-06 15:45:28 +09:00

libexec

[ruby/irb] Removed Release Version and Revisions for old VCS software

2023-01-11 22:29:09 +00:00

man

Add RUBY_GC_HEAP_INIT_SIZE_%d_SLOTS to pre-init pools granularly

2023-02-08 09:26:07 +01:00

misc

gdb: Fix a command example

2023-04-01 00:23:35 -07:00

missing

…

sample

Add all-ruby-quine as a sample code

2023-02-27 11:20:42 +09:00

spec

[ruby/syntax_suggest] Fix sibling bug to #177

2023-04-06 15:45:28 +09:00

template

Check leaked global symbols by default

2023-04-03 10:07:22 +09:00

test

Add missing test for Data.initialize

2023-04-06 09:24:38 +03:00

tool

core_assertions.rb: Prefer CPU time clocks

2023-04-06 00:19:03 +09:00

wasm

…

win32

s/MJIT/RJIT/

2023-03-06 23:44:01 -08:00

yjit

YJIT: Add codegen for Integer methods (#7665 )

2023-04-05 13:19:31 -07:00

.appveyor.yml

Bump the required BASERUBY version to 2.5 (#7504 )

2023-03-10 23:40:22 -08:00

.cirrus.yml

Check leaked global symbols by default

2023-04-03 10:07:22 +09:00

.dir-locals.el

…

.document

s/mjit/rjit/

2023-03-06 23:44:01 -08:00

.editorconfig

…

.gdbinit

Expand tabs in .gdbinit

2023-03-31 00:52:47 -07:00

.git-blame-ignore-revs

Ignore parse.y expand tabs commit

2023-03-09 09:34:04 -08:00

.gitattributes

…

.gitignore

s/mjit/rjit/

2023-03-06 23:44:01 -08:00

.indent.pro

…

.rdoc_options

…

.rspec_parallel

…

.travis.yml

…

aclocal.m4

…

addr2line.c

addr2line.c: Silence GCC 11 false -Wmaybe-uninitialized warning

2023-01-16 15:45:51 -05:00

addr2line.h

…

array.c

[Feature #19579 ] Remove !USE_RVARGC code (#7655 )

2023-04-04 17:30:06 -04:00

array.rb

Array#first and Array#last in Ruby

2023-03-23 14:03:12 +09:00

ast.c

Add utility macros DECIMAL_SIZE_OF and DECIMAL_SIZE_OF_BYTES

2023-02-14 15:18:21 +09:00

ast.rb

Fix spelling (#7389 )

2023-02-27 09:56:06 -08:00

autogen.sh

…

bignum.c

Stop exporting symbols for MJIT

2023-03-06 21:59:23 -08:00

BSDL

…

builtin.c

Check loading built-in binaries

2023-03-08 13:59:21 +09:00

builtin.h

Remove MJIT's builtin function compiler

2023-03-07 23:16:24 -08:00

class.c

Adjust styles [ci skip]

2023-03-08 14:02:46 +09:00

common.mk

YJIT: Add codegen for Integer methods (#7665 )

2023-04-05 13:19:31 -07:00

compar.c

Change ArgumentError message when Comparable#clamp receives min value higher than max value

2023-01-17 21:25:11 -08:00

compile.c

vm_call_single_noarg_inline_builtin

2023-03-23 14:03:12 +09:00

complex.c

[DOC] Enhanced RDoc for NilClass (#7500 )

2023-03-13 12:55:59 -04:00

configure.ac

Add Dir.fchdir

2023-03-24 11:18:57 -07:00

constant.h

Stop exporting symbols for MJIT

2023-03-06 21:59:23 -08:00

cont.c

RJIT: Do nothing on jit_cont_free

2023-03-09 22:31:51 -08:00

CONTRIBUTING.md

…

COPYING

…

COPYING.ja

…

darray.h

Fix spelling (#7405 )

2023-02-28 10:05:30 -08:00

debug_counter.c

Stop exporting symbols for MJIT

2023-03-06 21:59:23 -08:00

debug_counter.h

Refactor to separate marking and sweeping phases

2023-02-21 08:05:31 -05:00

debug.c

[Feature #19579 ] Remove !USE_RVARGC code (#7655 )

2023-04-04 17:30:06 -04:00

dir.c

Suppress -Wdiscarded-qualifiers warning where fchdir is unusable

2023-04-04 11:27:43 +09:00

dir.rb

…

dln_find.c

…

dln.c

Update dln.c to fix error output from dln_open()

2023-03-21 19:10:19 +09:00

dln.h

…

dmydln.c

…

dmyenc.c

…

dmyext.c

…

encindex.h

…

encoding.c

Mark Encoding as Write Barrier protected

2023-02-07 11:48:57 +01:00

enum.c

Remove (newly unneeded) remarks about aliases

2023-02-19 14:26:34 -08:00

enumerator.c

Implement declarative references for enumerator

2023-03-17 19:20:40 +00:00

error.c

Stop exporting symbols for MJIT

2023-03-06 21:59:23 -08:00

eval_error.c

[Bug #19242 ] Prohibit circular causes to be loaded

2022-12-20 14:12:38 +09:00

eval_intern.h

…

eval_jump.c

…

eval.c

Remove obsoleted functions in rjit.c

2023-03-07 23:59:50 -08:00

file.c

Should not reach end of non-void function

2023-03-22 18:53:11 +09:00

gc.c

Ensure ruby_xfree won't segfault if called after vm_destruct

2023-04-05 12:57:32 -04:00

gc.rb

Add marking and sweeping time to GC.stat

2023-02-21 08:05:31 -05:00

gem_prelude.rb

…

golf_prelude.rb

…

goruby.c

…

GPL

…

hash.c

Change Hash#compact to keep default values and compare_by_identity flag

2023-03-24 10:55:13 -07:00

hrtime.h

…

id_table.c

Stop exporting symbols for MJIT

2023-03-06 21:59:23 -08:00

id_table.h

Transition complex objects to "too complex" shape

2022-12-15 10:06:04 -08:00

inits.c

Move WeakMap and WeakKeyMap code to weakmap.c

2023-03-10 09:32:10 -05:00

insns.def

Refactor jit_func_t and jit_exec

2023-03-16 10:42:17 -07:00

internal.h

Don't redefine RB_OBJ_WRITE

2023-01-18 08:49:32 -05:00

io_buffer.c

Support IO#pread / IO#pwrite using fiber scheduler. (#7594 )

2023-03-31 00:48:55 +13:00

io.c

Support IO#pread / IO#pwrite using fiber scheduler. (#7594 )

2023-03-31 00:48:55 +13:00

io.rb

…

iseq.c

Remove unused VM_CALL_BLOCKISEQ flag

2023-04-01 10:22:47 -07:00

iseq.h

Rename iseq_mark_and_update to iseq_mark_and_move

2023-02-08 12:43:25 -05:00

kernel.rb

Partially revert GH-7511

2023-03-15 09:53:49 -07:00

KNOWNBUGS.rb

…

LEGAL

Remove about ext/psych/yaml which is no longer bundled [ci skip]

2023-01-11 18:05:15 +09:00

lex.c.blt

…

load.c

Revert "reuse open(2) from rb_file_load_ok on POSIX-like system"

2023-02-27 09:24:45 -08:00

loadpath.c

…

localeinit.c

…

main.c

Enable DEBUG_LOG feature on USE_RUBY_DEBUG_LOG

2023-03-01 17:18:43 +09:00

marshal.c

Marshal.load: restore instance variables on Regexp

2023-02-21 13:57:04 +01:00

marshal.rb

…

math.c

…

memory_view.c

…

method.h

…

mini_builtin.c

Check loading built-in binaries

2023-03-08 13:59:21 +09:00

miniinit.c

…

NEWS.md

Revert "Fix transient heap mode"

2023-04-04 12:59:14 -07:00

nilclass.rb

…

node.c

…

node.h

Disallow mixed usage of ... and */**

2022-12-15 18:56:24 +09:00

numeric.c

Stop exporting symbols for MJIT

2023-03-06 21:59:23 -08:00

numeric.rb

Rename builtin attr :inline to :leaf

2023-03-11 14:25:12 -08:00

object.c

Use an st table for "too complex" objects

2023-03-20 13:54:18 -07:00

pack.c

Fix a typo in BUG message [ci skip]

2023-01-20 00:20:27 +09:00

pack.rb

…

parse.y

* in an array pattern should not be parsed as nil in ripper

2023-04-01 16:35:24 +09:00

prelude.rb

Fix ruby_testoptions on RubyCI

2023-03-08 12:00:14 -08:00

probes_helper.h

Stop exporting symbols for MJIT

2023-03-06 21:59:23 -08:00

probes.d

…

proc.c

Stop exporting symbols for MJIT

2023-03-06 21:59:23 -08:00

process.c

[DOC] Clarify behavior of abort() with no argument

2023-04-05 07:37:07 -07:00

ractor_core.h

relax assertion

2023-03-31 18:08:34 +09:00

ractor.c

show debug log for ractor_terminal_interrupt_all

2023-03-30 14:56:37 +09:00

ractor.rb

Ractor::Selector#empty?

2023-03-03 00:08:02 +09:00

random.c

…

range.c

Remove (newly unneeded) remarks about aliases

2023-02-19 14:26:34 -08:00

rational.c

[DOC] Enhanced RDoc for NilClass (#7500 )

2023-03-13 12:55:59 -04:00

re.c

Stop exporting symbols for MJIT

2023-03-06 21:59:23 -08:00

README.EXT

…

README.EXT.ja

…

README.ja.md

s/MJIT/RJIT/

2023-03-06 23:44:01 -08:00

README.md

s/MJIT/RJIT/

2023-03-06 23:44:01 -08:00

regcomp.c

…

regenc.c

…

regenc.h

…

regerror.c

…

regexec.c

[Bug #19476 ]: correct cache index computation for repetition (#7457 )

2023-03-13 18:31:13 +09:00

regint.h

…

regparse.c

…

regparse.h

…

regsyntax.c

…

rjit_c.c

RJIT: Support entry with different PCs

2023-04-02 15:27:40 -07:00

rjit_c.h

RJIT: Support entry with different PCs

2023-04-02 15:27:40 -07:00

rjit_c.rb

[Feature #19579 ] Remove !USE_RVARGC code (#7655 )

2023-04-04 17:30:06 -04:00

rjit.c

RJIT: Add --rjit-verify-ctx option

2023-04-04 00:35:29 -07:00

rjit.h

RJIT: Add --rjit-verify-ctx option

2023-04-04 00:35:29 -07:00

rjit.rb

RJIT: Implement --rjit-trace-exits

2023-03-12 15:15:08 -07:00

ruby_assert.h

…

ruby_atomic.h

…

ruby-runner.c

Stop building mjit_build_dir.so

2023-03-06 22:14:44 -08:00

ruby.c

[Feature #19579 ] Remove !USE_RVARGC code (#7655 )

2023-04-04 17:30:06 -04:00

rubystub.c

…

scheduler.c

Support IO#pread / IO#pwrite using fiber scheduler. (#7594 )

2023-03-31 00:48:55 +13:00

shape.c

Lazily allocate id tables for children

2023-03-22 12:50:42 -07:00

shape.h

Adjust SHAPE_BUFFER_SIZE with shape_id_t

2023-03-24 13:52:55 -07:00

signal.c

Remove SIGCHLD waidpid. (#7527 )

2023-03-15 19:48:27 +13:00

siphash.c

…

siphash.h

…

sparc.c

…

sprintf.c

…

st.c

Use an st table for "too complex" objects

2023-03-20 13:54:18 -07:00

strftime.c

…

string.c

[Feature #19579 ] Remove !USE_RVARGC code (#7655 )

2023-04-04 17:30:06 -04:00

string.rb

[DOC] Add missing escape

2023-03-16 18:37:19 +09:00

struct.c

[DOC] Fix broken link Data#deconstruct_keys

2023-03-08 12:26:26 +09:00

symbol.c

Merge gc.h and internal/gc.h

2023-02-09 10:32:29 -05:00

symbol.h

…

symbol.rb

Remove (newly unneeded) remarks about aliases

2023-02-19 14:26:34 -08:00

thread_none.c

pass th to thread_sched_to_waiting()

2023-03-31 18:50:10 +09:00

thread_none.h

…

thread_pthread.c

pass th to thread_sched_to_waiting()

2023-03-31 18:50:10 +09:00

thread_pthread.h

nt->serial for RUBY_DEBUG_LOG

2023-03-31 11:28:18 +09:00

thread_sync.c

Correctly clean up keeping_mutexes before resuming any other threads. (#7460 )

2023-03-07 20:23:00 +13:00

thread_sync.rb

…

thread_win32.c

pass th to thread_sched_to_waiting()

2023-03-31 18:50:10 +09:00

thread_win32.h

…

thread.c

fix deadlock on Thread#join

2023-04-04 07:57:51 +09:00

time.c

Fix crash in Time on 32-bit systems

2023-04-04 11:12:07 -04:00

timev.h

Fix crash in Time on 32-bit systems

2023-04-04 11:12:07 -04:00

timev.rb

[Feature #18033 ] Add precision: option

2022-12-16 22:52:59 +09:00

trace_point.rb

[DOC] Update TracePoint#binding docs for 3.2 behavior

2023-02-19 22:32:52 +02:00

transcode_data.h

…

transcode.c

[Feature #19579 ] Remove !USE_RVARGC code (#7655 )

2023-04-04 17:30:06 -04:00

transient_heap.c

Merge gc.h and internal/gc.h

2023-02-09 10:32:29 -05:00

transient_heap.h

…

util.c

…

variable.c

Use an st table for "too complex" objects

2023-03-20 13:54:18 -07:00

variable.h

…

version.c

s/mjit/rjit/

2023-03-06 23:44:01 -08:00

version.h

…

vm_args.c

Hash#dup for kwsplat arguments

2023-03-15 18:05:13 +09:00

vm_backtrace.c

Suppress -Wsign-compare warning

2023-03-23 23:31:46 +09:00

vm_callinfo.h

Remove unused VM_CALL_BLOCKISEQ flag

2023-04-01 10:22:47 -07:00

vm_core.h

rb_th_serial(th) allows th == NULL

2023-04-04 15:42:37 +09:00

vm_debug.h

…

vm_dump.c

Add thread and ractor counts to bug reports

2023-03-16 10:46:30 -04:00

vm_eval.c

Remove unused jit_enable_p flag

2023-03-14 14:01:53 -07:00

vm_exec.c

…

vm_exec.h

Refactor jit_func_t and jit_exec

2023-03-16 10:42:17 -07:00

vm_insnhelper.c

vm_call_single_noarg_inline_builtin

2023-03-23 14:03:12 +09:00

vm_insnhelper.h

s/mjit/rjit/

2023-03-06 23:44:01 -08:00

vm_method.c

RJIT: Stop allowing leaked globals rjit_*

2023-03-08 23:24:38 -08:00

vm_opts.h

Remove an unused VM option

2023-03-13 20:54:00 -07:00

vm_sync.c

Move RB_VM_SAVE_MACHINE_CONTEXT to internal/thread.h

2023-03-15 21:26:26 +00:00

vm_sync.h

…

vm_trace.c

vm_call_single_noarg_inline_builtin

2023-03-23 14:03:12 +09:00

vm.c

nt->serial for RUBY_DEBUG_LOG

2023-03-31 11:28:18 +09:00

vsnprintf.c

…

warning.rb

[DOC] [Bug #19290 ] fix formatting

2023-01-01 14:50:39 +09:00

weakmap.c

ObjectSpace::WeakMap: clean inverse reference when an entry is re-assigned

2023-03-17 17:50:08 +00:00

yjit.c

YJIT: Add codegen for Integer methods (#7665 )

2023-04-05 13:19:31 -07:00

yjit.h

YJIT: Add --yjit-pause and RubyVM::YJIT.resume (#7609 )

2023-03-28 15:21:19 -04:00

yjit.rb

YJIT: Count the number of actually written bytes (#7658 )

2023-04-05 10:32:04 -04:00

README.md

What is Ruby?

Ruby is an interpreted object-oriented programming language often used for web development. It also offers many scripting features to process plain text and serialized files, or manage system tasks. It is simple, straightforward, and extensible.

Features of Ruby

Simple Syntax
Normal Object-oriented Features (e.g. class, method calls)
Advanced Object-oriented Features (e.g. mix-in, singleton-method)
Operator Overloading
Exception Handling
Iterators and Closures
Garbage Collection
Dynamic Loading of Object Files (on some architectures)
Highly Portable (works on many Unix-like/POSIX compatible platforms as well as Windows, macOS, etc.) cf. https://github.com/ruby/ruby/blob/master/doc/maintainers.md#platform-maintainers

How to get Ruby

For a complete list of ways to install Ruby, including using third-party tools like rvm, see:

https://www.ruby-lang.org/en/downloads/

You can download release packages and the snapshot of the repository. If you want to download whole versions of Ruby, please visit https://www.ruby-lang.org/en/downloads/releases/.

Download with Git

The mirror of the Ruby source tree can be checked out with the following command:

$ git clone https://github.com/ruby/ruby.git

There are some other branches under development. Try the following command to see the list of branches:

$ git ls-remote https://github.com/ruby/ruby.git

You may also want to use https://git.ruby-lang.org/ruby.git (actual master of Ruby source) if you are a committer.

How to build

see Building Ruby

Ruby home page

https://www.ruby-lang.org/

Documentation

Mailing list

There is a mailing list to discuss Ruby. To subscribe to this list, please send the following phrase:

subscribe

in the mail body (not subject) to the address ruby-talk-request@ruby-lang.org.

Copying

See the file COPYING.

Feedback

Questions about the Ruby language can be asked on the Ruby-Talk mailing list or on websites like https://stackoverflow.com.

Bugs should be reported at https://bugs.ruby-lang.org. Read "Reporting Issues" for more information.

Contributing

See "Contributing to Ruby", which includes setup and build instructions.

The Author

Ruby was originally designed and developed by Yukihiro Matsumoto (Matz) in 1995.

matz@ruby-lang.org

Languages

Ruby 58.9%

C 29.5%

Rust 6.1%

C++ 2.9%

Yacc 0.9%

Other 1.6%