In the C API, we want to use slices instead of locations in the
AST. In this case a "slice" is effectively the same thing as the
location, expect it is represented using a 32-bit offset and a
32-bit length. This will cut down on half of the space of all of
the locations in the AST.
Note that from the Ruby/Java/JavaScript side, this is effectively
an invisible change. This only impacts the C/Rust side.
Since `on_sp` is emitted, it doesn't do a whole lot anymore.
This leaves one incompatibility for code like `"x#$%"`
Ripper confuses this for bare interpolation with a global, but `$%` is not a valid global name. Still,
it emits two string tokens in such a case. It doesn't make sense for prism to work around this bug,
so the affected files are added as excludes.
Since the only usage of this method makes sense for testing in prism itself,
the method is removed instead of deprecated.
https://github.com/ruby/prism/commit/31be379f98
You're supposed to return the first argument.
```rb
# Before
[[:stmts_new], [:rescue_mod, nil, nil], [:stmts_add, nil, nil], [:program, nil]]
# After
[[:stmts_new], [:rescue_mod, "1", "2"], [:stmts_add, nil, "1"], [:program, nil]]
```
The correct result would be:
`[[:rescue_mod, "1", "2"], [:stmts_new], [:stmts_add, nil, "1"], [:program, nil]]`
But the order depends on the prism AST so it seems very difficult to match.
https://github.com/ruby/prism/commit/94e0107729
* Handle line continuations.
* Handle space at the end of file in LexCompat.
https://github.com/ruby/prism/commit/32bd13eb7d
Co-authored-by: Earlopain <14981592+Earlopain@users.noreply.github.com>
This is everything that `irb` uses. It works in their test-suite, but there are 20 failures when using the shim that I haven't looked into at all.
`parse` is not used by `irb`. `scan` is, and it's basically `parse` but also including errors. `irb` doesn't seem to care about the errors, so I didn't implement that.
https://github.com/ruby/prism/commit/2c5826b39f
One per version seems excessive.
Do note that `rubocop-ast` used to require individual parser files. I wouldn't consider that to be part of the API since everything is autoloaded.
From a GitHub code search, I didn't find anyone else doing it like that.
https://github.com/ruby/prism/commit/458f622c34
Ripper exposes Ripper::Lexer:State in its output, which is a bit of a problem. To make this work, I basically copy-pasted the implementation.
I'm unsure if that is acceptable and added a test to make sure that these values never go out of sync.
I don't imagine them changing often, prism maps them 1:1 for its own usage.
This also fixed the shim by accident. `Ripper.lex` went to `Translation::Ripper.lex` when it should have been the original. Removing the need for the original resolves that issue.
https://github.com/ruby/prism/commit/2c0bea076d
This commit adds an `expect1_opening` function that expects a token and
attaches the error to the opening token location rather than the current
position. This is useful for errors about missing closing tokens, where
we want to point to the line with the opening token rather than the end
of the file.
For example:
```ruby
def foo
def bar
def baz
^ expected an `end` to close the `def` statement
^ expected an `end` to close the `def` statement
^ expected an `end` to close the `def` statement
```
This would previously produce three identical errors at the end of the
file. After this commit, they would be reported at the opening token
location:
```ruby
def foo
^~~ expected an `end` to close the `def` statement
def bar
^~~ expected an `end` to close the `def` statement
def baz
^~~ expected an `end` to close the `def` statement
```
I considered using the end of the line where the opening token is
located, but in some cases that would be less useful than the opening
token location itself. For example:
```ruby
def foo def bar def baz
```
Here the end of the line where the opening token is located would be the
same for each of the unclosed `def` nodes.
https://github.com/ruby/prism/commit/2d7829f060
The lexer did not jump to the `heredoc_end`, causing the heredoc end delimiter
to be parsed twice.
Normally the heredocs get flushed when a newline is encountered. But because
the newline is part of the string delimiter, that codepath is not taken.
Fixes [Bug #21758]
https://github.com/ruby/prism/commit/7440eb4b11
Mostly not having to list version-specific excludes when testing against ripper/parse.y
Also don't test new syntax additions against the parser gems. The version support
for them may (or may not) be expanded but we shouldn't bother while the ruby version
hasn't even released yet.
(ruby_parser translation is not versioned, so let as is for now)
I also removed excludes that have since been implemented by parse.y
https://github.com/ruby/prism/commit/e5a0221c37
The unicode version has been updated upstream, which means new
codepoints mapped to alpha/alnum/isupper flags. We need to update
our tables to match.
I'm purposefully not adding a version check here, since that is
such a large amount of code. It's possible that we could include
different tables depending on a macro (like UNICODE_VERSION) or
something to that effect, but it's such a minimal impact on the
running of the actual parser that I don't think it's necessary.
https://github.com/ruby/prism/commit/78925fe5b6
When a pattern match is using a string as a hash pattern key and is
using it incorrectly, we were previously assuming it was a symbol.
In the case of an error, that's not the case. So we need to add a
missing node in this case.
https://github.com/ruby/prism/commit/f0b06d6269
They were being parsed as `p((p a, &block) => value)`.
When we get to this point, we must not just have parsed a command call, always consuming the `=>` is not correct.
Closes [Bug #21622]
https://github.com/ruby/prism/commit/796ab0edf4
When there are nested capture variables inside of a pattern match
that has an alternation pattern, it is a syntax error. Currently it
only adds a syntax error when it is at the top level of the pattern.
This one has been on my mind for a while now.
Currently, there are only tests against the latest syntax version.
This changes the snapshot structure as follows:
* Snapshots at their current location are tested against all syntax versions
* Snapshots inside a version folder like "3.3" are tested against all versions starting from that version
* Snapshots inside a version folder like "3.3-4.2" are tested against all versions in the given range.
This makes sure that as new syntax is added, older versions still work as expected.
I also added a few tests for now valid syntax that should be invalid in older versions (and the other way around as well)
These tests run really fast. So even though it does 3x the work for these, I am still able to run the whole test suite in just 11 seconds.
https://github.com/ruby/prism/commit/5191b1aa68
The docs currently say to use `Prism.parse(foo, version: RUBY_VERSION)` for this.
By specifying "current" instead, we can have prism raise a more specifc error.
Note: Does not use `ruby_version` from `ruby/version.h` because writing a test for that is not really possible.
`RUBY_VERSION` is nicely stubbable for both the c-ext and FFI backend.
https://github.com/ruby/prism/commit/9c5cd205cf