Skip to content

Use darray for incoming branches list#48

Merged
maximecb merged 1 commit into
microjitfrom
ujit_inc_branches_darray
Feb 19, 2021
Merged

Use darray for incoming branches list#48
maximecb merged 1 commit into
microjitfrom
ujit_inc_branches_darray

Conversation

@maximecb

Copy link
Copy Markdown

No description provided.

@maximecb maximecb merged commit 362b478 into microjit Feb 19, 2021
@maximecb maximecb deleted the ujit_inc_branches_darray branch March 6, 2021 05:39
tenderlove pushed a commit that referenced this pull request Jun 2, 2021
* Implement gen_newarray

* Implement newhash for n=0

* Add yjit tests for newhash/newarray

* Fix integer size warning on clang

* Save PC and SP in newhash and newarray

Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
maximecb added a commit that referenced this pull request Oct 19, 2021
* Implement gen_newarray

* Implement newhash for n=0

* Add yjit tests for newhash/newarray

* Fix integer size warning on clang

* Save PC and SP in newhash and newarray

Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
noahgibbs pushed a commit that referenced this pull request Oct 20, 2021
* Implement gen_newarray

* Implement newhash for n=0

* Add yjit tests for newhash/newarray

* Fix integer size warning on clang

* Save PC and SP in newhash and newarray

Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
XrXr pushed a commit that referenced this pull request Oct 20, 2021
* Implement gen_newarray

* Implement newhash for n=0

* Add yjit tests for newhash/newarray

* Fix integer size warning on clang

* Save PC and SP in newhash and newarray

Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
XrXr pushed a commit that referenced this pull request Oct 20, 2021
* Implement gen_newarray

* Implement newhash for n=0

* Add yjit tests for newhash/newarray

* Fix integer size warning on clang

* Save PC and SP in newhash and newarray

Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
XrXr pushed a commit that referenced this pull request Oct 21, 2021
* Implement gen_newarray

* Implement newhash for n=0

* Add yjit tests for newhash/newarray

* Fix integer size warning on clang

* Save PC and SP in newhash and newarray

Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
eightbitraptor pushed a commit that referenced this pull request Dec 9, 2022
Remove all shipped files and require path on JRuby until we can
add JRuby's extension to the gem.

Temporary workaround for #48

ruby/date@94c3becef2
XrXr pushed a commit that referenced this pull request May 2, 2025
XrXr added a commit that referenced this pull request Jul 30, 2025
On the ruby side, this fixes a crash for methods with 39 or more
parameters. We used to miscomp those entry points due to Insn::Lea
picking ADDS which cannot reference SP:

    # set method params: 40
    mov x0, #0xfee8
    movk x0, #0xffff, lsl #16
    movk x0, #0xffff, lsl #32
    movk x0, #0xffff, lsl #48
    adds x0, xzr, x0

Have Lea work for all i32 displacements and avoid involving the split
pass. Previously, direct use of Insn::Lea directly from the user (as
opposed to generated by the split pass for some memory operations)
wasn't split, so being able to handle the whole range in arm64_emit()
was implicitly required. Also, not going through split reduces register
pressure.
XrXr added a commit that referenced this pull request Jul 30, 2025
On the ruby side, this fixes a crash for methods with 39 or more
parameters. We used to miscomp those entry points due to Insn::Lea
picking ADDS which cannot reference SP:

    # set method params: 40
    mov x0, #0xfee8
    movk x0, #0xffff, lsl #16
    movk x0, #0xffff, lsl #32
    movk x0, #0xffff, lsl #48
    adds x0, xzr, x0

Have Lea work for all i32 displacements and avoid involving the split
pass. Previously, direct use of Insn::Lea directly from the user (as
opposed to generated by the split pass for some memory operations)
wasn't split, so being able to handle the whole range in arm64_emit()
was implicitly required. Also, not going through split reduces register
pressure.
casperisfine pushed a commit that referenced this pull request Jul 31, 2025
On the ruby side, this fixes a crash for methods with 39 or more
parameters. We used to miscomp those entry points due to Insn::Lea
picking ADDS which cannot reference SP:

    # set method params: 40
    mov x0, #0xfee8
    movk x0, #0xffff, lsl #16
    movk x0, #0xffff, lsl #32
    movk x0, #0xffff, lsl #48
    adds x0, xzr, x0

Have Lea work for all i32 displacements and avoid involving the split
pass. Previously, direct use of Insn::Lea directly from the user (as
opposed to generated by the split pass for some memory operations)
wasn't split, so being able to handle the whole range in arm64_emit()
was implicitly required. Also, not going through split reduces register
pressure.
tekknolagi pushed a commit to tekknolagi/ruby that referenced this pull request Jun 10, 2026
* ZJIT: Use shape id as cache key for object layout

Since ruby#17158, we can use the shape id as our cache key for determining
object layout.  This patch changes ZJIT to use shape id instead of
testing all bits.

Given this program:

```ruby
class Foo
  def initialize; @bar = 123; end
  def read; @bar; end
  def add; @baz = 123; end
end

foo = Foo.new
3.times {
  foo = Foo.new
  foo.read
  foo.add
  foo.read
}
```

We end up with a polymorphic read in the `read` method.  On master, the
HIR looks like this:

```
Optimized HIR:
fn read@../test.rb:9:
bb1():
  EntryPoint interpreter
  v1:HeapBasicObject = LoadSelf
  Jump bb3(v1)
bb2():
  EntryPoint JIT(0)
  v4:HeapBasicObject = LoadArg :self@0
  Jump bb3(v4)
bb3(v6:HeapBasicObject):
  PatchPoint SingleRactorMode
  v12:CUInt64 = LoadField v6, :RBASIC_FLAGS@0x0
  v14:CUInt64[0xffffffff0000001f] = Const CUInt64(0xffffffff0000001f)
  v15:CPtr[CPtr(0x8000800000001)] = Const CPtr(0x8000800000001)
  v16 = RefineType v15, CUInt64
  v17:CInt64 = IntAnd v12, v14
  v18:CBool = IsBitEqual v17, v16
  CondBranch v18, bb5(), bb6()
bb5():
  v20:BasicObject = LoadField v6, :@bar@0x10
  Jump bb4(v20)
bb6():
  v22:CUInt64[0xffffffff0000001f] = Const CUInt64(0xffffffff0000001f)
  v23:CPtr[CPtr(0x8000900000001)] = Const CPtr(0x8000900000001)
  v24 = RefineType v23, CUInt64
  v25:CInt64 = IntAnd v12, v22
  v26:CBool = IsBitEqual v25, v24
  CondBranch v26, bb7(), bb8()
bb7():
  v28:BasicObject = LoadField v6, :@bar@0x10
  Jump bb4(v28)
bb8():
  v30:BasicObject = GetIvar v6, :@bar
  Jump bb4(v30)
bb4(v13:BasicObject):
  CheckInterrupts
  Return v13
```

On this branch, the HIR is like this:

```
Optimized HIR:
fn read@../test.rb:9:
bb1():
  EntryPoint interpreter
  v1:HeapBasicObject = LoadSelf
  Jump bb3(v1)
bb2():
  EntryPoint JIT(0)
  v4:HeapBasicObject = LoadArg :self@0
  Jump bb3(v4)
bb3(v6:HeapBasicObject):
  PatchPoint SingleRactorMode
  v13:CShape = LoadField v6, :shape_id@0x4
  v14:CShape[0x80008] = Const CShape(0x80008)
  v15:CBool = IsBitEqual v14, v13
  CondBranch v15, bb5(), bb6()
bb5():
  v17:BasicObject = LoadField v6, :@bar@0x10
  Jump bb4(v17)
bb6():
  v19:CShape = LoadField v6, :shape_id@0x4
  v20:CShape[0x80009] = Const CShape(0x80009)
  v21:CBool = IsBitEqual v20, v19
  CondBranch v21, bb7(), bb8()
bb7():
  v23:BasicObject = LoadField v6, :@bar@0x10
  Jump bb4(v23)
bb8():
  v25:BasicObject = GetIvar v6, :@bar
  Jump bb4(v25)
bb4(v12:BasicObject):
  CheckInterrupts
  Return v12
```

We're able to avoid loading all of the flags, applying a mask, and the
testing.

The machine code for bb3 looks like this on master (I've removed the nop
buffers for patch points):

```
  # Insn: v12 LoadField v6, :RBASIC_FLAGS@0x0
  # Load field id=RBASIC_FLAGS offset=0
  0x122c50124: ldur x1, [x0]
  # Insn: v14 Const CUInt64(0xffffffff0000001f)
  # Insn: v15 Const CPtr(0x8000800000001)
  # Insn: v16 RefineType v15, CUInt64
  # Insn: v17 IntAnd v12, v14
  0x122c50128: and x2, x1, #0xffffffff0000001f
  # Insn: v18 IsBitEqual v17, v16
  0x122c5012c: mov x3, #1
  0x122c50130: movk x3, #0, lsl Shopify#16
  0x122c50134: movk x3, Shopify#8, lsl Shopify#32
  0x122c50138: movk x3, Shopify#8, lsl Shopify#48
  0x122c5013c: cmp x2, x3
  0x122c50140: mov x2, #1
  0x122c50144: mov x3, #0
  0x122c50148: csel x2, x2, x3, eq
  0x122c5014c: tst x2, x2
  0x122c50150: b.ne #0x122c501e8
```

On this branch it looks like this:

```
  # Insn: v13 LoadField v6, :shape_id@0x4
  # Load field id=shape_id offset=4
  0x124dd0124: ldur w1, [x0, #4]
  # Insn: v14 Const CShape(0x80008)
  # Insn: v15 IsBitEqual v14, v13
  0x124dd0128: mov x2, Shopify#8
  0x124dd012c: movk x2, Shopify#8, lsl Shopify#16
  0x124dd0130: cmp x2, x1
  0x124dd0134: mov x1, #1
  0x124dd0138: mov x2, #0
  0x124dd013c: csel x1, x1, x2, eq
  0x124dd0140: tst x1, x1
  0x124dd0144: b.ne #0x124dd01d4
```

We've eliminated the `and` instruction and only need to do a 32 bit load
for the shape.

* fix operand order

* Remove comments and whitespace

* remove stuttering

* fix variable name to be more clear

* Keep assert

* update snapshots
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant