small increment and fixes
Some checks are pending
CI / Julia ${{ matrix.version }} - ${{ matrix.os }} - ${{ matrix.arch }} - ${{ github.event_name }} (x64, ubuntu-latest, 1.10) (push) Waiting to run
CI / Julia ${{ matrix.version }} - ${{ matrix.os }} - ${{ matrix.arch }} - ${{ github.event_name }} (x64, ubuntu-latest, 1.6) (push) Waiting to run
CI / Julia ${{ matrix.version }} - ${{ matrix.os }} - ${{ matrix.arch }} - ${{ github.event_name }} (x64, ubuntu-latest, pre) (push) Waiting to run
Some checks are pending
CI / Julia ${{ matrix.version }} - ${{ matrix.os }} - ${{ matrix.arch }} - ${{ github.event_name }} (x64, ubuntu-latest, 1.10) (push) Waiting to run
CI / Julia ${{ matrix.version }} - ${{ matrix.os }} - ${{ matrix.arch }} - ${{ github.event_name }} (x64, ubuntu-latest, 1.6) (push) Waiting to run
CI / Julia ${{ matrix.version }} - ${{ matrix.os }} - ${{ matrix.arch }} - ${{ github.event_name }} (x64, ubuntu-latest, pre) (push) Waiting to run
This commit is contained in:
@ -16,10 +16,10 @@ All Instructions: https://docs.nvidia.com/cuda/parallel-thread-execution/index.h
|
||||
)
|
||||
|
||||
{
|
||||
.reg .pred %p<2>; -> predicate registers: p1 (needed for branching)
|
||||
.reg .f32 %f<4>; -> float registers: f1 - f3
|
||||
.reg .b32 %r<6>; -> 32 bits registers: r1 - r5 (bits are actual raw bits without a type)
|
||||
.reg .b64 %rd<11>; -> 64 bits registers: rd1 - rd10
|
||||
.reg .pred %p<2>; -> predicate registers: p0, p1 (needed for branching)
|
||||
.reg .f32 %f<4>; -> float registers: f0 - f3
|
||||
.reg .b32 %r<6>; -> 32 bits registers: r0 - r5 (bits are actual raw bits without a type)
|
||||
.reg .b64 %rd<11>; -> 64 bits registers: rd0 - rd10
|
||||
|
||||
ld.param.u64 %rd1, [VecAdd_kernel_param_0]; -> rd1 = Data1
|
||||
ld.param.u64 %rd2, [VecAdd_kernel_param_1]; -> rd2 = Data2
|
||||
|
Reference in New Issue
Block a user