Assembly Diffs

osx arm64

Diffs are based on 2,229,935 contexts (927,360 MinOpts, 1,302,575 FullOpts).

MISSED contexts: 6,082 (0.27%)

No diffs found.

Details

Context information

Collection Diffed contexts MinOpts FullOpts Missed, base Missed, diff
benchmarks.run_pgo.osx.arm64.checked.mch 84,643 48,345 36,298 183 (0.22%) 183 (0.22%)
benchmarks.run_tiered.osx.arm64.checked.mch 48,253 37,331 10,922 63 (0.13%) 63 (0.13%)
coreclr_tests.run.osx.arm64.checked.mch 586,148 358,028 228,120 437 (0.07%) 437 (0.07%)
libraries.crossgen2.osx.arm64.checked.mch 233,760 15 233,745 0 (0.00%) 0 (0.00%)
libraries.pmi.osx.arm64.checked.mch 313,588 18 313,570 2,028 (0.64%) 2,028 (0.64%)
libraries_tests.run.osx.arm64.Release.mch 631,294 462,062 169,232 963 (0.15%) 963 (0.15%)
librariestestsnotieredcompilation.run.osx.arm64.Release.mch 301,031 21,558 279,473 2,083 (0.69%) 2,083 (0.69%)
realworld.run.osx.arm64.checked.mch 31,218 3 31,215 325 (1.03%) 325 (1.03%)
2,229,935 927,360 1,302,575 6,082 (0.27%) 6,082 (0.27%)


windows arm64

Diffs are based on 2,308,464 contexts (929,692 MinOpts, 1,378,772 FullOpts).

MISSED contexts: 6,334 (0.27%)

No diffs found.

Details

Context information

Collection Diffed contexts MinOpts FullOpts Missed, base Missed, diff
benchmarks.run.windows.arm64.checked.mch 24,218 4 24,214 229 (0.94%) 229 (0.94%)
benchmarks.run_pgo.windows.arm64.checked.mch 96,879 48,066 48,813 104 (0.11%) 104 (0.11%)
benchmarks.run_tiered.windows.arm64.checked.mch 48,412 36,693 11,719 61 (0.13%) 61 (0.13%)
coreclr_tests.run.windows.arm64.checked.mch 595,265 362,539 232,726 438 (0.07%) 438 (0.07%)
libraries.crossgen2.windows.arm64.checked.mch 243,831 15 243,816 0 (0.00%) 0 (0.00%)
libraries.pmi.windows.arm64.checked.mch 302,817 6 302,811 2,054 (0.67%) 2,054 (0.67%)
libraries_tests.run.windows.arm64.Release.mch 625,154 460,799 164,355 900 (0.14%) 900 (0.14%)
librariestestsnotieredcompilation.run.windows.arm64.Release.mch 314,858 21,559 293,299 2,179 (0.69%) 2,179 (0.69%)
realworld.run.windows.arm64.checked.mch 32,878 3 32,875 366 (1.10%) 366 (1.10%)
smoke_tests.nativeaot.windows.arm64.checked.mch 24,152 8 24,144 3 (0.01%) 3 (0.01%)
2,308,464 929,692 1,378,772 6,334 (0.27%) 6,334 (0.27%)


windows x64

Diffs are based on 2,366,413 contexts (928,740 MinOpts, 1,437,673 FullOpts).

MISSED contexts: 6,788 (0.29%)

Overall (-2,856 bytes)

Collection Base size (bytes) Diff size (bytes)
benchmarks.run.windows.x64.checked.mch 8,538,947 -28
coreclr_tests.run.windows.x64.checked.mch 393,235,161 -528
libraries.pmi.windows.x64.checked.mch 60,138,217 -499
libraries_tests.run.windows.x64.Release.mch 278,258,792 -1,014
librariestestsnotieredcompilation.run.windows.x64.Release.mch 135,861,173 -759
smoke_tests.nativeaot.windows.x64.checked.mch 5,087,315 -28

MinOpts (-248 bytes)

Collection Base size (bytes) Diff size (bytes)
coreclr_tests.run.windows.x64.checked.mch 273,504,444 -192
libraries_tests.run.windows.x64.Release.mch 175,002,442 -56

FullOpts (-2,608 bytes)

Collection Base size (bytes) Diff size (bytes)
benchmarks.run.windows.x64.checked.mch 8,538,586 -28
coreclr_tests.run.windows.x64.checked.mch 119,730,717 -336
libraries.pmi.windows.x64.checked.mch 60,024,698 -499
libraries_tests.run.windows.x64.Release.mch 103,256,350 -958
librariestestsnotieredcompilation.run.windows.x64.Release.mch 124,984,011 -759
smoke_tests.nativeaot.windows.x64.checked.mch 5,086,368 -28

Example diffs

benchmarks.run.windows.x64.checked.mch

-28 (-2.50%) : 24358.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)

@@ -186,12 +186,11 @@ G_M48875_IG06: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r vpshufb ymm1, ymm3, ymm1 vpand ymm2, ymm2, ymmword ptr [reloc @RWD96] vpcmpub k1, ymm2, ymmword ptr [reloc @RWD128], 6
- vpmovm2b ymm3, k1 - vpsubb ymm4, ymm2, ymmword ptr [reloc @RWD160] - vpshufb ymm4, ymm9, ymm4
+ vpsubb ymm3, ymm2, ymmword ptr [reloc @RWD160] + vpshufb ymm3, ymm9, ymm3
vpshufb ymm2, ymm8, ymm2
- vpternlogd ymm3, ymm4, ymm2, -54 - vpand ymm1, ymm3, ymm1
+ vpblendmb ymm2 {k1}, ymm2, ymm3 + vpand ymm1, ymm2, ymm1
vxorps ymm2, ymm2, ymm2 vpcmpeqb ymm1, ymm1, ymm2 vpcmpeqd ymm2, ymm2, ymm2 @@ -202,12 +201,11 @@ G_M48875_IG06: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r vpshufb ymm2, ymm3, ymm2 vpand ymm0, ymm0, ymmword ptr [reloc @RWD96] vpcmpub k1, ymm0, ymmword ptr [reloc @RWD128], 6
- vpmovm2b ymm3, k1 - vpsubb ymm4, ymm0, ymmword ptr [reloc @RWD160] - vpshufb ymm4, ymm9, ymm4
+ vpsubb ymm3, ymm0, ymmword ptr [reloc @RWD160] + vpshufb ymm3, ymm9, ymm3
vpshufb ymm0, ymm8, ymm0
- vpternlogd ymm3, ymm4, ymm0, -54 - vpand ymm0, ymm3, ymm2
+ vpblendmb ymm0 {k1}, ymm0, ymm3 + vpand ymm0, ymm0, ymm2
vxorps ymm2, ymm2, ymm2 vpcmpeqb ymm0, ymm0, ymm2 vpcmpeqd ymm2, ymm2, ymm2 @@ -215,7 +213,7 @@ G_M48875_IG06: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r vpand ymm0, ymm1, ymm0 vptest ymm0, ymm0 je SHORT G_M48875_IG09
- ;; size=248 bbWeight=4 PerfScore 317.33
+ ;; size=234 bbWeight=4 PerfScore 313.33
G_M48875_IG07: ; bbWeight=2, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi rdi r14 r15}, byref vpermq ymm0, ymm0, -40 vpmovmskb ebp, ymm0 @@ -338,12 +336,11 @@ G_M48875_IG16: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r vpshufb xmm1, xmm10, xmm1 vpand xmm2, xmm2, xmm11 vpcmpub k1, xmm2, xmm12, 6
- vpmovm2b xmm3, k1 - vpsubb xmm4, xmm2, xmm13 - vpshufb xmm4, xmm7, xmm4
+ vpsubb xmm3, xmm2, xmm13 + vpshufb xmm3, xmm7, xmm3
vpshufb xmm2, xmm6, xmm2
- vpternlogd xmm3, xmm4, xmm2, -54 - vpand xmm1, xmm3, xmm1
+ vpblendmb xmm2 {k1}, xmm2, xmm3 + vpand xmm1, xmm2, xmm1
vxorps xmm2, xmm2, xmm2 vpcmpeqb xmm1, xmm1, xmm2 vpcmpeqd xmm2, xmm2, xmm2 @@ -353,12 +350,11 @@ G_M48875_IG16: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r vpshufb xmm2, xmm10, xmm2 vpand xmm0, xmm0, xmm11 vpcmpub k1, xmm0, xmm12, 6
- vpmovm2b xmm3, k1 - vpsubb xmm4, xmm0, xmm13 - vpshufb xmm4, xmm7, xmm4
+ vpsubb xmm3, xmm0, xmm13 + vpshufb xmm3, xmm7, xmm3
vpshufb xmm0, xmm6, xmm0
- vpternlogd xmm3, xmm4, xmm0, -54 - vpand xmm0, xmm3, xmm2
+ vpblendmb xmm0 {k1}, xmm0, xmm3 + vpand xmm0, xmm0, xmm2
vxorps xmm2, xmm2, xmm2 vpcmpeqb xmm0, xmm0, xmm2 vpcmpeqd xmm2, xmm2, xmm2 @@ -366,7 +362,7 @@ G_M48875_IG16: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r vpand xmm0, xmm1, xmm0 vptest xmm0, xmm0 je SHORT G_M48875_IG19
- ;; size=200 bbWeight=4 PerfScore 168.00
+ ;; size=186 bbWeight=4 PerfScore 164.00
G_M48875_IG17: ; bbWeight=2, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi rdi r14 r15}, byref vpmovmskb ebp, xmm0 ;; size=4 bbWeight=2 PerfScore 4.00 @@ -433,7 +429,7 @@ RWD128 dq 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F RWD160 dq 1010101010101010h, 1010101010101010h, 1010101010101010h, 1010101010101010h
-; Total bytes of code 1118, prolog size 86, PerfScore 1254.58, instruction count 240, allocated bytes for code 1118 (MethodHash=36e94114) for method System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
+; Total bytes of code 1090, prolog size 86, PerfScore 1246.58, instruction count 236, allocated bytes for code 1090 (MethodHash=36e94114) for method System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
; ============================================================ Unwind Info:

coreclr_tests.run.windows.x64.checked.mch

-28 (-1.46%) : 174835.dasm - VectorTest+VectorRelopTest`1[int]:VectorRelOp(int,int):int (FullOpts)

@@ -331,10 +331,9 @@ G_M1266_IG17: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref ;; size=11 bbWeight=4 PerfScore 6.00 G_M1266_IG18: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref vpcmpd k1, ymm8, ymm6, 2
- vpmovm2d ymm9, k1 - vpternlogd ymm9, ymm8, ymm7, -54
+ vpblendmd ymm9 {k1}, ymm7, ymm8
xor ebx, ebx
- ;; size=22 bbWeight=1 PerfScore 4.75
+ ;; size=15 bbWeight=1 PerfScore 3.58
G_M1266_IG19: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, isz mov ecx, ebx vmovups ymmword ptr [rsp+0x20], ymm9 @@ -389,10 +388,9 @@ G_M1266_IG21: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref ;; size=11 bbWeight=4 PerfScore 6.00 G_M1266_IG22: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref vpcmpd k1, ymm8, ymm7, 2
- vpmovm2d ymm9, k1 - vpternlogd ymm9, ymm8, ymm7, -54
+ vpblendmd ymm9 {k1}, ymm7, ymm8
xor ebx, ebx
- ;; size=22 bbWeight=1 PerfScore 4.75
+ ;; size=15 bbWeight=1 PerfScore 3.58
G_M1266_IG23: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, isz mov ecx, ebx vmovups ymmword ptr [rsp+0x20], ymm9 @@ -447,10 +445,9 @@ G_M1266_IG25: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref ;; size=11 bbWeight=4 PerfScore 6.00 G_M1266_IG26: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref vpcmpd k1, ymm8, ymm6, 5
- vpmovm2d ymm9, k1 - vpternlogd ymm9, ymm8, ymm7, -54
+ vpblendmd ymm9 {k1}, ymm7, ymm8
xor ebx, ebx
- ;; size=22 bbWeight=1 PerfScore 4.75
+ ;; size=15 bbWeight=1 PerfScore 3.58
G_M1266_IG27: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, isz mov ecx, ebx vmovups ymmword ptr [rsp+0x20], ymm9 @@ -505,10 +502,9 @@ G_M1266_IG29: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref ;; size=11 bbWeight=4 PerfScore 6.00 G_M1266_IG30: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref vpcmpd k1, ymm7, ymm6, 5
- vpmovm2d ymm9, k1 - vpternlogd ymm9, ymm8, ymm7, -54
+ vpblendmd ymm9 {k1}, ymm7, ymm8
xor ebx, ebx
- ;; size=22 bbWeight=1 PerfScore 4.75
+ ;; size=15 bbWeight=1 PerfScore 3.58
G_M1266_IG31: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, isz mov ecx, ebx vmovups ymmword ptr [rsp+0x20], ymm9 @@ -687,7 +683,7 @@ G_M1266_IG42: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, ret ;; size=73 bbWeight=1 PerfScore 35.25
-; Total bytes of code 1922, prolog size 82, PerfScore 1043.58, instruction count 381, allocated bytes for code 1922 (MethodHash=881ffb0d) for method VectorTest+VectorRelopTest`1[int]:VectorRelOp(int,int):int (FullOpts)
+; Total bytes of code 1894, prolog size 82, PerfScore 1038.92, instruction count 377, allocated bytes for code 1894 (MethodHash=881ffb0d) for method VectorTest+VectorRelopTest`1[int]:VectorRelOp(int,int):int (FullOpts)
; ============================================================ Unwind Info:

-28 (-1.43%) : 174836.dasm - VectorTest+VectorRelopTest`1[long]:VectorRelOp(long,long):int (FullOpts)

@@ -331,10 +331,9 @@ G_M59915_IG17: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref ;; size=11 bbWeight=4 PerfScore 6.00 G_M59915_IG18: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref vpcmpq k1, ymm8, ymm6, 2
- vpmovm2q ymm9, k1 - vpternlogq ymm9, ymm8, ymm7, -54
+ vpblendmq ymm9 {k1}, ymm7, ymm8
xor ebx, ebx
- ;; size=22 bbWeight=1 PerfScore 4.75
+ ;; size=15 bbWeight=1 PerfScore 3.58
G_M59915_IG19: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, isz mov ecx, ebx vmovups ymmword ptr [rsp+0x20], ymm9 @@ -389,10 +388,9 @@ G_M59915_IG21: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref ;; size=11 bbWeight=4 PerfScore 6.00 G_M59915_IG22: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref vpcmpq k1, ymm8, ymm7, 2
- vpmovm2q ymm9, k1 - vpternlogq ymm9, ymm8, ymm7, -54
+ vpblendmq ymm9 {k1}, ymm7, ymm8
xor ebx, ebx
- ;; size=22 bbWeight=1 PerfScore 4.75
+ ;; size=15 bbWeight=1 PerfScore 3.58
G_M59915_IG23: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, isz mov ecx, ebx vmovups ymmword ptr [rsp+0x20], ymm9 @@ -447,10 +445,9 @@ G_M59915_IG25: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref ;; size=11 bbWeight=4 PerfScore 6.00 G_M59915_IG26: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref vpcmpq k1, ymm8, ymm6, 5
- vpmovm2q ymm9, k1 - vpternlogq ymm9, ymm8, ymm7, -54
+ vpblendmq ymm9 {k1}, ymm7, ymm8
xor ebx, ebx
- ;; size=22 bbWeight=1 PerfScore 4.75
+ ;; size=15 bbWeight=1 PerfScore 3.58
G_M59915_IG27: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, isz mov ecx, ebx vmovups ymmword ptr [rsp+0x20], ymm9 @@ -505,10 +502,9 @@ G_M59915_IG29: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref ;; size=11 bbWeight=4 PerfScore 6.00 G_M59915_IG30: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref vpcmpq k1, ymm7, ymm6, 5
- vpmovm2q ymm9, k1 - vpternlogq ymm9, ymm8, ymm7, -54
+ vpblendmq ymm9 {k1}, ymm7, ymm8
xor ebx, ebx
- ;; size=22 bbWeight=1 PerfScore 4.75
+ ;; size=15 bbWeight=1 PerfScore 3.58
G_M59915_IG31: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, isz mov ecx, ebx vmovups ymmword ptr [rsp+0x20], ymm9 @@ -687,7 +683,7 @@ G_M59915_IG42: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, ret ;; size=73 bbWeight=1 PerfScore 35.25
-; Total bytes of code 1956, prolog size 82, PerfScore 1049.58, instruction count 381, allocated bytes for code 1956 (MethodHash=e2e315f4) for method VectorTest+VectorRelopTest`1[long]:VectorRelOp(long,long):int (FullOpts)
+; Total bytes of code 1928, prolog size 82, PerfScore 1044.92, instruction count 377, allocated bytes for code 1928 (MethodHash=e2e315f4) for method VectorTest+VectorRelopTest`1[long]:VectorRelOp(long,long):int (FullOpts)
; ============================================================ Unwind Info:

-28 (-1.42%) : 174839.dasm - VectorTest+VectorRelopTest`1[short]:VectorRelOp(short,short):int (FullOpts)

@@ -334,10 +334,9 @@ G_M8563_IG17: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref ;; size=11 bbWeight=4 PerfScore 6.00 G_M8563_IG18: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref vpcmpw k1, ymm6, ymm7, 2
- vpmovm2w ymm9, k1 - vpternlogd ymm9, ymm6, ymm8, -54
+ vpblendmw ymm9 {k1}, ymm8, ymm6
xor ebx, ebx
- ;; size=22 bbWeight=1 PerfScore 5.75
+ ;; size=15 bbWeight=1 PerfScore 5.25
G_M8563_IG19: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, isz mov ecx, ebx vmovups ymmword ptr [rsp+0x20], ymm9 @@ -392,10 +391,9 @@ G_M8563_IG21: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref ;; size=11 bbWeight=4 PerfScore 6.00 G_M8563_IG22: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref vpcmpw k1, ymm6, ymm8, 2
- vpmovm2w ymm9, k1 - vpternlogd ymm9, ymm6, ymm8, -54
+ vpblendmw ymm9 {k1}, ymm8, ymm6
xor ebx, ebx
- ;; size=22 bbWeight=1 PerfScore 5.75
+ ;; size=15 bbWeight=1 PerfScore 5.25
G_M8563_IG23: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, isz mov ecx, ebx vmovups ymmword ptr [rsp+0x20], ymm9 @@ -450,10 +448,9 @@ G_M8563_IG25: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref ;; size=11 bbWeight=4 PerfScore 6.00 G_M8563_IG26: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref vpcmpw k1, ymm6, ymm7, 5
- vpmovm2w ymm9, k1 - vpternlogd ymm9, ymm6, ymm8, -54
+ vpblendmw ymm9 {k1}, ymm8, ymm6
xor ebx, ebx
- ;; size=22 bbWeight=1 PerfScore 5.75
+ ;; size=15 bbWeight=1 PerfScore 5.25
G_M8563_IG27: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, isz mov ecx, ebx vmovups ymmword ptr [rsp+0x20], ymm9 @@ -508,10 +505,9 @@ G_M8563_IG29: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref ;; size=11 bbWeight=4 PerfScore 6.00 G_M8563_IG30: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref vpcmpw k1, ymm8, ymm7, 5
- vpmovm2w ymm9, k1 - vpternlogd ymm9, ymm6, ymm8, -54
+ vpblendmw ymm9 {k1}, ymm8, ymm6
xor ebx, ebx
- ;; size=22 bbWeight=1 PerfScore 5.75
+ ;; size=15 bbWeight=1 PerfScore 5.25
G_M8563_IG31: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, isz mov ecx, ebx vmovups ymmword ptr [rsp+0x20], ymm9 @@ -690,7 +686,7 @@ G_M8563_IG42: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, ret ;; size=73 bbWeight=1 PerfScore 35.25
-; Total bytes of code 1968, prolog size 82, PerfScore 1206.33, instruction count 383, allocated bytes for code 1968 (MethodHash=33a1de8c) for method VectorTest+VectorRelopTest`1[short]:VectorRelOp(short,short):int (FullOpts)
+; Total bytes of code 1940, prolog size 82, PerfScore 1204.33, instruction count 379, allocated bytes for code 1940 (MethodHash=33a1de8c) for method VectorTest+VectorRelopTest`1[short]:VectorRelOp(short,short):int (FullOpts)
; ============================================================ Unwind Info:

-16 (-0.39%) : 429089.dasm - VectorTest+VectorRelopTest`1[long]:VectorRelOp(long,long):int (Instrumented Tier0)

@@ -437,14 +437,13 @@ G_M59915_IG22: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref jl G_M59915_IG18 vmovups ymm0, ymmword ptr [rbp-0x70] vpcmpq k1, ymm0, ymmword ptr [rbp-0xB0], 2
- vpmovm2q ymm0, k1 - vmovups ymm1, ymmword ptr [rbp-0x70] - vpternlogq ymm0, ymm1, ymmword ptr [rbp-0x90], -54
+ vmovups ymm0, ymmword ptr [rbp-0x90] + vpblendmq ymm0 {k1}, ymm0, ymmword ptr [rbp-0x70]
vmovups ymmword ptr [rbp-0xD0], ymm0 xor ecx, ecx mov dword ptr [rbp-0x100], ecx jmp G_M59915_IG25
- ;; size=72 bbWeight=1 PerfScore 23.25
+ ;; size=68 bbWeight=1 PerfScore 22.25
G_M59915_IG23: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref cmp dword ptr [rbp-0x100], 4 jae G_M59915_IG54 @@ -523,14 +522,13 @@ G_M59915_IG27: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref jl G_M59915_IG23 vmovups ymm0, ymmword ptr [rbp-0x70] vpcmpq k1, ymm0, ymmword ptr [rbp-0x90], 2
- vpmovm2q ymm0, k1 - vmovups ymm1, ymmword ptr [rbp-0x70] - vpternlogq ymm0, ymm1, ymmword ptr [rbp-0x90], -54
+ vmovups ymm0, ymmword ptr [rbp-0x90] + vpblendmq ymm0 {k1}, ymm0, ymmword ptr [rbp-0x70]
vmovups ymmword ptr [rbp-0xD0], ymm0 xor ecx, ecx mov dword ptr [rbp-0x108], ecx jmp G_M59915_IG30
- ;; size=72 bbWeight=1 PerfScore 23.25
+ ;; size=68 bbWeight=1 PerfScore 22.25
G_M59915_IG28: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref cmp dword ptr [rbp-0x108], 4 jae G_M59915_IG54 @@ -609,14 +607,13 @@ G_M59915_IG32: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref jl G_M59915_IG28 vmovups ymm0, ymmword ptr [rbp-0x70] vpcmpq k1, ymm0, ymmword ptr [rbp-0xB0], 5
- vpmovm2q ymm0, k1 - vmovups ymm1, ymmword ptr [rbp-0x70] - vpternlogq ymm0, ymm1, ymmword ptr [rbp-0x90], -54
+ vmovups ymm0, ymmword ptr [rbp-0x90] + vpblendmq ymm0 {k1}, ymm0, ymmword ptr [rbp-0x70]
vmovups ymmword ptr [rbp-0xD0], ymm0 xor ecx, ecx mov dword ptr [rbp-0x110], ecx jmp G_M59915_IG35
- ;; size=72 bbWeight=1 PerfScore 23.25
+ ;; size=68 bbWeight=1 PerfScore 22.25
G_M59915_IG33: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref cmp dword ptr [rbp-0x110], 4 jae G_M59915_IG54 @@ -695,14 +692,13 @@ G_M59915_IG37: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref jl G_M59915_IG33 vmovups ymm0, ymmword ptr [rbp-0x90] vpcmpq k1, ymm0, ymmword ptr [rbp-0xB0], 5
- vpmovm2q ymm0, k1 - vmovups ymm1, ymmword ptr [rbp-0x70] - vpternlogq ymm0, ymm1, ymmword ptr [rbp-0x90], -54
+ vmovups ymm0, ymmword ptr [rbp-0x90] + vpblendmq ymm0 {k1}, ymm0, ymmword ptr [rbp-0x70]
vmovups ymmword ptr [rbp-0xD0], ymm0 xor ecx, ecx mov dword ptr [rbp-0x118], ecx jmp G_M59915_IG40
- ;; size=75 bbWeight=1 PerfScore 23.25
+ ;; size=71 bbWeight=1 PerfScore 22.25
G_M59915_IG38: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref cmp dword ptr [rbp-0x118], 4 jae G_M59915_IG54 @@ -964,7 +960,7 @@ G_M59915_IG54: ; bbWeight=0, gcVars=0000000000000000 {}, gcrefRegs=0000 { int3 ;; size=6 bbWeight=0 PerfScore 0.00
-; Total bytes of code 4100, prolog size 77, PerfScore 831.26, instruction count 657, allocated bytes for code 4100 (MethodHash=e2e315f4) for method VectorTest+VectorRelopTest`1[long]:VectorRelOp(long,long):int (Instrumented Tier0)
+; Total bytes of code 4084, prolog size 77, PerfScore 827.26, instruction count 653, allocated bytes for code 4088 (MethodHash=e2e315f4) for method VectorTest+VectorRelopTest`1[long]:VectorRelOp(long,long):int (Instrumented Tier0)
; ============================================================ Unwind Info:

-16 (-0.39%) : 429093.dasm - VectorTest+VectorRelopTest`1[byte]:VectorRelOp(byte,byte):int (Instrumented Tier0)

@@ -443,14 +443,13 @@ G_M44299_IG22: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref jl G_M44299_IG18 vmovups ymm0, ymmword ptr [rbp-0x70] vpcmpb k1, ymm0, ymmword ptr [rbp-0xB0], 2
- vpmovm2b ymm0, k1 - vmovups ymm1, ymmword ptr [rbp-0x70] - vpternlogd ymm0, ymm1, ymmword ptr [rbp-0x90], -54
+ vmovups ymm0, ymmword ptr [rbp-0x90] + vpblendmb ymm0 {k1}, ymm0, ymmword ptr [rbp-0x70]
vmovups ymmword ptr [rbp-0xD0], ymm0 xor ecx, ecx mov dword ptr [rbp-0x100], ecx jmp G_M44299_IG25
- ;; size=72 bbWeight=1 PerfScore 24.25
+ ;; size=68 bbWeight=1 PerfScore 24.25
G_M44299_IG23: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref cmp dword ptr [rbp-0x100], 32 jae G_M44299_IG54 @@ -529,14 +528,13 @@ G_M44299_IG27: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref jl G_M44299_IG23 vmovups ymm0, ymmword ptr [rbp-0x70] vpcmpb k1, ymm0, ymmword ptr [rbp-0x90], 2
- vpmovm2b ymm0, k1 - vmovups ymm1, ymmword ptr [rbp-0x70] - vpternlogd ymm0, ymm1, ymmword ptr [rbp-0x90], -54
+ vmovups ymm0, ymmword ptr [rbp-0x90] + vpblendmb ymm0 {k1}, ymm0, ymmword ptr [rbp-0x70]
vmovups ymmword ptr [rbp-0xD0], ymm0 xor ecx, ecx mov dword ptr [rbp-0x108], ecx jmp G_M44299_IG30
- ;; size=72 bbWeight=1 PerfScore 24.25
+ ;; size=68 bbWeight=1 PerfScore 24.25
G_M44299_IG28: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref cmp dword ptr [rbp-0x108], 32 jae G_M44299_IG54 @@ -615,14 +613,13 @@ G_M44299_IG32: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref jl G_M44299_IG28 vmovups ymm0, ymmword ptr [rbp-0x70] vpcmpb k1, ymm0, ymmword ptr [rbp-0xB0], 5
- vpmovm2b ymm0, k1 - vmovups ymm1, ymmword ptr [rbp-0x70] - vpternlogd ymm0, ymm1, ymmword ptr [rbp-0x90], -54
+ vmovups ymm0, ymmword ptr [rbp-0x90] + vpblendmb ymm0 {k1}, ymm0, ymmword ptr [rbp-0x70]
vmovups ymmword ptr [rbp-0xD0], ymm0 xor ecx, ecx mov dword ptr [rbp-0x110], ecx jmp G_M44299_IG35
- ;; size=72 bbWeight=1 PerfScore 24.25
+ ;; size=68 bbWeight=1 PerfScore 24.25
G_M44299_IG33: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref cmp dword ptr [rbp-0x110], 32 jae G_M44299_IG54 @@ -701,14 +698,13 @@ G_M44299_IG37: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref jl G_M44299_IG33 vmovups ymm0, ymmword ptr [rbp-0x90] vpcmpb k1, ymm0, ymmword ptr [rbp-0xB0], 5
- vpmovm2b ymm0, k1 - vmovups ymm1, ymmword ptr [rbp-0x70] - vpternlogd ymm0, ymm1, ymmword ptr [rbp-0x90], -54
+ vmovups ymm0, ymmword ptr [rbp-0x90] + vpblendmb ymm0 {k1}, ymm0, ymmword ptr [rbp-0x70]
vmovups ymmword ptr [rbp-0xD0], ymm0 xor ecx, ecx mov dword ptr [rbp-0x118], ecx jmp G_M44299_IG40
- ;; size=75 bbWeight=1 PerfScore 24.25
+ ;; size=71 bbWeight=1 PerfScore 24.25
G_M44299_IG38: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref cmp dword ptr [rbp-0x118], 32 jae G_M44299_IG54 @@ -970,7 +966,7 @@ G_M44299_IG54: ; bbWeight=0, gcVars=0000000000000000 {}, gcrefRegs=0000 { int3 ;; size=6 bbWeight=0 PerfScore 0.00
-; Total bytes of code 4123, prolog size 77, PerfScore 865.01, instruction count 663, allocated bytes for code 4123 (MethodHash=09db52f4) for method VectorTest+VectorRelopTest`1[byte]:VectorRelOp(byte,byte):int (Instrumented Tier0)
+; Total bytes of code 4107, prolog size 77, PerfScore 865.01, instruction count 659, allocated bytes for code 4111 (MethodHash=09db52f4) for method VectorTest+VectorRelopTest`1[byte]:VectorRelOp(byte,byte):int (Instrumented Tier0)
; ============================================================ Unwind Info:

-16 (-0.39%) : 429092.dasm - VectorTest+VectorRelopTest`1[short]:VectorRelOp(short,short):int (Instrumented Tier0)

@@ -443,14 +443,13 @@ G_M8563_IG22: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref jl G_M8563_IG18 vmovups ymm0, ymmword ptr [rbp-0x70] vpcmpw k1, ymm0, ymmword ptr [rbp-0xB0], 2
- vpmovm2w ymm0, k1 - vmovups ymm1, ymmword ptr [rbp-0x70] - vpternlogd ymm0, ymm1, ymmword ptr [rbp-0x90], -54
+ vmovups ymm0, ymmword ptr [rbp-0x90] + vpblendmw ymm0 {k1}, ymm0, ymmword ptr [rbp-0x70]
vmovups ymmword ptr [rbp-0xD0], ymm0 xor ecx, ecx mov dword ptr [rbp-0x100], ecx jmp G_M8563_IG25
- ;; size=72 bbWeight=1 PerfScore 24.25
+ ;; size=68 bbWeight=1 PerfScore 24.25
G_M8563_IG23: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref cmp dword ptr [rbp-0x100], 16 jae G_M8563_IG54 @@ -529,14 +528,13 @@ G_M8563_IG27: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref jl G_M8563_IG23 vmovups ymm0, ymmword ptr [rbp-0x70] vpcmpw k1, ymm0, ymmword ptr [rbp-0x90], 2
- vpmovm2w ymm0, k1 - vmovups ymm1, ymmword ptr [rbp-0x70] - vpternlogd ymm0, ymm1, ymmword ptr [rbp-0x90], -54
+ vmovups ymm0, ymmword ptr [rbp-0x90] + vpblendmw ymm0 {k1}, ymm0, ymmword ptr [rbp-0x70]
vmovups ymmword ptr [rbp-0xD0], ymm0 xor ecx, ecx mov dword ptr [rbp-0x108], ecx jmp G_M8563_IG30
- ;; size=72 bbWeight=1 PerfScore 24.25
+ ;; size=68 bbWeight=1 PerfScore 24.25
G_M8563_IG28: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref cmp dword ptr [rbp-0x108], 16 jae G_M8563_IG54 @@ -615,14 +613,13 @@ G_M8563_IG32: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref jl G_M8563_IG28 vmovups ymm0, ymmword ptr [rbp-0x70] vpcmpw k1, ymm0, ymmword ptr [rbp-0xB0], 5
- vpmovm2w ymm0, k1 - vmovups ymm1, ymmword ptr [rbp-0x70] - vpternlogd ymm0, ymm1, ymmword ptr [rbp-0x90], -54
+ vmovups ymm0, ymmword ptr [rbp-0x90] + vpblendmw ymm0 {k1}, ymm0, ymmword ptr [rbp-0x70]
vmovups ymmword ptr [rbp-0xD0], ymm0 xor ecx, ecx mov dword ptr [rbp-0x110], ecx jmp G_M8563_IG35
- ;; size=72 bbWeight=1 PerfScore 24.25
+ ;; size=68 bbWeight=1 PerfScore 24.25
G_M8563_IG33: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref cmp dword ptr [rbp-0x110], 16 jae G_M8563_IG54 @@ -701,14 +698,13 @@ G_M8563_IG37: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref jl G_M8563_IG33 vmovups ymm0, ymmword ptr [rbp-0x90] vpcmpw k1, ymm0, ymmword ptr [rbp-0xB0], 5
- vpmovm2w ymm0, k1 - vmovups ymm1, ymmword ptr [rbp-0x70] - vpternlogd ymm0, ymm1, ymmword ptr [rbp-0x90], -54
+ vmovups ymm0, ymmword ptr [rbp-0x90] + vpblendmw ymm0 {k1}, ymm0, ymmword ptr [rbp-0x70]
vmovups ymmword ptr [rbp-0xD0], ymm0 xor ecx, ecx mov dword ptr [rbp-0x118], ecx jmp G_M8563_IG40
- ;; size=75 bbWeight=1 PerfScore 24.25
+ ;; size=71 bbWeight=1 PerfScore 24.25
G_M8563_IG38: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref cmp dword ptr [rbp-0x118], 16 jae G_M8563_IG54 @@ -970,7 +966,7 @@ G_M8563_IG54: ; bbWeight=0, gcVars=0000000000000000 {}, gcrefRegs=0000 {} int3 ;; size=6 bbWeight=0 PerfScore 0.00
-; Total bytes of code 4123, prolog size 77, PerfScore 865.01, instruction count 663, allocated bytes for code 4123 (MethodHash=33a1de8c) for method VectorTest+VectorRelopTest`1[short]:VectorRelOp(short,short):int (Instrumented Tier0)
+; Total bytes of code 4107, prolog size 77, PerfScore 865.01, instruction count 659, allocated bytes for code 4111 (MethodHash=33a1de8c) for method VectorTest+VectorRelopTest`1[short]:VectorRelOp(short,short):int (Instrumented Tier0)
; ============================================================ Unwind Info:

libraries.pmi.windows.x64.checked.mch

-21 (-20.39%) : 294005.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector5121[ubyte],System.Runtime.Intrinsics.Vector5121[ubyte]):System.Runtime.Intrinsics.Vector5121ubyte

@@ -17,7 +17,7 @@ ; V06 tmp1 [V06,T05] ( 3, 3 ) simd64 -> mm1 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector512`1[ubyte]> ; V07 tmp2 [V07,T06] ( 3, 3 ) simd64 -> mm3 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector512`1[ubyte]> ;* V08 tmp3 [V08 ] ( 0, 0 ) simd64 -> zero-ref "spilled call-like call argument"
-; V09 tmp4 [V09,T07] ( 2, 2 ) simd64 -> mm4 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector512`1[ubyte]>
+; V09 tmp4 [V09,T07] ( 2, 2 ) simd64 -> mm0 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector512`1[ubyte]>
;* V10 tmp5 [V10 ] ( 0, 0 ) simd64 -> zero-ref "Inline return value spill temp" <System.Runtime.Intrinsics.Vector512`1[ubyte]> ; V11 cse0 [V11,T03] ( 5, 5 ) simd64 -> mm0 "CSE - aggressive" ; V12 cse1 [V12,T04] ( 4, 4 ) simd64 -> mm2 "CSE - aggressive" @@ -34,25 +34,22 @@ G_M27576_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0106 {rcx rdx r vmovups zmm2, zmmword ptr [r8] vmovaps zmm3, zmm2 vpcmpeqb k1, zmm1, zmm3
- vpmovm2b zmm4, k1 - vxorps ymm5, ymm5, ymm5 - vpcmpub k1, zmm0, zmm5, 1 - vpmovm2b zmm5, k1 - vpternlogd zmm5, zmm2, zmm0, -54 - vpcmpub k1, zmm1, zmm3, 6 - vpmovm2b zmm1, k1 - vpternlogd zmm1, zmm0, zmm2, -54 - vpternlogd zmm4, zmm5, zmm1, -54 - vmovups zmmword ptr [rcx], zmm4
+ vxorps ymm4, ymm4, ymm4 + vpcmpub k2, zmm0, zmm4, 1 + vpblendmb zmm4 {k2}, zmm0, zmm2 + vpcmpub k2, zmm1, zmm3, 6 + vpblendmb zmm0 {k2}, zmm2, zmm0 + vpblendmb zmm0 {k1}, zmm0, zmm4 + vmovups zmmword ptr [rcx], zmm0
mov rax, rcx ; byrRegs +[rax]
- ;; size=96 bbWeight=1 PerfScore 25.08
+ ;; size=75 bbWeight=1 PerfScore 23.58
G_M27576_IG03: ; bbWeight=1, epilog, nogc, extend vzeroupper ret ;; size=4 bbWeight=1 PerfScore 2.00
-; Total bytes of code 103, prolog size 3, PerfScore 28.08, instruction count 19, allocated bytes for code 103 (MethodHash=a5449447) for method System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
+; Total bytes of code 82, prolog size 3, PerfScore 26.58, instruction count 16, allocated bytes for code 82 (MethodHash=a5449447) for method System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
; ============================================================ Unwind Info:

-21 (-20.39%) : 294062.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector5121[ubyte],System.Runtime.Intrinsics.Vector5121[ubyte]):System.Runtime.Intrinsics.Vector5121ubyte

@@ -17,7 +17,7 @@ ; V06 tmp1 [V06,T05] ( 3, 3 ) simd64 -> mm1 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector512`1[ubyte]> ; V07 tmp2 [V07,T06] ( 3, 3 ) simd64 -> mm3 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector512`1[ubyte]> ;* V08 tmp3 [V08 ] ( 0, 0 ) simd64 -> zero-ref "spilled call-like call argument"
-; V09 tmp4 [V09,T07] ( 2, 2 ) simd64 -> mm4 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector512`1[ubyte]>
+; V09 tmp4 [V09,T07] ( 2, 2 ) simd64 -> mm0 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector512`1[ubyte]>
;* V10 tmp5 [V10 ] ( 0, 0 ) simd64 -> zero-ref "Inline return value spill temp" <System.Runtime.Intrinsics.Vector512`1[ubyte]> ; V11 cse0 [V11,T03] ( 5, 5 ) simd64 -> mm2 "CSE - aggressive" ; V12 cse1 [V12,T04] ( 4, 4 ) simd64 -> mm0 "CSE - aggressive" @@ -34,25 +34,22 @@ G_M10214_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0106 {rcx rdx r vmovups zmm2, zmmword ptr [r8] vmovaps zmm3, zmm2 vpcmpeqb k1, zmm3, zmm1
- vpmovm2b zmm4, k1 - vxorps ymm5, ymm5, ymm5 - vpcmpub k1, zmm2, zmm5, 1 - vpmovm2b zmm5, k1 - vpternlogd zmm5, zmm2, zmm0, -54 - vpcmpub k1, zmm3, zmm1, 1 - vpmovm2b zmm1, k1 - vpternlogd zmm1, zmm2, zmm0, -54 - vpternlogd zmm4, zmm5, zmm1, -54 - vmovups zmmword ptr [rcx], zmm4
+ vxorps ymm4, ymm4, ymm4 + vpcmpub k2, zmm2, zmm4, 1 + vpblendmb zmm4 {k2}, zmm0, zmm2 + vpcmpub k2, zmm3, zmm1, 1 + vpblendmb zmm0 {k2}, zmm0, zmm2 + vpblendmb zmm0 {k1}, zmm0, zmm4 + vmovups zmmword ptr [rcx], zmm0
mov rax, rcx ; byrRegs +[rax]
- ;; size=96 bbWeight=1 PerfScore 25.08
+ ;; size=75 bbWeight=1 PerfScore 23.58
G_M10214_IG03: ; bbWeight=1, epilog, nogc, extend vzeroupper ret ;; size=4 bbWeight=1 PerfScore 2.00
-; Total bytes of code 103, prolog size 3, PerfScore 28.08, instruction count 19, allocated bytes for code 103 (MethodHash=6846d819) for method System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
+; Total bytes of code 82, prolog size 3, PerfScore 26.58, instruction count 16, allocated bytes for code 82 (MethodHash=6846d819) for method System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
; ============================================================ Unwind Info:

-21 (-20.39%) : 293983.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector5121[ubyte],System.Runtime.Intrinsics.Vector5121[ubyte]):System.Runtime.Intrinsics.Vector5121ubyte

@@ -13,7 +13,7 @@ ; V02 arg1 [V02,T01] ( 3, 6 ) byref -> r8 single-def ; V03 loc0 [V03,T05] ( 3, 3 ) simd64 -> mm1 <System.Runtime.Intrinsics.Vector512`1[ubyte]> ; V04 loc1 [V04,T06] ( 3, 3 ) simd64 -> mm3 <System.Runtime.Intrinsics.Vector512`1[ubyte]>
-; V05 loc2 [V05,T07] ( 2, 2 ) simd64 -> mm4 <System.Runtime.Intrinsics.Vector512`1[ubyte]>
+; V05 loc2 [V05,T07] ( 2, 2 ) simd64 -> mm0 <System.Runtime.Intrinsics.Vector512`1[ubyte]>
;* V06 loc3 [V06 ] ( 0, 0 ) simd64 -> zero-ref <System.Runtime.Intrinsics.Vector512`1[ubyte]> ;* V07 loc4 [V07 ] ( 0, 0 ) simd64 -> zero-ref <System.Runtime.Intrinsics.Vector512`1[ubyte]> ;# V08 OutArgs [V08 ] ( 1, 1 ) struct ( 0) [rsp+0x00] do-not-enreg[XS] addr-exposed "OutgoingArgSpace" @@ -34,25 +34,22 @@ G_M22834_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0106 {rcx rdx r vmovups zmm2, zmmword ptr [r8] vmovaps zmm3, zmm2 vpcmpeqb k1, zmm1, zmm3
- vpmovm2b zmm4, k1 - vxorps ymm5, ymm5, ymm5 - vpcmpub k1, zmm0, zmm5, 1 - vpmovm2b zmm5, k1 - vpternlogd zmm5, zmm2, zmm0, -54 - vpcmpub k1, zmm1, zmm3, 6 - vpmovm2b zmm1, k1 - vpternlogd zmm1, zmm0, zmm2, -54 - vpternlogd zmm4, zmm5, zmm1, -54 - vmovups zmmword ptr [rcx], zmm4
+ vxorps ymm4, ymm4, ymm4 + vpcmpub k2, zmm0, zmm4, 1 + vpblendmb zmm4 {k2}, zmm0, zmm2 + vpcmpub k2, zmm1, zmm3, 6 + vpblendmb zmm0 {k2}, zmm2, zmm0 + vpblendmb zmm0 {k1}, zmm0, zmm4 + vmovups zmmword ptr [rcx], zmm0
mov rax, rcx ; byrRegs +[rax]
- ;; size=96 bbWeight=1 PerfScore 25.08
+ ;; size=75 bbWeight=1 PerfScore 23.58
G_M22834_IG03: ; bbWeight=1, epilog, nogc, extend vzeroupper ret ;; size=4 bbWeight=1 PerfScore 2.00
-; Total bytes of code 103, prolog size 3, PerfScore 28.08, instruction count 19, allocated bytes for code 103 (MethodHash=885fa6cd) for method System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
+; Total bytes of code 82, prolog size 3, PerfScore 26.58, instruction count 16, allocated bytes for code 82 (MethodHash=885fa6cd) for method System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
; ============================================================ Unwind Info:

-7 (-5.65%) : 27696.dasm - System.Buffers.ProbabilisticMap:IsCharBitSetAvx2(System.Runtime.Intrinsics.Vector2561[ubyte],System.Runtime.Intrinsics.Vector2561[ubyte],System.Runtime.Intrinsics.Vector2561[ubyte]):System.Runtime.Intrinsics.Vector2561ubyte

@@ -35,14 +35,13 @@ G_M53822_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0306 {rcx rdx r vpshufb ymm1, ymm2, ymm1 vpand ymm0, ymm0, ymmword ptr [reloc @RWD64] vpcmpub k1, ymm0, ymmword ptr [reloc @RWD96], 6
- vpmovm2b ymm2, k1 - vmovups ymm3, ymmword ptr [r8] - vpsubb ymm4, ymm0, ymmword ptr [reloc @RWD128] - vpshufb ymm3, ymm3, ymm4 - vmovups ymm4, ymmword ptr [rdx] - vpshufb ymm0, ymm4, ymm0 - vpternlogd ymm2, ymm3, ymm0, -54 - vpand ymm0, ymm2, ymm1
+ vmovups ymm2, ymmword ptr [r8] + vpsubb ymm3, ymm0, ymmword ptr [reloc @RWD128] + vpshufb ymm2, ymm2, ymm3 + vmovups ymm3, ymmword ptr [rdx] + vpshufb ymm0, ymm3, ymm0 + vpblendmb ymm0 {k1}, ymm0, ymm2 + vpand ymm0, ymm0, ymm1
vxorps ymm1, ymm1, ymm1 vpcmpeqb ymm0, ymm0, ymm1 vpcmpeqd ymm1, ymm1, ymm1 @@ -50,7 +49,7 @@ G_M53822_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0306 {rcx rdx r vmovups ymmword ptr [rcx], ymm0 mov rax, rcx ; byrRegs +[rax]
- ;; size=117 bbWeight=1 PerfScore 42.75
+ ;; size=110 bbWeight=1 PerfScore 42.25
G_M53822_IG03: ; bbWeight=1, epilog, nogc, extend vzeroupper ret @@ -62,7 +61,7 @@ RWD96 dq 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F RWD128 dq 1010101010101010h, 1010101010101010h, 1010101010101010h, 1010101010101010h
-; Total bytes of code 124, prolog size 3, PerfScore 45.75, instruction count 24, allocated bytes for code 124 (MethodHash=47dc2dc1) for method System.Buffers.ProbabilisticMap:IsCharBitSetAvx2(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte]):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
+; Total bytes of code 117, prolog size 3, PerfScore 45.25, instruction count 23, allocated bytes for code 117 (MethodHash=47dc2dc1) for method System.Buffers.ProbabilisticMap:IsCharBitSetAvx2(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte]):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
; ============================================================ Unwind Info:

-7 (-4.58%) : 293786.dasm - System.Numerics.Tensors.TensorPrimitives:g_HalfAsWidenedUInt32ToSingleVector512|210_2(System.Runtime.Intrinsics.Vector5121[uint]):System.Runtime.Intrinsics.Vector5121float

@@ -39,9 +39,8 @@ G_M58105_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0006 {rcx rdx}, vpandd zmm3, zmm4, dword ptr [reloc @RWD128] {1to16} vpord zmm4, zmm3, dword ptr [reloc @RWD132] {1to16} vptestnmd k1, zmm2, zmm2
- vpmovm2d zmm2, k1 - vpslld zmm5, zmm4, 1 - vpternlogd zmm2, zmm4, zmm5, -54
+ vpslld zmm2, zmm4, 1 + vpblendmd zmm2 {k1}, zmm2, zmm4
vpslld zmm0, zmm0, 13 vpandd zmm0, zmm0, dword ptr [reloc @RWD136] {1to16} vpaddd zmm0, zmm0, zmm2 @@ -50,7 +49,7 @@ G_M58105_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0006 {rcx rdx}, vmovups zmmword ptr [rcx], zmm0 mov rax, rcx ; byrRegs +[rax]
- ;; size=146 bbWeight=1 PerfScore 33.25
+ ;; size=139 bbWeight=1 PerfScore 32.25
G_M58105_IG03: ; bbWeight=1, epilog, nogc, extend vzeroupper ret @@ -65,7 +64,7 @@ RWD132 dd 38000000h RWD136 dd 0FFFE000h
-; Total bytes of code 153, prolog size 3, PerfScore 36.25, instruction count 24, allocated bytes for code 153 (MethodHash=e6ab1d06) for method System.Numerics.Tensors.TensorPrimitives:<ConvertToSingle>g__HalfAsWidenedUInt32ToSingle_Vector512|210_2(System.Runtime.Intrinsics.Vector512`1[uint]):System.Runtime.Intrinsics.Vector512`1[float] (FullOpts)
+; Total bytes of code 146, prolog size 3, PerfScore 35.25, instruction count 23, allocated bytes for code 146 (MethodHash=e6ab1d06) for method System.Numerics.Tensors.TensorPrimitives:<ConvertToSingle>g__HalfAsWidenedUInt32ToSingle_Vector512|210_2(System.Runtime.Intrinsics.Vector512`1[uint]):System.Runtime.Intrinsics.Vector512`1[float] (FullOpts)
; ============================================================ Unwind Info:

-28 (-2.50%) : 27702.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)

@@ -186,12 +186,11 @@ G_M48875_IG06: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r vpshufb ymm1, ymm3, ymm1 vpand ymm2, ymm2, ymmword ptr [reloc @RWD96] vpcmpub k1, ymm2, ymmword ptr [reloc @RWD128], 6
- vpmovm2b ymm3, k1 - vpsubb ymm4, ymm2, ymmword ptr [reloc @RWD160] - vpshufb ymm4, ymm9, ymm4
+ vpsubb ymm3, ymm2, ymmword ptr [reloc @RWD160] + vpshufb ymm3, ymm9, ymm3
vpshufb ymm2, ymm8, ymm2
- vpternlogd ymm3, ymm4, ymm2, -54 - vpand ymm1, ymm3, ymm1
+ vpblendmb ymm2 {k1}, ymm2, ymm3 + vpand ymm1, ymm2, ymm1
vxorps ymm2, ymm2, ymm2 vpcmpeqb ymm1, ymm1, ymm2 vpcmpeqd ymm2, ymm2, ymm2 @@ -202,12 +201,11 @@ G_M48875_IG06: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r vpshufb ymm2, ymm3, ymm2 vpand ymm0, ymm0, ymmword ptr [reloc @RWD96] vpcmpub k1, ymm0, ymmword ptr [reloc @RWD128], 6
- vpmovm2b ymm3, k1 - vpsubb ymm4, ymm0, ymmword ptr [reloc @RWD160] - vpshufb ymm4, ymm9, ymm4
+ vpsubb ymm3, ymm0, ymmword ptr [reloc @RWD160] + vpshufb ymm3, ymm9, ymm3
vpshufb ymm0, ymm8, ymm0
- vpternlogd ymm3, ymm4, ymm0, -54 - vpand ymm0, ymm3, ymm2
+ vpblendmb ymm0 {k1}, ymm0, ymm3 + vpand ymm0, ymm0, ymm2
vxorps ymm2, ymm2, ymm2 vpcmpeqb ymm0, ymm0, ymm2 vpcmpeqd ymm2, ymm2, ymm2 @@ -215,7 +213,7 @@ G_M48875_IG06: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r vpand ymm0, ymm1, ymm0 vptest ymm0, ymm0 je SHORT G_M48875_IG09
- ;; size=248 bbWeight=4 PerfScore 317.33
+ ;; size=234 bbWeight=4 PerfScore 313.33
G_M48875_IG07: ; bbWeight=2, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi rdi r14 r15}, byref vpermq ymm0, ymm0, -40 vpmovmskb ebp, ymm0 @@ -338,12 +336,11 @@ G_M48875_IG16: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r vpshufb xmm1, xmm10, xmm1 vpand xmm2, xmm2, xmm11 vpcmpub k1, xmm2, xmm12, 6
- vpmovm2b xmm3, k1 - vpsubb xmm4, xmm2, xmm13 - vpshufb xmm4, xmm7, xmm4
+ vpsubb xmm3, xmm2, xmm13 + vpshufb xmm3, xmm7, xmm3
vpshufb xmm2, xmm6, xmm2
- vpternlogd xmm3, xmm4, xmm2, -54 - vpand xmm1, xmm3, xmm1
+ vpblendmb xmm2 {k1}, xmm2, xmm3 + vpand xmm1, xmm2, xmm1
vxorps xmm2, xmm2, xmm2 vpcmpeqb xmm1, xmm1, xmm2 vpcmpeqd xmm2, xmm2, xmm2 @@ -353,12 +350,11 @@ G_M48875_IG16: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r vpshufb xmm2, xmm10, xmm2 vpand xmm0, xmm0, xmm11 vpcmpub k1, xmm0, xmm12, 6
- vpmovm2b xmm3, k1 - vpsubb xmm4, xmm0, xmm13 - vpshufb xmm4, xmm7, xmm4
+ vpsubb xmm3, xmm0, xmm13 + vpshufb xmm3, xmm7, xmm3
vpshufb xmm0, xmm6, xmm0
- vpternlogd xmm3, xmm4, xmm0, -54 - vpand xmm0, xmm3, xmm2
+ vpblendmb xmm0 {k1}, xmm0, xmm3 + vpand xmm0, xmm0, xmm2
vxorps xmm2, xmm2, xmm2 vpcmpeqb xmm0, xmm0, xmm2 vpcmpeqd xmm2, xmm2, xmm2 @@ -366,7 +362,7 @@ G_M48875_IG16: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r vpand xmm0, xmm1, xmm0 vptest xmm0, xmm0 je SHORT G_M48875_IG19
- ;; size=200 bbWeight=4 PerfScore 168.00
+ ;; size=186 bbWeight=4 PerfScore 164.00
G_M48875_IG17: ; bbWeight=2, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi rdi r14 r15}, byref vpmovmskb ebp, xmm0 ;; size=4 bbWeight=2 PerfScore 4.00 @@ -433,7 +429,7 @@ RWD128 dq 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F RWD160 dq 1010101010101010h, 1010101010101010h, 1010101010101010h, 1010101010101010h
-; Total bytes of code 1118, prolog size 86, PerfScore 1254.58, instruction count 240, allocated bytes for code 1118 (MethodHash=36e94114) for method System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
+; Total bytes of code 1090, prolog size 86, PerfScore 1246.58, instruction count 236, allocated bytes for code 1090 (MethodHash=36e94114) for method System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
; ============================================================ Unwind Info:

libraries_tests.run.windows.x64.Release.mch

-14 (-16.67%) : 386233.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator1[ulong]:Invoke(System.Runtime.Intrinsics.Vector1281[ulong],System.Runtime.Intrinsics.Vector1281[ulong]):System.Runtime.Intrinsics.Vector1281ulong

@@ -37,21 +37,19 @@ G_M11551_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0106 {rcx rdx r vpcmpeqq xmm4, xmm1, xmm3 vxorps xmm5, xmm5, xmm5 vpcmpuq k1, xmm0, xmm5, 1
- vpmovm2q xmm5, k1 - vpternlogq xmm5, xmm2, xmm0, -54
+ vpblendmq xmm5 {k1}, xmm0, xmm2
vpcmpuq k1, xmm1, xmm3, 6
- vpmovm2q xmm1, k1 - vpternlogq xmm1, xmm0, xmm2, -54 - vpternlogq xmm4, xmm5, xmm1, -54
+ vpblendmq xmm0 {k1}, xmm2, xmm0 + vpternlogq xmm4, xmm5, xmm0, -54
vmovups xmmword ptr [rcx], xmm4 mov rax, rcx ; byrRegs +[rax]
- ;; size=80 bbWeight=1 PerfScore 21.08
+ ;; size=66 bbWeight=1 PerfScore 18.75
G_M11551_IG03: ; bbWeight=1, epilog, nogc, extend ret ;; size=1 bbWeight=1 PerfScore 1.00
-; Total bytes of code 84, prolog size 3, PerfScore 23.08, instruction count 17, allocated bytes for code 84 (MethodHash=e91fd2e0) for method System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector128`1[ulong],System.Runtime.Intrinsics.Vector128`1[ulong]):System.Runtime.Intrinsics.Vector128`1[ulong] (Tier1)
+; Total bytes of code 70, prolog size 3, PerfScore 20.75, instruction count 15, allocated bytes for code 70 (MethodHash=e91fd2e0) for method System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector128`1[ulong],System.Runtime.Intrinsics.Vector128`1[ulong]):System.Runtime.Intrinsics.Vector128`1[ulong] (Tier1)
; ============================================================ Unwind Info:

-14 (-16.67%) : 393190.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator1[ulong]:Invoke(System.Runtime.Intrinsics.Vector1281[ulong],System.Runtime.Intrinsics.Vector1281[ulong]):System.Runtime.Intrinsics.Vector1281ulong

@@ -36,21 +36,19 @@ G_M1813_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0106 {rcx rdx r8 vpcmpeqq xmm4, xmm1, xmm3 vxorps xmm5, xmm5, xmm5 vpcmpuq k1, xmm0, xmm5, 1
- vpmovm2q xmm5, k1 - vpternlogq xmm5, xmm2, xmm0, -54
+ vpblendmq xmm5 {k1}, xmm0, xmm2
vpcmpuq k1, xmm1, xmm3, 6
- vpmovm2q xmm1, k1 - vpternlogq xmm1, xmm0, xmm2, -54 - vpternlogq xmm4, xmm5, xmm1, -54
+ vpblendmq xmm0 {k1}, xmm2, xmm0 + vpternlogq xmm4, xmm5, xmm0, -54
vmovups xmmword ptr [rcx], xmm4 mov rax, rcx ; byrRegs +[rax]
- ;; size=80 bbWeight=1 PerfScore 21.08
+ ;; size=66 bbWeight=1 PerfScore 18.75
G_M1813_IG03: ; bbWeight=1, epilog, nogc, extend ret ;; size=1 bbWeight=1 PerfScore 1.00
-; Total bytes of code 84, prolog size 3, PerfScore 23.08, instruction count 17, allocated bytes for code 84 (MethodHash=f881f8ea) for method System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector128`1[ulong],System.Runtime.Intrinsics.Vector128`1[ulong]):System.Runtime.Intrinsics.Vector128`1[ulong] (Tier1)
+; Total bytes of code 70, prolog size 3, PerfScore 20.75, instruction count 15, allocated bytes for code 70 (MethodHash=f881f8ea) for method System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector128`1[ulong],System.Runtime.Intrinsics.Vector128`1[ulong]):System.Runtime.Intrinsics.Vector128`1[ulong] (Tier1)
; ============================================================ Unwind Info:

-14 (-16.67%) : 393191.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator1[ulong]:Invoke(System.Runtime.Intrinsics.Vector1281[ulong],System.Runtime.Intrinsics.Vector1281[ulong]):System.Runtime.Intrinsics.Vector1281ulong

@@ -37,21 +37,19 @@ G_M11551_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0106 {rcx rdx r vpcmpeqq xmm4, xmm1, xmm3 vxorps xmm5, xmm5, xmm5 vpcmpuq k1, xmm0, xmm5, 1
- vpmovm2q xmm5, k1 - vpternlogq xmm5, xmm2, xmm0, -54
+ vpblendmq xmm5 {k1}, xmm0, xmm2
vpcmpuq k1, xmm1, xmm3, 6
- vpmovm2q xmm1, k1 - vpternlogq xmm1, xmm0, xmm2, -54 - vpternlogq xmm4, xmm5, xmm1, -54
+ vpblendmq xmm0 {k1}, xmm2, xmm0 + vpternlogq xmm4, xmm5, xmm0, -54
vmovups xmmword ptr [rcx], xmm4 mov rax, rcx ; byrRegs +[rax]
- ;; size=80 bbWeight=1 PerfScore 21.08
+ ;; size=66 bbWeight=1 PerfScore 18.75
G_M11551_IG03: ; bbWeight=1, epilog, nogc, extend ret ;; size=1 bbWeight=1 PerfScore 1.00
-; Total bytes of code 84, prolog size 3, PerfScore 23.08, instruction count 17, allocated bytes for code 84 (MethodHash=e91fd2e0) for method System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector128`1[ulong],System.Runtime.Intrinsics.Vector128`1[ulong]):System.Runtime.Intrinsics.Vector128`1[ulong] (Tier1)
+; Total bytes of code 70, prolog size 3, PerfScore 20.75, instruction count 15, allocated bytes for code 70 (MethodHash=e91fd2e0) for method System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector128`1[ulong],System.Runtime.Intrinsics.Vector128`1[ulong]):System.Runtime.Intrinsics.Vector128`1[ulong] (Tier1)
; ============================================================ Unwind Info:

-7 (-2.52%) : 392581.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator1[ulong]:Invoke(System.Runtime.Intrinsics.Vector2561[ulong],System.Runtime.Intrinsics.Vector2561[ulong]):System.Runtime.Intrinsics.Vector2561ulong

@@ -59,16 +59,17 @@ G_M12395_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref vpternlogq ymm1, ymm2, ymmword ptr [rcx], -54 vmovups ymm2, ymmword ptr [rbp-0x50] vpcmpuq k1, ymm2, ymmword ptr [rbp-0x30], 1
- vpmovm2q ymm2, k1
mov rcx, bword ptr [rbp+0x20]
- vmovups ymm3, ymmword ptr [rcx] - mov rcx, bword ptr [rbp+0x18] - vpternlogq ymm2, ymm3, ymmword ptr [rcx], -54
+ mov rax, bword ptr [rbp+0x18] + ; byrRegs +[rax] + vmovups ymm2, ymmword ptr [rax] + vpblendmq ymm2 {k1}, ymm2, ymmword ptr [rcx]
vpternlogq ymm0, ymm1, ymm2, -54 vmovups ymmword ptr [rbp-0x70], ymm0 mov rcx, 0xD1FFAB1E ; byrRegs -[rcx] call CORINFO_HELP_COUNTPROFILE32
+ ; byrRegs -[rax]
mov rcx, 0xD1FFAB1E call CORINFO_HELP_COUNTPROFILE32 mov rax, bword ptr [rbp+0x10] @@ -76,7 +77,7 @@ G_M12395_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref vmovups ymm0, ymmword ptr [rbp-0x70] vmovups ymmword ptr [rax], ymm0 mov rax, bword ptr [rbp+0x10]
- ;; size=200 bbWeight=1 PerfScore 77.00
+ ;; size=193 bbWeight=1 PerfScore 76.00
G_M12395_IG03: ; bbWeight=1, epilog, nogc, extend vzeroupper add rsp, 272 @@ -84,7 +85,7 @@ G_M12395_IG03: ; bbWeight=1, epilog, nogc, extend ret ;; size=12 bbWeight=1 PerfScore 2.75
-; Total bytes of code 278, prolog size 54, PerfScore 95.83, instruction count 53, allocated bytes for code 278 (MethodHash=3d67cf94) for method System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Instrumented Tier0)
+; Total bytes of code 271, prolog size 54, PerfScore 94.83, instruction count 52, allocated bytes for code 272 (MethodHash=3d67cf94) for method System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Instrumented Tier0)
; ============================================================ Unwind Info:

-7 (-2.52%) : 392539.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator1[ulong]:Invoke(System.Runtime.Intrinsics.Vector2561[ulong],System.Runtime.Intrinsics.Vector2561[ulong]):System.Runtime.Intrinsics.Vector2561ulong

@@ -59,16 +59,17 @@ G_M63669_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref vpternlogq ymm1, ymm2, ymmword ptr [rcx], -54 vmovups ymm2, ymmword ptr [rbp-0x30] vpcmpuq k1, ymm2, ymmword ptr [rbp-0x50], 6
- vpmovm2q ymm2, k1
mov rcx, bword ptr [rbp+0x18]
- vmovups ymm3, ymmword ptr [rcx] - mov rcx, bword ptr [rbp+0x20] - vpternlogq ymm2, ymm3, ymmword ptr [rcx], -54
+ mov rax, bword ptr [rbp+0x20] + ; byrRegs +[rax] + vmovups ymm2, ymmword ptr [rax] + vpblendmq ymm2 {k1}, ymm2, ymmword ptr [rcx]
vpternlogq ymm0, ymm1, ymm2, -54 vmovups ymmword ptr [rbp-0x70], ymm0 mov rcx, 0xD1FFAB1E ; byrRegs -[rcx] call CORINFO_HELP_COUNTPROFILE32
+ ; byrRegs -[rax]
mov rcx, 0xD1FFAB1E call CORINFO_HELP_COUNTPROFILE32 mov rax, bword ptr [rbp+0x10] @@ -76,7 +77,7 @@ G_M63669_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref vmovups ymm0, ymmword ptr [rbp-0x70] vmovups ymmword ptr [rax], ymm0 mov rax, bword ptr [rbp+0x10]
- ;; size=200 bbWeight=1 PerfScore 77.00
+ ;; size=193 bbWeight=1 PerfScore 76.00
G_M63669_IG03: ; bbWeight=1, epilog, nogc, extend vzeroupper add rsp, 272 @@ -84,7 +85,7 @@ G_M63669_IG03: ; bbWeight=1, epilog, nogc, extend ret ;; size=12 bbWeight=1 PerfScore 2.75
-; Total bytes of code 278, prolog size 54, PerfScore 95.83, instruction count 53, allocated bytes for code 278 (MethodHash=dc23074a) for method System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Instrumented Tier0)
+; Total bytes of code 271, prolog size 54, PerfScore 94.83, instruction count 52, allocated bytes for code 272 (MethodHash=dc23074a) for method System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Instrumented Tier0)
; ============================================================ Unwind Info:

-28 (-2.40%) : 342446.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (Tier1)

@@ -172,12 +172,11 @@ G_M48875_IG04: ; bbWeight=4.37, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rs vpshufb ymm1, ymm3, ymm1 vpand ymm2, ymm2, ymmword ptr [reloc @RWD96] vpcmpub k1, ymm2, ymmword ptr [reloc @RWD128], 6
- vpmovm2b ymm3, k1 - vpsubb ymm4, ymm2, ymmword ptr [reloc @RWD160] - vpshufb ymm4, ymm9, ymm4
+ vpsubb ymm3, ymm2, ymmword ptr [reloc @RWD160] + vpshufb ymm3, ymm9, ymm3
vpshufb ymm2, ymm8, ymm2
- vpternlogd ymm3, ymm4, ymm2, -54 - vpand ymm1, ymm3, ymm1
+ vpblendmb ymm2 {k1}, ymm2, ymm3 + vpand ymm1, ymm2, ymm1
vxorps ymm2, ymm2, ymm2 vpcmpeqb ymm1, ymm1, ymm2 vpcmpeqd ymm2, ymm2, ymm2 @@ -188,12 +187,11 @@ G_M48875_IG04: ; bbWeight=4.37, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rs vpshufb ymm2, ymm3, ymm2 vpand ymm0, ymm0, ymmword ptr [reloc @RWD96] vpcmpub k1, ymm0, ymmword ptr [reloc @RWD128], 6
- vpmovm2b ymm3, k1 - vpsubb ymm4, ymm0, ymmword ptr [reloc @RWD160] - vpshufb ymm4, ymm9, ymm4
+ vpsubb ymm3, ymm0, ymmword ptr [reloc @RWD160] + vpshufb ymm3, ymm9, ymm3
vpshufb ymm0, ymm8, ymm0
- vpternlogd ymm3, ymm4, ymm0, -54 - vpand ymm0, ymm3, ymm2
+ vpblendmb ymm0 {k1}, ymm0, ymm3 + vpand ymm0, ymm0, ymm2
vxorps ymm2, ymm2, ymm2 vpcmpeqb ymm0, ymm0, ymm2 vpcmpeqd ymm2, ymm2, ymm2 @@ -201,7 +199,7 @@ G_M48875_IG04: ; bbWeight=4.37, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rs vpand ymm10, ymm1, ymm0 vptest ymm10, ymm10 jne SHORT G_M48875_IG07
- ;; size=248 bbWeight=4.37 PerfScore 346.74
+ ;; size=234 bbWeight=4.37 PerfScore 342.37
G_M48875_IG05: ; bbWeight=3.45, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi rdi r14 r15}, byref add r15, 64 cmp r15, rsi @@ -318,12 +316,11 @@ G_M48875_IG17: ; bbWeight=0.08, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rs vpshufb xmm1, xmm3, xmm1 vpand xmm2, xmm2, xmmword ptr [reloc @RWD96] vpcmpub k1, xmm2, xmmword ptr [reloc @RWD128], 6
- vpmovm2b xmm3, k1 - vpsubb xmm4, xmm2, xmmword ptr [reloc @RWD160] - vpshufb xmm4, xmm7, xmm4
+ vpsubb xmm3, xmm2, xmmword ptr [reloc @RWD160] + vpshufb xmm3, xmm7, xmm3
vpshufb xmm2, xmm6, xmm2
- vpternlogd xmm3, xmm4, xmm2, -54 - vpand xmm1, xmm3, xmm1
+ vpblendmb xmm2 {k1}, xmm2, xmm3 + vpand xmm1, xmm2, xmm1
vxorps xmm2, xmm2, xmm2 vpcmpeqb xmm1, xmm1, xmm2 vpcmpeqd xmm2, xmm2, xmm2 @@ -334,12 +331,11 @@ G_M48875_IG17: ; bbWeight=0.08, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rs vpshufb xmm2, xmm3, xmm2 vpand xmm0, xmm0, xmmword ptr [reloc @RWD96] vpcmpub k1, xmm0, xmmword ptr [reloc @RWD128], 6
- vpmovm2b xmm3, k1 - vpsubb xmm4, xmm0, xmmword ptr [reloc @RWD160] - vpshufb xmm4, xmm7, xmm4
+ vpsubb xmm3, xmm0, xmmword ptr [reloc @RWD160] + vpshufb xmm3, xmm7, xmm3
vpshufb xmm0, xmm6, xmm0
- vpternlogd xmm3, xmm4, xmm0, -54 - vpand xmm0, xmm3, xmm2
+ vpblendmb xmm0 {k1}, xmm0, xmm3 + vpand xmm0, xmm0, xmm2
vxorps xmm2, xmm2, xmm2 vpcmpeqb xmm0, xmm0, xmm2 vpcmpeqd xmm2, xmm2, xmm2 @@ -347,7 +343,7 @@ G_M48875_IG17: ; bbWeight=0.08, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rs vpand xmm0, xmm1, xmm0 vptest xmm0, xmm0 je SHORT G_M48875_IG21
- ;; size=248 bbWeight=0.08 PerfScore 5.05
+ ;; size=234 bbWeight=0.08 PerfScore 4.97
G_M48875_IG18: ; bbWeight=0.07, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi rdi r14 r15}, byref vpmovmskb ebp, xmm0 ;; size=4 bbWeight=0.07 PerfScore 0.14 @@ -444,7 +440,7 @@ RWD128 dq 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F RWD160 dq 1010101010101010h, 1010101010101010h, 1010101010101010h, 1010101010101010h
-; Total bytes of code 1165, prolog size 86, PerfScore 510.92, instruction count 247, allocated bytes for code 1165 (MethodHash=36e94114) for method System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (Tier1)
+; Total bytes of code 1137, prolog size 86, PerfScore 506.47, instruction count 243, allocated bytes for code 1137 (MethodHash=36e94114) for method System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (Tier1)
; ============================================================ Unwind Info:

librariestestsnotieredcompilation.run.windows.x64.Release.mch

-28 (-2.50%) : 150728.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)

@@ -186,12 +186,11 @@ G_M48875_IG06: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r vpshufb ymm1, ymm3, ymm1 vpand ymm2, ymm2, ymmword ptr [reloc @RWD96] vpcmpub k1, ymm2, ymmword ptr [reloc @RWD128], 6
- vpmovm2b ymm3, k1 - vpsubb ymm4, ymm2, ymmword ptr [reloc @RWD160] - vpshufb ymm4, ymm9, ymm4
+ vpsubb ymm3, ymm2, ymmword ptr [reloc @RWD160] + vpshufb ymm3, ymm9, ymm3
vpshufb ymm2, ymm8, ymm2
- vpternlogd ymm3, ymm4, ymm2, -54 - vpand ymm1, ymm3, ymm1
+ vpblendmb ymm2 {k1}, ymm2, ymm3 + vpand ymm1, ymm2, ymm1
vxorps ymm2, ymm2, ymm2 vpcmpeqb ymm1, ymm1, ymm2 vpcmpeqd ymm2, ymm2, ymm2 @@ -202,12 +201,11 @@ G_M48875_IG06: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r vpshufb ymm2, ymm3, ymm2 vpand ymm0, ymm0, ymmword ptr [reloc @RWD96] vpcmpub k1, ymm0, ymmword ptr [reloc @RWD128], 6
- vpmovm2b ymm3, k1 - vpsubb ymm4, ymm0, ymmword ptr [reloc @RWD160] - vpshufb ymm4, ymm9, ymm4
+ vpsubb ymm3, ymm0, ymmword ptr [reloc @RWD160] + vpshufb ymm3, ymm9, ymm3
vpshufb ymm0, ymm8, ymm0
- vpternlogd ymm3, ymm4, ymm0, -54 - vpand ymm0, ymm3, ymm2
+ vpblendmb ymm0 {k1}, ymm0, ymm3 + vpand ymm0, ymm0, ymm2
vxorps ymm2, ymm2, ymm2 vpcmpeqb ymm0, ymm0, ymm2 vpcmpeqd ymm2, ymm2, ymm2 @@ -215,7 +213,7 @@ G_M48875_IG06: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r vpand ymm0, ymm1, ymm0 vptest ymm0, ymm0 je SHORT G_M48875_IG09
- ;; size=248 bbWeight=4 PerfScore 317.33
+ ;; size=234 bbWeight=4 PerfScore 313.33
G_M48875_IG07: ; bbWeight=2, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi rdi r14 r15}, byref vpermq ymm0, ymm0, -40 vpmovmskb ebp, ymm0 @@ -338,12 +336,11 @@ G_M48875_IG16: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r vpshufb xmm1, xmm10, xmm1 vpand xmm2, xmm2, xmm11 vpcmpub k1, xmm2, xmm12, 6
- vpmovm2b xmm3, k1 - vpsubb xmm4, xmm2, xmm13 - vpshufb xmm4, xmm7, xmm4
+ vpsubb xmm3, xmm2, xmm13 + vpshufb xmm3, xmm7, xmm3
vpshufb xmm2, xmm6, xmm2
- vpternlogd xmm3, xmm4, xmm2, -54 - vpand xmm1, xmm3, xmm1
+ vpblendmb xmm2 {k1}, xmm2, xmm3 + vpand xmm1, xmm2, xmm1
vxorps xmm2, xmm2, xmm2 vpcmpeqb xmm1, xmm1, xmm2 vpcmpeqd xmm2, xmm2, xmm2 @@ -353,12 +350,11 @@ G_M48875_IG16: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r vpshufb xmm2, xmm10, xmm2 vpand xmm0, xmm0, xmm11 vpcmpub k1, xmm0, xmm12, 6
- vpmovm2b xmm3, k1 - vpsubb xmm4, xmm0, xmm13 - vpshufb xmm4, xmm7, xmm4
+ vpsubb xmm3, xmm0, xmm13 + vpshufb xmm3, xmm7, xmm3
vpshufb xmm0, xmm6, xmm0
- vpternlogd xmm3, xmm4, xmm0, -54 - vpand xmm0, xmm3, xmm2
+ vpblendmb xmm0 {k1}, xmm0, xmm3 + vpand xmm0, xmm0, xmm2
vxorps xmm2, xmm2, xmm2 vpcmpeqb xmm0, xmm0, xmm2 vpcmpeqd xmm2, xmm2, xmm2 @@ -366,7 +362,7 @@ G_M48875_IG16: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r vpand xmm0, xmm1, xmm0 vptest xmm0, xmm0 je SHORT G_M48875_IG19
- ;; size=200 bbWeight=4 PerfScore 168.00
+ ;; size=186 bbWeight=4 PerfScore 164.00
G_M48875_IG17: ; bbWeight=2, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi rdi r14 r15}, byref vpmovmskb ebp, xmm0 ;; size=4 bbWeight=2 PerfScore 4.00 @@ -433,7 +429,7 @@ RWD128 dq 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F RWD160 dq 1010101010101010h, 1010101010101010h, 1010101010101010h, 1010101010101010h
-; Total bytes of code 1118, prolog size 86, PerfScore 1254.58, instruction count 240, allocated bytes for code 1118 (MethodHash=36e94114) for method System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
+; Total bytes of code 1090, prolog size 86, PerfScore 1246.58, instruction count 236, allocated bytes for code 1090 (MethodHash=36e94114) for method System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
; ============================================================ Unwind Info:

-17 (-1.75%) : 169934.dasm - System.Numerics.Tests.GenericVectorTests:TestConditionalSelectuint:this (FullOpts)

@@ -15,7 +15,7 @@ ;* V04 loc3 [V04 ] ( 0, 0 ) simd32 -> zero-ref <System.Numerics.Vector`1[uint]> ; V05 loc4 [V05,T16] ( 2, 2 ) simd32 -> mm8 <System.Numerics.Vector`1[uint]> ;* V06 loc5 [V06 ] ( 0, 0 ) simd32 -> zero-ref <System.Numerics.Vector`1[uint]>
-; V07 loc6 [V07,T17] ( 2, 2 ) simd32 -> mm8 <System.Numerics.Vector`1[uint]>
+; V07 loc6 [V07,T17] ( 2, 2 ) simd32 -> mm6 <System.Numerics.Vector`1[uint]>
;* V08 loc7 [V08 ] ( 0, 0 ) struct ( 8) zero-ref ld-addr-op <System.Nullable`1[int]> ; V09 OutArgs [V09 ] ( 1, 1 ) struct (32) [rsp+0x00] do-not-enreg[XS] addr-exposed "OutgoingArgSpace" ;* V10 tmp1 [V10 ] ( 0, 0 ) ref -> zero-ref class-hnd exact "NewObj constructor temp" <System.Numerics.Tests.GenericVectorTests+<>c__DisplayClass670_0`1[uint]> @@ -26,7 +26,7 @@ ;* V15 tmp6 [V15,T08] ( 0, 0 ) int -> zero-ref "Inline stloc first use temp" ; V16 tmp7 [V16,T12] ( 9, 18 ) simd32 -> mm8 ld-addr-op "Inlining Arg" <System.Numerics.Vector`1[uint]> ;* V17 tmp8 [V17,T09] ( 0, 0 ) int -> zero-ref "Inline stloc first use temp"
-; V18 tmp9 [V18,T13] ( 9, 18 ) simd32 -> mm8 ld-addr-op "Inlining Arg" <System.Numerics.Vector`1[uint]>
+; V18 tmp9 [V18,T13] ( 9, 18 ) simd32 -> mm6 ld-addr-op "Inlining Arg" <System.Numerics.Vector`1[uint]>
;* V19 tmp10 [V19,T10] ( 0, 0 ) ubyte -> zero-ref single-def "field V08.hasValue (fldOffset=0x0)" P-INDEP ;* V20 tmp11 [V20,T11] ( 0, 0 ) int -> zero-ref single-def "field V08.value (fldOffset=0x4)" P-INDEP ; V21 tmp12 [V21,T02] ( 4, 8 ) struct ( 8) [rsp+0x20] do-not-enreg[SF] "by-value struct argument" <System.Nullable`1[int]> @@ -104,8 +104,7 @@ G_M21446_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref jl G_M21446_IG06 vmovups ymm7, ymmword ptr [rcx+0x10] vpcmpud k1, ymm6, ymm7, 6
- vpmovm2d ymm8, k1 - vpternlogd ymm8, ymm6, ymm7, -54
+ vpblendmd ymm8 {k1}, ymm7, ymm6
mov rcx, 0xD1FFAB1E ; System.Action`2[int,uint] ; gcrRegs -[rcx] vextractf128 xmm9, ymm6, 1 @@ -151,7 +150,7 @@ G_M21446_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref mov rcx, gword ptr [rsi+0x08] ; gcrRegs +[rcx] vextractf128 xmm12, ymm8, 1
- ;; size=312 bbWeight=1 PerfScore 88.00
+ ;; size=305 bbWeight=1 PerfScore 86.83
G_M21446_IG03: ; bbWeight=1, extend call [rsi+0x18]<unknown method> ; gcrRegs -[rcx] @@ -208,10 +207,9 @@ G_M21446_IG03: ; bbWeight=1, extend vinsertf128 ymm6, ymm6, xmm9, 1 vinsertf128 ymm7, ymm7, xmm10, 1 vpcmpud k1, ymm6, ymm7, 2
- vpmovm2d ymm8, k1 - vpternlogd ymm8, ymm6, ymm7, -54
+ vpblendmd ymm6 {k1}, ymm7, ymm6
mov rcx, 0xD1FFAB1E ; System.Action`2[int,uint]
- vextractf128 xmm6, ymm8, 1
+ vextractf128 xmm7, ymm6, 1
call CORINFO_HELP_NEWSFAST ; gcrRegs +[rax] ; gcr arg pop 0 @@ -226,79 +224,79 @@ G_M21446_IG03: ; bbWeight=1, extend ; byrRegs -[rcx] mov r8, 0xD1FFAB1E ; code for <unknown method> mov qword ptr [rsi+0x18], r8
- vinsertf128 ymm8, ymm8, xmm6, 1 - vmovd r8d, xmm8
+ vinsertf128 ymm6, ymm6, xmm7, 1 + vmovd r8d, xmm6
xor edx, edx mov rcx, gword ptr [rsi+0x08] ; gcrRegs +[rcx]
- vextractf128 xmm7, ymm8, 1
+ vextractf128 xmm8, ymm6, 1
call [rsi+0x18]<unknown method> ; gcrRegs -[rcx] ; gcr arg pop 0
- vinsertf128 ymm8, ymm8, xmm7, 1 - vmovaps ymm0, ymm8
+ vinsertf128 ymm6, ymm6, xmm8, 1 + vmovaps ymm0, ymm6
vpextrd r8d, xmm0, 1 mov edx, 1 mov rcx, gword ptr [rsi+0x08] ; gcrRegs +[rcx]
- vextractf128 xmm7, ymm8, 1
+ vextractf128 xmm8, ymm6, 1
call [rsi+0x18]<unknown method> ; gcrRegs -[rcx] ; gcr arg pop 0
- vinsertf128 ymm8, ymm8, xmm7, 1 - vmovaps ymm0, ymm8
+ vinsertf128 ymm6, ymm6, xmm8, 1 + vmovaps ymm0, ymm6
vpextrd r8d, xmm0, 2 mov edx, 2 mov rcx, gword ptr [rsi+0x08] ; gcrRegs +[rcx]
- vextractf128 xmm7, ymm8, 1
+ vextractf128 xmm8, ymm6, 1
call [rsi+0x18]<unknown method> ; gcrRegs -[rcx] ; gcr arg pop 0
- ;; size=353 bbWeight=1 PerfScore 120.75
+ vinsertf128 ymm6, ymm6, xmm8, 1 + ;; size=350 bbWeight=1 PerfScore 121.58
G_M21446_IG04: ; bbWeight=1, extend
- vinsertf128 ymm8, ymm8, xmm7, 1 - vmovaps ymm0, ymm8
+ vmovaps ymm0, ymm6
vpextrd r8d, xmm0, 3 mov edx, 3 mov rcx, gword ptr [rsi+0x08] ; gcrRegs +[rcx]
- vextractf128 xmm7, ymm8, 1
+ vextractf128 xmm8, ymm6, 1
call [rsi+0x18]<unknown method> ; gcrRegs -[rcx] ; gcr arg pop 0
- vinsertf128 ymm8, ymm8, xmm7, 1 - vextracti128 xmm0, ymm8, 1
+ vinsertf128 ymm6, ymm6, xmm8, 1 + vextracti128 xmm0, ymm6, 1
vmovd r8d, xmm0 mov edx, 4 mov rcx, gword ptr [rsi+0x08] ; gcrRegs +[rcx]
- vextractf128 xmm7, ymm8, 1
+ vextractf128 xmm8, ymm6, 1
call [rsi+0x18]<unknown method> ; gcrRegs -[rcx] ; gcr arg pop 0
- vinsertf128 ymm8, ymm8, xmm7, 1 - vextracti128 xmm0, ymm8, 1
+ vinsertf128 ymm6, ymm6, xmm8, 1 + vextracti128 xmm0, ymm6, 1
vpextrd r8d, xmm0, 1 mov edx, 5 mov rcx, gword ptr [rsi+0x08] ; gcrRegs +[rcx]
- vextractf128 xmm7, ymm8, 1
+ vextractf128 xmm8, ymm6, 1
call [rsi+0x18]<unknown method> ; gcrRegs -[rcx] ; gcr arg pop 0
- vinsertf128 ymm8, ymm8, xmm7, 1 - vextracti128 xmm0, ymm8, 1
+ vinsertf128 ymm6, ymm6, xmm8, 1 + vextracti128 xmm0, ymm6, 1
vpextrd r8d, xmm0, 2 mov edx, 6 mov rcx, gword ptr [rsi+0x08] ; gcrRegs +[rcx]
- vextractf128 xmm7, ymm8, 1
+ vextractf128 xmm8, ymm6, 1
call [rsi+0x18]<unknown method> ; gcrRegs -[rcx] ; gcr arg pop 0
- vinsertf128 ymm8, ymm8, xmm7, 1 - vextracti128 xmm0, ymm8, 1
+ vinsertf128 ymm6, ymm6, xmm8, 1 + vextracti128 xmm0, ymm6, 1
vpextrd r8d, xmm0, 3 mov edx, 7 mov rcx, gword ptr [rsi+0x08] @@ -307,7 +305,7 @@ G_M21446_IG04: ; bbWeight=1, extend ; gcrRegs -[rcx rsi] ; gcr arg pop 0 nop
- ;; size=173 bbWeight=1 PerfScore 66.75
+ ;; size=166 bbWeight=1 PerfScore 64.75
G_M21446_IG05: ; bbWeight=1, epilog, nogc, extend vmovaps xmm6, xmmword ptr [rsp+0x90] vmovaps xmm7, xmmword ptr [rsp+0x80] @@ -328,7 +326,7 @@ G_M21446_IG06: ; bbWeight=0, gcVars=0000000000000000 {}, gcrefRegs=0000 { int3 ;; size=7 bbWeight=0 PerfScore 0.00
-; Total bytes of code 973, prolog size 67, PerfScore 325.25, instruction count 194, allocated bytes for code 973 (MethodHash=8544ac39) for method System.Numerics.Tests.GenericVectorTests:TestConditionalSelect[uint]():this (FullOpts)
+; Total bytes of code 956, prolog size 67, PerfScore 322.92, instruction count 192, allocated bytes for code 956 (MethodHash=8544ac39) for method System.Numerics.Tests.GenericVectorTests:TestConditionalSelect[uint]():this (FullOpts)
; ============================================================ Unwind Info:

smoke_tests.nativeaot.windows.x64.checked.mch

-28 (-2.52%) : 19903.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)

@@ -185,12 +185,11 @@ G_M48875_IG06: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r vpshufb ymm1, ymm3, ymm1 vpand ymm2, ymm2, ymmword ptr [reloc @RWD96] vpcmpub k1, ymm2, ymmword ptr [reloc @RWD128], 6
- vpmovm2b ymm3, k1 - vpsubb ymm4, ymm2, ymmword ptr [reloc @RWD160] - vpshufb ymm4, ymm9, ymm4
+ vpsubb ymm3, ymm2, ymmword ptr [reloc @RWD160] + vpshufb ymm3, ymm9, ymm3
vpshufb ymm2, ymm8, ymm2
- vpternlogd ymm3, ymm4, ymm2, -54 - vpand ymm1, ymm3, ymm1
+ vpblendmb ymm2 {k1}, ymm2, ymm3 + vpand ymm1, ymm2, ymm1
vxorps ymm2, ymm2, ymm2 vpcmpeqb ymm1, ymm1, ymm2 vpcmpeqd ymm2, ymm2, ymm2 @@ -201,12 +200,11 @@ G_M48875_IG06: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r vpshufb ymm2, ymm3, ymm2 vpand ymm0, ymm0, ymmword ptr [reloc @RWD96] vpcmpub k1, ymm0, ymmword ptr [reloc @RWD128], 6
- vpmovm2b ymm3, k1 - vpsubb ymm4, ymm0, ymmword ptr [reloc @RWD160] - vpshufb ymm4, ymm9, ymm4
+ vpsubb ymm3, ymm0, ymmword ptr [reloc @RWD160] + vpshufb ymm3, ymm9, ymm3
vpshufb ymm0, ymm8, ymm0
- vpternlogd ymm3, ymm4, ymm0, -54 - vpand ymm0, ymm3, ymm2
+ vpblendmb ymm0 {k1}, ymm0, ymm3 + vpand ymm0, ymm0, ymm2
vxorps ymm2, ymm2, ymm2 vpcmpeqb ymm0, ymm0, ymm2 vpcmpeqd ymm2, ymm2, ymm2 @@ -214,7 +212,7 @@ G_M48875_IG06: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r vpand ymm0, ymm1, ymm0 vptest ymm0, ymm0 je SHORT G_M48875_IG09
- ;; size=248 bbWeight=4 PerfScore 317.33
+ ;; size=234 bbWeight=4 PerfScore 313.33
G_M48875_IG07: ; bbWeight=2, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi rdi r14 r15}, byref vpermq ymm0, ymm0, -40 vpmovmskb ebp, ymm0 @@ -337,12 +335,11 @@ G_M48875_IG16: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r vpshufb xmm1, xmm10, xmm1 vpand xmm2, xmm2, xmm11 vpcmpub k1, xmm2, xmm12, 6
- vpmovm2b xmm3, k1 - vpsubb xmm4, xmm2, xmm13 - vpshufb xmm4, xmm7, xmm4
+ vpsubb xmm3, xmm2, xmm13 + vpshufb xmm3, xmm7, xmm3
vpshufb xmm2, xmm6, xmm2
- vpternlogd xmm3, xmm4, xmm2, -54 - vpand xmm1, xmm3, xmm1
+ vpblendmb xmm2 {k1}, xmm2, xmm3 + vpand xmm1, xmm2, xmm1
vxorps xmm2, xmm2, xmm2 vpcmpeqb xmm1, xmm1, xmm2 vpcmpeqd xmm2, xmm2, xmm2 @@ -352,12 +349,11 @@ G_M48875_IG16: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r vpshufb xmm2, xmm10, xmm2 vpand xmm0, xmm0, xmm11 vpcmpub k1, xmm0, xmm12, 6
- vpmovm2b xmm3, k1 - vpsubb xmm4, xmm0, xmm13 - vpshufb xmm4, xmm7, xmm4
+ vpsubb xmm3, xmm0, xmm13 + vpshufb xmm3, xmm7, xmm3
vpshufb xmm0, xmm6, xmm0
- vpternlogd xmm3, xmm4, xmm0, -54 - vpand xmm0, xmm3, xmm2
+ vpblendmb xmm0 {k1}, xmm0, xmm3 + vpand xmm0, xmm0, xmm2
vxorps xmm2, xmm2, xmm2 vpcmpeqb xmm0, xmm0, xmm2 vpcmpeqd xmm2, xmm2, xmm2 @@ -365,7 +361,7 @@ G_M48875_IG16: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r vpand xmm0, xmm1, xmm0 vptest xmm0, xmm0 je SHORT G_M48875_IG19
- ;; size=200 bbWeight=4 PerfScore 168.00
+ ;; size=186 bbWeight=4 PerfScore 164.00
G_M48875_IG17: ; bbWeight=2, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi rdi r14 r15}, byref vpmovmskb ebp, xmm0 ;; size=4 bbWeight=2 PerfScore 4.00 @@ -432,7 +428,7 @@ RWD128 dq 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F RWD160 dq 1010101010101010h, 1010101010101010h, 1010101010101010h, 1010101010101010h
-; Total bytes of code 1109, prolog size 86, PerfScore 1189.83, instruction count 240, allocated bytes for code 1109 (MethodHash=36e94114) for method System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
+; Total bytes of code 1081, prolog size 86, PerfScore 1181.83, instruction count 236, allocated bytes for code 1081 (MethodHash=36e94114) for method System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
; ============================================================ Unwind Info:

Details

Improvements/regressions per collection

Collection Contexts with diffs Improvements Regressions Same size Improvements (bytes) Regressions (bytes)
benchmarks.run.windows.x64.checked.mch 1 1 0 0 -28 +0
benchmarks.run_pgo.windows.x64.checked.mch 0 0 0 0 -0 +0
benchmarks.run_tiered.windows.x64.checked.mch 0 0 0 0 -0 +0
coreclr_tests.run.windows.x64.checked.mch 16 16 0 0 -528 +0
libraries.crossgen2.windows.x64.checked.mch 0 0 0 0 -0 +0
libraries.pmi.windows.x64.checked.mch 24 24 0 0 -499 +0
libraries_tests.run.windows.x64.Release.mch 29 29 0 0 -1,014 +0
librariestestsnotieredcompilation.run.windows.x64.Release.mch 7 7 0 0 -759 +0
realworld.run.windows.x64.checked.mch 0 0 0 0 -0 +0
smoke_tests.nativeaot.windows.x64.checked.mch 1 1 0 0 -28 +0
78 78 0 0 -2,856 +0

Context information

Collection Diffed contexts MinOpts FullOpts Missed, base Missed, diff
benchmarks.run.windows.x64.checked.mch 27,854 4 27,850 232 (0.83%) 232 (0.83%)
benchmarks.run_pgo.windows.x64.checked.mch 101,589 49,789 51,800 129 (0.13%) 129 (0.13%)
benchmarks.run_tiered.windows.x64.checked.mch 54,309 36,842 17,467 76 (0.14%) 76 (0.14%)
coreclr_tests.run.windows.x64.checked.mch 573,547 340,982 232,565 442 (0.08%) 442 (0.08%)
libraries.crossgen2.windows.x64.checked.mch 243,425 15 243,410 0 (0.00%) 0 (0.00%)
libraries.pmi.windows.x64.checked.mch 306,302 6 306,296 2,196 (0.71%) 2,196 (0.71%)
libraries_tests.run.windows.x64.Release.mch 672,162 479,203 192,959 1,125 (0.17%) 1,125 (0.17%)
librariestestsnotieredcompilation.run.windows.x64.Release.mch 318,324 21,885 296,439 2,187 (0.68%) 2,187 (0.68%)
realworld.run.windows.x64.checked.mch 36,492 3 36,489 398 (1.08%) 398 (1.08%)
smoke_tests.nativeaot.windows.x64.checked.mch 32,409 11 32,398 3 (0.01%) 3 (0.01%)
2,366,413 928,740 1,437,673 6,788 (0.29%) 6,788 (0.29%)

jit-analyze output

benchmarks.run.windows.x64.checked.mch

To reproduce these diffs on Windows x64: superpmi.py asmdiffs -target_os windows -target_arch x64 -arch x64


Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 8538947 (overridden on cmd)
Total bytes of diff: 8538919 (overridden on cmd)
Total bytes of delta: -28 (-0.00 % of base)
    diff is an improvement.
    relative diff is an improvement.

Detail diffs



Top file improvements (bytes):
         -28 : 24358.dasm (-2.50 % of base)

1 total files with Code Size differences (1 improved, 0 regressed), 0 unchanged.

Top method improvements (bytes):
         -28 (-2.50 % of base) : 24358.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)

Top method improvements (percentages):
         -28 (-2.50 % of base) : 24358.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)

1 total methods with Code Size differences (1 improved, 0 regressed).


coreclr_tests.run.windows.x64.checked.mch

To reproduce these diffs on Windows x64: superpmi.py asmdiffs -target_os windows -target_arch x64 -arch x64


Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 393235161 (overridden on cmd)
Total bytes of diff: 393234633 (overridden on cmd)
Total bytes of delta: -528 (-0.00 % of base)
    diff is an improvement.
    relative diff is an improvement.

Detail diffs



Top file improvements (bytes):
         -56 : 174838.dasm (-2.83 % of base)
         -56 : 174841.dasm (-2.86 % of base)
         -56 : 174837.dasm (-2.83 % of base)
         -56 : 174842.dasm (-2.82 % of base)
         -32 : 429091.dasm (-0.77 % of base)
         -32 : 429094.dasm (-0.78 % of base)
         -32 : 429095.dasm (-0.77 % of base)
         -32 : 429090.dasm (-0.77 % of base)
         -28 : 174840.dasm (-1.42 % of base)
         -28 : 174835.dasm (-1.46 % of base)
         -28 : 174836.dasm (-1.43 % of base)
         -28 : 174839.dasm (-1.42 % of base)
         -16 : 429089.dasm (-0.39 % of base)
         -16 : 429092.dasm (-0.39 % of base)
         -16 : 429093.dasm (-0.39 % of base)
         -16 : 429088.dasm (-0.39 % of base)

16 total files with Code Size differences (16 improved, 0 regressed), 0 unchanged.

Top method improvements (bytes):
         -56 (-2.83 % of base) : 174838.dasm - VectorTest+VectorRelopTest`1[ubyte]:VectorRelOp(ubyte,ubyte):int (FullOpts)
         -56 (-2.86 % of base) : 174841.dasm - VectorTest+VectorRelopTest`1[uint]:VectorRelOp(uint,uint):int (FullOpts)
         -56 (-2.82 % of base) : 174842.dasm - VectorTest+VectorRelopTest`1[ulong]:VectorRelOp(ulong,ulong):int (FullOpts)
         -56 (-2.83 % of base) : 174837.dasm - VectorTest+VectorRelopTest`1[ushort]:VectorRelOp(ushort,ushort):int (FullOpts)
         -32 (-0.77 % of base) : 429091.dasm - VectorTest+VectorRelopTest`1[ubyte]:VectorRelOp(ubyte,ubyte):int (Instrumented Tier0)
         -32 (-0.78 % of base) : 429094.dasm - VectorTest+VectorRelopTest`1[uint]:VectorRelOp(uint,uint):int (Instrumented Tier0)
         -32 (-0.77 % of base) : 429095.dasm - VectorTest+VectorRelopTest`1[ulong]:VectorRelOp(ulong,ulong):int (Instrumented Tier0)
         -32 (-0.77 % of base) : 429090.dasm - VectorTest+VectorRelopTest`1[ushort]:VectorRelOp(ushort,ushort):int (Instrumented Tier0)
         -28 (-1.42 % of base) : 174840.dasm - VectorTest+VectorRelopTest`1[byte]:VectorRelOp(byte,byte):int (FullOpts)
         -28 (-1.46 % of base) : 174835.dasm - VectorTest+VectorRelopTest`1[int]:VectorRelOp(int,int):int (FullOpts)
         -28 (-1.43 % of base) : 174836.dasm - VectorTest+VectorRelopTest`1[long]:VectorRelOp(long,long):int (FullOpts)
         -28 (-1.42 % of base) : 174839.dasm - VectorTest+VectorRelopTest`1[short]:VectorRelOp(short,short):int (FullOpts)
         -16 (-0.39 % of base) : 429093.dasm - VectorTest+VectorRelopTest`1[byte]:VectorRelOp(byte,byte):int (Instrumented Tier0)
         -16 (-0.39 % of base) : 429088.dasm - VectorTest+VectorRelopTest`1[int]:VectorRelOp(int,int):int (Instrumented Tier0)
         -16 (-0.39 % of base) : 429089.dasm - VectorTest+VectorRelopTest`1[long]:VectorRelOp(long,long):int (Instrumented Tier0)
         -16 (-0.39 % of base) : 429092.dasm - VectorTest+VectorRelopTest`1[short]:VectorRelOp(short,short):int (Instrumented Tier0)

Top method improvements (percentages):
         -56 (-2.86 % of base) : 174841.dasm - VectorTest+VectorRelopTest`1[uint]:VectorRelOp(uint,uint):int (FullOpts)
         -56 (-2.83 % of base) : 174838.dasm - VectorTest+VectorRelopTest`1[ubyte]:VectorRelOp(ubyte,ubyte):int (FullOpts)
         -56 (-2.83 % of base) : 174837.dasm - VectorTest+VectorRelopTest`1[ushort]:VectorRelOp(ushort,ushort):int (FullOpts)
         -56 (-2.82 % of base) : 174842.dasm - VectorTest+VectorRelopTest`1[ulong]:VectorRelOp(ulong,ulong):int (FullOpts)
         -28 (-1.46 % of base) : 174835.dasm - VectorTest+VectorRelopTest`1[int]:VectorRelOp(int,int):int (FullOpts)
         -28 (-1.43 % of base) : 174836.dasm - VectorTest+VectorRelopTest`1[long]:VectorRelOp(long,long):int (FullOpts)
         -28 (-1.42 % of base) : 174840.dasm - VectorTest+VectorRelopTest`1[byte]:VectorRelOp(byte,byte):int (FullOpts)
         -28 (-1.42 % of base) : 174839.dasm - VectorTest+VectorRelopTest`1[short]:VectorRelOp(short,short):int (FullOpts)
         -32 (-0.78 % of base) : 429094.dasm - VectorTest+VectorRelopTest`1[uint]:VectorRelOp(uint,uint):int (Instrumented Tier0)
         -32 (-0.77 % of base) : 429095.dasm - VectorTest+VectorRelopTest`1[ulong]:VectorRelOp(ulong,ulong):int (Instrumented Tier0)
         -32 (-0.77 % of base) : 429091.dasm - VectorTest+VectorRelopTest`1[ubyte]:VectorRelOp(ubyte,ubyte):int (Instrumented Tier0)
         -32 (-0.77 % of base) : 429090.dasm - VectorTest+VectorRelopTest`1[ushort]:VectorRelOp(ushort,ushort):int (Instrumented Tier0)
         -16 (-0.39 % of base) : 429088.dasm - VectorTest+VectorRelopTest`1[int]:VectorRelOp(int,int):int (Instrumented Tier0)
         -16 (-0.39 % of base) : 429089.dasm - VectorTest+VectorRelopTest`1[long]:VectorRelOp(long,long):int (Instrumented Tier0)
         -16 (-0.39 % of base) : 429093.dasm - VectorTest+VectorRelopTest`1[byte]:VectorRelOp(byte,byte):int (Instrumented Tier0)
         -16 (-0.39 % of base) : 429092.dasm - VectorTest+VectorRelopTest`1[short]:VectorRelOp(short,short):int (Instrumented Tier0)

16 total methods with Code Size differences (16 improved, 0 regressed).


libraries.pmi.windows.x64.checked.mch

To reproduce these diffs on Windows x64: superpmi.py asmdiffs -target_os windows -target_arch x64 -arch x64


Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 60138217 (overridden on cmd)
Total bytes of diff: 60137718 (overridden on cmd)
Total bytes of delta: -499 (-0.00 % of base)
    diff is an improvement.
    relative diff is an improvement.

Detail diffs



Top file improvements (bytes):
         -56 : 294041.dasm (-20.66 % of base)
         -56 : 293984.dasm (-20.66 % of base)
         -32 : 27695.dasm (-10.85 % of base)
         -28 : 293986.dasm (-16.28 % of base)
         -28 : 27702.dasm (-2.50 % of base)
         -28 : 294043.dasm (-16.28 % of base)
         -26 : 27697.dasm (-8.78 % of base)
         -21 : 293983.dasm (-20.39 % of base)
         -21 : 294005.dasm (-20.39 % of base)
         -21 : 294040.dasm (-20.39 % of base)
         -21 : 294062.dasm (-20.39 % of base)
         -14 : 293982.dasm (-16.28 % of base)
         -14 : 294042.dasm (-13.73 % of base)
         -14 : 293985.dasm (-13.73 % of base)
         -14 : 294003.dasm (-16.87 % of base)
         -14 : 294038.dasm (-16.87 % of base)
         -14 : 294060.dasm (-16.87 % of base)
         -14 : 294061.dasm (-16.28 % of base)
         -14 : 293981.dasm (-16.87 % of base)
         -14 : 294004.dasm (-16.28 % of base)

24 total files with Code Size differences (24 improved, 0 regressed), 0 unchanged.

Top method improvements (bytes):
         -56 (-20.66 % of base) : 293984.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte]):ubyte (FullOpts)
         -56 (-20.66 % of base) : 294041.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte]):ubyte (FullOpts)
         -32 (-10.85 % of base) : 27695.dasm - System.Buffers.ProbabilisticMap:ContainsMask32CharsAvx2(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte],byref):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
         -28 (-2.50 % of base) : 27702.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
         -28 (-16.28 % of base) : 293986.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte]):ubyte (FullOpts)
         -28 (-16.28 % of base) : 294043.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte]):ubyte (FullOpts)
         -26 (-8.78 % of base) : 27697.dasm - System.Buffers.ProbabilisticMap:ContainsMask16Chars(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte],byref):System.Runtime.Intrinsics.Vector128`1[ubyte] (FullOpts)
         -21 (-20.39 % of base) : 293983.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
         -21 (-20.39 % of base) : 294005.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
         -21 (-20.39 % of base) : 294040.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
         -21 (-20.39 % of base) : 294062.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
         -14 (-16.87 % of base) : 293981.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]):System.Runtime.Intrinsics.Vector128`1[ubyte] (FullOpts)
         -14 (-13.73 % of base) : 293985.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte]):ubyte (FullOpts)
         -14 (-16.28 % of base) : 293982.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte]):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
         -14 (-16.87 % of base) : 294003.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]):System.Runtime.Intrinsics.Vector128`1[ubyte] (FullOpts)
         -14 (-16.28 % of base) : 294004.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte]):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
         -14 (-16.87 % of base) : 294038.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]):System.Runtime.Intrinsics.Vector128`1[ubyte] (FullOpts)
         -14 (-13.73 % of base) : 294042.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte]):ubyte (FullOpts)
         -14 (-16.28 % of base) : 294039.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte]):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
         -14 (-16.87 % of base) : 294060.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]):System.Runtime.Intrinsics.Vector128`1[ubyte] (FullOpts)

Top method improvements (percentages):
         -56 (-20.66 % of base) : 293984.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte]):ubyte (FullOpts)
         -56 (-20.66 % of base) : 294041.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte]):ubyte (FullOpts)
         -21 (-20.39 % of base) : 293983.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
         -21 (-20.39 % of base) : 294005.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
         -21 (-20.39 % of base) : 294040.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
         -21 (-20.39 % of base) : 294062.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
         -14 (-16.87 % of base) : 293981.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]):System.Runtime.Intrinsics.Vector128`1[ubyte] (FullOpts)
         -14 (-16.87 % of base) : 294003.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]):System.Runtime.Intrinsics.Vector128`1[ubyte] (FullOpts)
         -14 (-16.87 % of base) : 294038.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]):System.Runtime.Intrinsics.Vector128`1[ubyte] (FullOpts)
         -14 (-16.87 % of base) : 294060.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]):System.Runtime.Intrinsics.Vector128`1[ubyte] (FullOpts)
         -14 (-16.28 % of base) : 293982.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte]):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
         -28 (-16.28 % of base) : 293986.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte]):ubyte (FullOpts)
         -14 (-16.28 % of base) : 294004.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte]):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
         -14 (-16.28 % of base) : 294039.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte]):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
         -28 (-16.28 % of base) : 294043.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte]):ubyte (FullOpts)
         -14 (-16.28 % of base) : 294061.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte]):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
         -14 (-13.73 % of base) : 293985.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte]):ubyte (FullOpts)
         -14 (-13.73 % of base) : 294042.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte]):ubyte (FullOpts)
         -32 (-10.85 % of base) : 27695.dasm - System.Buffers.ProbabilisticMap:ContainsMask32CharsAvx2(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte],byref):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
         -26 (-8.78 % of base) : 27697.dasm - System.Buffers.ProbabilisticMap:ContainsMask16Chars(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte],byref):System.Runtime.Intrinsics.Vector128`1[ubyte] (FullOpts)

24 total methods with Code Size differences (24 improved, 0 regressed).


libraries_tests.run.windows.x64.Release.mch

To reproduce these diffs on Windows x64: superpmi.py asmdiffs -target_os windows -target_arch x64 -arch x64


Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 278258792 (overridden on cmd)
Total bytes of diff: 278257778 (overridden on cmd)
Total bytes of delta: -1014 (-0.00 % of base)
    diff is an improvement.
    relative diff is an improvement.

Detail diffs



Top file improvements (bytes):
        -368 : 393123.dasm (-15.51 % of base)
         -84 : 393118.dasm (-10.81 % of base)
         -84 : 386805.dasm (-10.98 % of base)
         -84 : 393473.dasm (-11.02 % of base)
         -84 : 386605.dasm (-10.98 % of base)
         -32 : 342443.dasm (-10.85 % of base)
         -28 : 342446.dasm (-2.40 % of base)
         -26 : 342451.dasm (-8.78 % of base)
         -14 : 385680.dasm (-16.09 % of base)
         -14 : 386233.dasm (-16.67 % of base)
         -14 : 393193.dasm (-16.09 % of base)
         -14 : 385681.dasm (-16.09 % of base)
         -14 : 385965.dasm (-16.09 % of base)
         -14 : 393122.dasm (-16.09 % of base)
         -14 : 385964.dasm (-16.67 % of base)
         -14 : 393121.dasm (-16.09 % of base)
         -14 : 393190.dasm (-16.67 % of base)
         -14 : 393191.dasm (-16.67 % of base)
         -14 : 393192.dasm (-16.09 % of base)
          -7 : 342450.dasm (-5.79 % of base)

29 total files with Code Size differences (29 improved, 0 regressed), 0 unchanged.

Top method improvements (bytes):
        -368 (-15.51 % of base) : 393123.dasm - System.Numerics.Tensors.TensorPrimitives:<InvokeSpanSpanIntoSpan>g__Vectorized256|220_2[ulong,System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ulong]](byref,byref,byref,ulong) (Tier1)
         -84 (-11.02 % of base) : 393473.dasm - System.Numerics.Tensors.TensorPrimitives:InvokeSpanSpanIntoSpan[ulong,System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ulong]](System.ReadOnlySpan`1[ulong],System.ReadOnlySpan`1[ulong],System.Span`1[ulong]) (Tier1)
         -84 (-10.98 % of base) : 386605.dasm - System.Numerics.Tensors.TensorPrimitives:InvokeSpanSpanIntoSpan[ulong,System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ulong]](System.ReadOnlySpan`1[ulong],System.ReadOnlySpan`1[ulong],System.Span`1[ulong]) (Tier1)
         -84 (-10.81 % of base) : 393118.dasm - System.Numerics.Tensors.TensorPrimitives:InvokeSpanSpanIntoSpan[ulong,System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ulong]](System.ReadOnlySpan`1[ulong],System.ReadOnlySpan`1[ulong],System.Span`1[ulong]) (Tier1)
         -84 (-10.98 % of base) : 386805.dasm - System.Numerics.Tensors.TensorPrimitives:InvokeSpanSpanIntoSpan[ulong,System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ulong]](System.ReadOnlySpan`1[ulong],System.ReadOnlySpan`1[ulong],System.Span`1[ulong]) (Tier1)
         -32 (-10.85 % of base) : 342443.dasm - System.Buffers.ProbabilisticMap:ContainsMask32CharsAvx2(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte],byref):System.Runtime.Intrinsics.Vector256`1[ubyte] (Tier1)
         -28 (-2.40 % of base) : 342446.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (Tier1)
         -26 (-8.78 % of base) : 342451.dasm - System.Buffers.ProbabilisticMap:ContainsMask16Chars(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte],byref):System.Runtime.Intrinsics.Vector128`1[ubyte] (Tier1)
         -14 (-16.67 % of base) : 386233.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector128`1[ulong],System.Runtime.Intrinsics.Vector128`1[ulong]):System.Runtime.Intrinsics.Vector128`1[ulong] (Tier1)
         -14 (-16.67 % of base) : 393191.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector128`1[ulong],System.Runtime.Intrinsics.Vector128`1[ulong]):System.Runtime.Intrinsics.Vector128`1[ulong] (Tier1)
         -14 (-16.09 % of base) : 393193.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Tier1)
         -14 (-16.09 % of base) : 385681.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Tier1)
         -14 (-16.67 % of base) : 393190.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector128`1[ulong],System.Runtime.Intrinsics.Vector128`1[ulong]):System.Runtime.Intrinsics.Vector128`1[ulong] (Tier1)
         -14 (-16.09 % of base) : 385680.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Tier1)
         -14 (-16.09 % of base) : 393192.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Tier1)
         -14 (-16.67 % of base) : 385964.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector128`1[ulong],System.Runtime.Intrinsics.Vector128`1[ulong]):System.Runtime.Intrinsics.Vector128`1[ulong] (Tier1)
         -14 (-16.09 % of base) : 385965.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Tier1)
         -14 (-16.09 % of base) : 393122.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Tier1)
         -14 (-16.09 % of base) : 393121.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Tier1)
          -7 (-5.79 % of base) : 342450.dasm - System.Buffers.ProbabilisticMap:IsCharBitSet(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]):System.Runtime.Intrinsics.Vector128`1[ubyte] (Tier1)

Top method improvements (percentages):
         -14 (-16.67 % of base) : 386233.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector128`1[ulong],System.Runtime.Intrinsics.Vector128`1[ulong]):System.Runtime.Intrinsics.Vector128`1[ulong] (Tier1)
         -14 (-16.67 % of base) : 393191.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector128`1[ulong],System.Runtime.Intrinsics.Vector128`1[ulong]):System.Runtime.Intrinsics.Vector128`1[ulong] (Tier1)
         -14 (-16.67 % of base) : 393190.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector128`1[ulong],System.Runtime.Intrinsics.Vector128`1[ulong]):System.Runtime.Intrinsics.Vector128`1[ulong] (Tier1)
         -14 (-16.67 % of base) : 385964.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector128`1[ulong],System.Runtime.Intrinsics.Vector128`1[ulong]):System.Runtime.Intrinsics.Vector128`1[ulong] (Tier1)
         -14 (-16.09 % of base) : 393193.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Tier1)
         -14 (-16.09 % of base) : 385681.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Tier1)
         -14 (-16.09 % of base) : 385680.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Tier1)
         -14 (-16.09 % of base) : 393192.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Tier1)
         -14 (-16.09 % of base) : 385965.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Tier1)
         -14 (-16.09 % of base) : 393122.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Tier1)
         -14 (-16.09 % of base) : 393121.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Tier1)
        -368 (-15.51 % of base) : 393123.dasm - System.Numerics.Tensors.TensorPrimitives:<InvokeSpanSpanIntoSpan>g__Vectorized256|220_2[ulong,System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ulong]](byref,byref,byref,ulong) (Tier1)
         -84 (-11.02 % of base) : 393473.dasm - System.Numerics.Tensors.TensorPrimitives:InvokeSpanSpanIntoSpan[ulong,System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ulong]](System.ReadOnlySpan`1[ulong],System.ReadOnlySpan`1[ulong],System.Span`1[ulong]) (Tier1)
         -84 (-10.98 % of base) : 386605.dasm - System.Numerics.Tensors.TensorPrimitives:InvokeSpanSpanIntoSpan[ulong,System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ulong]](System.ReadOnlySpan`1[ulong],System.ReadOnlySpan`1[ulong],System.Span`1[ulong]) (Tier1)
         -84 (-10.98 % of base) : 386805.dasm - System.Numerics.Tensors.TensorPrimitives:InvokeSpanSpanIntoSpan[ulong,System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ulong]](System.ReadOnlySpan`1[ulong],System.ReadOnlySpan`1[ulong],System.Span`1[ulong]) (Tier1)
         -32 (-10.85 % of base) : 342443.dasm - System.Buffers.ProbabilisticMap:ContainsMask32CharsAvx2(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte],byref):System.Runtime.Intrinsics.Vector256`1[ubyte] (Tier1)
         -84 (-10.81 % of base) : 393118.dasm - System.Numerics.Tensors.TensorPrimitives:InvokeSpanSpanIntoSpan[ulong,System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ulong]](System.ReadOnlySpan`1[ulong],System.ReadOnlySpan`1[ulong],System.Span`1[ulong]) (Tier1)
         -26 (-8.78 % of base) : 342451.dasm - System.Buffers.ProbabilisticMap:ContainsMask16Chars(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte],byref):System.Runtime.Intrinsics.Vector128`1[ubyte] (Tier1)
          -7 (-5.79 % of base) : 342450.dasm - System.Buffers.ProbabilisticMap:IsCharBitSet(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]):System.Runtime.Intrinsics.Vector128`1[ubyte] (Tier1)
          -7 (-5.65 % of base) : 342442.dasm - System.Buffers.ProbabilisticMap:IsCharBitSetAvx2(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte]):System.Runtime.Intrinsics.Vector256`1[ubyte] (Tier1)

29 total methods with Code Size differences (29 improved, 0 regressed).


librariestestsnotieredcompilation.run.windows.x64.Release.mch

To reproduce these diffs on Windows x64: superpmi.py asmdiffs -target_os windows -target_arch x64 -arch x64


Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 135861173 (overridden on cmd)
Total bytes of diff: 135860414 (overridden on cmd)
Total bytes of delta: -759 (-0.00 % of base)
    diff is an improvement.
    relative diff is an improvement.

Detail diffs



Top file improvements (bytes):
        -182 : 168894.dasm (-17.50 % of base)
        -182 : 168616.dasm (-17.50 % of base)
        -154 : 169032.dasm (-17.09 % of base)
         -98 : 168947.dasm (-15.15 % of base)
         -98 : 168825.dasm (-15.15 % of base)
         -28 : 150728.dasm (-2.50 % of base)
         -17 : 169934.dasm (-1.75 % of base)

7 total files with Code Size differences (7 improved, 0 regressed), 0 unchanged.

Top method improvements (bytes):
        -182 (-17.50 % of base) : 168894.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ubyte,System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]](System.ReadOnlySpan`1[ubyte]):ubyte (FullOpts)
        -182 (-17.50 % of base) : 168616.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ubyte,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]](System.ReadOnlySpan`1[ubyte]):ubyte (FullOpts)
        -154 (-17.09 % of base) : 169032.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ushort,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ushort]](System.ReadOnlySpan`1[ushort]):ushort (FullOpts)
         -98 (-15.15 % of base) : 168947.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ulong,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]](System.ReadOnlySpan`1[ulong]):ulong (FullOpts)
         -98 (-15.15 % of base) : 168825.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ulong,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]](System.ReadOnlySpan`1[ulong]):ulong (FullOpts)
         -28 (-2.50 % of base) : 150728.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
         -17 (-1.75 % of base) : 169934.dasm - System.Numerics.Tests.GenericVectorTests:TestConditionalSelect[uint]():this (FullOpts)

Top method improvements (percentages):
        -182 (-17.50 % of base) : 168894.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ubyte,System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]](System.ReadOnlySpan`1[ubyte]):ubyte (FullOpts)
        -182 (-17.50 % of base) : 168616.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ubyte,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]](System.ReadOnlySpan`1[ubyte]):ubyte (FullOpts)
        -154 (-17.09 % of base) : 169032.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ushort,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ushort]](System.ReadOnlySpan`1[ushort]):ushort (FullOpts)
         -98 (-15.15 % of base) : 168947.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ulong,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]](System.ReadOnlySpan`1[ulong]):ulong (FullOpts)
         -98 (-15.15 % of base) : 168825.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ulong,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]](System.ReadOnlySpan`1[ulong]):ulong (FullOpts)
         -28 (-2.50 % of base) : 150728.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
         -17 (-1.75 % of base) : 169934.dasm - System.Numerics.Tests.GenericVectorTests:TestConditionalSelect[uint]():this (FullOpts)

7 total methods with Code Size differences (7 improved, 0 regressed).


smoke_tests.nativeaot.windows.x64.checked.mch

To reproduce these diffs on Windows x64: superpmi.py asmdiffs -target_os windows -target_arch x64 -arch x64


Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 5087315 (overridden on cmd)
Total bytes of diff: 5087287 (overridden on cmd)
Total bytes of delta: -28 (-0.00 % of base)
    diff is an improvement.
    relative diff is an improvement.

Detail diffs



Top file improvements (bytes):
         -28 : 19903.dasm (-2.52 % of base)

1 total files with Code Size differences (1 improved, 0 regressed), 0 unchanged.

Top method improvements (bytes):
         -28 (-2.52 % of base) : 19903.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)

Top method improvements (percentages):
         -28 (-2.52 % of base) : 19903.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)

1 total methods with Code Size differences (1 improved, 0 regressed).