Assembly Diffs
osx arm64
Diffs are based on 2,229,935 contexts (927,360 MinOpts, 1,302,575 FullOpts).
MISSED contexts: 6,082 (0.27%)
No diffs found.
Details
Context information
| Collection |
Diffed contexts |
MinOpts |
FullOpts |
Missed, base |
Missed, diff |
| benchmarks.run_pgo.osx.arm64.checked.mch |
84,643 |
48,345 |
36,298 |
183 (0.22%) |
183 (0.22%) |
| benchmarks.run_tiered.osx.arm64.checked.mch |
48,253 |
37,331 |
10,922 |
63 (0.13%) |
63 (0.13%) |
| coreclr_tests.run.osx.arm64.checked.mch |
586,148 |
358,028 |
228,120 |
437 (0.07%) |
437 (0.07%) |
| libraries.crossgen2.osx.arm64.checked.mch |
233,760 |
15 |
233,745 |
0 (0.00%) |
0 (0.00%) |
| libraries.pmi.osx.arm64.checked.mch |
313,588 |
18 |
313,570 |
2,028 (0.64%) |
2,028 (0.64%) |
| libraries_tests.run.osx.arm64.Release.mch |
631,294 |
462,062 |
169,232 |
963 (0.15%) |
963 (0.15%) |
| librariestestsnotieredcompilation.run.osx.arm64.Release.mch |
301,031 |
21,558 |
279,473 |
2,083 (0.69%) |
2,083 (0.69%) |
| realworld.run.osx.arm64.checked.mch |
31,218 |
3 |
31,215 |
325 (1.03%) |
325 (1.03%) |
|
2,229,935 |
927,360 |
1,302,575 |
6,082 (0.27%) |
6,082 (0.27%) |
windows arm64
Diffs are based on 2,308,464 contexts (929,692 MinOpts, 1,378,772 FullOpts).
MISSED contexts: 6,334 (0.27%)
No diffs found.
Details
Context information
| Collection |
Diffed contexts |
MinOpts |
FullOpts |
Missed, base |
Missed, diff |
| benchmarks.run.windows.arm64.checked.mch |
24,218 |
4 |
24,214 |
229 (0.94%) |
229 (0.94%) |
| benchmarks.run_pgo.windows.arm64.checked.mch |
96,879 |
48,066 |
48,813 |
104 (0.11%) |
104 (0.11%) |
| benchmarks.run_tiered.windows.arm64.checked.mch |
48,412 |
36,693 |
11,719 |
61 (0.13%) |
61 (0.13%) |
| coreclr_tests.run.windows.arm64.checked.mch |
595,265 |
362,539 |
232,726 |
438 (0.07%) |
438 (0.07%) |
| libraries.crossgen2.windows.arm64.checked.mch |
243,831 |
15 |
243,816 |
0 (0.00%) |
0 (0.00%) |
| libraries.pmi.windows.arm64.checked.mch |
302,817 |
6 |
302,811 |
2,054 (0.67%) |
2,054 (0.67%) |
| libraries_tests.run.windows.arm64.Release.mch |
625,154 |
460,799 |
164,355 |
900 (0.14%) |
900 (0.14%) |
| librariestestsnotieredcompilation.run.windows.arm64.Release.mch |
314,858 |
21,559 |
293,299 |
2,179 (0.69%) |
2,179 (0.69%) |
| realworld.run.windows.arm64.checked.mch |
32,878 |
3 |
32,875 |
366 (1.10%) |
366 (1.10%) |
| smoke_tests.nativeaot.windows.arm64.checked.mch |
24,152 |
8 |
24,144 |
3 (0.01%) |
3 (0.01%) |
|
2,308,464 |
929,692 |
1,378,772 |
6,334 (0.27%) |
6,334 (0.27%) |
windows x64
Diffs are based on 2,366,413 contexts (928,740 MinOpts, 1,437,673 FullOpts).
MISSED contexts: 6,788 (0.29%)
Overall (-2,856 bytes)
| Collection |
Base size (bytes) |
Diff size (bytes) |
| benchmarks.run.windows.x64.checked.mch |
8,538,947 |
-28 |
| coreclr_tests.run.windows.x64.checked.mch |
393,235,161 |
-528 |
| libraries.pmi.windows.x64.checked.mch |
60,138,217 |
-499 |
| libraries_tests.run.windows.x64.Release.mch |
278,258,792 |
-1,014 |
| librariestestsnotieredcompilation.run.windows.x64.Release.mch |
135,861,173 |
-759 |
| smoke_tests.nativeaot.windows.x64.checked.mch |
5,087,315 |
-28 |
MinOpts (-248 bytes)
| Collection |
Base size (bytes) |
Diff size (bytes) |
| coreclr_tests.run.windows.x64.checked.mch |
273,504,444 |
-192 |
| libraries_tests.run.windows.x64.Release.mch |
175,002,442 |
-56 |
FullOpts (-2,608 bytes)
| Collection |
Base size (bytes) |
Diff size (bytes) |
| benchmarks.run.windows.x64.checked.mch |
8,538,586 |
-28 |
| coreclr_tests.run.windows.x64.checked.mch |
119,730,717 |
-336 |
| libraries.pmi.windows.x64.checked.mch |
60,024,698 |
-499 |
| libraries_tests.run.windows.x64.Release.mch |
103,256,350 |
-958 |
| librariestestsnotieredcompilation.run.windows.x64.Release.mch |
124,984,011 |
-759 |
| smoke_tests.nativeaot.windows.x64.checked.mch |
5,086,368 |
-28 |
Example diffs
benchmarks.run.windows.x64.checked.mch
-28 (-2.50%) : 24358.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
@@ -186,12 +186,11 @@ G_M48875_IG06: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r
vpshufb ymm1, ymm3, ymm1
vpand ymm2, ymm2, ymmword ptr [reloc @RWD96]
vpcmpub k1, ymm2, ymmword ptr [reloc @RWD128], 6
- vpmovm2b ymm3, k1
- vpsubb ymm4, ymm2, ymmword ptr [reloc @RWD160]
- vpshufb ymm4, ymm9, ymm4
+ vpsubb ymm3, ymm2, ymmword ptr [reloc @RWD160]
+ vpshufb ymm3, ymm9, ymm3
vpshufb ymm2, ymm8, ymm2
- vpternlogd ymm3, ymm4, ymm2, -54
- vpand ymm1, ymm3, ymm1
+ vpblendmb ymm2 {k1}, ymm2, ymm3
+ vpand ymm1, ymm2, ymm1
vxorps ymm2, ymm2, ymm2
vpcmpeqb ymm1, ymm1, ymm2
vpcmpeqd ymm2, ymm2, ymm2
@@ -202,12 +201,11 @@ G_M48875_IG06: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r
vpshufb ymm2, ymm3, ymm2
vpand ymm0, ymm0, ymmword ptr [reloc @RWD96]
vpcmpub k1, ymm0, ymmword ptr [reloc @RWD128], 6
- vpmovm2b ymm3, k1
- vpsubb ymm4, ymm0, ymmword ptr [reloc @RWD160]
- vpshufb ymm4, ymm9, ymm4
+ vpsubb ymm3, ymm0, ymmword ptr [reloc @RWD160]
+ vpshufb ymm3, ymm9, ymm3
vpshufb ymm0, ymm8, ymm0
- vpternlogd ymm3, ymm4, ymm0, -54
- vpand ymm0, ymm3, ymm2
+ vpblendmb ymm0 {k1}, ymm0, ymm3
+ vpand ymm0, ymm0, ymm2
vxorps ymm2, ymm2, ymm2
vpcmpeqb ymm0, ymm0, ymm2
vpcmpeqd ymm2, ymm2, ymm2
@@ -215,7 +213,7 @@ G_M48875_IG06: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r
vpand ymm0, ymm1, ymm0
vptest ymm0, ymm0
je SHORT G_M48875_IG09
- ;; size=248 bbWeight=4 PerfScore 317.33
+ ;; size=234 bbWeight=4 PerfScore 313.33
G_M48875_IG07: ; bbWeight=2, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi rdi r14 r15}, byref
vpermq ymm0, ymm0, -40
vpmovmskb ebp, ymm0
@@ -338,12 +336,11 @@ G_M48875_IG16: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r
vpshufb xmm1, xmm10, xmm1
vpand xmm2, xmm2, xmm11
vpcmpub k1, xmm2, xmm12, 6
- vpmovm2b xmm3, k1
- vpsubb xmm4, xmm2, xmm13
- vpshufb xmm4, xmm7, xmm4
+ vpsubb xmm3, xmm2, xmm13
+ vpshufb xmm3, xmm7, xmm3
vpshufb xmm2, xmm6, xmm2
- vpternlogd xmm3, xmm4, xmm2, -54
- vpand xmm1, xmm3, xmm1
+ vpblendmb xmm2 {k1}, xmm2, xmm3
+ vpand xmm1, xmm2, xmm1
vxorps xmm2, xmm2, xmm2
vpcmpeqb xmm1, xmm1, xmm2
vpcmpeqd xmm2, xmm2, xmm2
@@ -353,12 +350,11 @@ G_M48875_IG16: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r
vpshufb xmm2, xmm10, xmm2
vpand xmm0, xmm0, xmm11
vpcmpub k1, xmm0, xmm12, 6
- vpmovm2b xmm3, k1
- vpsubb xmm4, xmm0, xmm13
- vpshufb xmm4, xmm7, xmm4
+ vpsubb xmm3, xmm0, xmm13
+ vpshufb xmm3, xmm7, xmm3
vpshufb xmm0, xmm6, xmm0
- vpternlogd xmm3, xmm4, xmm0, -54
- vpand xmm0, xmm3, xmm2
+ vpblendmb xmm0 {k1}, xmm0, xmm3
+ vpand xmm0, xmm0, xmm2
vxorps xmm2, xmm2, xmm2
vpcmpeqb xmm0, xmm0, xmm2
vpcmpeqd xmm2, xmm2, xmm2
@@ -366,7 +362,7 @@ G_M48875_IG16: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r
vpand xmm0, xmm1, xmm0
vptest xmm0, xmm0
je SHORT G_M48875_IG19
- ;; size=200 bbWeight=4 PerfScore 168.00
+ ;; size=186 bbWeight=4 PerfScore 164.00
G_M48875_IG17: ; bbWeight=2, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi rdi r14 r15}, byref
vpmovmskb ebp, xmm0
;; size=4 bbWeight=2 PerfScore 4.00
@@ -433,7 +429,7 @@ RWD128 dq 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F
RWD160 dq 1010101010101010h, 1010101010101010h, 1010101010101010h, 1010101010101010h
-; Total bytes of code 1118, prolog size 86, PerfScore 1254.58, instruction count 240, allocated bytes for code 1118 (MethodHash=36e94114) for method System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
+; Total bytes of code 1090, prolog size 86, PerfScore 1246.58, instruction count 236, allocated bytes for code 1090 (MethodHash=36e94114) for method System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
; ============================================================
Unwind Info:
coreclr_tests.run.windows.x64.checked.mch
-28 (-1.46%) : 174835.dasm - VectorTest+VectorRelopTest`1[int]:VectorRelOp(int,int):int (FullOpts)
@@ -331,10 +331,9 @@ G_M1266_IG17: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
;; size=11 bbWeight=4 PerfScore 6.00
G_M1266_IG18: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
vpcmpd k1, ymm8, ymm6, 2
- vpmovm2d ymm9, k1
- vpternlogd ymm9, ymm8, ymm7, -54
+ vpblendmd ymm9 {k1}, ymm7, ymm8
xor ebx, ebx
- ;; size=22 bbWeight=1 PerfScore 4.75
+ ;; size=15 bbWeight=1 PerfScore 3.58
G_M1266_IG19: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, isz
mov ecx, ebx
vmovups ymmword ptr [rsp+0x20], ymm9
@@ -389,10 +388,9 @@ G_M1266_IG21: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
;; size=11 bbWeight=4 PerfScore 6.00
G_M1266_IG22: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
vpcmpd k1, ymm8, ymm7, 2
- vpmovm2d ymm9, k1
- vpternlogd ymm9, ymm8, ymm7, -54
+ vpblendmd ymm9 {k1}, ymm7, ymm8
xor ebx, ebx
- ;; size=22 bbWeight=1 PerfScore 4.75
+ ;; size=15 bbWeight=1 PerfScore 3.58
G_M1266_IG23: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, isz
mov ecx, ebx
vmovups ymmword ptr [rsp+0x20], ymm9
@@ -447,10 +445,9 @@ G_M1266_IG25: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
;; size=11 bbWeight=4 PerfScore 6.00
G_M1266_IG26: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
vpcmpd k1, ymm8, ymm6, 5
- vpmovm2d ymm9, k1
- vpternlogd ymm9, ymm8, ymm7, -54
+ vpblendmd ymm9 {k1}, ymm7, ymm8
xor ebx, ebx
- ;; size=22 bbWeight=1 PerfScore 4.75
+ ;; size=15 bbWeight=1 PerfScore 3.58
G_M1266_IG27: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, isz
mov ecx, ebx
vmovups ymmword ptr [rsp+0x20], ymm9
@@ -505,10 +502,9 @@ G_M1266_IG29: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
;; size=11 bbWeight=4 PerfScore 6.00
G_M1266_IG30: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
vpcmpd k1, ymm7, ymm6, 5
- vpmovm2d ymm9, k1
- vpternlogd ymm9, ymm8, ymm7, -54
+ vpblendmd ymm9 {k1}, ymm7, ymm8
xor ebx, ebx
- ;; size=22 bbWeight=1 PerfScore 4.75
+ ;; size=15 bbWeight=1 PerfScore 3.58
G_M1266_IG31: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, isz
mov ecx, ebx
vmovups ymmword ptr [rsp+0x20], ymm9
@@ -687,7 +683,7 @@ G_M1266_IG42: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref,
ret
;; size=73 bbWeight=1 PerfScore 35.25
-; Total bytes of code 1922, prolog size 82, PerfScore 1043.58, instruction count 381, allocated bytes for code 1922 (MethodHash=881ffb0d) for method VectorTest+VectorRelopTest`1[int]:VectorRelOp(int,int):int (FullOpts)
+; Total bytes of code 1894, prolog size 82, PerfScore 1038.92, instruction count 377, allocated bytes for code 1894 (MethodHash=881ffb0d) for method VectorTest+VectorRelopTest`1[int]:VectorRelOp(int,int):int (FullOpts)
; ============================================================
Unwind Info:
-28 (-1.43%) : 174836.dasm - VectorTest+VectorRelopTest`1[long]:VectorRelOp(long,long):int (FullOpts)
@@ -331,10 +331,9 @@ G_M59915_IG17: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
;; size=11 bbWeight=4 PerfScore 6.00
G_M59915_IG18: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
vpcmpq k1, ymm8, ymm6, 2
- vpmovm2q ymm9, k1
- vpternlogq ymm9, ymm8, ymm7, -54
+ vpblendmq ymm9 {k1}, ymm7, ymm8
xor ebx, ebx
- ;; size=22 bbWeight=1 PerfScore 4.75
+ ;; size=15 bbWeight=1 PerfScore 3.58
G_M59915_IG19: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, isz
mov ecx, ebx
vmovups ymmword ptr [rsp+0x20], ymm9
@@ -389,10 +388,9 @@ G_M59915_IG21: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
;; size=11 bbWeight=4 PerfScore 6.00
G_M59915_IG22: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
vpcmpq k1, ymm8, ymm7, 2
- vpmovm2q ymm9, k1
- vpternlogq ymm9, ymm8, ymm7, -54
+ vpblendmq ymm9 {k1}, ymm7, ymm8
xor ebx, ebx
- ;; size=22 bbWeight=1 PerfScore 4.75
+ ;; size=15 bbWeight=1 PerfScore 3.58
G_M59915_IG23: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, isz
mov ecx, ebx
vmovups ymmword ptr [rsp+0x20], ymm9
@@ -447,10 +445,9 @@ G_M59915_IG25: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
;; size=11 bbWeight=4 PerfScore 6.00
G_M59915_IG26: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
vpcmpq k1, ymm8, ymm6, 5
- vpmovm2q ymm9, k1
- vpternlogq ymm9, ymm8, ymm7, -54
+ vpblendmq ymm9 {k1}, ymm7, ymm8
xor ebx, ebx
- ;; size=22 bbWeight=1 PerfScore 4.75
+ ;; size=15 bbWeight=1 PerfScore 3.58
G_M59915_IG27: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, isz
mov ecx, ebx
vmovups ymmword ptr [rsp+0x20], ymm9
@@ -505,10 +502,9 @@ G_M59915_IG29: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
;; size=11 bbWeight=4 PerfScore 6.00
G_M59915_IG30: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
vpcmpq k1, ymm7, ymm6, 5
- vpmovm2q ymm9, k1
- vpternlogq ymm9, ymm8, ymm7, -54
+ vpblendmq ymm9 {k1}, ymm7, ymm8
xor ebx, ebx
- ;; size=22 bbWeight=1 PerfScore 4.75
+ ;; size=15 bbWeight=1 PerfScore 3.58
G_M59915_IG31: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, isz
mov ecx, ebx
vmovups ymmword ptr [rsp+0x20], ymm9
@@ -687,7 +683,7 @@ G_M59915_IG42: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref,
ret
;; size=73 bbWeight=1 PerfScore 35.25
-; Total bytes of code 1956, prolog size 82, PerfScore 1049.58, instruction count 381, allocated bytes for code 1956 (MethodHash=e2e315f4) for method VectorTest+VectorRelopTest`1[long]:VectorRelOp(long,long):int (FullOpts)
+; Total bytes of code 1928, prolog size 82, PerfScore 1044.92, instruction count 377, allocated bytes for code 1928 (MethodHash=e2e315f4) for method VectorTest+VectorRelopTest`1[long]:VectorRelOp(long,long):int (FullOpts)
; ============================================================
Unwind Info:
-28 (-1.42%) : 174839.dasm - VectorTest+VectorRelopTest`1[short]:VectorRelOp(short,short):int (FullOpts)
@@ -334,10 +334,9 @@ G_M8563_IG17: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
;; size=11 bbWeight=4 PerfScore 6.00
G_M8563_IG18: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
vpcmpw k1, ymm6, ymm7, 2
- vpmovm2w ymm9, k1
- vpternlogd ymm9, ymm6, ymm8, -54
+ vpblendmw ymm9 {k1}, ymm8, ymm6
xor ebx, ebx
- ;; size=22 bbWeight=1 PerfScore 5.75
+ ;; size=15 bbWeight=1 PerfScore 5.25
G_M8563_IG19: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, isz
mov ecx, ebx
vmovups ymmword ptr [rsp+0x20], ymm9
@@ -392,10 +391,9 @@ G_M8563_IG21: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
;; size=11 bbWeight=4 PerfScore 6.00
G_M8563_IG22: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
vpcmpw k1, ymm6, ymm8, 2
- vpmovm2w ymm9, k1
- vpternlogd ymm9, ymm6, ymm8, -54
+ vpblendmw ymm9 {k1}, ymm8, ymm6
xor ebx, ebx
- ;; size=22 bbWeight=1 PerfScore 5.75
+ ;; size=15 bbWeight=1 PerfScore 5.25
G_M8563_IG23: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, isz
mov ecx, ebx
vmovups ymmword ptr [rsp+0x20], ymm9
@@ -450,10 +448,9 @@ G_M8563_IG25: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
;; size=11 bbWeight=4 PerfScore 6.00
G_M8563_IG26: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
vpcmpw k1, ymm6, ymm7, 5
- vpmovm2w ymm9, k1
- vpternlogd ymm9, ymm6, ymm8, -54
+ vpblendmw ymm9 {k1}, ymm8, ymm6
xor ebx, ebx
- ;; size=22 bbWeight=1 PerfScore 5.75
+ ;; size=15 bbWeight=1 PerfScore 5.25
G_M8563_IG27: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, isz
mov ecx, ebx
vmovups ymmword ptr [rsp+0x20], ymm9
@@ -508,10 +505,9 @@ G_M8563_IG29: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
;; size=11 bbWeight=4 PerfScore 6.00
G_M8563_IG30: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
vpcmpw k1, ymm8, ymm7, 5
- vpmovm2w ymm9, k1
- vpternlogd ymm9, ymm6, ymm8, -54
+ vpblendmw ymm9 {k1}, ymm8, ymm6
xor ebx, ebx
- ;; size=22 bbWeight=1 PerfScore 5.75
+ ;; size=15 bbWeight=1 PerfScore 5.25
G_M8563_IG31: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, isz
mov ecx, ebx
vmovups ymmword ptr [rsp+0x20], ymm9
@@ -690,7 +686,7 @@ G_M8563_IG42: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref,
ret
;; size=73 bbWeight=1 PerfScore 35.25
-; Total bytes of code 1968, prolog size 82, PerfScore 1206.33, instruction count 383, allocated bytes for code 1968 (MethodHash=33a1de8c) for method VectorTest+VectorRelopTest`1[short]:VectorRelOp(short,short):int (FullOpts)
+; Total bytes of code 1940, prolog size 82, PerfScore 1204.33, instruction count 379, allocated bytes for code 1940 (MethodHash=33a1de8c) for method VectorTest+VectorRelopTest`1[short]:VectorRelOp(short,short):int (FullOpts)
; ============================================================
Unwind Info:
-16 (-0.39%) : 429089.dasm - VectorTest+VectorRelopTest`1[long]:VectorRelOp(long,long):int (Instrumented Tier0)
@@ -437,14 +437,13 @@ G_M59915_IG22: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
jl G_M59915_IG18
vmovups ymm0, ymmword ptr [rbp-0x70]
vpcmpq k1, ymm0, ymmword ptr [rbp-0xB0], 2
- vpmovm2q ymm0, k1
- vmovups ymm1, ymmword ptr [rbp-0x70]
- vpternlogq ymm0, ymm1, ymmword ptr [rbp-0x90], -54
+ vmovups ymm0, ymmword ptr [rbp-0x90]
+ vpblendmq ymm0 {k1}, ymm0, ymmword ptr [rbp-0x70]
vmovups ymmword ptr [rbp-0xD0], ymm0
xor ecx, ecx
mov dword ptr [rbp-0x100], ecx
jmp G_M59915_IG25
- ;; size=72 bbWeight=1 PerfScore 23.25
+ ;; size=68 bbWeight=1 PerfScore 22.25
G_M59915_IG23: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
cmp dword ptr [rbp-0x100], 4
jae G_M59915_IG54
@@ -523,14 +522,13 @@ G_M59915_IG27: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
jl G_M59915_IG23
vmovups ymm0, ymmword ptr [rbp-0x70]
vpcmpq k1, ymm0, ymmword ptr [rbp-0x90], 2
- vpmovm2q ymm0, k1
- vmovups ymm1, ymmword ptr [rbp-0x70]
- vpternlogq ymm0, ymm1, ymmword ptr [rbp-0x90], -54
+ vmovups ymm0, ymmword ptr [rbp-0x90]
+ vpblendmq ymm0 {k1}, ymm0, ymmword ptr [rbp-0x70]
vmovups ymmword ptr [rbp-0xD0], ymm0
xor ecx, ecx
mov dword ptr [rbp-0x108], ecx
jmp G_M59915_IG30
- ;; size=72 bbWeight=1 PerfScore 23.25
+ ;; size=68 bbWeight=1 PerfScore 22.25
G_M59915_IG28: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
cmp dword ptr [rbp-0x108], 4
jae G_M59915_IG54
@@ -609,14 +607,13 @@ G_M59915_IG32: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
jl G_M59915_IG28
vmovups ymm0, ymmword ptr [rbp-0x70]
vpcmpq k1, ymm0, ymmword ptr [rbp-0xB0], 5
- vpmovm2q ymm0, k1
- vmovups ymm1, ymmword ptr [rbp-0x70]
- vpternlogq ymm0, ymm1, ymmword ptr [rbp-0x90], -54
+ vmovups ymm0, ymmword ptr [rbp-0x90]
+ vpblendmq ymm0 {k1}, ymm0, ymmword ptr [rbp-0x70]
vmovups ymmword ptr [rbp-0xD0], ymm0
xor ecx, ecx
mov dword ptr [rbp-0x110], ecx
jmp G_M59915_IG35
- ;; size=72 bbWeight=1 PerfScore 23.25
+ ;; size=68 bbWeight=1 PerfScore 22.25
G_M59915_IG33: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
cmp dword ptr [rbp-0x110], 4
jae G_M59915_IG54
@@ -695,14 +692,13 @@ G_M59915_IG37: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
jl G_M59915_IG33
vmovups ymm0, ymmword ptr [rbp-0x90]
vpcmpq k1, ymm0, ymmword ptr [rbp-0xB0], 5
- vpmovm2q ymm0, k1
- vmovups ymm1, ymmword ptr [rbp-0x70]
- vpternlogq ymm0, ymm1, ymmword ptr [rbp-0x90], -54
+ vmovups ymm0, ymmword ptr [rbp-0x90]
+ vpblendmq ymm0 {k1}, ymm0, ymmword ptr [rbp-0x70]
vmovups ymmword ptr [rbp-0xD0], ymm0
xor ecx, ecx
mov dword ptr [rbp-0x118], ecx
jmp G_M59915_IG40
- ;; size=75 bbWeight=1 PerfScore 23.25
+ ;; size=71 bbWeight=1 PerfScore 22.25
G_M59915_IG38: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
cmp dword ptr [rbp-0x118], 4
jae G_M59915_IG54
@@ -964,7 +960,7 @@ G_M59915_IG54: ; bbWeight=0, gcVars=0000000000000000 {}, gcrefRegs=0000 {
int3
;; size=6 bbWeight=0 PerfScore 0.00
-; Total bytes of code 4100, prolog size 77, PerfScore 831.26, instruction count 657, allocated bytes for code 4100 (MethodHash=e2e315f4) for method VectorTest+VectorRelopTest`1[long]:VectorRelOp(long,long):int (Instrumented Tier0)
+; Total bytes of code 4084, prolog size 77, PerfScore 827.26, instruction count 653, allocated bytes for code 4088 (MethodHash=e2e315f4) for method VectorTest+VectorRelopTest`1[long]:VectorRelOp(long,long):int (Instrumented Tier0)
; ============================================================
Unwind Info:
-16 (-0.39%) : 429093.dasm - VectorTest+VectorRelopTest`1[byte]:VectorRelOp(byte,byte):int (Instrumented Tier0)
@@ -443,14 +443,13 @@ G_M44299_IG22: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
jl G_M44299_IG18
vmovups ymm0, ymmword ptr [rbp-0x70]
vpcmpb k1, ymm0, ymmword ptr [rbp-0xB0], 2
- vpmovm2b ymm0, k1
- vmovups ymm1, ymmword ptr [rbp-0x70]
- vpternlogd ymm0, ymm1, ymmword ptr [rbp-0x90], -54
+ vmovups ymm0, ymmword ptr [rbp-0x90]
+ vpblendmb ymm0 {k1}, ymm0, ymmword ptr [rbp-0x70]
vmovups ymmword ptr [rbp-0xD0], ymm0
xor ecx, ecx
mov dword ptr [rbp-0x100], ecx
jmp G_M44299_IG25
- ;; size=72 bbWeight=1 PerfScore 24.25
+ ;; size=68 bbWeight=1 PerfScore 24.25
G_M44299_IG23: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
cmp dword ptr [rbp-0x100], 32
jae G_M44299_IG54
@@ -529,14 +528,13 @@ G_M44299_IG27: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
jl G_M44299_IG23
vmovups ymm0, ymmword ptr [rbp-0x70]
vpcmpb k1, ymm0, ymmword ptr [rbp-0x90], 2
- vpmovm2b ymm0, k1
- vmovups ymm1, ymmword ptr [rbp-0x70]
- vpternlogd ymm0, ymm1, ymmword ptr [rbp-0x90], -54
+ vmovups ymm0, ymmword ptr [rbp-0x90]
+ vpblendmb ymm0 {k1}, ymm0, ymmword ptr [rbp-0x70]
vmovups ymmword ptr [rbp-0xD0], ymm0
xor ecx, ecx
mov dword ptr [rbp-0x108], ecx
jmp G_M44299_IG30
- ;; size=72 bbWeight=1 PerfScore 24.25
+ ;; size=68 bbWeight=1 PerfScore 24.25
G_M44299_IG28: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
cmp dword ptr [rbp-0x108], 32
jae G_M44299_IG54
@@ -615,14 +613,13 @@ G_M44299_IG32: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
jl G_M44299_IG28
vmovups ymm0, ymmword ptr [rbp-0x70]
vpcmpb k1, ymm0, ymmword ptr [rbp-0xB0], 5
- vpmovm2b ymm0, k1
- vmovups ymm1, ymmword ptr [rbp-0x70]
- vpternlogd ymm0, ymm1, ymmword ptr [rbp-0x90], -54
+ vmovups ymm0, ymmword ptr [rbp-0x90]
+ vpblendmb ymm0 {k1}, ymm0, ymmword ptr [rbp-0x70]
vmovups ymmword ptr [rbp-0xD0], ymm0
xor ecx, ecx
mov dword ptr [rbp-0x110], ecx
jmp G_M44299_IG35
- ;; size=72 bbWeight=1 PerfScore 24.25
+ ;; size=68 bbWeight=1 PerfScore 24.25
G_M44299_IG33: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
cmp dword ptr [rbp-0x110], 32
jae G_M44299_IG54
@@ -701,14 +698,13 @@ G_M44299_IG37: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
jl G_M44299_IG33
vmovups ymm0, ymmword ptr [rbp-0x90]
vpcmpb k1, ymm0, ymmword ptr [rbp-0xB0], 5
- vpmovm2b ymm0, k1
- vmovups ymm1, ymmword ptr [rbp-0x70]
- vpternlogd ymm0, ymm1, ymmword ptr [rbp-0x90], -54
+ vmovups ymm0, ymmword ptr [rbp-0x90]
+ vpblendmb ymm0 {k1}, ymm0, ymmword ptr [rbp-0x70]
vmovups ymmword ptr [rbp-0xD0], ymm0
xor ecx, ecx
mov dword ptr [rbp-0x118], ecx
jmp G_M44299_IG40
- ;; size=75 bbWeight=1 PerfScore 24.25
+ ;; size=71 bbWeight=1 PerfScore 24.25
G_M44299_IG38: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
cmp dword ptr [rbp-0x118], 32
jae G_M44299_IG54
@@ -970,7 +966,7 @@ G_M44299_IG54: ; bbWeight=0, gcVars=0000000000000000 {}, gcrefRegs=0000 {
int3
;; size=6 bbWeight=0 PerfScore 0.00
-; Total bytes of code 4123, prolog size 77, PerfScore 865.01, instruction count 663, allocated bytes for code 4123 (MethodHash=09db52f4) for method VectorTest+VectorRelopTest`1[byte]:VectorRelOp(byte,byte):int (Instrumented Tier0)
+; Total bytes of code 4107, prolog size 77, PerfScore 865.01, instruction count 659, allocated bytes for code 4111 (MethodHash=09db52f4) for method VectorTest+VectorRelopTest`1[byte]:VectorRelOp(byte,byte):int (Instrumented Tier0)
; ============================================================
Unwind Info:
-16 (-0.39%) : 429092.dasm - VectorTest+VectorRelopTest`1[short]:VectorRelOp(short,short):int (Instrumented Tier0)
@@ -443,14 +443,13 @@ G_M8563_IG22: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
jl G_M8563_IG18
vmovups ymm0, ymmword ptr [rbp-0x70]
vpcmpw k1, ymm0, ymmword ptr [rbp-0xB0], 2
- vpmovm2w ymm0, k1
- vmovups ymm1, ymmword ptr [rbp-0x70]
- vpternlogd ymm0, ymm1, ymmword ptr [rbp-0x90], -54
+ vmovups ymm0, ymmword ptr [rbp-0x90]
+ vpblendmw ymm0 {k1}, ymm0, ymmword ptr [rbp-0x70]
vmovups ymmword ptr [rbp-0xD0], ymm0
xor ecx, ecx
mov dword ptr [rbp-0x100], ecx
jmp G_M8563_IG25
- ;; size=72 bbWeight=1 PerfScore 24.25
+ ;; size=68 bbWeight=1 PerfScore 24.25
G_M8563_IG23: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
cmp dword ptr [rbp-0x100], 16
jae G_M8563_IG54
@@ -529,14 +528,13 @@ G_M8563_IG27: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
jl G_M8563_IG23
vmovups ymm0, ymmword ptr [rbp-0x70]
vpcmpw k1, ymm0, ymmword ptr [rbp-0x90], 2
- vpmovm2w ymm0, k1
- vmovups ymm1, ymmword ptr [rbp-0x70]
- vpternlogd ymm0, ymm1, ymmword ptr [rbp-0x90], -54
+ vmovups ymm0, ymmword ptr [rbp-0x90]
+ vpblendmw ymm0 {k1}, ymm0, ymmword ptr [rbp-0x70]
vmovups ymmword ptr [rbp-0xD0], ymm0
xor ecx, ecx
mov dword ptr [rbp-0x108], ecx
jmp G_M8563_IG30
- ;; size=72 bbWeight=1 PerfScore 24.25
+ ;; size=68 bbWeight=1 PerfScore 24.25
G_M8563_IG28: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
cmp dword ptr [rbp-0x108], 16
jae G_M8563_IG54
@@ -615,14 +613,13 @@ G_M8563_IG32: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
jl G_M8563_IG28
vmovups ymm0, ymmword ptr [rbp-0x70]
vpcmpw k1, ymm0, ymmword ptr [rbp-0xB0], 5
- vpmovm2w ymm0, k1
- vmovups ymm1, ymmword ptr [rbp-0x70]
- vpternlogd ymm0, ymm1, ymmword ptr [rbp-0x90], -54
+ vmovups ymm0, ymmword ptr [rbp-0x90]
+ vpblendmw ymm0 {k1}, ymm0, ymmword ptr [rbp-0x70]
vmovups ymmword ptr [rbp-0xD0], ymm0
xor ecx, ecx
mov dword ptr [rbp-0x110], ecx
jmp G_M8563_IG35
- ;; size=72 bbWeight=1 PerfScore 24.25
+ ;; size=68 bbWeight=1 PerfScore 24.25
G_M8563_IG33: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
cmp dword ptr [rbp-0x110], 16
jae G_M8563_IG54
@@ -701,14 +698,13 @@ G_M8563_IG37: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
jl G_M8563_IG33
vmovups ymm0, ymmword ptr [rbp-0x90]
vpcmpw k1, ymm0, ymmword ptr [rbp-0xB0], 5
- vpmovm2w ymm0, k1
- vmovups ymm1, ymmword ptr [rbp-0x70]
- vpternlogd ymm0, ymm1, ymmword ptr [rbp-0x90], -54
+ vmovups ymm0, ymmword ptr [rbp-0x90]
+ vpblendmw ymm0 {k1}, ymm0, ymmword ptr [rbp-0x70]
vmovups ymmword ptr [rbp-0xD0], ymm0
xor ecx, ecx
mov dword ptr [rbp-0x118], ecx
jmp G_M8563_IG40
- ;; size=75 bbWeight=1 PerfScore 24.25
+ ;; size=71 bbWeight=1 PerfScore 24.25
G_M8563_IG38: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
cmp dword ptr [rbp-0x118], 16
jae G_M8563_IG54
@@ -970,7 +966,7 @@ G_M8563_IG54: ; bbWeight=0, gcVars=0000000000000000 {}, gcrefRegs=0000 {}
int3
;; size=6 bbWeight=0 PerfScore 0.00
-; Total bytes of code 4123, prolog size 77, PerfScore 865.01, instruction count 663, allocated bytes for code 4123 (MethodHash=33a1de8c) for method VectorTest+VectorRelopTest`1[short]:VectorRelOp(short,short):int (Instrumented Tier0)
+; Total bytes of code 4107, prolog size 77, PerfScore 865.01, instruction count 659, allocated bytes for code 4111 (MethodHash=33a1de8c) for method VectorTest+VectorRelopTest`1[short]:VectorRelOp(short,short):int (Instrumented Tier0)
; ============================================================
Unwind Info:
libraries.pmi.windows.x64.checked.mch
-21 (-20.39%) : 294005.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector5121[ubyte],System.Runtime.Intrinsics.Vector5121[ubyte]):System.Runtime.Intrinsics.Vector5121ubyte
@@ -17,7 +17,7 @@
; V06 tmp1 [V06,T05] ( 3, 3 ) simd64 -> mm1 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector512`1[ubyte]>
; V07 tmp2 [V07,T06] ( 3, 3 ) simd64 -> mm3 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector512`1[ubyte]>
;* V08 tmp3 [V08 ] ( 0, 0 ) simd64 -> zero-ref "spilled call-like call argument"
-; V09 tmp4 [V09,T07] ( 2, 2 ) simd64 -> mm4 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector512`1[ubyte]>
+; V09 tmp4 [V09,T07] ( 2, 2 ) simd64 -> mm0 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector512`1[ubyte]>
;* V10 tmp5 [V10 ] ( 0, 0 ) simd64 -> zero-ref "Inline return value spill temp" <System.Runtime.Intrinsics.Vector512`1[ubyte]>
; V11 cse0 [V11,T03] ( 5, 5 ) simd64 -> mm0 "CSE - aggressive"
; V12 cse1 [V12,T04] ( 4, 4 ) simd64 -> mm2 "CSE - aggressive"
@@ -34,25 +34,22 @@ G_M27576_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0106 {rcx rdx r
vmovups zmm2, zmmword ptr [r8]
vmovaps zmm3, zmm2
vpcmpeqb k1, zmm1, zmm3
- vpmovm2b zmm4, k1
- vxorps ymm5, ymm5, ymm5
- vpcmpub k1, zmm0, zmm5, 1
- vpmovm2b zmm5, k1
- vpternlogd zmm5, zmm2, zmm0, -54
- vpcmpub k1, zmm1, zmm3, 6
- vpmovm2b zmm1, k1
- vpternlogd zmm1, zmm0, zmm2, -54
- vpternlogd zmm4, zmm5, zmm1, -54
- vmovups zmmword ptr [rcx], zmm4
+ vxorps ymm4, ymm4, ymm4
+ vpcmpub k2, zmm0, zmm4, 1
+ vpblendmb zmm4 {k2}, zmm0, zmm2
+ vpcmpub k2, zmm1, zmm3, 6
+ vpblendmb zmm0 {k2}, zmm2, zmm0
+ vpblendmb zmm0 {k1}, zmm0, zmm4
+ vmovups zmmword ptr [rcx], zmm0
mov rax, rcx
; byrRegs +[rax]
- ;; size=96 bbWeight=1 PerfScore 25.08
+ ;; size=75 bbWeight=1 PerfScore 23.58
G_M27576_IG03: ; bbWeight=1, epilog, nogc, extend
vzeroupper
ret
;; size=4 bbWeight=1 PerfScore 2.00
-; Total bytes of code 103, prolog size 3, PerfScore 28.08, instruction count 19, allocated bytes for code 103 (MethodHash=a5449447) for method System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
+; Total bytes of code 82, prolog size 3, PerfScore 26.58, instruction count 16, allocated bytes for code 82 (MethodHash=a5449447) for method System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
; ============================================================
Unwind Info:
-21 (-20.39%) : 294062.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector5121[ubyte],System.Runtime.Intrinsics.Vector5121[ubyte]):System.Runtime.Intrinsics.Vector5121ubyte
@@ -17,7 +17,7 @@
; V06 tmp1 [V06,T05] ( 3, 3 ) simd64 -> mm1 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector512`1[ubyte]>
; V07 tmp2 [V07,T06] ( 3, 3 ) simd64 -> mm3 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector512`1[ubyte]>
;* V08 tmp3 [V08 ] ( 0, 0 ) simd64 -> zero-ref "spilled call-like call argument"
-; V09 tmp4 [V09,T07] ( 2, 2 ) simd64 -> mm4 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector512`1[ubyte]>
+; V09 tmp4 [V09,T07] ( 2, 2 ) simd64 -> mm0 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector512`1[ubyte]>
;* V10 tmp5 [V10 ] ( 0, 0 ) simd64 -> zero-ref "Inline return value spill temp" <System.Runtime.Intrinsics.Vector512`1[ubyte]>
; V11 cse0 [V11,T03] ( 5, 5 ) simd64 -> mm2 "CSE - aggressive"
; V12 cse1 [V12,T04] ( 4, 4 ) simd64 -> mm0 "CSE - aggressive"
@@ -34,25 +34,22 @@ G_M10214_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0106 {rcx rdx r
vmovups zmm2, zmmword ptr [r8]
vmovaps zmm3, zmm2
vpcmpeqb k1, zmm3, zmm1
- vpmovm2b zmm4, k1
- vxorps ymm5, ymm5, ymm5
- vpcmpub k1, zmm2, zmm5, 1
- vpmovm2b zmm5, k1
- vpternlogd zmm5, zmm2, zmm0, -54
- vpcmpub k1, zmm3, zmm1, 1
- vpmovm2b zmm1, k1
- vpternlogd zmm1, zmm2, zmm0, -54
- vpternlogd zmm4, zmm5, zmm1, -54
- vmovups zmmword ptr [rcx], zmm4
+ vxorps ymm4, ymm4, ymm4
+ vpcmpub k2, zmm2, zmm4, 1
+ vpblendmb zmm4 {k2}, zmm0, zmm2
+ vpcmpub k2, zmm3, zmm1, 1
+ vpblendmb zmm0 {k2}, zmm0, zmm2
+ vpblendmb zmm0 {k1}, zmm0, zmm4
+ vmovups zmmword ptr [rcx], zmm0
mov rax, rcx
; byrRegs +[rax]
- ;; size=96 bbWeight=1 PerfScore 25.08
+ ;; size=75 bbWeight=1 PerfScore 23.58
G_M10214_IG03: ; bbWeight=1, epilog, nogc, extend
vzeroupper
ret
;; size=4 bbWeight=1 PerfScore 2.00
-; Total bytes of code 103, prolog size 3, PerfScore 28.08, instruction count 19, allocated bytes for code 103 (MethodHash=6846d819) for method System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
+; Total bytes of code 82, prolog size 3, PerfScore 26.58, instruction count 16, allocated bytes for code 82 (MethodHash=6846d819) for method System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
; ============================================================
Unwind Info:
-21 (-20.39%) : 293983.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector5121[ubyte],System.Runtime.Intrinsics.Vector5121[ubyte]):System.Runtime.Intrinsics.Vector5121ubyte
@@ -13,7 +13,7 @@
; V02 arg1 [V02,T01] ( 3, 6 ) byref -> r8 single-def
; V03 loc0 [V03,T05] ( 3, 3 ) simd64 -> mm1 <System.Runtime.Intrinsics.Vector512`1[ubyte]>
; V04 loc1 [V04,T06] ( 3, 3 ) simd64 -> mm3 <System.Runtime.Intrinsics.Vector512`1[ubyte]>
-; V05 loc2 [V05,T07] ( 2, 2 ) simd64 -> mm4 <System.Runtime.Intrinsics.Vector512`1[ubyte]>
+; V05 loc2 [V05,T07] ( 2, 2 ) simd64 -> mm0 <System.Runtime.Intrinsics.Vector512`1[ubyte]>
;* V06 loc3 [V06 ] ( 0, 0 ) simd64 -> zero-ref <System.Runtime.Intrinsics.Vector512`1[ubyte]>
;* V07 loc4 [V07 ] ( 0, 0 ) simd64 -> zero-ref <System.Runtime.Intrinsics.Vector512`1[ubyte]>
;# V08 OutArgs [V08 ] ( 1, 1 ) struct ( 0) [rsp+0x00] do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
@@ -34,25 +34,22 @@ G_M22834_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0106 {rcx rdx r
vmovups zmm2, zmmword ptr [r8]
vmovaps zmm3, zmm2
vpcmpeqb k1, zmm1, zmm3
- vpmovm2b zmm4, k1
- vxorps ymm5, ymm5, ymm5
- vpcmpub k1, zmm0, zmm5, 1
- vpmovm2b zmm5, k1
- vpternlogd zmm5, zmm2, zmm0, -54
- vpcmpub k1, zmm1, zmm3, 6
- vpmovm2b zmm1, k1
- vpternlogd zmm1, zmm0, zmm2, -54
- vpternlogd zmm4, zmm5, zmm1, -54
- vmovups zmmword ptr [rcx], zmm4
+ vxorps ymm4, ymm4, ymm4
+ vpcmpub k2, zmm0, zmm4, 1
+ vpblendmb zmm4 {k2}, zmm0, zmm2
+ vpcmpub k2, zmm1, zmm3, 6
+ vpblendmb zmm0 {k2}, zmm2, zmm0
+ vpblendmb zmm0 {k1}, zmm0, zmm4
+ vmovups zmmword ptr [rcx], zmm0
mov rax, rcx
; byrRegs +[rax]
- ;; size=96 bbWeight=1 PerfScore 25.08
+ ;; size=75 bbWeight=1 PerfScore 23.58
G_M22834_IG03: ; bbWeight=1, epilog, nogc, extend
vzeroupper
ret
;; size=4 bbWeight=1 PerfScore 2.00
-; Total bytes of code 103, prolog size 3, PerfScore 28.08, instruction count 19, allocated bytes for code 103 (MethodHash=885fa6cd) for method System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
+; Total bytes of code 82, prolog size 3, PerfScore 26.58, instruction count 16, allocated bytes for code 82 (MethodHash=885fa6cd) for method System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
; ============================================================
Unwind Info:
-7 (-5.65%) : 27696.dasm - System.Buffers.ProbabilisticMap:IsCharBitSetAvx2(System.Runtime.Intrinsics.Vector2561[ubyte],System.Runtime.Intrinsics.Vector2561[ubyte],System.Runtime.Intrinsics.Vector2561[ubyte]):System.Runtime.Intrinsics.Vector2561ubyte
@@ -35,14 +35,13 @@ G_M53822_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0306 {rcx rdx r
vpshufb ymm1, ymm2, ymm1
vpand ymm0, ymm0, ymmword ptr [reloc @RWD64]
vpcmpub k1, ymm0, ymmword ptr [reloc @RWD96], 6
- vpmovm2b ymm2, k1
- vmovups ymm3, ymmword ptr [r8]
- vpsubb ymm4, ymm0, ymmword ptr [reloc @RWD128]
- vpshufb ymm3, ymm3, ymm4
- vmovups ymm4, ymmword ptr [rdx]
- vpshufb ymm0, ymm4, ymm0
- vpternlogd ymm2, ymm3, ymm0, -54
- vpand ymm0, ymm2, ymm1
+ vmovups ymm2, ymmword ptr [r8]
+ vpsubb ymm3, ymm0, ymmword ptr [reloc @RWD128]
+ vpshufb ymm2, ymm2, ymm3
+ vmovups ymm3, ymmword ptr [rdx]
+ vpshufb ymm0, ymm3, ymm0
+ vpblendmb ymm0 {k1}, ymm0, ymm2
+ vpand ymm0, ymm0, ymm1
vxorps ymm1, ymm1, ymm1
vpcmpeqb ymm0, ymm0, ymm1
vpcmpeqd ymm1, ymm1, ymm1
@@ -50,7 +49,7 @@ G_M53822_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0306 {rcx rdx r
vmovups ymmword ptr [rcx], ymm0
mov rax, rcx
; byrRegs +[rax]
- ;; size=117 bbWeight=1 PerfScore 42.75
+ ;; size=110 bbWeight=1 PerfScore 42.25
G_M53822_IG03: ; bbWeight=1, epilog, nogc, extend
vzeroupper
ret
@@ -62,7 +61,7 @@ RWD96 dq 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F
RWD128 dq 1010101010101010h, 1010101010101010h, 1010101010101010h, 1010101010101010h
-; Total bytes of code 124, prolog size 3, PerfScore 45.75, instruction count 24, allocated bytes for code 124 (MethodHash=47dc2dc1) for method System.Buffers.ProbabilisticMap:IsCharBitSetAvx2(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte]):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
+; Total bytes of code 117, prolog size 3, PerfScore 45.25, instruction count 23, allocated bytes for code 117 (MethodHash=47dc2dc1) for method System.Buffers.ProbabilisticMap:IsCharBitSetAvx2(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte]):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
; ============================================================
Unwind Info:
-7 (-4.58%) : 293786.dasm - System.Numerics.Tensors.TensorPrimitives:g_HalfAsWidenedUInt32ToSingleVector512|210_2(System.Runtime.Intrinsics.Vector5121[uint]):System.Runtime.Intrinsics.Vector5121float
@@ -39,9 +39,8 @@ G_M58105_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0006 {rcx rdx},
vpandd zmm3, zmm4, dword ptr [reloc @RWD128] {1to16}
vpord zmm4, zmm3, dword ptr [reloc @RWD132] {1to16}
vptestnmd k1, zmm2, zmm2
- vpmovm2d zmm2, k1
- vpslld zmm5, zmm4, 1
- vpternlogd zmm2, zmm4, zmm5, -54
+ vpslld zmm2, zmm4, 1
+ vpblendmd zmm2 {k1}, zmm2, zmm4
vpslld zmm0, zmm0, 13
vpandd zmm0, zmm0, dword ptr [reloc @RWD136] {1to16}
vpaddd zmm0, zmm0, zmm2
@@ -50,7 +49,7 @@ G_M58105_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0006 {rcx rdx},
vmovups zmmword ptr [rcx], zmm0
mov rax, rcx
; byrRegs +[rax]
- ;; size=146 bbWeight=1 PerfScore 33.25
+ ;; size=139 bbWeight=1 PerfScore 32.25
G_M58105_IG03: ; bbWeight=1, epilog, nogc, extend
vzeroupper
ret
@@ -65,7 +64,7 @@ RWD132 dd 38000000h
RWD136 dd 0FFFE000h
-; Total bytes of code 153, prolog size 3, PerfScore 36.25, instruction count 24, allocated bytes for code 153 (MethodHash=e6ab1d06) for method System.Numerics.Tensors.TensorPrimitives:<ConvertToSingle>g__HalfAsWidenedUInt32ToSingle_Vector512|210_2(System.Runtime.Intrinsics.Vector512`1[uint]):System.Runtime.Intrinsics.Vector512`1[float] (FullOpts)
+; Total bytes of code 146, prolog size 3, PerfScore 35.25, instruction count 23, allocated bytes for code 146 (MethodHash=e6ab1d06) for method System.Numerics.Tensors.TensorPrimitives:<ConvertToSingle>g__HalfAsWidenedUInt32ToSingle_Vector512|210_2(System.Runtime.Intrinsics.Vector512`1[uint]):System.Runtime.Intrinsics.Vector512`1[float] (FullOpts)
; ============================================================
Unwind Info:
-28 (-2.50%) : 27702.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
@@ -186,12 +186,11 @@ G_M48875_IG06: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r
vpshufb ymm1, ymm3, ymm1
vpand ymm2, ymm2, ymmword ptr [reloc @RWD96]
vpcmpub k1, ymm2, ymmword ptr [reloc @RWD128], 6
- vpmovm2b ymm3, k1
- vpsubb ymm4, ymm2, ymmword ptr [reloc @RWD160]
- vpshufb ymm4, ymm9, ymm4
+ vpsubb ymm3, ymm2, ymmword ptr [reloc @RWD160]
+ vpshufb ymm3, ymm9, ymm3
vpshufb ymm2, ymm8, ymm2
- vpternlogd ymm3, ymm4, ymm2, -54
- vpand ymm1, ymm3, ymm1
+ vpblendmb ymm2 {k1}, ymm2, ymm3
+ vpand ymm1, ymm2, ymm1
vxorps ymm2, ymm2, ymm2
vpcmpeqb ymm1, ymm1, ymm2
vpcmpeqd ymm2, ymm2, ymm2
@@ -202,12 +201,11 @@ G_M48875_IG06: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r
vpshufb ymm2, ymm3, ymm2
vpand ymm0, ymm0, ymmword ptr [reloc @RWD96]
vpcmpub k1, ymm0, ymmword ptr [reloc @RWD128], 6
- vpmovm2b ymm3, k1
- vpsubb ymm4, ymm0, ymmword ptr [reloc @RWD160]
- vpshufb ymm4, ymm9, ymm4
+ vpsubb ymm3, ymm0, ymmword ptr [reloc @RWD160]
+ vpshufb ymm3, ymm9, ymm3
vpshufb ymm0, ymm8, ymm0
- vpternlogd ymm3, ymm4, ymm0, -54
- vpand ymm0, ymm3, ymm2
+ vpblendmb ymm0 {k1}, ymm0, ymm3
+ vpand ymm0, ymm0, ymm2
vxorps ymm2, ymm2, ymm2
vpcmpeqb ymm0, ymm0, ymm2
vpcmpeqd ymm2, ymm2, ymm2
@@ -215,7 +213,7 @@ G_M48875_IG06: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r
vpand ymm0, ymm1, ymm0
vptest ymm0, ymm0
je SHORT G_M48875_IG09
- ;; size=248 bbWeight=4 PerfScore 317.33
+ ;; size=234 bbWeight=4 PerfScore 313.33
G_M48875_IG07: ; bbWeight=2, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi rdi r14 r15}, byref
vpermq ymm0, ymm0, -40
vpmovmskb ebp, ymm0
@@ -338,12 +336,11 @@ G_M48875_IG16: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r
vpshufb xmm1, xmm10, xmm1
vpand xmm2, xmm2, xmm11
vpcmpub k1, xmm2, xmm12, 6
- vpmovm2b xmm3, k1
- vpsubb xmm4, xmm2, xmm13
- vpshufb xmm4, xmm7, xmm4
+ vpsubb xmm3, xmm2, xmm13
+ vpshufb xmm3, xmm7, xmm3
vpshufb xmm2, xmm6, xmm2
- vpternlogd xmm3, xmm4, xmm2, -54
- vpand xmm1, xmm3, xmm1
+ vpblendmb xmm2 {k1}, xmm2, xmm3
+ vpand xmm1, xmm2, xmm1
vxorps xmm2, xmm2, xmm2
vpcmpeqb xmm1, xmm1, xmm2
vpcmpeqd xmm2, xmm2, xmm2
@@ -353,12 +350,11 @@ G_M48875_IG16: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r
vpshufb xmm2, xmm10, xmm2
vpand xmm0, xmm0, xmm11
vpcmpub k1, xmm0, xmm12, 6
- vpmovm2b xmm3, k1
- vpsubb xmm4, xmm0, xmm13
- vpshufb xmm4, xmm7, xmm4
+ vpsubb xmm3, xmm0, xmm13
+ vpshufb xmm3, xmm7, xmm3
vpshufb xmm0, xmm6, xmm0
- vpternlogd xmm3, xmm4, xmm0, -54
- vpand xmm0, xmm3, xmm2
+ vpblendmb xmm0 {k1}, xmm0, xmm3
+ vpand xmm0, xmm0, xmm2
vxorps xmm2, xmm2, xmm2
vpcmpeqb xmm0, xmm0, xmm2
vpcmpeqd xmm2, xmm2, xmm2
@@ -366,7 +362,7 @@ G_M48875_IG16: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r
vpand xmm0, xmm1, xmm0
vptest xmm0, xmm0
je SHORT G_M48875_IG19
- ;; size=200 bbWeight=4 PerfScore 168.00
+ ;; size=186 bbWeight=4 PerfScore 164.00
G_M48875_IG17: ; bbWeight=2, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi rdi r14 r15}, byref
vpmovmskb ebp, xmm0
;; size=4 bbWeight=2 PerfScore 4.00
@@ -433,7 +429,7 @@ RWD128 dq 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F
RWD160 dq 1010101010101010h, 1010101010101010h, 1010101010101010h, 1010101010101010h
-; Total bytes of code 1118, prolog size 86, PerfScore 1254.58, instruction count 240, allocated bytes for code 1118 (MethodHash=36e94114) for method System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
+; Total bytes of code 1090, prolog size 86, PerfScore 1246.58, instruction count 236, allocated bytes for code 1090 (MethodHash=36e94114) for method System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
; ============================================================
Unwind Info:
libraries_tests.run.windows.x64.Release.mch
-14 (-16.67%) : 386233.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator1[ulong]:Invoke(System.Runtime.Intrinsics.Vector1281[ulong],System.Runtime.Intrinsics.Vector1281[ulong]):System.Runtime.Intrinsics.Vector1281ulong
@@ -37,21 +37,19 @@ G_M11551_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0106 {rcx rdx r
vpcmpeqq xmm4, xmm1, xmm3
vxorps xmm5, xmm5, xmm5
vpcmpuq k1, xmm0, xmm5, 1
- vpmovm2q xmm5, k1
- vpternlogq xmm5, xmm2, xmm0, -54
+ vpblendmq xmm5 {k1}, xmm0, xmm2
vpcmpuq k1, xmm1, xmm3, 6
- vpmovm2q xmm1, k1
- vpternlogq xmm1, xmm0, xmm2, -54
- vpternlogq xmm4, xmm5, xmm1, -54
+ vpblendmq xmm0 {k1}, xmm2, xmm0
+ vpternlogq xmm4, xmm5, xmm0, -54
vmovups xmmword ptr [rcx], xmm4
mov rax, rcx
; byrRegs +[rax]
- ;; size=80 bbWeight=1 PerfScore 21.08
+ ;; size=66 bbWeight=1 PerfScore 18.75
G_M11551_IG03: ; bbWeight=1, epilog, nogc, extend
ret
;; size=1 bbWeight=1 PerfScore 1.00
-; Total bytes of code 84, prolog size 3, PerfScore 23.08, instruction count 17, allocated bytes for code 84 (MethodHash=e91fd2e0) for method System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector128`1[ulong],System.Runtime.Intrinsics.Vector128`1[ulong]):System.Runtime.Intrinsics.Vector128`1[ulong] (Tier1)
+; Total bytes of code 70, prolog size 3, PerfScore 20.75, instruction count 15, allocated bytes for code 70 (MethodHash=e91fd2e0) for method System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector128`1[ulong],System.Runtime.Intrinsics.Vector128`1[ulong]):System.Runtime.Intrinsics.Vector128`1[ulong] (Tier1)
; ============================================================
Unwind Info:
-14 (-16.67%) : 393190.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator1[ulong]:Invoke(System.Runtime.Intrinsics.Vector1281[ulong],System.Runtime.Intrinsics.Vector1281[ulong]):System.Runtime.Intrinsics.Vector1281ulong
@@ -36,21 +36,19 @@ G_M1813_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0106 {rcx rdx r8
vpcmpeqq xmm4, xmm1, xmm3
vxorps xmm5, xmm5, xmm5
vpcmpuq k1, xmm0, xmm5, 1
- vpmovm2q xmm5, k1
- vpternlogq xmm5, xmm2, xmm0, -54
+ vpblendmq xmm5 {k1}, xmm0, xmm2
vpcmpuq k1, xmm1, xmm3, 6
- vpmovm2q xmm1, k1
- vpternlogq xmm1, xmm0, xmm2, -54
- vpternlogq xmm4, xmm5, xmm1, -54
+ vpblendmq xmm0 {k1}, xmm2, xmm0
+ vpternlogq xmm4, xmm5, xmm0, -54
vmovups xmmword ptr [rcx], xmm4
mov rax, rcx
; byrRegs +[rax]
- ;; size=80 bbWeight=1 PerfScore 21.08
+ ;; size=66 bbWeight=1 PerfScore 18.75
G_M1813_IG03: ; bbWeight=1, epilog, nogc, extend
ret
;; size=1 bbWeight=1 PerfScore 1.00
-; Total bytes of code 84, prolog size 3, PerfScore 23.08, instruction count 17, allocated bytes for code 84 (MethodHash=f881f8ea) for method System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector128`1[ulong],System.Runtime.Intrinsics.Vector128`1[ulong]):System.Runtime.Intrinsics.Vector128`1[ulong] (Tier1)
+; Total bytes of code 70, prolog size 3, PerfScore 20.75, instruction count 15, allocated bytes for code 70 (MethodHash=f881f8ea) for method System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector128`1[ulong],System.Runtime.Intrinsics.Vector128`1[ulong]):System.Runtime.Intrinsics.Vector128`1[ulong] (Tier1)
; ============================================================
Unwind Info:
-14 (-16.67%) : 393191.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator1[ulong]:Invoke(System.Runtime.Intrinsics.Vector1281[ulong],System.Runtime.Intrinsics.Vector1281[ulong]):System.Runtime.Intrinsics.Vector1281ulong
@@ -37,21 +37,19 @@ G_M11551_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0106 {rcx rdx r
vpcmpeqq xmm4, xmm1, xmm3
vxorps xmm5, xmm5, xmm5
vpcmpuq k1, xmm0, xmm5, 1
- vpmovm2q xmm5, k1
- vpternlogq xmm5, xmm2, xmm0, -54
+ vpblendmq xmm5 {k1}, xmm0, xmm2
vpcmpuq k1, xmm1, xmm3, 6
- vpmovm2q xmm1, k1
- vpternlogq xmm1, xmm0, xmm2, -54
- vpternlogq xmm4, xmm5, xmm1, -54
+ vpblendmq xmm0 {k1}, xmm2, xmm0
+ vpternlogq xmm4, xmm5, xmm0, -54
vmovups xmmword ptr [rcx], xmm4
mov rax, rcx
; byrRegs +[rax]
- ;; size=80 bbWeight=1 PerfScore 21.08
+ ;; size=66 bbWeight=1 PerfScore 18.75
G_M11551_IG03: ; bbWeight=1, epilog, nogc, extend
ret
;; size=1 bbWeight=1 PerfScore 1.00
-; Total bytes of code 84, prolog size 3, PerfScore 23.08, instruction count 17, allocated bytes for code 84 (MethodHash=e91fd2e0) for method System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector128`1[ulong],System.Runtime.Intrinsics.Vector128`1[ulong]):System.Runtime.Intrinsics.Vector128`1[ulong] (Tier1)
+; Total bytes of code 70, prolog size 3, PerfScore 20.75, instruction count 15, allocated bytes for code 70 (MethodHash=e91fd2e0) for method System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector128`1[ulong],System.Runtime.Intrinsics.Vector128`1[ulong]):System.Runtime.Intrinsics.Vector128`1[ulong] (Tier1)
; ============================================================
Unwind Info:
-7 (-2.52%) : 392581.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator1[ulong]:Invoke(System.Runtime.Intrinsics.Vector2561[ulong],System.Runtime.Intrinsics.Vector2561[ulong]):System.Runtime.Intrinsics.Vector2561ulong
@@ -59,16 +59,17 @@ G_M12395_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
vpternlogq ymm1, ymm2, ymmword ptr [rcx], -54
vmovups ymm2, ymmword ptr [rbp-0x50]
vpcmpuq k1, ymm2, ymmword ptr [rbp-0x30], 1
- vpmovm2q ymm2, k1
mov rcx, bword ptr [rbp+0x20]
- vmovups ymm3, ymmword ptr [rcx]
- mov rcx, bword ptr [rbp+0x18]
- vpternlogq ymm2, ymm3, ymmword ptr [rcx], -54
+ mov rax, bword ptr [rbp+0x18]
+ ; byrRegs +[rax]
+ vmovups ymm2, ymmword ptr [rax]
+ vpblendmq ymm2 {k1}, ymm2, ymmword ptr [rcx]
vpternlogq ymm0, ymm1, ymm2, -54
vmovups ymmword ptr [rbp-0x70], ymm0
mov rcx, 0xD1FFAB1E
; byrRegs -[rcx]
call CORINFO_HELP_COUNTPROFILE32
+ ; byrRegs -[rax]
mov rcx, 0xD1FFAB1E
call CORINFO_HELP_COUNTPROFILE32
mov rax, bword ptr [rbp+0x10]
@@ -76,7 +77,7 @@ G_M12395_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
vmovups ymm0, ymmword ptr [rbp-0x70]
vmovups ymmword ptr [rax], ymm0
mov rax, bword ptr [rbp+0x10]
- ;; size=200 bbWeight=1 PerfScore 77.00
+ ;; size=193 bbWeight=1 PerfScore 76.00
G_M12395_IG03: ; bbWeight=1, epilog, nogc, extend
vzeroupper
add rsp, 272
@@ -84,7 +85,7 @@ G_M12395_IG03: ; bbWeight=1, epilog, nogc, extend
ret
;; size=12 bbWeight=1 PerfScore 2.75
-; Total bytes of code 278, prolog size 54, PerfScore 95.83, instruction count 53, allocated bytes for code 278 (MethodHash=3d67cf94) for method System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Instrumented Tier0)
+; Total bytes of code 271, prolog size 54, PerfScore 94.83, instruction count 52, allocated bytes for code 272 (MethodHash=3d67cf94) for method System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Instrumented Tier0)
; ============================================================
Unwind Info:
-7 (-2.52%) : 392539.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator1[ulong]:Invoke(System.Runtime.Intrinsics.Vector2561[ulong],System.Runtime.Intrinsics.Vector2561[ulong]):System.Runtime.Intrinsics.Vector2561ulong
@@ -59,16 +59,17 @@ G_M63669_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
vpternlogq ymm1, ymm2, ymmword ptr [rcx], -54
vmovups ymm2, ymmword ptr [rbp-0x30]
vpcmpuq k1, ymm2, ymmword ptr [rbp-0x50], 6
- vpmovm2q ymm2, k1
mov rcx, bword ptr [rbp+0x18]
- vmovups ymm3, ymmword ptr [rcx]
- mov rcx, bword ptr [rbp+0x20]
- vpternlogq ymm2, ymm3, ymmword ptr [rcx], -54
+ mov rax, bword ptr [rbp+0x20]
+ ; byrRegs +[rax]
+ vmovups ymm2, ymmword ptr [rax]
+ vpblendmq ymm2 {k1}, ymm2, ymmword ptr [rcx]
vpternlogq ymm0, ymm1, ymm2, -54
vmovups ymmword ptr [rbp-0x70], ymm0
mov rcx, 0xD1FFAB1E
; byrRegs -[rcx]
call CORINFO_HELP_COUNTPROFILE32
+ ; byrRegs -[rax]
mov rcx, 0xD1FFAB1E
call CORINFO_HELP_COUNTPROFILE32
mov rax, bword ptr [rbp+0x10]
@@ -76,7 +77,7 @@ G_M63669_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
vmovups ymm0, ymmword ptr [rbp-0x70]
vmovups ymmword ptr [rax], ymm0
mov rax, bword ptr [rbp+0x10]
- ;; size=200 bbWeight=1 PerfScore 77.00
+ ;; size=193 bbWeight=1 PerfScore 76.00
G_M63669_IG03: ; bbWeight=1, epilog, nogc, extend
vzeroupper
add rsp, 272
@@ -84,7 +85,7 @@ G_M63669_IG03: ; bbWeight=1, epilog, nogc, extend
ret
;; size=12 bbWeight=1 PerfScore 2.75
-; Total bytes of code 278, prolog size 54, PerfScore 95.83, instruction count 53, allocated bytes for code 278 (MethodHash=dc23074a) for method System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Instrumented Tier0)
+; Total bytes of code 271, prolog size 54, PerfScore 94.83, instruction count 52, allocated bytes for code 272 (MethodHash=dc23074a) for method System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Instrumented Tier0)
; ============================================================
Unwind Info:
-28 (-2.40%) : 342446.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (Tier1)
@@ -172,12 +172,11 @@ G_M48875_IG04: ; bbWeight=4.37, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rs
vpshufb ymm1, ymm3, ymm1
vpand ymm2, ymm2, ymmword ptr [reloc @RWD96]
vpcmpub k1, ymm2, ymmword ptr [reloc @RWD128], 6
- vpmovm2b ymm3, k1
- vpsubb ymm4, ymm2, ymmword ptr [reloc @RWD160]
- vpshufb ymm4, ymm9, ymm4
+ vpsubb ymm3, ymm2, ymmword ptr [reloc @RWD160]
+ vpshufb ymm3, ymm9, ymm3
vpshufb ymm2, ymm8, ymm2
- vpternlogd ymm3, ymm4, ymm2, -54
- vpand ymm1, ymm3, ymm1
+ vpblendmb ymm2 {k1}, ymm2, ymm3
+ vpand ymm1, ymm2, ymm1
vxorps ymm2, ymm2, ymm2
vpcmpeqb ymm1, ymm1, ymm2
vpcmpeqd ymm2, ymm2, ymm2
@@ -188,12 +187,11 @@ G_M48875_IG04: ; bbWeight=4.37, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rs
vpshufb ymm2, ymm3, ymm2
vpand ymm0, ymm0, ymmword ptr [reloc @RWD96]
vpcmpub k1, ymm0, ymmword ptr [reloc @RWD128], 6
- vpmovm2b ymm3, k1
- vpsubb ymm4, ymm0, ymmword ptr [reloc @RWD160]
- vpshufb ymm4, ymm9, ymm4
+ vpsubb ymm3, ymm0, ymmword ptr [reloc @RWD160]
+ vpshufb ymm3, ymm9, ymm3
vpshufb ymm0, ymm8, ymm0
- vpternlogd ymm3, ymm4, ymm0, -54
- vpand ymm0, ymm3, ymm2
+ vpblendmb ymm0 {k1}, ymm0, ymm3
+ vpand ymm0, ymm0, ymm2
vxorps ymm2, ymm2, ymm2
vpcmpeqb ymm0, ymm0, ymm2
vpcmpeqd ymm2, ymm2, ymm2
@@ -201,7 +199,7 @@ G_M48875_IG04: ; bbWeight=4.37, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rs
vpand ymm10, ymm1, ymm0
vptest ymm10, ymm10
jne SHORT G_M48875_IG07
- ;; size=248 bbWeight=4.37 PerfScore 346.74
+ ;; size=234 bbWeight=4.37 PerfScore 342.37
G_M48875_IG05: ; bbWeight=3.45, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi rdi r14 r15}, byref
add r15, 64
cmp r15, rsi
@@ -318,12 +316,11 @@ G_M48875_IG17: ; bbWeight=0.08, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rs
vpshufb xmm1, xmm3, xmm1
vpand xmm2, xmm2, xmmword ptr [reloc @RWD96]
vpcmpub k1, xmm2, xmmword ptr [reloc @RWD128], 6
- vpmovm2b xmm3, k1
- vpsubb xmm4, xmm2, xmmword ptr [reloc @RWD160]
- vpshufb xmm4, xmm7, xmm4
+ vpsubb xmm3, xmm2, xmmword ptr [reloc @RWD160]
+ vpshufb xmm3, xmm7, xmm3
vpshufb xmm2, xmm6, xmm2
- vpternlogd xmm3, xmm4, xmm2, -54
- vpand xmm1, xmm3, xmm1
+ vpblendmb xmm2 {k1}, xmm2, xmm3
+ vpand xmm1, xmm2, xmm1
vxorps xmm2, xmm2, xmm2
vpcmpeqb xmm1, xmm1, xmm2
vpcmpeqd xmm2, xmm2, xmm2
@@ -334,12 +331,11 @@ G_M48875_IG17: ; bbWeight=0.08, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rs
vpshufb xmm2, xmm3, xmm2
vpand xmm0, xmm0, xmmword ptr [reloc @RWD96]
vpcmpub k1, xmm0, xmmword ptr [reloc @RWD128], 6
- vpmovm2b xmm3, k1
- vpsubb xmm4, xmm0, xmmword ptr [reloc @RWD160]
- vpshufb xmm4, xmm7, xmm4
+ vpsubb xmm3, xmm0, xmmword ptr [reloc @RWD160]
+ vpshufb xmm3, xmm7, xmm3
vpshufb xmm0, xmm6, xmm0
- vpternlogd xmm3, xmm4, xmm0, -54
- vpand xmm0, xmm3, xmm2
+ vpblendmb xmm0 {k1}, xmm0, xmm3
+ vpand xmm0, xmm0, xmm2
vxorps xmm2, xmm2, xmm2
vpcmpeqb xmm0, xmm0, xmm2
vpcmpeqd xmm2, xmm2, xmm2
@@ -347,7 +343,7 @@ G_M48875_IG17: ; bbWeight=0.08, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rs
vpand xmm0, xmm1, xmm0
vptest xmm0, xmm0
je SHORT G_M48875_IG21
- ;; size=248 bbWeight=0.08 PerfScore 5.05
+ ;; size=234 bbWeight=0.08 PerfScore 4.97
G_M48875_IG18: ; bbWeight=0.07, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi rdi r14 r15}, byref
vpmovmskb ebp, xmm0
;; size=4 bbWeight=0.07 PerfScore 0.14
@@ -444,7 +440,7 @@ RWD128 dq 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F
RWD160 dq 1010101010101010h, 1010101010101010h, 1010101010101010h, 1010101010101010h
-; Total bytes of code 1165, prolog size 86, PerfScore 510.92, instruction count 247, allocated bytes for code 1165 (MethodHash=36e94114) for method System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (Tier1)
+; Total bytes of code 1137, prolog size 86, PerfScore 506.47, instruction count 243, allocated bytes for code 1137 (MethodHash=36e94114) for method System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (Tier1)
; ============================================================
Unwind Info:
librariestestsnotieredcompilation.run.windows.x64.Release.mch
-28 (-2.50%) : 150728.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
@@ -186,12 +186,11 @@ G_M48875_IG06: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r
vpshufb ymm1, ymm3, ymm1
vpand ymm2, ymm2, ymmword ptr [reloc @RWD96]
vpcmpub k1, ymm2, ymmword ptr [reloc @RWD128], 6
- vpmovm2b ymm3, k1
- vpsubb ymm4, ymm2, ymmword ptr [reloc @RWD160]
- vpshufb ymm4, ymm9, ymm4
+ vpsubb ymm3, ymm2, ymmword ptr [reloc @RWD160]
+ vpshufb ymm3, ymm9, ymm3
vpshufb ymm2, ymm8, ymm2
- vpternlogd ymm3, ymm4, ymm2, -54
- vpand ymm1, ymm3, ymm1
+ vpblendmb ymm2 {k1}, ymm2, ymm3
+ vpand ymm1, ymm2, ymm1
vxorps ymm2, ymm2, ymm2
vpcmpeqb ymm1, ymm1, ymm2
vpcmpeqd ymm2, ymm2, ymm2
@@ -202,12 +201,11 @@ G_M48875_IG06: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r
vpshufb ymm2, ymm3, ymm2
vpand ymm0, ymm0, ymmword ptr [reloc @RWD96]
vpcmpub k1, ymm0, ymmword ptr [reloc @RWD128], 6
- vpmovm2b ymm3, k1
- vpsubb ymm4, ymm0, ymmword ptr [reloc @RWD160]
- vpshufb ymm4, ymm9, ymm4
+ vpsubb ymm3, ymm0, ymmword ptr [reloc @RWD160]
+ vpshufb ymm3, ymm9, ymm3
vpshufb ymm0, ymm8, ymm0
- vpternlogd ymm3, ymm4, ymm0, -54
- vpand ymm0, ymm3, ymm2
+ vpblendmb ymm0 {k1}, ymm0, ymm3
+ vpand ymm0, ymm0, ymm2
vxorps ymm2, ymm2, ymm2
vpcmpeqb ymm0, ymm0, ymm2
vpcmpeqd ymm2, ymm2, ymm2
@@ -215,7 +213,7 @@ G_M48875_IG06: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r
vpand ymm0, ymm1, ymm0
vptest ymm0, ymm0
je SHORT G_M48875_IG09
- ;; size=248 bbWeight=4 PerfScore 317.33
+ ;; size=234 bbWeight=4 PerfScore 313.33
G_M48875_IG07: ; bbWeight=2, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi rdi r14 r15}, byref
vpermq ymm0, ymm0, -40
vpmovmskb ebp, ymm0
@@ -338,12 +336,11 @@ G_M48875_IG16: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r
vpshufb xmm1, xmm10, xmm1
vpand xmm2, xmm2, xmm11
vpcmpub k1, xmm2, xmm12, 6
- vpmovm2b xmm3, k1
- vpsubb xmm4, xmm2, xmm13
- vpshufb xmm4, xmm7, xmm4
+ vpsubb xmm3, xmm2, xmm13
+ vpshufb xmm3, xmm7, xmm3
vpshufb xmm2, xmm6, xmm2
- vpternlogd xmm3, xmm4, xmm2, -54
- vpand xmm1, xmm3, xmm1
+ vpblendmb xmm2 {k1}, xmm2, xmm3
+ vpand xmm1, xmm2, xmm1
vxorps xmm2, xmm2, xmm2
vpcmpeqb xmm1, xmm1, xmm2
vpcmpeqd xmm2, xmm2, xmm2
@@ -353,12 +350,11 @@ G_M48875_IG16: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r
vpshufb xmm2, xmm10, xmm2
vpand xmm0, xmm0, xmm11
vpcmpub k1, xmm0, xmm12, 6
- vpmovm2b xmm3, k1
- vpsubb xmm4, xmm0, xmm13
- vpshufb xmm4, xmm7, xmm4
+ vpsubb xmm3, xmm0, xmm13
+ vpshufb xmm3, xmm7, xmm3
vpshufb xmm0, xmm6, xmm0
- vpternlogd xmm3, xmm4, xmm0, -54
- vpand xmm0, xmm3, xmm2
+ vpblendmb xmm0 {k1}, xmm0, xmm3
+ vpand xmm0, xmm0, xmm2
vxorps xmm2, xmm2, xmm2
vpcmpeqb xmm0, xmm0, xmm2
vpcmpeqd xmm2, xmm2, xmm2
@@ -366,7 +362,7 @@ G_M48875_IG16: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r
vpand xmm0, xmm1, xmm0
vptest xmm0, xmm0
je SHORT G_M48875_IG19
- ;; size=200 bbWeight=4 PerfScore 168.00
+ ;; size=186 bbWeight=4 PerfScore 164.00
G_M48875_IG17: ; bbWeight=2, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi rdi r14 r15}, byref
vpmovmskb ebp, xmm0
;; size=4 bbWeight=2 PerfScore 4.00
@@ -433,7 +429,7 @@ RWD128 dq 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F
RWD160 dq 1010101010101010h, 1010101010101010h, 1010101010101010h, 1010101010101010h
-; Total bytes of code 1118, prolog size 86, PerfScore 1254.58, instruction count 240, allocated bytes for code 1118 (MethodHash=36e94114) for method System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
+; Total bytes of code 1090, prolog size 86, PerfScore 1246.58, instruction count 236, allocated bytes for code 1090 (MethodHash=36e94114) for method System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
; ============================================================
Unwind Info:
-17 (-1.75%) : 169934.dasm - System.Numerics.Tests.GenericVectorTests:TestConditionalSelectuint:this (FullOpts)
@@ -15,7 +15,7 @@
;* V04 loc3 [V04 ] ( 0, 0 ) simd32 -> zero-ref <System.Numerics.Vector`1[uint]>
; V05 loc4 [V05,T16] ( 2, 2 ) simd32 -> mm8 <System.Numerics.Vector`1[uint]>
;* V06 loc5 [V06 ] ( 0, 0 ) simd32 -> zero-ref <System.Numerics.Vector`1[uint]>
-; V07 loc6 [V07,T17] ( 2, 2 ) simd32 -> mm8 <System.Numerics.Vector`1[uint]>
+; V07 loc6 [V07,T17] ( 2, 2 ) simd32 -> mm6 <System.Numerics.Vector`1[uint]>
;* V08 loc7 [V08 ] ( 0, 0 ) struct ( 8) zero-ref ld-addr-op <System.Nullable`1[int]>
; V09 OutArgs [V09 ] ( 1, 1 ) struct (32) [rsp+0x00] do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
;* V10 tmp1 [V10 ] ( 0, 0 ) ref -> zero-ref class-hnd exact "NewObj constructor temp" <System.Numerics.Tests.GenericVectorTests+<>c__DisplayClass670_0`1[uint]>
@@ -26,7 +26,7 @@
;* V15 tmp6 [V15,T08] ( 0, 0 ) int -> zero-ref "Inline stloc first use temp"
; V16 tmp7 [V16,T12] ( 9, 18 ) simd32 -> mm8 ld-addr-op "Inlining Arg" <System.Numerics.Vector`1[uint]>
;* V17 tmp8 [V17,T09] ( 0, 0 ) int -> zero-ref "Inline stloc first use temp"
-; V18 tmp9 [V18,T13] ( 9, 18 ) simd32 -> mm8 ld-addr-op "Inlining Arg" <System.Numerics.Vector`1[uint]>
+; V18 tmp9 [V18,T13] ( 9, 18 ) simd32 -> mm6 ld-addr-op "Inlining Arg" <System.Numerics.Vector`1[uint]>
;* V19 tmp10 [V19,T10] ( 0, 0 ) ubyte -> zero-ref single-def "field V08.hasValue (fldOffset=0x0)" P-INDEP
;* V20 tmp11 [V20,T11] ( 0, 0 ) int -> zero-ref single-def "field V08.value (fldOffset=0x4)" P-INDEP
; V21 tmp12 [V21,T02] ( 4, 8 ) struct ( 8) [rsp+0x20] do-not-enreg[SF] "by-value struct argument" <System.Nullable`1[int]>
@@ -104,8 +104,7 @@ G_M21446_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
jl G_M21446_IG06
vmovups ymm7, ymmword ptr [rcx+0x10]
vpcmpud k1, ymm6, ymm7, 6
- vpmovm2d ymm8, k1
- vpternlogd ymm8, ymm6, ymm7, -54
+ vpblendmd ymm8 {k1}, ymm7, ymm6
mov rcx, 0xD1FFAB1E ; System.Action`2[int,uint]
; gcrRegs -[rcx]
vextractf128 xmm9, ymm6, 1
@@ -151,7 +150,7 @@ G_M21446_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
mov rcx, gword ptr [rsi+0x08]
; gcrRegs +[rcx]
vextractf128 xmm12, ymm8, 1
- ;; size=312 bbWeight=1 PerfScore 88.00
+ ;; size=305 bbWeight=1 PerfScore 86.83
G_M21446_IG03: ; bbWeight=1, extend
call [rsi+0x18]<unknown method>
; gcrRegs -[rcx]
@@ -208,10 +207,9 @@ G_M21446_IG03: ; bbWeight=1, extend
vinsertf128 ymm6, ymm6, xmm9, 1
vinsertf128 ymm7, ymm7, xmm10, 1
vpcmpud k1, ymm6, ymm7, 2
- vpmovm2d ymm8, k1
- vpternlogd ymm8, ymm6, ymm7, -54
+ vpblendmd ymm6 {k1}, ymm7, ymm6
mov rcx, 0xD1FFAB1E ; System.Action`2[int,uint]
- vextractf128 xmm6, ymm8, 1
+ vextractf128 xmm7, ymm6, 1
call CORINFO_HELP_NEWSFAST
; gcrRegs +[rax]
; gcr arg pop 0
@@ -226,79 +224,79 @@ G_M21446_IG03: ; bbWeight=1, extend
; byrRegs -[rcx]
mov r8, 0xD1FFAB1E ; code for <unknown method>
mov qword ptr [rsi+0x18], r8
- vinsertf128 ymm8, ymm8, xmm6, 1
- vmovd r8d, xmm8
+ vinsertf128 ymm6, ymm6, xmm7, 1
+ vmovd r8d, xmm6
xor edx, edx
mov rcx, gword ptr [rsi+0x08]
; gcrRegs +[rcx]
- vextractf128 xmm7, ymm8, 1
+ vextractf128 xmm8, ymm6, 1
call [rsi+0x18]<unknown method>
; gcrRegs -[rcx]
; gcr arg pop 0
- vinsertf128 ymm8, ymm8, xmm7, 1
- vmovaps ymm0, ymm8
+ vinsertf128 ymm6, ymm6, xmm8, 1
+ vmovaps ymm0, ymm6
vpextrd r8d, xmm0, 1
mov edx, 1
mov rcx, gword ptr [rsi+0x08]
; gcrRegs +[rcx]
- vextractf128 xmm7, ymm8, 1
+ vextractf128 xmm8, ymm6, 1
call [rsi+0x18]<unknown method>
; gcrRegs -[rcx]
; gcr arg pop 0
- vinsertf128 ymm8, ymm8, xmm7, 1
- vmovaps ymm0, ymm8
+ vinsertf128 ymm6, ymm6, xmm8, 1
+ vmovaps ymm0, ymm6
vpextrd r8d, xmm0, 2
mov edx, 2
mov rcx, gword ptr [rsi+0x08]
; gcrRegs +[rcx]
- vextractf128 xmm7, ymm8, 1
+ vextractf128 xmm8, ymm6, 1
call [rsi+0x18]<unknown method>
; gcrRegs -[rcx]
; gcr arg pop 0
- ;; size=353 bbWeight=1 PerfScore 120.75
+ vinsertf128 ymm6, ymm6, xmm8, 1
+ ;; size=350 bbWeight=1 PerfScore 121.58
G_M21446_IG04: ; bbWeight=1, extend
- vinsertf128 ymm8, ymm8, xmm7, 1
- vmovaps ymm0, ymm8
+ vmovaps ymm0, ymm6
vpextrd r8d, xmm0, 3
mov edx, 3
mov rcx, gword ptr [rsi+0x08]
; gcrRegs +[rcx]
- vextractf128 xmm7, ymm8, 1
+ vextractf128 xmm8, ymm6, 1
call [rsi+0x18]<unknown method>
; gcrRegs -[rcx]
; gcr arg pop 0
- vinsertf128 ymm8, ymm8, xmm7, 1
- vextracti128 xmm0, ymm8, 1
+ vinsertf128 ymm6, ymm6, xmm8, 1
+ vextracti128 xmm0, ymm6, 1
vmovd r8d, xmm0
mov edx, 4
mov rcx, gword ptr [rsi+0x08]
; gcrRegs +[rcx]
- vextractf128 xmm7, ymm8, 1
+ vextractf128 xmm8, ymm6, 1
call [rsi+0x18]<unknown method>
; gcrRegs -[rcx]
; gcr arg pop 0
- vinsertf128 ymm8, ymm8, xmm7, 1
- vextracti128 xmm0, ymm8, 1
+ vinsertf128 ymm6, ymm6, xmm8, 1
+ vextracti128 xmm0, ymm6, 1
vpextrd r8d, xmm0, 1
mov edx, 5
mov rcx, gword ptr [rsi+0x08]
; gcrRegs +[rcx]
- vextractf128 xmm7, ymm8, 1
+ vextractf128 xmm8, ymm6, 1
call [rsi+0x18]<unknown method>
; gcrRegs -[rcx]
; gcr arg pop 0
- vinsertf128 ymm8, ymm8, xmm7, 1
- vextracti128 xmm0, ymm8, 1
+ vinsertf128 ymm6, ymm6, xmm8, 1
+ vextracti128 xmm0, ymm6, 1
vpextrd r8d, xmm0, 2
mov edx, 6
mov rcx, gword ptr [rsi+0x08]
; gcrRegs +[rcx]
- vextractf128 xmm7, ymm8, 1
+ vextractf128 xmm8, ymm6, 1
call [rsi+0x18]<unknown method>
; gcrRegs -[rcx]
; gcr arg pop 0
- vinsertf128 ymm8, ymm8, xmm7, 1
- vextracti128 xmm0, ymm8, 1
+ vinsertf128 ymm6, ymm6, xmm8, 1
+ vextracti128 xmm0, ymm6, 1
vpextrd r8d, xmm0, 3
mov edx, 7
mov rcx, gword ptr [rsi+0x08]
@@ -307,7 +305,7 @@ G_M21446_IG04: ; bbWeight=1, extend
; gcrRegs -[rcx rsi]
; gcr arg pop 0
nop
- ;; size=173 bbWeight=1 PerfScore 66.75
+ ;; size=166 bbWeight=1 PerfScore 64.75
G_M21446_IG05: ; bbWeight=1, epilog, nogc, extend
vmovaps xmm6, xmmword ptr [rsp+0x90]
vmovaps xmm7, xmmword ptr [rsp+0x80]
@@ -328,7 +326,7 @@ G_M21446_IG06: ; bbWeight=0, gcVars=0000000000000000 {}, gcrefRegs=0000 {
int3
;; size=7 bbWeight=0 PerfScore 0.00
-; Total bytes of code 973, prolog size 67, PerfScore 325.25, instruction count 194, allocated bytes for code 973 (MethodHash=8544ac39) for method System.Numerics.Tests.GenericVectorTests:TestConditionalSelect[uint]():this (FullOpts)
+; Total bytes of code 956, prolog size 67, PerfScore 322.92, instruction count 192, allocated bytes for code 956 (MethodHash=8544ac39) for method System.Numerics.Tests.GenericVectorTests:TestConditionalSelect[uint]():this (FullOpts)
; ============================================================
Unwind Info:
smoke_tests.nativeaot.windows.x64.checked.mch
-28 (-2.52%) : 19903.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
@@ -185,12 +185,11 @@ G_M48875_IG06: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r
vpshufb ymm1, ymm3, ymm1
vpand ymm2, ymm2, ymmword ptr [reloc @RWD96]
vpcmpub k1, ymm2, ymmword ptr [reloc @RWD128], 6
- vpmovm2b ymm3, k1
- vpsubb ymm4, ymm2, ymmword ptr [reloc @RWD160]
- vpshufb ymm4, ymm9, ymm4
+ vpsubb ymm3, ymm2, ymmword ptr [reloc @RWD160]
+ vpshufb ymm3, ymm9, ymm3
vpshufb ymm2, ymm8, ymm2
- vpternlogd ymm3, ymm4, ymm2, -54
- vpand ymm1, ymm3, ymm1
+ vpblendmb ymm2 {k1}, ymm2, ymm3
+ vpand ymm1, ymm2, ymm1
vxorps ymm2, ymm2, ymm2
vpcmpeqb ymm1, ymm1, ymm2
vpcmpeqd ymm2, ymm2, ymm2
@@ -201,12 +200,11 @@ G_M48875_IG06: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r
vpshufb ymm2, ymm3, ymm2
vpand ymm0, ymm0, ymmword ptr [reloc @RWD96]
vpcmpub k1, ymm0, ymmword ptr [reloc @RWD128], 6
- vpmovm2b ymm3, k1
- vpsubb ymm4, ymm0, ymmword ptr [reloc @RWD160]
- vpshufb ymm4, ymm9, ymm4
+ vpsubb ymm3, ymm0, ymmword ptr [reloc @RWD160]
+ vpshufb ymm3, ymm9, ymm3
vpshufb ymm0, ymm8, ymm0
- vpternlogd ymm3, ymm4, ymm0, -54
- vpand ymm0, ymm3, ymm2
+ vpblendmb ymm0 {k1}, ymm0, ymm3
+ vpand ymm0, ymm0, ymm2
vxorps ymm2, ymm2, ymm2
vpcmpeqb ymm0, ymm0, ymm2
vpcmpeqd ymm2, ymm2, ymm2
@@ -214,7 +212,7 @@ G_M48875_IG06: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r
vpand ymm0, ymm1, ymm0
vptest ymm0, ymm0
je SHORT G_M48875_IG09
- ;; size=248 bbWeight=4 PerfScore 317.33
+ ;; size=234 bbWeight=4 PerfScore 313.33
G_M48875_IG07: ; bbWeight=2, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi rdi r14 r15}, byref
vpermq ymm0, ymm0, -40
vpmovmskb ebp, ymm0
@@ -337,12 +335,11 @@ G_M48875_IG16: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r
vpshufb xmm1, xmm10, xmm1
vpand xmm2, xmm2, xmm11
vpcmpub k1, xmm2, xmm12, 6
- vpmovm2b xmm3, k1
- vpsubb xmm4, xmm2, xmm13
- vpshufb xmm4, xmm7, xmm4
+ vpsubb xmm3, xmm2, xmm13
+ vpshufb xmm3, xmm7, xmm3
vpshufb xmm2, xmm6, xmm2
- vpternlogd xmm3, xmm4, xmm2, -54
- vpand xmm1, xmm3, xmm1
+ vpblendmb xmm2 {k1}, xmm2, xmm3
+ vpand xmm1, xmm2, xmm1
vxorps xmm2, xmm2, xmm2
vpcmpeqb xmm1, xmm1, xmm2
vpcmpeqd xmm2, xmm2, xmm2
@@ -352,12 +349,11 @@ G_M48875_IG16: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r
vpshufb xmm2, xmm10, xmm2
vpand xmm0, xmm0, xmm11
vpcmpub k1, xmm0, xmm12, 6
- vpmovm2b xmm3, k1
- vpsubb xmm4, xmm0, xmm13
- vpshufb xmm4, xmm7, xmm4
+ vpsubb xmm3, xmm0, xmm13
+ vpshufb xmm3, xmm7, xmm3
vpshufb xmm0, xmm6, xmm0
- vpternlogd xmm3, xmm4, xmm0, -54
- vpand xmm0, xmm3, xmm2
+ vpblendmb xmm0 {k1}, xmm0, xmm3
+ vpand xmm0, xmm0, xmm2
vxorps xmm2, xmm2, xmm2
vpcmpeqb xmm0, xmm0, xmm2
vpcmpeqd xmm2, xmm2, xmm2
@@ -365,7 +361,7 @@ G_M48875_IG16: ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi r
vpand xmm0, xmm1, xmm0
vptest xmm0, xmm0
je SHORT G_M48875_IG19
- ;; size=200 bbWeight=4 PerfScore 168.00
+ ;; size=186 bbWeight=4 PerfScore 164.00
G_M48875_IG17: ; bbWeight=2, gcrefRegs=0000 {}, byrefRegs=C0C8 {rbx rsi rdi r14 r15}, byref
vpmovmskb ebp, xmm0
;; size=4 bbWeight=2 PerfScore 4.00
@@ -432,7 +428,7 @@ RWD128 dq 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F
RWD160 dq 1010101010101010h, 1010101010101010h, 1010101010101010h, 1010101010101010h
-; Total bytes of code 1109, prolog size 86, PerfScore 1189.83, instruction count 240, allocated bytes for code 1109 (MethodHash=36e94114) for method System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
+; Total bytes of code 1081, prolog size 86, PerfScore 1181.83, instruction count 236, allocated bytes for code 1081 (MethodHash=36e94114) for method System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
; ============================================================
Unwind Info:
Details
Improvements/regressions per collection
| Collection |
Contexts with diffs |
Improvements |
Regressions |
Same size |
Improvements (bytes) |
Regressions (bytes) |
| benchmarks.run.windows.x64.checked.mch |
1 |
1 |
0 |
0 |
-28 |
+0 |
| benchmarks.run_pgo.windows.x64.checked.mch |
0 |
0 |
0 |
0 |
-0 |
+0 |
| benchmarks.run_tiered.windows.x64.checked.mch |
0 |
0 |
0 |
0 |
-0 |
+0 |
| coreclr_tests.run.windows.x64.checked.mch |
16 |
16 |
0 |
0 |
-528 |
+0 |
| libraries.crossgen2.windows.x64.checked.mch |
0 |
0 |
0 |
0 |
-0 |
+0 |
| libraries.pmi.windows.x64.checked.mch |
24 |
24 |
0 |
0 |
-499 |
+0 |
| libraries_tests.run.windows.x64.Release.mch |
29 |
29 |
0 |
0 |
-1,014 |
+0 |
| librariestestsnotieredcompilation.run.windows.x64.Release.mch |
7 |
7 |
0 |
0 |
-759 |
+0 |
| realworld.run.windows.x64.checked.mch |
0 |
0 |
0 |
0 |
-0 |
+0 |
| smoke_tests.nativeaot.windows.x64.checked.mch |
1 |
1 |
0 |
0 |
-28 |
+0 |
|
78 |
78 |
0 |
0 |
-2,856 |
+0 |
Context information
| Collection |
Diffed contexts |
MinOpts |
FullOpts |
Missed, base |
Missed, diff |
| benchmarks.run.windows.x64.checked.mch |
27,854 |
4 |
27,850 |
232 (0.83%) |
232 (0.83%) |
| benchmarks.run_pgo.windows.x64.checked.mch |
101,589 |
49,789 |
51,800 |
129 (0.13%) |
129 (0.13%) |
| benchmarks.run_tiered.windows.x64.checked.mch |
54,309 |
36,842 |
17,467 |
76 (0.14%) |
76 (0.14%) |
| coreclr_tests.run.windows.x64.checked.mch |
573,547 |
340,982 |
232,565 |
442 (0.08%) |
442 (0.08%) |
| libraries.crossgen2.windows.x64.checked.mch |
243,425 |
15 |
243,410 |
0 (0.00%) |
0 (0.00%) |
| libraries.pmi.windows.x64.checked.mch |
306,302 |
6 |
306,296 |
2,196 (0.71%) |
2,196 (0.71%) |
| libraries_tests.run.windows.x64.Release.mch |
672,162 |
479,203 |
192,959 |
1,125 (0.17%) |
1,125 (0.17%) |
| librariestestsnotieredcompilation.run.windows.x64.Release.mch |
318,324 |
21,885 |
296,439 |
2,187 (0.68%) |
2,187 (0.68%) |
| realworld.run.windows.x64.checked.mch |
36,492 |
3 |
36,489 |
398 (1.08%) |
398 (1.08%) |
| smoke_tests.nativeaot.windows.x64.checked.mch |
32,409 |
11 |
32,398 |
3 (0.01%) |
3 (0.01%) |
|
2,366,413 |
928,740 |
1,437,673 |
6,788 (0.29%) |
6,788 (0.29%) |
jit-analyze output
benchmarks.run.windows.x64.checked.mch
To reproduce these diffs on Windows x64:
superpmi.py asmdiffs -target_os windows -target_arch x64 -arch x64
Summary of Code Size diffs:
(Lower is better)
Total bytes of base: 8538947 (overridden on cmd)
Total bytes of diff: 8538919 (overridden on cmd)
Total bytes of delta: -28 (-0.00 % of base)
diff is an improvement.
relative diff is an improvement.
Detail diffs
Top file improvements (bytes):
-28 : 24358.dasm (-2.50 % of base)
1 total files with Code Size differences (1 improved, 0 regressed), 0 unchanged.
Top method improvements (bytes):
-28 (-2.50 % of base) : 24358.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
Top method improvements (percentages):
-28 (-2.50 % of base) : 24358.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
1 total methods with Code Size differences (1 improved, 0 regressed).
coreclr_tests.run.windows.x64.checked.mch
To reproduce these diffs on Windows x64:
superpmi.py asmdiffs -target_os windows -target_arch x64 -arch x64
Summary of Code Size diffs:
(Lower is better)
Total bytes of base: 393235161 (overridden on cmd)
Total bytes of diff: 393234633 (overridden on cmd)
Total bytes of delta: -528 (-0.00 % of base)
diff is an improvement.
relative diff is an improvement.
Detail diffs
Top file improvements (bytes):
-56 : 174838.dasm (-2.83 % of base)
-56 : 174841.dasm (-2.86 % of base)
-56 : 174837.dasm (-2.83 % of base)
-56 : 174842.dasm (-2.82 % of base)
-32 : 429091.dasm (-0.77 % of base)
-32 : 429094.dasm (-0.78 % of base)
-32 : 429095.dasm (-0.77 % of base)
-32 : 429090.dasm (-0.77 % of base)
-28 : 174840.dasm (-1.42 % of base)
-28 : 174835.dasm (-1.46 % of base)
-28 : 174836.dasm (-1.43 % of base)
-28 : 174839.dasm (-1.42 % of base)
-16 : 429089.dasm (-0.39 % of base)
-16 : 429092.dasm (-0.39 % of base)
-16 : 429093.dasm (-0.39 % of base)
-16 : 429088.dasm (-0.39 % of base)
16 total files with Code Size differences (16 improved, 0 regressed), 0 unchanged.
Top method improvements (bytes):
-56 (-2.83 % of base) : 174838.dasm - VectorTest+VectorRelopTest`1[ubyte]:VectorRelOp(ubyte,ubyte):int (FullOpts)
-56 (-2.86 % of base) : 174841.dasm - VectorTest+VectorRelopTest`1[uint]:VectorRelOp(uint,uint):int (FullOpts)
-56 (-2.82 % of base) : 174842.dasm - VectorTest+VectorRelopTest`1[ulong]:VectorRelOp(ulong,ulong):int (FullOpts)
-56 (-2.83 % of base) : 174837.dasm - VectorTest+VectorRelopTest`1[ushort]:VectorRelOp(ushort,ushort):int (FullOpts)
-32 (-0.77 % of base) : 429091.dasm - VectorTest+VectorRelopTest`1[ubyte]:VectorRelOp(ubyte,ubyte):int (Instrumented Tier0)
-32 (-0.78 % of base) : 429094.dasm - VectorTest+VectorRelopTest`1[uint]:VectorRelOp(uint,uint):int (Instrumented Tier0)
-32 (-0.77 % of base) : 429095.dasm - VectorTest+VectorRelopTest`1[ulong]:VectorRelOp(ulong,ulong):int (Instrumented Tier0)
-32 (-0.77 % of base) : 429090.dasm - VectorTest+VectorRelopTest`1[ushort]:VectorRelOp(ushort,ushort):int (Instrumented Tier0)
-28 (-1.42 % of base) : 174840.dasm - VectorTest+VectorRelopTest`1[byte]:VectorRelOp(byte,byte):int (FullOpts)
-28 (-1.46 % of base) : 174835.dasm - VectorTest+VectorRelopTest`1[int]:VectorRelOp(int,int):int (FullOpts)
-28 (-1.43 % of base) : 174836.dasm - VectorTest+VectorRelopTest`1[long]:VectorRelOp(long,long):int (FullOpts)
-28 (-1.42 % of base) : 174839.dasm - VectorTest+VectorRelopTest`1[short]:VectorRelOp(short,short):int (FullOpts)
-16 (-0.39 % of base) : 429093.dasm - VectorTest+VectorRelopTest`1[byte]:VectorRelOp(byte,byte):int (Instrumented Tier0)
-16 (-0.39 % of base) : 429088.dasm - VectorTest+VectorRelopTest`1[int]:VectorRelOp(int,int):int (Instrumented Tier0)
-16 (-0.39 % of base) : 429089.dasm - VectorTest+VectorRelopTest`1[long]:VectorRelOp(long,long):int (Instrumented Tier0)
-16 (-0.39 % of base) : 429092.dasm - VectorTest+VectorRelopTest`1[short]:VectorRelOp(short,short):int (Instrumented Tier0)
Top method improvements (percentages):
-56 (-2.86 % of base) : 174841.dasm - VectorTest+VectorRelopTest`1[uint]:VectorRelOp(uint,uint):int (FullOpts)
-56 (-2.83 % of base) : 174838.dasm - VectorTest+VectorRelopTest`1[ubyte]:VectorRelOp(ubyte,ubyte):int (FullOpts)
-56 (-2.83 % of base) : 174837.dasm - VectorTest+VectorRelopTest`1[ushort]:VectorRelOp(ushort,ushort):int (FullOpts)
-56 (-2.82 % of base) : 174842.dasm - VectorTest+VectorRelopTest`1[ulong]:VectorRelOp(ulong,ulong):int (FullOpts)
-28 (-1.46 % of base) : 174835.dasm - VectorTest+VectorRelopTest`1[int]:VectorRelOp(int,int):int (FullOpts)
-28 (-1.43 % of base) : 174836.dasm - VectorTest+VectorRelopTest`1[long]:VectorRelOp(long,long):int (FullOpts)
-28 (-1.42 % of base) : 174840.dasm - VectorTest+VectorRelopTest`1[byte]:VectorRelOp(byte,byte):int (FullOpts)
-28 (-1.42 % of base) : 174839.dasm - VectorTest+VectorRelopTest`1[short]:VectorRelOp(short,short):int (FullOpts)
-32 (-0.78 % of base) : 429094.dasm - VectorTest+VectorRelopTest`1[uint]:VectorRelOp(uint,uint):int (Instrumented Tier0)
-32 (-0.77 % of base) : 429095.dasm - VectorTest+VectorRelopTest`1[ulong]:VectorRelOp(ulong,ulong):int (Instrumented Tier0)
-32 (-0.77 % of base) : 429091.dasm - VectorTest+VectorRelopTest`1[ubyte]:VectorRelOp(ubyte,ubyte):int (Instrumented Tier0)
-32 (-0.77 % of base) : 429090.dasm - VectorTest+VectorRelopTest`1[ushort]:VectorRelOp(ushort,ushort):int (Instrumented Tier0)
-16 (-0.39 % of base) : 429088.dasm - VectorTest+VectorRelopTest`1[int]:VectorRelOp(int,int):int (Instrumented Tier0)
-16 (-0.39 % of base) : 429089.dasm - VectorTest+VectorRelopTest`1[long]:VectorRelOp(long,long):int (Instrumented Tier0)
-16 (-0.39 % of base) : 429093.dasm - VectorTest+VectorRelopTest`1[byte]:VectorRelOp(byte,byte):int (Instrumented Tier0)
-16 (-0.39 % of base) : 429092.dasm - VectorTest+VectorRelopTest`1[short]:VectorRelOp(short,short):int (Instrumented Tier0)
16 total methods with Code Size differences (16 improved, 0 regressed).
libraries.pmi.windows.x64.checked.mch
To reproduce these diffs on Windows x64:
superpmi.py asmdiffs -target_os windows -target_arch x64 -arch x64
Summary of Code Size diffs:
(Lower is better)
Total bytes of base: 60138217 (overridden on cmd)
Total bytes of diff: 60137718 (overridden on cmd)
Total bytes of delta: -499 (-0.00 % of base)
diff is an improvement.
relative diff is an improvement.
Detail diffs
Top file improvements (bytes):
-56 : 294041.dasm (-20.66 % of base)
-56 : 293984.dasm (-20.66 % of base)
-32 : 27695.dasm (-10.85 % of base)
-28 : 293986.dasm (-16.28 % of base)
-28 : 27702.dasm (-2.50 % of base)
-28 : 294043.dasm (-16.28 % of base)
-26 : 27697.dasm (-8.78 % of base)
-21 : 293983.dasm (-20.39 % of base)
-21 : 294005.dasm (-20.39 % of base)
-21 : 294040.dasm (-20.39 % of base)
-21 : 294062.dasm (-20.39 % of base)
-14 : 293982.dasm (-16.28 % of base)
-14 : 294042.dasm (-13.73 % of base)
-14 : 293985.dasm (-13.73 % of base)
-14 : 294003.dasm (-16.87 % of base)
-14 : 294038.dasm (-16.87 % of base)
-14 : 294060.dasm (-16.87 % of base)
-14 : 294061.dasm (-16.28 % of base)
-14 : 293981.dasm (-16.87 % of base)
-14 : 294004.dasm (-16.28 % of base)
24 total files with Code Size differences (24 improved, 0 regressed), 0 unchanged.
Top method improvements (bytes):
-56 (-20.66 % of base) : 293984.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte]):ubyte (FullOpts)
-56 (-20.66 % of base) : 294041.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte]):ubyte (FullOpts)
-32 (-10.85 % of base) : 27695.dasm - System.Buffers.ProbabilisticMap:ContainsMask32CharsAvx2(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte],byref):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
-28 (-2.50 % of base) : 27702.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
-28 (-16.28 % of base) : 293986.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte]):ubyte (FullOpts)
-28 (-16.28 % of base) : 294043.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte]):ubyte (FullOpts)
-26 (-8.78 % of base) : 27697.dasm - System.Buffers.ProbabilisticMap:ContainsMask16Chars(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte],byref):System.Runtime.Intrinsics.Vector128`1[ubyte] (FullOpts)
-21 (-20.39 % of base) : 293983.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
-21 (-20.39 % of base) : 294005.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
-21 (-20.39 % of base) : 294040.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
-21 (-20.39 % of base) : 294062.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
-14 (-16.87 % of base) : 293981.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]):System.Runtime.Intrinsics.Vector128`1[ubyte] (FullOpts)
-14 (-13.73 % of base) : 293985.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte]):ubyte (FullOpts)
-14 (-16.28 % of base) : 293982.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte]):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
-14 (-16.87 % of base) : 294003.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]):System.Runtime.Intrinsics.Vector128`1[ubyte] (FullOpts)
-14 (-16.28 % of base) : 294004.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte]):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
-14 (-16.87 % of base) : 294038.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]):System.Runtime.Intrinsics.Vector128`1[ubyte] (FullOpts)
-14 (-13.73 % of base) : 294042.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte]):ubyte (FullOpts)
-14 (-16.28 % of base) : 294039.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte]):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
-14 (-16.87 % of base) : 294060.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]):System.Runtime.Intrinsics.Vector128`1[ubyte] (FullOpts)
Top method improvements (percentages):
-56 (-20.66 % of base) : 293984.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte]):ubyte (FullOpts)
-56 (-20.66 % of base) : 294041.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte]):ubyte (FullOpts)
-21 (-20.39 % of base) : 293983.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
-21 (-20.39 % of base) : 294005.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
-21 (-20.39 % of base) : 294040.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
-21 (-20.39 % of base) : 294062.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
-14 (-16.87 % of base) : 293981.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]):System.Runtime.Intrinsics.Vector128`1[ubyte] (FullOpts)
-14 (-16.87 % of base) : 294003.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]):System.Runtime.Intrinsics.Vector128`1[ubyte] (FullOpts)
-14 (-16.87 % of base) : 294038.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]):System.Runtime.Intrinsics.Vector128`1[ubyte] (FullOpts)
-14 (-16.87 % of base) : 294060.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]):System.Runtime.Intrinsics.Vector128`1[ubyte] (FullOpts)
-14 (-16.28 % of base) : 293982.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte]):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
-28 (-16.28 % of base) : 293986.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte]):ubyte (FullOpts)
-14 (-16.28 % of base) : 294004.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte]):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
-14 (-16.28 % of base) : 294039.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte]):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
-28 (-16.28 % of base) : 294043.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte]):ubyte (FullOpts)
-14 (-16.28 % of base) : 294061.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte]):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
-14 (-13.73 % of base) : 293985.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte]):ubyte (FullOpts)
-14 (-13.73 % of base) : 294042.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte]):ubyte (FullOpts)
-32 (-10.85 % of base) : 27695.dasm - System.Buffers.ProbabilisticMap:ContainsMask32CharsAvx2(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte],byref):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
-26 (-8.78 % of base) : 27697.dasm - System.Buffers.ProbabilisticMap:ContainsMask16Chars(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte],byref):System.Runtime.Intrinsics.Vector128`1[ubyte] (FullOpts)
24 total methods with Code Size differences (24 improved, 0 regressed).
libraries_tests.run.windows.x64.Release.mch
To reproduce these diffs on Windows x64:
superpmi.py asmdiffs -target_os windows -target_arch x64 -arch x64
Summary of Code Size diffs:
(Lower is better)
Total bytes of base: 278258792 (overridden on cmd)
Total bytes of diff: 278257778 (overridden on cmd)
Total bytes of delta: -1014 (-0.00 % of base)
diff is an improvement.
relative diff is an improvement.
Detail diffs
Top file improvements (bytes):
-368 : 393123.dasm (-15.51 % of base)
-84 : 393118.dasm (-10.81 % of base)
-84 : 386805.dasm (-10.98 % of base)
-84 : 393473.dasm (-11.02 % of base)
-84 : 386605.dasm (-10.98 % of base)
-32 : 342443.dasm (-10.85 % of base)
-28 : 342446.dasm (-2.40 % of base)
-26 : 342451.dasm (-8.78 % of base)
-14 : 385680.dasm (-16.09 % of base)
-14 : 386233.dasm (-16.67 % of base)
-14 : 393193.dasm (-16.09 % of base)
-14 : 385681.dasm (-16.09 % of base)
-14 : 385965.dasm (-16.09 % of base)
-14 : 393122.dasm (-16.09 % of base)
-14 : 385964.dasm (-16.67 % of base)
-14 : 393121.dasm (-16.09 % of base)
-14 : 393190.dasm (-16.67 % of base)
-14 : 393191.dasm (-16.67 % of base)
-14 : 393192.dasm (-16.09 % of base)
-7 : 342450.dasm (-5.79 % of base)
29 total files with Code Size differences (29 improved, 0 regressed), 0 unchanged.
Top method improvements (bytes):
-368 (-15.51 % of base) : 393123.dasm - System.Numerics.Tensors.TensorPrimitives:<InvokeSpanSpanIntoSpan>g__Vectorized256|220_2[ulong,System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ulong]](byref,byref,byref,ulong) (Tier1)
-84 (-11.02 % of base) : 393473.dasm - System.Numerics.Tensors.TensorPrimitives:InvokeSpanSpanIntoSpan[ulong,System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ulong]](System.ReadOnlySpan`1[ulong],System.ReadOnlySpan`1[ulong],System.Span`1[ulong]) (Tier1)
-84 (-10.98 % of base) : 386605.dasm - System.Numerics.Tensors.TensorPrimitives:InvokeSpanSpanIntoSpan[ulong,System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ulong]](System.ReadOnlySpan`1[ulong],System.ReadOnlySpan`1[ulong],System.Span`1[ulong]) (Tier1)
-84 (-10.81 % of base) : 393118.dasm - System.Numerics.Tensors.TensorPrimitives:InvokeSpanSpanIntoSpan[ulong,System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ulong]](System.ReadOnlySpan`1[ulong],System.ReadOnlySpan`1[ulong],System.Span`1[ulong]) (Tier1)
-84 (-10.98 % of base) : 386805.dasm - System.Numerics.Tensors.TensorPrimitives:InvokeSpanSpanIntoSpan[ulong,System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ulong]](System.ReadOnlySpan`1[ulong],System.ReadOnlySpan`1[ulong],System.Span`1[ulong]) (Tier1)
-32 (-10.85 % of base) : 342443.dasm - System.Buffers.ProbabilisticMap:ContainsMask32CharsAvx2(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte],byref):System.Runtime.Intrinsics.Vector256`1[ubyte] (Tier1)
-28 (-2.40 % of base) : 342446.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (Tier1)
-26 (-8.78 % of base) : 342451.dasm - System.Buffers.ProbabilisticMap:ContainsMask16Chars(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte],byref):System.Runtime.Intrinsics.Vector128`1[ubyte] (Tier1)
-14 (-16.67 % of base) : 386233.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector128`1[ulong],System.Runtime.Intrinsics.Vector128`1[ulong]):System.Runtime.Intrinsics.Vector128`1[ulong] (Tier1)
-14 (-16.67 % of base) : 393191.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector128`1[ulong],System.Runtime.Intrinsics.Vector128`1[ulong]):System.Runtime.Intrinsics.Vector128`1[ulong] (Tier1)
-14 (-16.09 % of base) : 393193.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Tier1)
-14 (-16.09 % of base) : 385681.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Tier1)
-14 (-16.67 % of base) : 393190.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector128`1[ulong],System.Runtime.Intrinsics.Vector128`1[ulong]):System.Runtime.Intrinsics.Vector128`1[ulong] (Tier1)
-14 (-16.09 % of base) : 385680.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Tier1)
-14 (-16.09 % of base) : 393192.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Tier1)
-14 (-16.67 % of base) : 385964.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector128`1[ulong],System.Runtime.Intrinsics.Vector128`1[ulong]):System.Runtime.Intrinsics.Vector128`1[ulong] (Tier1)
-14 (-16.09 % of base) : 385965.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Tier1)
-14 (-16.09 % of base) : 393122.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Tier1)
-14 (-16.09 % of base) : 393121.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Tier1)
-7 (-5.79 % of base) : 342450.dasm - System.Buffers.ProbabilisticMap:IsCharBitSet(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]):System.Runtime.Intrinsics.Vector128`1[ubyte] (Tier1)
Top method improvements (percentages):
-14 (-16.67 % of base) : 386233.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector128`1[ulong],System.Runtime.Intrinsics.Vector128`1[ulong]):System.Runtime.Intrinsics.Vector128`1[ulong] (Tier1)
-14 (-16.67 % of base) : 393191.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector128`1[ulong],System.Runtime.Intrinsics.Vector128`1[ulong]):System.Runtime.Intrinsics.Vector128`1[ulong] (Tier1)
-14 (-16.67 % of base) : 393190.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector128`1[ulong],System.Runtime.Intrinsics.Vector128`1[ulong]):System.Runtime.Intrinsics.Vector128`1[ulong] (Tier1)
-14 (-16.67 % of base) : 385964.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector128`1[ulong],System.Runtime.Intrinsics.Vector128`1[ulong]):System.Runtime.Intrinsics.Vector128`1[ulong] (Tier1)
-14 (-16.09 % of base) : 393193.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Tier1)
-14 (-16.09 % of base) : 385681.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Tier1)
-14 (-16.09 % of base) : 385680.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Tier1)
-14 (-16.09 % of base) : 393192.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Tier1)
-14 (-16.09 % of base) : 385965.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Tier1)
-14 (-16.09 % of base) : 393122.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Tier1)
-14 (-16.09 % of base) : 393121.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ulong]:Invoke(System.Runtime.Intrinsics.Vector256`1[ulong],System.Runtime.Intrinsics.Vector256`1[ulong]):System.Runtime.Intrinsics.Vector256`1[ulong] (Tier1)
-368 (-15.51 % of base) : 393123.dasm - System.Numerics.Tensors.TensorPrimitives:<InvokeSpanSpanIntoSpan>g__Vectorized256|220_2[ulong,System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ulong]](byref,byref,byref,ulong) (Tier1)
-84 (-11.02 % of base) : 393473.dasm - System.Numerics.Tensors.TensorPrimitives:InvokeSpanSpanIntoSpan[ulong,System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ulong]](System.ReadOnlySpan`1[ulong],System.ReadOnlySpan`1[ulong],System.Span`1[ulong]) (Tier1)
-84 (-10.98 % of base) : 386605.dasm - System.Numerics.Tensors.TensorPrimitives:InvokeSpanSpanIntoSpan[ulong,System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ulong]](System.ReadOnlySpan`1[ulong],System.ReadOnlySpan`1[ulong],System.Span`1[ulong]) (Tier1)
-84 (-10.98 % of base) : 386805.dasm - System.Numerics.Tensors.TensorPrimitives:InvokeSpanSpanIntoSpan[ulong,System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ulong]](System.ReadOnlySpan`1[ulong],System.ReadOnlySpan`1[ulong],System.Span`1[ulong]) (Tier1)
-32 (-10.85 % of base) : 342443.dasm - System.Buffers.ProbabilisticMap:ContainsMask32CharsAvx2(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte],byref):System.Runtime.Intrinsics.Vector256`1[ubyte] (Tier1)
-84 (-10.81 % of base) : 393118.dasm - System.Numerics.Tensors.TensorPrimitives:InvokeSpanSpanIntoSpan[ulong,System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ulong]](System.ReadOnlySpan`1[ulong],System.ReadOnlySpan`1[ulong],System.Span`1[ulong]) (Tier1)
-26 (-8.78 % of base) : 342451.dasm - System.Buffers.ProbabilisticMap:ContainsMask16Chars(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte],byref):System.Runtime.Intrinsics.Vector128`1[ubyte] (Tier1)
-7 (-5.79 % of base) : 342450.dasm - System.Buffers.ProbabilisticMap:IsCharBitSet(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]):System.Runtime.Intrinsics.Vector128`1[ubyte] (Tier1)
-7 (-5.65 % of base) : 342442.dasm - System.Buffers.ProbabilisticMap:IsCharBitSetAvx2(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte]):System.Runtime.Intrinsics.Vector256`1[ubyte] (Tier1)
29 total methods with Code Size differences (29 improved, 0 regressed).
librariestestsnotieredcompilation.run.windows.x64.Release.mch
To reproduce these diffs on Windows x64:
superpmi.py asmdiffs -target_os windows -target_arch x64 -arch x64
Summary of Code Size diffs:
(Lower is better)
Total bytes of base: 135861173 (overridden on cmd)
Total bytes of diff: 135860414 (overridden on cmd)
Total bytes of delta: -759 (-0.00 % of base)
diff is an improvement.
relative diff is an improvement.
Detail diffs
Top file improvements (bytes):
-182 : 168894.dasm (-17.50 % of base)
-182 : 168616.dasm (-17.50 % of base)
-154 : 169032.dasm (-17.09 % of base)
-98 : 168947.dasm (-15.15 % of base)
-98 : 168825.dasm (-15.15 % of base)
-28 : 150728.dasm (-2.50 % of base)
-17 : 169934.dasm (-1.75 % of base)
7 total files with Code Size differences (7 improved, 0 regressed), 0 unchanged.
Top method improvements (bytes):
-182 (-17.50 % of base) : 168894.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ubyte,System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]](System.ReadOnlySpan`1[ubyte]):ubyte (FullOpts)
-182 (-17.50 % of base) : 168616.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ubyte,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]](System.ReadOnlySpan`1[ubyte]):ubyte (FullOpts)
-154 (-17.09 % of base) : 169032.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ushort,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ushort]](System.ReadOnlySpan`1[ushort]):ushort (FullOpts)
-98 (-15.15 % of base) : 168947.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ulong,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]](System.ReadOnlySpan`1[ulong]):ulong (FullOpts)
-98 (-15.15 % of base) : 168825.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ulong,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]](System.ReadOnlySpan`1[ulong]):ulong (FullOpts)
-28 (-2.50 % of base) : 150728.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
-17 (-1.75 % of base) : 169934.dasm - System.Numerics.Tests.GenericVectorTests:TestConditionalSelect[uint]():this (FullOpts)
Top method improvements (percentages):
-182 (-17.50 % of base) : 168894.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ubyte,System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]](System.ReadOnlySpan`1[ubyte]):ubyte (FullOpts)
-182 (-17.50 % of base) : 168616.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ubyte,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]](System.ReadOnlySpan`1[ubyte]):ubyte (FullOpts)
-154 (-17.09 % of base) : 169032.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ushort,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ushort]](System.ReadOnlySpan`1[ushort]):ushort (FullOpts)
-98 (-15.15 % of base) : 168947.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ulong,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]](System.ReadOnlySpan`1[ulong]):ulong (FullOpts)
-98 (-15.15 % of base) : 168825.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ulong,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]](System.ReadOnlySpan`1[ulong]):ulong (FullOpts)
-28 (-2.50 % of base) : 150728.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
-17 (-1.75 % of base) : 169934.dasm - System.Numerics.Tests.GenericVectorTests:TestConditionalSelect[uint]():this (FullOpts)
7 total methods with Code Size differences (7 improved, 0 regressed).
smoke_tests.nativeaot.windows.x64.checked.mch
To reproduce these diffs on Windows x64:
superpmi.py asmdiffs -target_os windows -target_arch x64 -arch x64
Summary of Code Size diffs:
(Lower is better)
Total bytes of base: 5087315 (overridden on cmd)
Total bytes of diff: 5087287 (overridden on cmd)
Total bytes of delta: -28 (-0.00 % of base)
diff is an improvement.
relative diff is an improvement.
Detail diffs
Top file improvements (bytes):
-28 : 19903.dasm (-2.52 % of base)
1 total files with Code Size differences (1 improved, 0 regressed), 0 unchanged.
Top method improvements (bytes):
-28 (-2.52 % of base) : 19903.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
Top method improvements (percentages):
-28 (-2.52 % of base) : 19903.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
1 total methods with Code Size differences (1 improved, 0 regressed).