Assembly Diffs
linux arm
Diffs are based on 2,237,081 contexts (825,130 MinOpts, 1,411,951 FullOpts).
MISSED contexts: 70,976 (3.08%)
No diffs found.
Details
Context information
| Collection |
Diffed contexts |
MinOpts |
FullOpts |
Missed, base |
Missed, diff |
| benchmarks.run.linux.arm.checked.mch |
46,198 |
5,279 |
40,919 |
1,202 (2.54%) |
1,202 (2.54%) |
| benchmarks.run_pgo.linux.arm.checked.mch |
159,584 |
58,093 |
101,491 |
3,243 (1.99%) |
3,243 (1.99%) |
| benchmarks.run_tiered.linux.arm.checked.mch |
71,534 |
38,077 |
33,457 |
945 (1.30%) |
945 (1.30%) |
| coreclr_tests.run.linux.arm.checked.mch |
471,885 |
259,093 |
212,792 |
7,156 (1.49%) |
7,156 (1.49%) |
| libraries.crossgen2.linux.arm.checked.mch |
195,441 |
14 |
195,427 |
0 (0.00%) |
0 (0.00%) |
| libraries.pmi.linux.arm.checked.mch |
271,663 |
6 |
271,657 |
7,766 (2.78%) |
7,766 (2.78%) |
| libraries_tests.run.linux.arm.Release.mch |
709,797 |
442,850 |
266,947 |
15,984 (2.20%) |
15,984 (2.20%) |
| librariestestsnotieredcompilation.run.linux.arm.Release.mch |
274,582 |
21,565 |
253,017 |
33,273 (10.81%) |
33,273 (10.81%) |
| realworld.run.linux.arm.checked.mch |
36,397 |
153 |
36,244 |
1,407 (3.72%) |
1,407 (3.72%) |
|
2,237,081 |
825,130 |
1,411,951 |
70,976 (3.08%) |
70,976 (3.08%) |
windows x86
Diffs are based on 2,299,121 contexts (840,463 MinOpts, 1,458,658 FullOpts).
MISSED contexts: 7 (0.00%)
Overall (-3,338 bytes)
| Collection |
Base size (bytes) |
Diff size (bytes) |
| benchmarks.run.windows.x86.checked.mch |
7,123,696 |
-117 |
| benchmarks.run_pgo.windows.x86.checked.mch |
45,854,626 |
-126 |
| benchmarks.run_tiered.windows.x86.checked.mch |
9,444,502 |
-117 |
| coreclr_tests.run.windows.x86.checked.mch |
309,424,823 |
-672 |
| libraries.pmi.windows.x86.checked.mch |
49,148,609 |
-565 |
| libraries_tests.run.windows.x86.Release.mch |
188,553,323 |
-896 |
| librariestestsnotieredcompilation.run.windows.x86.Release.mch |
103,930,242 |
-845 |
FullOpts (-3,338 bytes)
| Collection |
Base size (bytes) |
Diff size (bytes) |
| benchmarks.run.windows.x86.checked.mch |
7,123,417 |
-117 |
| benchmarks.run_pgo.windows.x86.checked.mch |
39,241,495 |
-126 |
| benchmarks.run_tiered.windows.x86.checked.mch |
5,176,811 |
-117 |
| coreclr_tests.run.windows.x86.checked.mch |
107,730,518 |
-672 |
| libraries.pmi.windows.x86.checked.mch |
49,053,295 |
-565 |
| libraries_tests.run.windows.x86.Release.mch |
90,396,173 |
-896 |
| librariestestsnotieredcompilation.run.windows.x86.Release.mch |
95,260,534 |
-845 |
Example diffs
benchmarks.run.windows.x86.checked.mch
-117 (-11.17%) : 22326.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
@@ -9,23 +9,23 @@
; Final local variable assignments
;
; V00 arg0 [V00,T13] ( 4, 4 ) byref -> edi single-def
-; V01 arg1 [V01,T12] ( 6, 5 ) byref -> [ebp-0xB4] single-def
+; V01 arg1 [V01,T12] ( 6, 5 ) byref -> [ebp-0x74] single-def
; V02 arg2 [V02,T14] ( 3, 3 ) int -> [ebp+0x10] single-def
; V03 arg3 [V03,T15] ( 2, 2 ) struct ( 8) [ebp+0x08] do-not-enreg[S] single-def <System.ReadOnlySpan`1[ushort]>
-; V04 loc0 [V04,T09] ( 7, 14.50) byref -> [ebp-0xB8] spill-single-def
+; V04 loc0 [V04,T09] ( 7, 14.50) byref -> [ebp-0x78] spill-single-def
; V05 loc1 [V05,T00] ( 19, 93.50) byref -> ebx
; V06 loc2 [V06,T30] ( 5, 10 ) simd16 -> [ebp-0x1C] spill-single-def <System.Runtime.Intrinsics.Vector128`1[ubyte]>
; V07 loc3 [V07,T31] ( 5, 10 ) simd16 -> [ebp-0x2C] spill-single-def <System.Runtime.Intrinsics.Vector128`1[ubyte]>
-; V08 loc4 [V08,T10] ( 3, 8.50) byref -> [ebp-0xBC] single-def
+; V08 loc4 [V08,T10] ( 3, 8.50) byref -> [ebp-0x7C] single-def
; V09 loc5 [V09,T32] ( 3, 8.50) simd32 -> [ebp-0x4C] spill-single-def <System.Runtime.Intrinsics.Vector256`1[ubyte]>
; V10 loc6 [V10,T33] ( 3, 8.50) simd32 -> [ebp-0x6C] spill-single-def <System.Runtime.Intrinsics.Vector256`1[ubyte]>
-; V11 loc7 [V11,T11] ( 3, 8.50) byref -> [ebp-0xC0] spill-single-def
+; V11 loc7 [V11,T11] ( 3, 8.50) byref -> [ebp-0x80] spill-single-def
; V12 loc8 [V12,T20] ( 4, 14 ) simd32 -> mm4 <System.Runtime.Intrinsics.Vector256`1[ubyte]>
; V13 loc9 [V13,T01] ( 5, 66 ) int -> esi
; V14 loc10 [V14,T07] ( 3, 32.50) byref -> edi
; V15 loc11 [V15,T21] ( 4, 14 ) simd16 -> mm2 <System.Runtime.Intrinsics.Vector128`1[ubyte]>
; V16 loc12 [V16,T02] ( 5, 66 ) int -> esi
-; V17 loc13 [V17,T08] ( 3, 32.50) byref -> [ebp-0xC4] spill-single-def
+; V17 loc13 [V17,T08] ( 3, 32.50) byref -> [ebp-0x84] spill-single-def
;* V18 tmp0 [V18 ] ( 0, 0 ) int -> zero-ref
;* V19 tmp1 [V19 ] ( 0, 0 ) ubyte -> zero-ref "Inlining Arg"
;* V20 tmp2 [V20 ] ( 0, 0 ) ubyte -> zero-ref "Inlining Arg"
@@ -39,10 +39,10 @@
; V28 tmp10 [V28,T23] ( 3, 12 ) simd32 -> mm5 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ushort]>
; V29 tmp11 [V29,T24] ( 3, 12 ) simd32 -> mm6 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
; V30 tmp12 [V30,T25] ( 3, 12 ) simd32 -> mm4 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
-; V31 tmp13 [V31,T34] ( 2, 8 ) simd32 -> [ebp-0x8C] spill-single-def "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
+; V31 tmp13 [V31,T34] ( 2, 8 ) simd32 -> mm5 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
; V32 tmp14 [V32,T35] ( 2, 8 ) simd32 -> mm4 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
;* V33 tmp15 [V33 ] ( 0, 0 ) simd32 -> zero-ref "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
-; V34 tmp16 [V34,T36] ( 2, 8 ) simd32 -> [ebp-0xAC] spill-single-def "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
+; V34 tmp16 [V34,T36] ( 2, 8 ) simd32 -> mm5 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
; V35 tmp17 [V35,T16] ( 4, 16 ) simd32 -> mm6 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
;* V36 tmp18 [V36 ] ( 0, 0 ) simd32 -> zero-ref "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
;* V37 tmp19 [V37 ] ( 0, 0 ) simd32 -> zero-ref "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
@@ -99,10 +99,10 @@
;* V88 tmp70 [V88 ] ( 0, 0 ) int -> zero-ref "field V80._length (fldOffset=0x4)" P-INDEP
;* V89 tmp71 [V89 ] ( 0, 0 ) byref -> zero-ref "field V82._reference (fldOffset=0x0)" P-INDEP
;* V90 tmp72 [V90 ] ( 0, 0 ) int -> zero-ref "field V82._length (fldOffset=0x4)" P-INDEP
-; V91 tmp73 [V91,T05] ( 3, 33 ) byref -> [ebp-0xC8] spill-single-def "V03.[000..004)"
-; V92 tmp74 [V92,T06] ( 3, 33 ) int -> [ebp-0xB0] spill-single-def "V03.[004..008)"
+; V91 tmp73 [V91,T05] ( 3, 33 ) byref -> [ebp-0x88] spill-single-def "V03.[000..004)"
+; V92 tmp74 [V92,T06] ( 3, 33 ) int -> [ebp-0x70] spill-single-def "V03.[004..008)"
;
-; Lcl frame size = 188
+; Lcl frame size = 124
G_M48875_IG01: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, nogc <-- Prolog IG
push ebp
@@ -110,25 +110,25 @@ G_M48875_IG01: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}
push edi
push esi
push ebx
- sub esp, 188
+ sub esp, 124
vzeroupper
mov edi, ecx
; byrRegs +[edi]
mov esi, edx
; byrRegs +[esi]
mov ebx, dword ptr [ebp+0x10]
- ;; size=22 bbWeight=1 PerfScore 7.00
+ ;; size=19 bbWeight=1 PerfScore 7.00
G_M48875_IG02: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=000000C0 {esi edi}, byref, isz
mov eax, bword ptr [ebp+0x08]
; byrRegs +[eax]
- mov bword ptr [ebp-0xC8], eax
+ mov bword ptr [ebp-0x88], eax
; GC ptr vars +{V91}
mov edx, dword ptr [ebp+0x0C]
- mov dword ptr [ebp-0xB0], edx
- mov eax, bword ptr [ebp-0xC8]
+ mov dword ptr [ebp-0x70], edx
+ mov eax, bword ptr [ebp-0x88]
cmp ebx, 16
jge SHORT G_M48875_IG04
- ;; size=29 bbWeight=1 PerfScore 6.25
+ ;; size=26 bbWeight=1 PerfScore 6.25
G_M48875_IG03: ; bbWeight=0.50, gcVars=0000000000000020 {V91}, gcrefRegs=00000000 {}, byrefRegs=000000C1 {eax esi edi}, gcvars, byref
mov ecx, 0xD1FFAB1E
; gcrRegs +[ecx]
@@ -137,16 +137,16 @@ G_M48875_IG03: ; bbWeight=0.50, gcVars=0000000000000020 {V91}, gcrefRegs=
call [<unknown method>]
; gcrRegs -[ecx edx]
; byrRegs -[eax]
- mov eax, bword ptr [ebp-0xC8]
+ mov eax, bword ptr [ebp-0x88]
; byrRegs +[eax]
;; size=22 bbWeight=0.50 PerfScore 2.25
G_M48875_IG04: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=000000C1 {eax esi edi}, byref
mov dword ptr [ebp+0x10], ebx
lea ecx, bword ptr [esi+2*ebx]
; byrRegs +[ecx]
- mov bword ptr [ebp-0xB8], ecx
+ mov bword ptr [ebp-0x78], ecx
; GC ptr vars +{V04}
- mov bword ptr [ebp-0xB4], esi
+ mov bword ptr [ebp-0x74], esi
; GC ptr vars +{V01}
mov ebx, esi
; byrRegs +[ebx]
@@ -156,7 +156,7 @@ G_M48875_IG04: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=000000C1 {e
vmovups xmmword ptr [ebp-0x2C], xmm1
cmp dword ptr [ebp+0x10], 32
jl G_M48875_IG17
- ;; size=49 bbWeight=1 PerfScore 16.75
+ ;; size=43 bbWeight=1 PerfScore 16.75
G_M48875_IG05: ; bbWeight=0.50, gcVars=0000000000001220 {V01 V04 V91}, gcrefRegs=00000000 {}, byrefRegs=0000000B {eax ecx ebx}, gcvars, byref
; byrRegs -[esi edi]
vmovaps ymm2, ymm0
@@ -167,9 +167,9 @@ G_M48875_IG05: ; bbWeight=0.50, gcVars=0000000000001220 {V01 V04 V91}, gc
vmovups ymmword ptr [ebp-0x6C], ymm3
lea edi, bword ptr [ecx-0x40]
; byrRegs +[edi]
- mov bword ptr [ebp-0xC0], edi
+ mov bword ptr [ebp-0x80], edi
; GC ptr vars +{V11}
- ;; size=39 bbWeight=0.50 PerfScore 4.00
+ ;; size=36 bbWeight=0.50 PerfScore 4.00
G_M48875_IG06: ; bbWeight=4, gcVars=0000000000001A20 {V01 V04 V11 V91}, gcrefRegs=00000000 {}, byrefRegs=00000009 {eax ebx}, gcvars, byref, isz
; byrRegs -[ecx edi]
vmovups ymm4, ymmword ptr [ebx]
@@ -184,41 +184,36 @@ G_M48875_IG06: ; bbWeight=4, gcVars=0000000000001A20 {V01 V04 V11 V91}, g
vpand ymm5, ymm5, ymmword ptr [@RWD32]
vmovups ymm7, ymmword ptr [@RWD64]
vpshufb ymm5, ymm7, ymm5
- vmovups ymmword ptr [ebp-0xAC], ymm5
vpand ymm6, ymm6, ymmword ptr [@RWD96]
vpcmpub k1, ymm6, ymmword ptr [@RWD128], 6
- vpmovm2b ymm7, k1
- vpsubb ymm5, ymm6, ymmword ptr [@RWD160]
- vpshufb ymm5, ymm3, ymm5
+ vpsubb ymm7, ymm6, ymmword ptr [@RWD160]
+ vpshufb ymm7, ymm3, ymm7
vpshufb ymm6, ymm2, ymm6
- vpternlogd ymm7, ymm5, ymm6, -54
- vpand ymm5, ymm7, ymmword ptr [ebp-0xAC]
+ vpblendmb ymm6 {k1}, ymm6, ymm7
+ vpand ymm5, ymm6, ymm5
vxorps ymm6, ymm6, ymm6
vpcmpeqb ymm5, ymm5, ymm6
vpcmpeqd ymm6, ymm6, ymm6
vpxor ymm5, ymm5, ymm6
- vmovups ymmword ptr [ebp-0x8C], ymm5
vpsrld ymm6, ymm4, 5
vpand ymm6, ymm6, ymmword ptr [@RWD32]
vmovups ymm7, ymmword ptr [@RWD64]
vpshufb ymm6, ymm7, ymm6
vpand ymm4, ymm4, ymmword ptr [@RWD96]
vpcmpub k1, ymm4, ymmword ptr [@RWD128], 6
- vpmovm2b ymm7, k1
- vpsubb ymm5, ymm4, ymmword ptr [@RWD160]
- vpshufb ymm5, ymm3, ymm5
+ vpsubb ymm7, ymm4, ymmword ptr [@RWD160]
+ vpshufb ymm7, ymm3, ymm7
vpshufb ymm4, ymm2, ymm4
- vpternlogd ymm7, ymm5, ymm4, -54
- vpand ymm4, ymm7, ymm6
- vxorps ymm5, ymm5, ymm5
- vpcmpeqb ymm4, ymm4, ymm5
- vpcmpeqd ymm5, ymm5, ymm5
- vpxor ymm4, ymm4, ymm5
- vmovups ymm5, ymmword ptr [ebp-0x8C]
+ vpblendmb ymm4 {k1}, ymm4, ymm7
+ vpand ymm4, ymm4, ymm6
+ vxorps ymm6, ymm6, ymm6
+ vpcmpeqb ymm4, ymm4, ymm6
+ vpcmpeqd ymm6, ymm6, ymm6
+ vpxor ymm4, ymm4, ymm6
vpand ymm4, ymm5, ymm4
vptest ymm4, ymm4
je SHORT G_M48875_IG11
- ;; size=274 bbWeight=4 PerfScore 348.00
+ ;; size=232 bbWeight=4 PerfScore 313.33
G_M48875_IG07: ; bbWeight=2, gcrefRegs=00000000 {}, byrefRegs=00000009 {eax ebx}, byref
vpermq ymm4, ymm4, -40
vpmovmskb esi, ymm4
@@ -229,7 +224,7 @@ G_M48875_IG08: ; bbWeight=16, gcrefRegs=00000000 {}, byrefRegs=00000009 {
lea edi, bword ptr [ebx+2*edi]
; byrRegs +[edi]
movzx ecx, word ptr [edi]
- push dword ptr [ebp-0xB0]
+ push dword ptr [ebp-0x70]
movsx edx, cx
mov ecx, eax
; byrRegs +[ecx]
@@ -239,19 +234,19 @@ G_M48875_IG08: ; bbWeight=16, gcrefRegs=00000000 {}, byrefRegs=00000009 {
jne SHORT G_M48875_IG12
blsr esi, esi
jne SHORT G_M48875_IG10
- ;; size=40 bbWeight=16 PerfScore 192.00
+ ;; size=37 bbWeight=16 PerfScore 192.00
G_M48875_IG09: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000008 {ebx}, byref, isz
; byrRegs -[edi]
add ebx, 64
- mov edi, bword ptr [ebp-0xC0]
+ mov edi, bword ptr [ebp-0x80]
; byrRegs +[edi]
cmp ebx, edi
- mov eax, bword ptr [ebp-0xC8]
+ mov eax, bword ptr [ebp-0x88]
; byrRegs +[eax]
vmovups ymm2, ymmword ptr [ebp-0x4C]
vmovups ymm3, ymmword ptr [ebp-0x6C]
jbe G_M48875_IG06
- mov ecx, bword ptr [ebp-0xB8]
+ mov ecx, bword ptr [ebp-0x78]
; byrRegs +[ecx]
cmp ebx, ecx
je SHORT G_M48875_IG14
@@ -261,13 +256,13 @@ G_M48875_IG09: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000008 {e
; byrRegs -[esi]
cmp esi, 32
jle SHORT G_M48875_IG16
- mov edi, bword ptr [ebp-0xC0]
+ mov edi, bword ptr [ebp-0x80]
mov ebx, edi
jmp G_M48875_IG06
- ;; size=65 bbWeight=4 PerfScore 75.00
+ ;; size=56 bbWeight=4 PerfScore 75.00
G_M48875_IG10: ; bbWeight=8, gcrefRegs=00000000 {}, byrefRegs=00000008 {ebx}, byref, isz
; byrRegs -[eax ecx edi]
- mov eax, bword ptr [ebp-0xC8]
+ mov eax, bword ptr [ebp-0x88]
; byrRegs +[eax]
jmp SHORT G_M48875_IG08
;; size=8 bbWeight=8 PerfScore 24.00
@@ -279,10 +274,10 @@ G_M48875_IG12: ; bbWeight=0.50, gcVars=0000000000001000 {V01}, gcrefRegs=
; GC ptr vars -{V04 V11 V91}
mov eax, edi
; byrRegs +[eax]
- sub eax, dword ptr [ebp-0xB4]
+ sub eax, dword ptr [ebp-0x74]
; byrRegs -[eax]
shr eax, 1
- ;; size=10 bbWeight=0.50 PerfScore 1.38
+ ;; size=7 bbWeight=0.50 PerfScore 1.38
G_M48875_IG13: ; bbWeight=0.50, epilog, nogc, extend
vzeroupper
lea esp, [ebp-0x0C]
@@ -317,10 +312,10 @@ G_M48875_IG16: ; bbWeight=0.50, gcVars=0000000000001220 {V01 V04 V91}, gc
G_M48875_IG17: ; bbWeight=0.50, gcrefRegs=00000000 {}, byrefRegs=0000000B {eax ecx ebx}, byref
lea edi, bword ptr [ecx-0x20]
...
benchmarks.run_pgo.windows.x86.checked.mch
-126 (-11.73%) : 94556.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (Tier0-FullOpts)
@@ -9,23 +9,23 @@
; Final local variable assignments
;
; V00 arg0 [V00,T13] ( 4, 4 ) byref -> edi single-def
-; V01 arg1 [V01,T12] ( 6, 5 ) byref -> [ebp-0xB4] single-def
+; V01 arg1 [V01,T12] ( 6, 5 ) byref -> [ebp-0x74] single-def
; V02 arg2 [V02,T14] ( 3, 3 ) int -> [ebp+0x10] single-def
; V03 arg3 [V03,T15] ( 2, 2 ) struct ( 8) [ebp+0x08] do-not-enreg[S] single-def <System.ReadOnlySpan`1[ushort]>
-; V04 loc0 [V04,T09] ( 7, 14.50) byref -> [ebp-0xB8] spill-single-def
+; V04 loc0 [V04,T09] ( 7, 14.50) byref -> [ebp-0x78] spill-single-def
; V05 loc1 [V05,T00] ( 19, 93.50) byref -> ebx
; V06 loc2 [V06,T30] ( 5, 10 ) simd16 -> [ebp-0x1C] spill-single-def <System.Runtime.Intrinsics.Vector128`1[ubyte]>
; V07 loc3 [V07,T31] ( 5, 10 ) simd16 -> [ebp-0x2C] spill-single-def <System.Runtime.Intrinsics.Vector128`1[ubyte]>
-; V08 loc4 [V08,T10] ( 3, 8.50) byref -> [ebp-0xBC] single-def
+; V08 loc4 [V08,T10] ( 3, 8.50) byref -> [ebp-0x7C] single-def
; V09 loc5 [V09,T32] ( 3, 8.50) simd32 -> [ebp-0x4C] spill-single-def <System.Runtime.Intrinsics.Vector256`1[ubyte]>
; V10 loc6 [V10,T33] ( 3, 8.50) simd32 -> [ebp-0x6C] spill-single-def <System.Runtime.Intrinsics.Vector256`1[ubyte]>
-; V11 loc7 [V11,T11] ( 3, 8.50) byref -> [ebp-0xC0] spill-single-def
+; V11 loc7 [V11,T11] ( 3, 8.50) byref -> [ebp-0x80] spill-single-def
; V12 loc8 [V12,T20] ( 4, 14 ) simd32 -> mm4 <System.Runtime.Intrinsics.Vector256`1[ubyte]>
; V13 loc9 [V13,T01] ( 5, 66 ) int -> esi
; V14 loc10 [V14,T07] ( 3, 32.50) byref -> edi
; V15 loc11 [V15,T21] ( 4, 14 ) simd16 -> mm2 <System.Runtime.Intrinsics.Vector128`1[ubyte]>
; V16 loc12 [V16,T02] ( 5, 66 ) int -> esi
-; V17 loc13 [V17,T08] ( 3, 32.50) byref -> [ebp-0xC4] spill-single-def
+; V17 loc13 [V17,T08] ( 3, 32.50) byref -> [ebp-0x84] spill-single-def
;* V18 tmp0 [V18 ] ( 0, 0 ) int -> zero-ref
;* V19 tmp1 [V19 ] ( 0, 0 ) ubyte -> zero-ref "Inlining Arg"
;* V20 tmp2 [V20 ] ( 0, 0 ) ubyte -> zero-ref "Inlining Arg"
@@ -39,10 +39,10 @@
; V28 tmp10 [V28,T23] ( 3, 12 ) simd32 -> mm5 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ushort]>
; V29 tmp11 [V29,T24] ( 3, 12 ) simd32 -> mm6 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
; V30 tmp12 [V30,T25] ( 3, 12 ) simd32 -> mm4 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
-; V31 tmp13 [V31,T34] ( 2, 8 ) simd32 -> [ebp-0x8C] spill-single-def "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
+; V31 tmp13 [V31,T34] ( 2, 8 ) simd32 -> mm5 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
; V32 tmp14 [V32,T35] ( 2, 8 ) simd32 -> mm4 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
;* V33 tmp15 [V33 ] ( 0, 0 ) simd32 -> zero-ref "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
-; V34 tmp16 [V34,T36] ( 2, 8 ) simd32 -> [ebp-0xAC] spill-single-def "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
+; V34 tmp16 [V34,T36] ( 2, 8 ) simd32 -> mm5 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
; V35 tmp17 [V35,T16] ( 4, 16 ) simd32 -> mm6 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
;* V36 tmp18 [V36 ] ( 0, 0 ) simd32 -> zero-ref "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
;* V37 tmp19 [V37 ] ( 0, 0 ) simd32 -> zero-ref "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
@@ -99,10 +99,10 @@
;* V88 tmp70 [V88 ] ( 0, 0 ) int -> zero-ref "field V80._length (fldOffset=0x4)" P-INDEP
;* V89 tmp71 [V89 ] ( 0, 0 ) byref -> zero-ref "field V82._reference (fldOffset=0x0)" P-INDEP
;* V90 tmp72 [V90 ] ( 0, 0 ) int -> zero-ref "field V82._length (fldOffset=0x4)" P-INDEP
-; V91 tmp73 [V91,T05] ( 3, 33 ) byref -> [ebp-0xC8] spill-single-def "V03.[000..004)"
-; V92 tmp74 [V92,T06] ( 3, 33 ) int -> [ebp-0xB0] spill-single-def "V03.[004..008)"
+; V91 tmp73 [V91,T05] ( 3, 33 ) byref -> [ebp-0x88] spill-single-def "V03.[000..004)"
+; V92 tmp74 [V92,T06] ( 3, 33 ) int -> [ebp-0x70] spill-single-def "V03.[004..008)"
;
-; Lcl frame size = 188
+; Lcl frame size = 124
G_M48875_IG01: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, nogc <-- Prolog IG
push ebp
@@ -110,31 +110,31 @@ G_M48875_IG01: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}
push edi
push esi
push ebx
- sub esp, 188
+ sub esp, 124
vzeroupper
mov edi, ecx
; byrRegs +[edi]
mov esi, edx
; byrRegs +[esi]
mov ebx, dword ptr [ebp+0x10]
- ;; size=22 bbWeight=1 PerfScore 7.00
+ ;; size=19 bbWeight=1 PerfScore 7.00
G_M48875_IG02: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=000000C0 {esi edi}, byref
mov eax, bword ptr [ebp+0x08]
; byrRegs +[eax]
- mov bword ptr [ebp-0xC8], eax
+ mov bword ptr [ebp-0x88], eax
; GC ptr vars +{V91}
mov ecx, dword ptr [ebp+0x0C]
- mov dword ptr [ebp-0xB0], ecx
+ mov dword ptr [ebp-0x70], ecx
cmp ebx, 16
jl G_M48875_IG25
- ;; size=27 bbWeight=1 PerfScore 5.25
+ ;; size=24 bbWeight=1 PerfScore 5.25
G_M48875_IG03: ; bbWeight=1, gcVars=0000000000000020 {V91}, gcrefRegs=00000000 {}, byrefRegs=000000C1 {eax esi edi}, gcvars, byref
mov dword ptr [ebp+0x10], ebx
lea edx, bword ptr [esi+2*ebx]
; byrRegs +[edx]
- mov bword ptr [ebp-0xB8], edx
+ mov bword ptr [ebp-0x78], edx
; GC ptr vars +{V04}
- mov bword ptr [ebp-0xB4], esi
+ mov bword ptr [ebp-0x74], esi
; GC ptr vars +{V01}
mov ebx, esi
; byrRegs +[ebx]
@@ -144,7 +144,7 @@ G_M48875_IG03: ; bbWeight=1, gcVars=0000000000000020 {V91}, gcrefRegs=000
vmovups xmmword ptr [ebp-0x2C], xmm1
cmp dword ptr [ebp+0x10], 32
jl G_M48875_IG16
- ;; size=49 bbWeight=1 PerfScore 16.75
+ ;; size=43 bbWeight=1 PerfScore 16.75
G_M48875_IG04: ; bbWeight=0.50, gcVars=0000000000001220 {V01 V04 V91}, gcrefRegs=00000000 {}, byrefRegs=0000000D {eax edx ebx}, gcvars, byref
; byrRegs -[esi edi]
vmovaps ymm2, ymm0
@@ -155,10 +155,10 @@ G_M48875_IG04: ; bbWeight=0.50, gcVars=0000000000001220 {V01 V04 V91}, gc
vmovups ymmword ptr [ebp-0x6C], ymm3
lea edi, bword ptr [edx-0x40]
; byrRegs +[edi]
- mov bword ptr [ebp-0xC0], edi
+ mov bword ptr [ebp-0x80], edi
; GC ptr vars +{V11}
- ;; size=39 bbWeight=0.50 PerfScore 4.00
-G_M48875_IG05: ; bbWeight=4, gcVars=0000000000001A20 {V01 V04 V11 V91}, gcrefRegs=00000000 {}, byrefRegs=00000009 {eax ebx}, gcvars, byref
+ ;; size=36 bbWeight=0.50 PerfScore 4.00
+G_M48875_IG05: ; bbWeight=4, gcVars=0000000000001A20 {V01 V04 V11 V91}, gcrefRegs=00000000 {}, byrefRegs=00000009 {eax ebx}, gcvars, byref, isz
; byrRegs -[edx edi]
vmovups ymm4, ymmword ptr [ebx]
vmovups ymm5, ymmword ptr [ebx+0x20]
@@ -172,41 +172,36 @@ G_M48875_IG05: ; bbWeight=4, gcVars=0000000000001A20 {V01 V04 V11 V91}, g
vpand ymm5, ymm5, ymmword ptr [@RWD32]
vmovups ymm7, ymmword ptr [@RWD64]
vpshufb ymm5, ymm7, ymm5
- vmovups ymmword ptr [ebp-0xAC], ymm5
vpand ymm6, ymm6, ymmword ptr [@RWD96]
vpcmpub k1, ymm6, ymmword ptr [@RWD128], 6
- vpmovm2b ymm7, k1
- vpsubb ymm5, ymm6, ymmword ptr [@RWD160]
- vpshufb ymm5, ymm3, ymm5
+ vpsubb ymm7, ymm6, ymmword ptr [@RWD160]
+ vpshufb ymm7, ymm3, ymm7
vpshufb ymm6, ymm2, ymm6
- vpternlogd ymm7, ymm5, ymm6, -54
- vpand ymm5, ymm7, ymmword ptr [ebp-0xAC]
+ vpblendmb ymm6 {k1}, ymm6, ymm7
+ vpand ymm5, ymm6, ymm5
vxorps ymm6, ymm6, ymm6
vpcmpeqb ymm5, ymm5, ymm6
vpcmpeqd ymm6, ymm6, ymm6
vpxor ymm5, ymm5, ymm6
- vmovups ymmword ptr [ebp-0x8C], ymm5
vpsrld ymm6, ymm4, 5
vpand ymm6, ymm6, ymmword ptr [@RWD32]
vmovups ymm7, ymmword ptr [@RWD64]
vpshufb ymm6, ymm7, ymm6
vpand ymm4, ymm4, ymmword ptr [@RWD96]
vpcmpub k1, ymm4, ymmword ptr [@RWD128], 6
- vpmovm2b ymm7, k1
- vpsubb ymm5, ymm4, ymmword ptr [@RWD160]
- vpshufb ymm5, ymm3, ymm5
+ vpsubb ymm7, ymm4, ymmword ptr [@RWD160]
+ vpshufb ymm7, ymm3, ymm7
vpshufb ymm4, ymm2, ymm4
- vpternlogd ymm7, ymm5, ymm4, -54
- vpand ymm4, ymm7, ymm6
- vxorps ymm5, ymm5, ymm5
- vpcmpeqb ymm4, ymm4, ymm5
- vpcmpeqd ymm5, ymm5, ymm5
- vpxor ymm4, ymm4, ymm5
- vmovups ymm5, ymmword ptr [ebp-0x8C]
+ vpblendmb ymm4 {k1}, ymm4, ymm7
+ vpand ymm4, ymm4, ymm6
+ vxorps ymm6, ymm6, ymm6
+ vpcmpeqb ymm4, ymm4, ymm6
+ vpcmpeqd ymm6, ymm6, ymm6
+ vpxor ymm4, ymm4, ymm6
vpand ymm4, ymm5, ymm4
vptest ymm4, ymm4
- je G_M48875_IG10
- ;; size=278 bbWeight=4 PerfScore 348.00
+ je SHORT G_M48875_IG10
+ ;; size=232 bbWeight=4 PerfScore 313.33
G_M48875_IG06: ; bbWeight=2, gcrefRegs=00000000 {}, byrefRegs=00000009 {eax ebx}, byref
vpermq ymm4, ymm4, -40
vpmovmskb esi, ymm4
@@ -231,16 +226,16 @@ G_M48875_IG07: ; bbWeight=16, gcrefRegs=00000000 {}, byrefRegs=00000009 {
G_M48875_IG08: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000008 {ebx}, byref, isz
; byrRegs -[edi]
add ebx, 64
- mov edi, bword ptr [ebp-0xC0]
+ mov edi, bword ptr [ebp-0x80]
; byrRegs +[edi]
cmp ebx, edi
- mov eax, bword ptr [ebp-0xC8]
+ mov eax, bword ptr [ebp-0x88]
; byrRegs +[eax]
- mov ecx, dword ptr [ebp-0xB0]
+ mov ecx, dword ptr [ebp-0x70]
vmovups ymm2, ymmword ptr [ebp-0x4C]
vmovups ymm3, ymmword ptr [ebp-0x6C]
jbe G_M48875_IG05
- mov edx, bword ptr [ebp-0xB8]
+ mov edx, bword ptr [ebp-0x78]
; byrRegs +[edx]
cmp ebx, edx
je SHORT G_M48875_IG13
@@ -250,17 +245,17 @@ G_M48875_IG08: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000008 {e
; byrRegs -[esi]
cmp esi, 32
jle SHORT G_M48875_IG15
- mov edi, bword ptr [ebp-0xC0]
+ mov edi, bword ptr [ebp-0x80]
mov ebx, edi
jmp G_M48875_IG05
- ;; size=71 bbWeight=4 PerfScore 79.00
+ ;; size=59 bbWeight=4 PerfScore 79.00
G_M48875_IG09: ; bbWeight=8, gcrefRegs=00000000 {}, byrefRegs=00000008 {ebx}, byref, isz
; byrRegs -[eax edx edi]
- mov eax, bword ptr [ebp-0xC8]
+ mov eax, bword ptr [ebp-0x88]
; byrRegs +[eax]
- mov ecx, dword ptr [ebp-0xB0]
+ mov ecx, dword ptr [ebp-0x70]
jmp SHORT G_M48875_IG07
- ;; size=14 bbWeight=8 PerfScore 32.00
+ ;; size=11 bbWeight=8 PerfScore 32.00
G_M48875_IG10: ; bbWeight=2, gcrefRegs=00000000 {}, byrefRegs=00000009 {eax ebx}, byref, isz
jmp SHORT G_M48875_IG08
;; size=2 bbWeight=2 PerfScore 4.00
@@ -269,10 +264,10 @@ G_M48875_IG11: ; bbWeight=0.50, gcVars=0000000000001000 {V01}, gcrefRegs=
; GC ptr vars -{V04 V11 V91}
mov eax, edi
; byrRegs +[eax]
- sub eax, dword ptr [ebp-0xB4]
+ sub eax, dword ptr [ebp-0x74]
; byrRegs -[eax]
shr eax, 1
- ;; size=10 bbWeight=0.50 PerfScore 1.38
+ ;; size=7 bbWeight=0.50 PerfScore 1.38
G_M48875_IG12: ; bbWeight=0.50, epilog, nogc, extend
vzeroupper
lea esp, [ebp-0x0C]
@@ -307,9 +302,9 @@ G_M48875_IG15: ; bbWeight=0.50, gcVars=0000000000001220 {V01 V04 V91}, gc
G_M48875_IG16: ; bbWeight=0.50, gcrefRegs=00000000 {}, byrefRegs=0000000D {eax edx ebx}, byref
lea edi, bword ptr [edx-0x20]
; byrRegs +[edi]
- mov bword ptr [ebp-0xBC], edi
+ mov bword ptr [ebp-0x7C], edi
; GC ptr vars +{V08}
- ;; size=9 bbWeight=0.50 PerfScore 0.75
+ ;; size=6 bbWeight=0.50 PerfScore 0.75
G_M48875_IG17: ; bbWeight=4, gcVars=0000000000001620 {V01 V04 V08 V91}, gcrefRegs=00000000 {}, byrefRegs=00000009 {eax ebx}, gcvars, byref
; byrRegs -[edx edi]
; GC ptr vars -{V05 V09 V12}
@@ -327,12 +322,11 @@ G_M48875_IG17: ; bbWeight=4, gcVars=0000000000001620 {V01 V04 V08 V91}, g
vpshufb xmm3, xmm5, xmm3
vpand xmm4, xmm4, xmmword ptr [@RWD96]
vpcmpub k1, xmm4, xmmword ptr [@RWD128], 6
- vpmovm2b xmm5, k1
- vpsubb xmm6, xmm4, xmmword ptr [@RWD160]
- vpshufb xmm6, xmm1, xmm6
...
benchmarks.run_tiered.windows.x86.checked.mch
-117 (-11.17%) : 44440.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (Tier0-FullOpts)
@@ -9,23 +9,23 @@
; Final local variable assignments
;
; V00 arg0 [V00,T13] ( 4, 4 ) byref -> edi single-def
-; V01 arg1 [V01,T12] ( 6, 5 ) byref -> [ebp-0xB4] single-def
+; V01 arg1 [V01,T12] ( 6, 5 ) byref -> [ebp-0x74] single-def
; V02 arg2 [V02,T14] ( 3, 3 ) int -> [ebp+0x10] single-def
; V03 arg3 [V03,T15] ( 2, 2 ) struct ( 8) [ebp+0x08] do-not-enreg[S] single-def <System.ReadOnlySpan`1[ushort]>
-; V04 loc0 [V04,T09] ( 7, 14.50) byref -> [ebp-0xB8] spill-single-def
+; V04 loc0 [V04,T09] ( 7, 14.50) byref -> [ebp-0x78] spill-single-def
; V05 loc1 [V05,T00] ( 19, 93.50) byref -> ebx
; V06 loc2 [V06,T30] ( 5, 10 ) simd16 -> [ebp-0x1C] spill-single-def <System.Runtime.Intrinsics.Vector128`1[ubyte]>
; V07 loc3 [V07,T31] ( 5, 10 ) simd16 -> [ebp-0x2C] spill-single-def <System.Runtime.Intrinsics.Vector128`1[ubyte]>
-; V08 loc4 [V08,T10] ( 3, 8.50) byref -> [ebp-0xBC] single-def
+; V08 loc4 [V08,T10] ( 3, 8.50) byref -> [ebp-0x7C] single-def
; V09 loc5 [V09,T32] ( 3, 8.50) simd32 -> [ebp-0x4C] spill-single-def <System.Runtime.Intrinsics.Vector256`1[ubyte]>
; V10 loc6 [V10,T33] ( 3, 8.50) simd32 -> [ebp-0x6C] spill-single-def <System.Runtime.Intrinsics.Vector256`1[ubyte]>
-; V11 loc7 [V11,T11] ( 3, 8.50) byref -> [ebp-0xC0] spill-single-def
+; V11 loc7 [V11,T11] ( 3, 8.50) byref -> [ebp-0x80] spill-single-def
; V12 loc8 [V12,T20] ( 4, 14 ) simd32 -> mm4 <System.Runtime.Intrinsics.Vector256`1[ubyte]>
; V13 loc9 [V13,T01] ( 5, 66 ) int -> esi
; V14 loc10 [V14,T07] ( 3, 32.50) byref -> edi
; V15 loc11 [V15,T21] ( 4, 14 ) simd16 -> mm2 <System.Runtime.Intrinsics.Vector128`1[ubyte]>
; V16 loc12 [V16,T02] ( 5, 66 ) int -> esi
-; V17 loc13 [V17,T08] ( 3, 32.50) byref -> [ebp-0xC4] spill-single-def
+; V17 loc13 [V17,T08] ( 3, 32.50) byref -> [ebp-0x84] spill-single-def
;* V18 tmp0 [V18 ] ( 0, 0 ) int -> zero-ref
;* V19 tmp1 [V19 ] ( 0, 0 ) ubyte -> zero-ref "Inlining Arg"
;* V20 tmp2 [V20 ] ( 0, 0 ) ubyte -> zero-ref "Inlining Arg"
@@ -39,10 +39,10 @@
; V28 tmp10 [V28,T23] ( 3, 12 ) simd32 -> mm5 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ushort]>
; V29 tmp11 [V29,T24] ( 3, 12 ) simd32 -> mm6 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
; V30 tmp12 [V30,T25] ( 3, 12 ) simd32 -> mm4 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
-; V31 tmp13 [V31,T34] ( 2, 8 ) simd32 -> [ebp-0x8C] spill-single-def "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
+; V31 tmp13 [V31,T34] ( 2, 8 ) simd32 -> mm5 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
; V32 tmp14 [V32,T35] ( 2, 8 ) simd32 -> mm4 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
;* V33 tmp15 [V33 ] ( 0, 0 ) simd32 -> zero-ref "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
-; V34 tmp16 [V34,T36] ( 2, 8 ) simd32 -> [ebp-0xAC] spill-single-def "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
+; V34 tmp16 [V34,T36] ( 2, 8 ) simd32 -> mm5 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
; V35 tmp17 [V35,T16] ( 4, 16 ) simd32 -> mm6 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
;* V36 tmp18 [V36 ] ( 0, 0 ) simd32 -> zero-ref "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
;* V37 tmp19 [V37 ] ( 0, 0 ) simd32 -> zero-ref "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
@@ -99,10 +99,10 @@
;* V88 tmp70 [V88 ] ( 0, 0 ) int -> zero-ref "field V80._length (fldOffset=0x4)" P-INDEP
;* V89 tmp71 [V89 ] ( 0, 0 ) byref -> zero-ref "field V82._reference (fldOffset=0x0)" P-INDEP
;* V90 tmp72 [V90 ] ( 0, 0 ) int -> zero-ref "field V82._length (fldOffset=0x4)" P-INDEP
-; V91 tmp73 [V91,T05] ( 3, 33 ) byref -> [ebp-0xC8] spill-single-def "V03.[000..004)"
-; V92 tmp74 [V92,T06] ( 3, 33 ) int -> [ebp-0xB0] spill-single-def "V03.[004..008)"
+; V91 tmp73 [V91,T05] ( 3, 33 ) byref -> [ebp-0x88] spill-single-def "V03.[000..004)"
+; V92 tmp74 [V92,T06] ( 3, 33 ) int -> [ebp-0x70] spill-single-def "V03.[004..008)"
;
-; Lcl frame size = 188
+; Lcl frame size = 124
G_M48875_IG01: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, nogc <-- Prolog IG
push ebp
@@ -110,25 +110,25 @@ G_M48875_IG01: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}
push edi
push esi
push ebx
- sub esp, 188
+ sub esp, 124
vzeroupper
mov edi, ecx
; byrRegs +[edi]
mov esi, edx
; byrRegs +[esi]
mov ebx, dword ptr [ebp+0x10]
- ;; size=22 bbWeight=1 PerfScore 7.00
+ ;; size=19 bbWeight=1 PerfScore 7.00
G_M48875_IG02: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=000000C0 {esi edi}, byref, isz
mov eax, bword ptr [ebp+0x08]
; byrRegs +[eax]
- mov bword ptr [ebp-0xC8], eax
+ mov bword ptr [ebp-0x88], eax
; GC ptr vars +{V91}
mov edx, dword ptr [ebp+0x0C]
- mov dword ptr [ebp-0xB0], edx
- mov eax, bword ptr [ebp-0xC8]
+ mov dword ptr [ebp-0x70], edx
+ mov eax, bword ptr [ebp-0x88]
cmp ebx, 16
jge SHORT G_M48875_IG04
- ;; size=29 bbWeight=1 PerfScore 6.25
+ ;; size=26 bbWeight=1 PerfScore 6.25
G_M48875_IG03: ; bbWeight=0.50, gcVars=0000000000000020 {V91}, gcrefRegs=00000000 {}, byrefRegs=000000C1 {eax esi edi}, gcvars, byref
mov ecx, 0xD1FFAB1E
; gcrRegs +[ecx]
@@ -137,16 +137,16 @@ G_M48875_IG03: ; bbWeight=0.50, gcVars=0000000000000020 {V91}, gcrefRegs=
call [<unknown method>]
; gcrRegs -[ecx edx]
; byrRegs -[eax]
- mov eax, bword ptr [ebp-0xC8]
+ mov eax, bword ptr [ebp-0x88]
; byrRegs +[eax]
;; size=22 bbWeight=0.50 PerfScore 2.25
G_M48875_IG04: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=000000C1 {eax esi edi}, byref
mov dword ptr [ebp+0x10], ebx
lea ecx, bword ptr [esi+2*ebx]
; byrRegs +[ecx]
- mov bword ptr [ebp-0xB8], ecx
+ mov bword ptr [ebp-0x78], ecx
; GC ptr vars +{V04}
- mov bword ptr [ebp-0xB4], esi
+ mov bword ptr [ebp-0x74], esi
; GC ptr vars +{V01}
mov ebx, esi
; byrRegs +[ebx]
@@ -156,7 +156,7 @@ G_M48875_IG04: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=000000C1 {e
vmovups xmmword ptr [ebp-0x2C], xmm1
cmp dword ptr [ebp+0x10], 32
jl G_M48875_IG17
- ;; size=49 bbWeight=1 PerfScore 16.75
+ ;; size=43 bbWeight=1 PerfScore 16.75
G_M48875_IG05: ; bbWeight=0.50, gcVars=0000000000001220 {V01 V04 V91}, gcrefRegs=00000000 {}, byrefRegs=0000000B {eax ecx ebx}, gcvars, byref
; byrRegs -[esi edi]
vmovaps ymm2, ymm0
@@ -167,9 +167,9 @@ G_M48875_IG05: ; bbWeight=0.50, gcVars=0000000000001220 {V01 V04 V91}, gc
vmovups ymmword ptr [ebp-0x6C], ymm3
lea edi, bword ptr [ecx-0x40]
; byrRegs +[edi]
- mov bword ptr [ebp-0xC0], edi
+ mov bword ptr [ebp-0x80], edi
; GC ptr vars +{V11}
- ;; size=39 bbWeight=0.50 PerfScore 4.00
+ ;; size=36 bbWeight=0.50 PerfScore 4.00
G_M48875_IG06: ; bbWeight=4, gcVars=0000000000001A20 {V01 V04 V11 V91}, gcrefRegs=00000000 {}, byrefRegs=00000009 {eax ebx}, gcvars, byref, isz
; byrRegs -[ecx edi]
vmovups ymm4, ymmword ptr [ebx]
@@ -184,41 +184,36 @@ G_M48875_IG06: ; bbWeight=4, gcVars=0000000000001A20 {V01 V04 V11 V91}, g
vpand ymm5, ymm5, ymmword ptr [@RWD32]
vmovups ymm7, ymmword ptr [@RWD64]
vpshufb ymm5, ymm7, ymm5
- vmovups ymmword ptr [ebp-0xAC], ymm5
vpand ymm6, ymm6, ymmword ptr [@RWD96]
vpcmpub k1, ymm6, ymmword ptr [@RWD128], 6
- vpmovm2b ymm7, k1
- vpsubb ymm5, ymm6, ymmword ptr [@RWD160]
- vpshufb ymm5, ymm3, ymm5
+ vpsubb ymm7, ymm6, ymmword ptr [@RWD160]
+ vpshufb ymm7, ymm3, ymm7
vpshufb ymm6, ymm2, ymm6
- vpternlogd ymm7, ymm5, ymm6, -54
- vpand ymm5, ymm7, ymmword ptr [ebp-0xAC]
+ vpblendmb ymm6 {k1}, ymm6, ymm7
+ vpand ymm5, ymm6, ymm5
vxorps ymm6, ymm6, ymm6
vpcmpeqb ymm5, ymm5, ymm6
vpcmpeqd ymm6, ymm6, ymm6
vpxor ymm5, ymm5, ymm6
- vmovups ymmword ptr [ebp-0x8C], ymm5
vpsrld ymm6, ymm4, 5
vpand ymm6, ymm6, ymmword ptr [@RWD32]
vmovups ymm7, ymmword ptr [@RWD64]
vpshufb ymm6, ymm7, ymm6
vpand ymm4, ymm4, ymmword ptr [@RWD96]
vpcmpub k1, ymm4, ymmword ptr [@RWD128], 6
- vpmovm2b ymm7, k1
- vpsubb ymm5, ymm4, ymmword ptr [@RWD160]
- vpshufb ymm5, ymm3, ymm5
+ vpsubb ymm7, ymm4, ymmword ptr [@RWD160]
+ vpshufb ymm7, ymm3, ymm7
vpshufb ymm4, ymm2, ymm4
- vpternlogd ymm7, ymm5, ymm4, -54
- vpand ymm4, ymm7, ymm6
- vxorps ymm5, ymm5, ymm5
- vpcmpeqb ymm4, ymm4, ymm5
- vpcmpeqd ymm5, ymm5, ymm5
- vpxor ymm4, ymm4, ymm5
- vmovups ymm5, ymmword ptr [ebp-0x8C]
+ vpblendmb ymm4 {k1}, ymm4, ymm7
+ vpand ymm4, ymm4, ymm6
+ vxorps ymm6, ymm6, ymm6
+ vpcmpeqb ymm4, ymm4, ymm6
+ vpcmpeqd ymm6, ymm6, ymm6
+ vpxor ymm4, ymm4, ymm6
vpand ymm4, ymm5, ymm4
vptest ymm4, ymm4
je SHORT G_M48875_IG11
- ;; size=274 bbWeight=4 PerfScore 348.00
+ ;; size=232 bbWeight=4 PerfScore 313.33
G_M48875_IG07: ; bbWeight=2, gcrefRegs=00000000 {}, byrefRegs=00000009 {eax ebx}, byref
vpermq ymm4, ymm4, -40
vpmovmskb esi, ymm4
@@ -229,7 +224,7 @@ G_M48875_IG08: ; bbWeight=16, gcrefRegs=00000000 {}, byrefRegs=00000009 {
lea edi, bword ptr [ebx+2*edi]
; byrRegs +[edi]
movzx ecx, word ptr [edi]
- push dword ptr [ebp-0xB0]
+ push dword ptr [ebp-0x70]
movsx edx, cx
mov ecx, eax
; byrRegs +[ecx]
@@ -239,19 +234,19 @@ G_M48875_IG08: ; bbWeight=16, gcrefRegs=00000000 {}, byrefRegs=00000009 {
jne SHORT G_M48875_IG12
blsr esi, esi
jne SHORT G_M48875_IG10
- ;; size=40 bbWeight=16 PerfScore 192.00
+ ;; size=37 bbWeight=16 PerfScore 192.00
G_M48875_IG09: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000008 {ebx}, byref, isz
; byrRegs -[edi]
add ebx, 64
- mov edi, bword ptr [ebp-0xC0]
+ mov edi, bword ptr [ebp-0x80]
; byrRegs +[edi]
cmp ebx, edi
- mov eax, bword ptr [ebp-0xC8]
+ mov eax, bword ptr [ebp-0x88]
; byrRegs +[eax]
vmovups ymm2, ymmword ptr [ebp-0x4C]
vmovups ymm3, ymmword ptr [ebp-0x6C]
jbe G_M48875_IG06
- mov ecx, bword ptr [ebp-0xB8]
+ mov ecx, bword ptr [ebp-0x78]
; byrRegs +[ecx]
cmp ebx, ecx
je SHORT G_M48875_IG14
@@ -261,13 +256,13 @@ G_M48875_IG09: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000008 {e
; byrRegs -[esi]
cmp esi, 32
jle SHORT G_M48875_IG16
- mov edi, bword ptr [ebp-0xC0]
+ mov edi, bword ptr [ebp-0x80]
mov ebx, edi
jmp G_M48875_IG06
- ;; size=65 bbWeight=4 PerfScore 75.00
+ ;; size=56 bbWeight=4 PerfScore 75.00
G_M48875_IG10: ; bbWeight=8, gcrefRegs=00000000 {}, byrefRegs=00000008 {ebx}, byref, isz
; byrRegs -[eax ecx edi]
- mov eax, bword ptr [ebp-0xC8]
+ mov eax, bword ptr [ebp-0x88]
; byrRegs +[eax]
jmp SHORT G_M48875_IG08
;; size=8 bbWeight=8 PerfScore 24.00
@@ -279,10 +274,10 @@ G_M48875_IG12: ; bbWeight=0.50, gcVars=0000000000001000 {V01}, gcrefRegs=
; GC ptr vars -{V04 V11 V91}
mov eax, edi
; byrRegs +[eax]
- sub eax, dword ptr [ebp-0xB4]
+ sub eax, dword ptr [ebp-0x74]
; byrRegs -[eax]
shr eax, 1
- ;; size=10 bbWeight=0.50 PerfScore 1.38
+ ;; size=7 bbWeight=0.50 PerfScore 1.38
G_M48875_IG13: ; bbWeight=0.50, epilog, nogc, extend
vzeroupper
lea esp, [ebp-0x0C]
@@ -317,10 +312,10 @@ G_M48875_IG16: ; bbWeight=0.50, gcVars=0000000000001220 {V01 V04 V91}, gc
G_M48875_IG17: ; bbWeight=0.50, gcrefRegs=00000000 {}, byrefRegs=0000000B {eax ecx ebx}, byref
lea edi, bword ptr [ecx-0x20]
...
coreclr_tests.run.windows.x86.checked.mch
-28 (-1.89%) : 469361.dasm - VectorTest+VectorRelopTest`1[int]:VectorRelOp(int,int):int (FullOpts)
@@ -289,10 +289,9 @@ G_M1266_IG17: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {},
G_M1266_IG18: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
vmovups ymm2, ymmword ptr [ebp-0x68]
vpcmpd k1, ymm2, ymmword ptr [ebp-0x28], 2
- vpmovm2d ymm3, k1
- vpternlogd ymm3, ymm2, ymm1, -54
+ vpblendmd ymm3 {k1}, ymm1, ymm2
xor esi, esi
- ;; size=31 bbWeight=1 PerfScore 10.75
+ ;; size=24 bbWeight=1 PerfScore 9.58
G_M1266_IG19: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, isz
mov ecx, esi
vmovups ymmword ptr [ebp-0x88], ymm3
@@ -340,10 +339,9 @@ G_M1266_IG21: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {},
G_M1266_IG22: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
vmovups ymm1, ymmword ptr [ebp-0x48]
vpcmpd k1, ymm2, ymm1, 2
- vpmovm2d ymm3, k1
- vpternlogd ymm3, ymm2, ymm1, -54
+ vpblendmd ymm3 {k1}, ymm1, ymm2
xor esi, esi
- ;; size=27 bbWeight=1 PerfScore 8.75
+ ;; size=20 bbWeight=1 PerfScore 7.58
G_M1266_IG23: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, isz
mov ecx, esi
vmovups ymmword ptr [ebp-0x88], ymm3
@@ -391,10 +389,9 @@ G_M1266_IG25: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {},
G_M1266_IG26: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
vmovups ymm2, ymmword ptr [ebp-0x68]
vpcmpd k1, ymm2, ymmword ptr [ebp-0x28], 5
- vpmovm2d ymm3, k1
- vpternlogd ymm3, ymm2, ymm1, -54
+ vpblendmd ymm3 {k1}, ymm1, ymm2
xor esi, esi
- ;; size=31 bbWeight=1 PerfScore 10.75
+ ;; size=24 bbWeight=1 PerfScore 9.58
G_M1266_IG27: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, isz
mov ecx, esi
vmovups ymmword ptr [ebp-0x88], ymm3
@@ -442,10 +439,9 @@ G_M1266_IG29: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {},
G_M1266_IG30: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
vmovups ymm1, ymmword ptr [ebp-0x48]
vpcmpd k1, ymm1, ymmword ptr [ebp-0x28], 5
- vpmovm2d ymm3, k1
- vpternlogd ymm3, ymm2, ymm1, -54
+ vpblendmd ymm3 {k1}, ymm1, ymm2
xor esi, esi
- ;; size=31 bbWeight=1 PerfScore 10.75
+ ;; size=24 bbWeight=1 PerfScore 9.58
G_M1266_IG31: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, isz
mov ecx, esi
vmovups ymmword ptr [ebp-0x88], ymm3
@@ -600,6 +596,6 @@ G_M1266_IG42: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {},
ret
;; size=10 bbWeight=1 PerfScore 4.00
-; Total bytes of code 1483, prolog size 26, PerfScore 1134.08, instruction count 335, allocated bytes for code 1483 (MethodHash=881ffb0d) for method VectorTest+VectorRelopTest`1[int]:VectorRelOp(int,int):int (FullOpts)
+; Total bytes of code 1455, prolog size 26, PerfScore 1129.42, instruction count 331, allocated bytes for code 1455 (MethodHash=881ffb0d) for method VectorTest+VectorRelopTest`1[int]:VectorRelOp(int,int):int (FullOpts)
; ============================================================
-28 (-1.89%) : 207935.dasm - VectorTest+VectorRelopTest`1[int]:VectorRelOp(int,int):int (Tier0-FullOpts)
@@ -289,10 +289,9 @@ G_M1266_IG17: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {},
G_M1266_IG18: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
vmovups ymm2, ymmword ptr [ebp-0x68]
vpcmpd k1, ymm2, ymmword ptr [ebp-0x28], 2
- vpmovm2d ymm3, k1
- vpternlogd ymm3, ymm2, ymm1, -54
+ vpblendmd ymm3 {k1}, ymm1, ymm2
xor esi, esi
- ;; size=31 bbWeight=1 PerfScore 10.75
+ ;; size=24 bbWeight=1 PerfScore 9.58
G_M1266_IG19: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, isz
mov ecx, esi
vmovups ymmword ptr [ebp-0x88], ymm3
@@ -340,10 +339,9 @@ G_M1266_IG21: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {},
G_M1266_IG22: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
vmovups ymm1, ymmword ptr [ebp-0x48]
vpcmpd k1, ymm2, ymm1, 2
- vpmovm2d ymm3, k1
- vpternlogd ymm3, ymm2, ymm1, -54
+ vpblendmd ymm3 {k1}, ymm1, ymm2
xor esi, esi
- ;; size=27 bbWeight=1 PerfScore 8.75
+ ;; size=20 bbWeight=1 PerfScore 7.58
G_M1266_IG23: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, isz
mov ecx, esi
vmovups ymmword ptr [ebp-0x88], ymm3
@@ -391,10 +389,9 @@ G_M1266_IG25: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {},
G_M1266_IG26: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
vmovups ymm2, ymmword ptr [ebp-0x68]
vpcmpd k1, ymm2, ymmword ptr [ebp-0x28], 5
- vpmovm2d ymm3, k1
- vpternlogd ymm3, ymm2, ymm1, -54
+ vpblendmd ymm3 {k1}, ymm1, ymm2
xor esi, esi
- ;; size=31 bbWeight=1 PerfScore 10.75
+ ;; size=24 bbWeight=1 PerfScore 9.58
G_M1266_IG27: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, isz
mov ecx, esi
vmovups ymmword ptr [ebp-0x88], ymm3
@@ -442,10 +439,9 @@ G_M1266_IG29: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {},
G_M1266_IG30: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
vmovups ymm1, ymmword ptr [ebp-0x48]
vpcmpd k1, ymm1, ymmword ptr [ebp-0x28], 5
- vpmovm2d ymm3, k1
- vpternlogd ymm3, ymm2, ymm1, -54
+ vpblendmd ymm3 {k1}, ymm1, ymm2
xor esi, esi
- ;; size=31 bbWeight=1 PerfScore 10.75
+ ;; size=24 bbWeight=1 PerfScore 9.58
G_M1266_IG31: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, isz
mov ecx, esi
vmovups ymmword ptr [ebp-0x88], ymm3
@@ -600,6 +596,6 @@ G_M1266_IG42: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {},
ret
;; size=10 bbWeight=1 PerfScore 4.00
-; Total bytes of code 1483, prolog size 26, PerfScore 1134.08, instruction count 335, allocated bytes for code 1483 (MethodHash=881ffb0d) for method VectorTest+VectorRelopTest`1[int]:VectorRelOp(int,int):int (Tier0-FullOpts)
+; Total bytes of code 1455, prolog size 26, PerfScore 1129.42, instruction count 331, allocated bytes for code 1455 (MethodHash=881ffb0d) for method VectorTest+VectorRelopTest`1[int]:VectorRelOp(int,int):int (Tier0-FullOpts)
; ============================================================
-28 (-1.86%) : 207945.dasm - VectorTest+VectorRelopTest`1[byte]:VectorRelOp(byte,byte):int (Tier0-FullOpts)
@@ -291,10 +291,9 @@ G_M44299_IG17: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {}
G_M44299_IG18: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
vmovups ymm0, ymmword ptr [ebp-0x68]
vpcmpb k1, ymm0, ymmword ptr [ebp-0x28], 2
- vpmovm2b ymm3, k1
- vpternlogd ymm3, ymm0, ymm2, -54
+ vpblendmb ymm3 {k1}, ymm2, ymm0
xor esi, esi
- ;; size=31 bbWeight=1 PerfScore 11.75
+ ;; size=24 bbWeight=1 PerfScore 11.25
G_M44299_IG19: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, isz
mov ecx, esi
vmovups ymmword ptr [ebp-0x88], ymm3
@@ -342,10 +341,9 @@ G_M44299_IG21: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {}
G_M44299_IG22: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
vmovups ymm2, ymmword ptr [ebp-0x48]
vpcmpb k1, ymm0, ymm2, 2
- vpmovm2b ymm3, k1
- vpternlogd ymm3, ymm0, ymm2, -54
+ vpblendmb ymm3 {k1}, ymm2, ymm0
xor esi, esi
- ;; size=27 bbWeight=1 PerfScore 9.75
+ ;; size=20 bbWeight=1 PerfScore 9.25
G_M44299_IG23: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, isz
mov ecx, esi
vmovups ymmword ptr [ebp-0x88], ymm3
@@ -393,10 +391,9 @@ G_M44299_IG25: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {}
G_M44299_IG26: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
vmovups ymm0, ymmword ptr [ebp-0x68]
vpcmpb k1, ymm0, ymmword ptr [ebp-0x28], 5
- vpmovm2b ymm3, k1
- vpternlogd ymm3, ymm0, ymm2, -54
+ vpblendmb ymm3 {k1}, ymm2, ymm0
xor esi, esi
- ;; size=31 bbWeight=1 PerfScore 11.75
+ ;; size=24 bbWeight=1 PerfScore 11.25
G_M44299_IG27: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, isz
mov ecx, esi
vmovups ymmword ptr [ebp-0x88], ymm3
@@ -444,10 +441,9 @@ G_M44299_IG29: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {}
G_M44299_IG30: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
vmovups ymm2, ymmword ptr [ebp-0x48]
vpcmpb k1, ymm2, ymmword ptr [ebp-0x28], 5
- vpmovm2b ymm3, k1
- vpternlogd ymm3, ymm0, ymm2, -54
+ vpblendmb ymm3 {k1}, ymm2, ymm0
xor esi, esi
- ;; size=31 bbWeight=1 PerfScore 11.75
+ ;; size=24 bbWeight=1 PerfScore 11.25
G_M44299_IG31: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, isz
mov ecx, esi
vmovups ymmword ptr [ebp-0x88], ymm3
@@ -602,6 +598,6 @@ G_M44299_IG42: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}
ret
;; size=10 bbWeight=1 PerfScore 4.00
-; Total bytes of code 1503, prolog size 26, PerfScore 1296.58, instruction count 336, allocated bytes for code 1503 (MethodHash=09db52f4) for method VectorTest+VectorRelopTest`1[byte]:VectorRelOp(byte,byte):int (Tier0-FullOpts)
+; Total bytes of code 1475, prolog size 26, PerfScore 1294.58, instruction count 332, allocated bytes for code 1475 (MethodHash=09db52f4) for method VectorTest+VectorRelopTest`1[byte]:VectorRelOp(byte,byte):int (Tier0-FullOpts)
; ============================================================
-28 (-1.86%) : 207944.dasm - VectorTest+VectorRelopTest`1[short]:VectorRelOp(short,short):int (Tier0-FullOpts)
@@ -291,10 +291,9 @@ G_M8563_IG17: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {},
G_M8563_IG18: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
vmovups ymm0, ymmword ptr [ebp-0x68]
vpcmpw k1, ymm0, ymmword ptr [ebp-0x28], 2
- vpmovm2w ymm3, k1
- vpternlogd ymm3, ymm0, ymm2, -54
+ vpblendmw ymm3 {k1}, ymm2, ymm0
xor esi, esi
- ;; size=31 bbWeight=1 PerfScore 11.75
+ ;; size=24 bbWeight=1 PerfScore 11.25
G_M8563_IG19: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, isz
mov ecx, esi
vmovups ymmword ptr [ebp-0x88], ymm3
@@ -342,10 +341,9 @@ G_M8563_IG21: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {},
G_M8563_IG22: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
vmovups ymm2, ymmword ptr [ebp-0x48]
vpcmpw k1, ymm0, ymm2, 2
- vpmovm2w ymm3, k1
- vpternlogd ymm3, ymm0, ymm2, -54
+ vpblendmw ymm3 {k1}, ymm2, ymm0
xor esi, esi
- ;; size=27 bbWeight=1 PerfScore 9.75
+ ;; size=20 bbWeight=1 PerfScore 9.25
G_M8563_IG23: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, isz
mov ecx, esi
vmovups ymmword ptr [ebp-0x88], ymm3
@@ -393,10 +391,9 @@ G_M8563_IG25: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {},
G_M8563_IG26: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
vmovups ymm0, ymmword ptr [ebp-0x68]
vpcmpw k1, ymm0, ymmword ptr [ebp-0x28], 5
- vpmovm2w ymm3, k1
- vpternlogd ymm3, ymm0, ymm2, -54
+ vpblendmw ymm3 {k1}, ymm2, ymm0
xor esi, esi
- ;; size=31 bbWeight=1 PerfScore 11.75
+ ;; size=24 bbWeight=1 PerfScore 11.25
G_M8563_IG27: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, isz
mov ecx, esi
vmovups ymmword ptr [ebp-0x88], ymm3
@@ -444,10 +441,9 @@ G_M8563_IG29: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {},
G_M8563_IG30: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
vmovups ymm2, ymmword ptr [ebp-0x48]
vpcmpw k1, ymm2, ymmword ptr [ebp-0x28], 5
- vpmovm2w ymm3, k1
- vpternlogd ymm3, ymm0, ymm2, -54
+ vpblendmw ymm3 {k1}, ymm2, ymm0
xor esi, esi
- ;; size=31 bbWeight=1 PerfScore 11.75
+ ;; size=24 bbWeight=1 PerfScore 11.25
G_M8563_IG31: ; bbWeight=4, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, isz
mov ecx, esi
vmovups ymmword ptr [ebp-0x88], ymm3
@@ -602,6 +598,6 @@ G_M8563_IG42: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {},
ret
;; size=10 bbWeight=1 PerfScore 4.00
-; Total bytes of code 1503, prolog size 26, PerfScore 1296.58, instruction count 336, allocated bytes for code 1503 (MethodHash=33a1de8c) for method VectorTest+VectorRelopTest`1[short]:VectorRelOp(short,short):int (Tier0-FullOpts)
+; Total bytes of code 1475, prolog size 26, PerfScore 1294.58, instruction count 332, allocated bytes for code 1475 (MethodHash=33a1de8c) for method VectorTest+VectorRelopTest`1[short]:VectorRelOp(short,short):int (Tier0-FullOpts)
; ============================================================
-28 (-0.63%) : 207938.dasm - VectorTest+VectorRelopTest`1[long]:VectorRelOp(long,long):int (Tier0-FullOpts)
@@ -699,9 +699,8 @@ G_M59915_IG22: ; bbWeight=0.50, gcrefRegs=00000000 {}, byrefRegs=00000000
G_M59915_IG23: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, isz
vmovups ymm0, ymmword ptr [ebp-0x24]
vpcmpq k1, ymm0, ymmword ptr [ebp-0x64], 2
- vpmovm2q ymm0, k1
- vmovups ymm1, ymmword ptr [ebp-0x24]
- vpternlogq ymm0, ymm1, ymmword ptr [ebp-0x44], -54
+ vmovups ymm0, ymmword ptr [ebp-0x44]
+ vpblendmq ymm0 {k1}, ymm0, ymmword ptr [ebp-0x24]
vmovd ecx, xmm0
vmovups ymmword ptr [ebp-0x84], ymm0
vmovaps ymm1, ymm0
@@ -710,7 +709,7 @@ G_M59915_IG23: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}
xor edx, dword ptr [ebp-0x20]
or ecx, edx
je SHORT G_M59915_IG25
- ;; size=70 bbWeight=1 PerfScore 27.50
+ ;; size=63 bbWeight=1 PerfScore 26.50
G_M59915_IG24: ; bbWeight=0.50, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
mov ecx, 0xD1FFAB1E
; gcrRegs +[ecx]
@@ -839,9 +838,8 @@ G_M59915_IG27: ; bbWeight=0.50, gcrefRegs=00000000 {}, byrefRegs=00000000
G_M59915_IG28: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, isz
vmovups ymm0, ymmword ptr [ebp-0x24]
vpcmpq k1, ymm0, ymmword ptr [ebp-0x44], 2
- vpmovm2q ymm0, k1
- vmovups ymm1, ymmword ptr [ebp-0x24]
- vpternlogq ymm0, ymm1, ymmword ptr [ebp-0x44], -54
+ vmovups ymm0, ymmword ptr [ebp-0x44]
+ vpblendmq ymm0 {k1}, ymm0, ymmword ptr [ebp-0x24]
vmovd ecx, xmm0
vmovups ymmword ptr [ebp-0x84], ymm0
vmovaps ymm1, ymm0
@@ -850,7 +848,7 @@ G_M59915_IG28: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}
xor edx, dword ptr [ebp-0x40]
or ecx, edx
je SHORT G_M59915_IG30
- ;; size=70 bbWeight=1 PerfScore 27.50
+ ;; size=63 bbWeight=1 PerfScore 26.50
G_M59915_IG29: ; bbWeight=0.50, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
mov ecx, 0xD1FFAB1E
; gcrRegs +[ecx]
@@ -979,9 +977,8 @@ G_M59915_IG32: ; bbWeight=0.50, gcrefRegs=00000000 {}, byrefRegs=00000000
G_M59915_IG33: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, isz
vmovups ymm0, ymmword ptr [ebp-0x24]
vpcmpq k1, ymm0, ymmword ptr [ebp-0x64], 5
- vpmovm2q ymm0, k1
- vmovups ymm1, ymmword ptr [ebp-0x24]
- vpternlogq ymm0, ymm1, ymmword ptr [ebp-0x44], -54
+ vmovups ymm0, ymmword ptr [ebp-0x44]
+ vpblendmq ymm0 {k1}, ymm0, ymmword ptr [ebp-0x24]
vmovd ecx, xmm0
vmovups ymmword ptr [ebp-0x84], ymm0
vmovaps ymm1, ymm0
@@ -990,7 +987,7 @@ G_M59915_IG33: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}
xor edx, dword ptr [ebp-0x20]
or ecx, edx
je SHORT G_M59915_IG35
- ;; size=70 bbWeight=1 PerfScore 27.50
+ ;; size=63 bbWeight=1 PerfScore 26.50
G_M59915_IG34: ; bbWeight=0.50, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
mov ecx, 0xD1FFAB1E
; gcrRegs +[ecx]
@@ -1119,9 +1116,8 @@ G_M59915_IG37: ; bbWeight=0.50, gcrefRegs=00000000 {}, byrefRegs=00000000
G_M59915_IG38: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, isz
vmovups ymm0, ymmword ptr [ebp-0x44]
vpcmpq k1, ymm0, ymmword ptr [ebp-0x64], 5
- vpmovm2q ymm0, k1
- vmovups ymm1, ymmword ptr [ebp-0x24]
- vpternlogq ymm0, ymm1, ymmword ptr [ebp-0x44], -54
+ vmovups ymm0, ymmword ptr [ebp-0x44]
+ vpblendmq ymm0 {k1}, ymm0, ymmword ptr [ebp-0x24]
vmovd ecx, xmm0
vmovups ymmword ptr [ebp-0x84], ymm0
vmovaps ymm1, ymm0
@@ -1130,7 +1126,7 @@ G_M59915_IG38: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}
xor edx, dword ptr [ebp-0x40]
or ecx, edx
je SHORT G_M59915_IG40
- ;; size=70 bbWeight=1 PerfScore 27.50
+ ;; size=63 bbWeight=1 PerfScore 26.50
G_M59915_IG39: ; bbWeight=0.50, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
mov ecx, 0xD1FFAB1E
; gcrRegs +[ecx]
@@ -1542,6 +1538,6 @@ G_M59915_IG53: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}
ret 16
;; size=11 bbWeight=1 PerfScore 4.50
-; Total bytes of code 4476, prolog size 32, PerfScore 917.83, instruction count 1037, allocated bytes for code 4476 (MethodHash=e2e315f4) for method VectorTest+VectorRelopTest`1[long]:VectorRelOp(long,long):int (Tier0-FullOpts)
+; Total bytes of code 4448, prolog size 32, PerfScore 913.83, instruction count 1033, allocated bytes for code 4452 (MethodHash=e2e315f4) for method VectorTest+VectorRelopTest`1[long]:VectorRelOp(long,long):int (Tier0-FullOpts)
; ============================================================
-28 (-0.63%) : 469362.dasm - VectorTest+VectorRelopTest`1[long]:VectorRelOp(long,long):int (FullOpts)
@@ -699,9 +699,8 @@ G_M59915_IG22: ; bbWeight=0.50, gcrefRegs=00000000 {}, byrefRegs=00000000
G_M59915_IG23: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, isz
vmovups ymm0, ymmword ptr [ebp-0x24]
vpcmpq k1, ymm0, ymmword ptr [ebp-0x64], 2
- vpmovm2q ymm0, k1
- vmovups ymm1, ymmword ptr [ebp-0x24]
- vpternlogq ymm0, ymm1, ymmword ptr [ebp-0x44], -54
+ vmovups ymm0, ymmword ptr [ebp-0x44]
+ vpblendmq ymm0 {k1}, ymm0, ymmword ptr [ebp-0x24]
vmovd ecx, xmm0
vmovups ymmword ptr [ebp-0x84], ymm0
vmovaps ymm1, ymm0
@@ -710,7 +709,7 @@ G_M59915_IG23: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}
xor edx, dword ptr [ebp-0x20]
or ecx, edx
je SHORT G_M59915_IG25
- ;; size=70 bbWeight=1 PerfScore 27.50
+ ;; size=63 bbWeight=1 PerfScore 26.50
G_M59915_IG24: ; bbWeight=0.50, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
mov ecx, 0xD1FFAB1E
; gcrRegs +[ecx]
@@ -839,9 +838,8 @@ G_M59915_IG27: ; bbWeight=0.50, gcrefRegs=00000000 {}, byrefRegs=00000000
G_M59915_IG28: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, isz
vmovups ymm0, ymmword ptr [ebp-0x24]
vpcmpq k1, ymm0, ymmword ptr [ebp-0x44], 2
- vpmovm2q ymm0, k1
- vmovups ymm1, ymmword ptr [ebp-0x24]
- vpternlogq ymm0, ymm1, ymmword ptr [ebp-0x44], -54
+ vmovups ymm0, ymmword ptr [ebp-0x44]
+ vpblendmq ymm0 {k1}, ymm0, ymmword ptr [ebp-0x24]
vmovd ecx, xmm0
vmovups ymmword ptr [ebp-0x84], ymm0
vmovaps ymm1, ymm0
@@ -850,7 +848,7 @@ G_M59915_IG28: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}
xor edx, dword ptr [ebp-0x40]
or ecx, edx
je SHORT G_M59915_IG30
- ;; size=70 bbWeight=1 PerfScore 27.50
+ ;; size=63 bbWeight=1 PerfScore 26.50
G_M59915_IG29: ; bbWeight=0.50, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
mov ecx, 0xD1FFAB1E
; gcrRegs +[ecx]
@@ -979,9 +977,8 @@ G_M59915_IG32: ; bbWeight=0.50, gcrefRegs=00000000 {}, byrefRegs=00000000
G_M59915_IG33: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, isz
vmovups ymm0, ymmword ptr [ebp-0x24]
vpcmpq k1, ymm0, ymmword ptr [ebp-0x64], 5
- vpmovm2q ymm0, k1
- vmovups ymm1, ymmword ptr [ebp-0x24]
- vpternlogq ymm0, ymm1, ymmword ptr [ebp-0x44], -54
+ vmovups ymm0, ymmword ptr [ebp-0x44]
+ vpblendmq ymm0 {k1}, ymm0, ymmword ptr [ebp-0x24]
vmovd ecx, xmm0
vmovups ymmword ptr [ebp-0x84], ymm0
vmovaps ymm1, ymm0
@@ -990,7 +987,7 @@ G_M59915_IG33: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}
xor edx, dword ptr [ebp-0x20]
or ecx, edx
je SHORT G_M59915_IG35
- ;; size=70 bbWeight=1 PerfScore 27.50
+ ;; size=63 bbWeight=1 PerfScore 26.50
G_M59915_IG34: ; bbWeight=0.50, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
mov ecx, 0xD1FFAB1E
; gcrRegs +[ecx]
@@ -1119,9 +1116,8 @@ G_M59915_IG37: ; bbWeight=0.50, gcrefRegs=00000000 {}, byrefRegs=00000000
G_M59915_IG38: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, isz
vmovups ymm0, ymmword ptr [ebp-0x44]
vpcmpq k1, ymm0, ymmword ptr [ebp-0x64], 5
- vpmovm2q ymm0, k1
- vmovups ymm1, ymmword ptr [ebp-0x24]
- vpternlogq ymm0, ymm1, ymmword ptr [ebp-0x44], -54
+ vmovups ymm0, ymmword ptr [ebp-0x44]
+ vpblendmq ymm0 {k1}, ymm0, ymmword ptr [ebp-0x24]
vmovd ecx, xmm0
vmovups ymmword ptr [ebp-0x84], ymm0
vmovaps ymm1, ymm0
@@ -1130,7 +1126,7 @@ G_M59915_IG38: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}
xor edx, dword ptr [ebp-0x40]
or ecx, edx
je SHORT G_M59915_IG40
- ;; size=70 bbWeight=1 PerfScore 27.50
+ ;; size=63 bbWeight=1 PerfScore 26.50
G_M59915_IG39: ; bbWeight=0.50, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
mov ecx, 0xD1FFAB1E
; gcrRegs +[ecx]
@@ -1542,6 +1538,6 @@ G_M59915_IG53: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}
ret 16
;; size=11 bbWeight=1 PerfScore 4.50
-; Total bytes of code 4476, prolog size 32, PerfScore 917.83, instruction count 1037, allocated bytes for code 4476 (MethodHash=e2e315f4) for method VectorTest+VectorRelopTest`1[long]:VectorRelOp(long,long):int (FullOpts)
+; Total bytes of code 4448, prolog size 32, PerfScore 913.83, instruction count 1033, allocated bytes for code 4452 (MethodHash=e2e315f4) for method VectorTest+VectorRelopTest`1[long]:VectorRelOp(long,long):int (FullOpts)
; ============================================================
libraries.pmi.windows.x86.checked.mch
-21 (-20.59%) : 273965.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector5121[ubyte],System.Runtime.Intrinsics.Vector5121[ubyte]):System.Runtime.Intrinsics.Vector5121ubyte
@@ -16,7 +16,7 @@
;* V05 loc2 [V05 ] ( 0, 0 ) simd64 -> zero-ref single-def "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector512`1[ubyte]>
;* V06 tmp1 [V06 ] ( 0, 0 ) simd64 -> zero-ref single-def "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector512`1[ubyte]>
;* V07 tmp2 [V07 ] ( 0, 0 ) simd64 -> zero-ref "spilled call-like call argument"
-; V08 tmp3 [V08,T03] ( 2, 2 ) simd64 -> mm2 single-def "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector512`1[ubyte]>
+; V08 tmp3 [V08,T03] ( 2, 2 ) simd64 -> mm0 single-def "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector512`1[ubyte]>
;* V09 tmp4 [V09 ] ( 0, 0 ) simd64 -> zero-ref "Inline return value spill temp" <System.Runtime.Intrinsics.Vector512`1[ubyte]>
;
; Lcl frame size = 0
@@ -31,23 +31,20 @@ G_M27576_IG01: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}
G_M27576_IG02: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000002 {ecx}, byref
; byrRegs +[ecx]
vpcmpeqb k1, zmm0, zmm1
- vpmovm2b zmm2, k1
- vxorps ymm3, ymm3, ymm3
- vpcmpub k1, zmm0, zmm3, 1
- vpmovm2b zmm3, k1
- vpternlogd zmm3, zmm1, zmm0, -54
- vpcmpub k1, zmm0, zmm1, 6
- vpmovm2b zmm4, k1
- vpternlogd zmm4, zmm0, zmm1, -54
- vpternlogd zmm2, zmm3, zmm4, -54
- vmovups zmmword ptr [ecx], zmm2
- ;; size=69 bbWeight=1 PerfScore 16.33
+ vxorps ymm2, ymm2, ymm2
+ vpcmpub k2, zmm0, zmm2, 1
+ vpblendmb zmm2 {k2}, zmm0, zmm1
+ vpcmpub k2, zmm0, zmm1, 6
+ vpblendmb zmm0 {k2}, zmm1, zmm0
+ vpblendmb zmm0 {k1}, zmm0, zmm2
+ vmovups zmmword ptr [ecx], zmm0
+ ;; size=48 bbWeight=1 PerfScore 14.83
G_M27576_IG03: ; bbWeight=1, epilog, nogc, extend
vzeroupper
pop ebp
ret 128
;; size=7 bbWeight=1 PerfScore 3.50
-; Total bytes of code 102, prolog size 6, PerfScore 28.08, instruction count 19, allocated bytes for code 102 (MethodHash=a5449447) for method System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
+; Total bytes of code 81, prolog size 6, PerfScore 26.58, instruction count 16, allocated bytes for code 81 (MethodHash=a5449447) for method System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
; ============================================================
-21 (-20.59%) : 274022.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector5121[ubyte],System.Runtime.Intrinsics.Vector5121[ubyte]):System.Runtime.Intrinsics.Vector5121ubyte
@@ -16,7 +16,7 @@
;* V05 loc2 [V05 ] ( 0, 0 ) simd64 -> zero-ref single-def "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector512`1[ubyte]>
;* V06 tmp1 [V06 ] ( 0, 0 ) simd64 -> zero-ref single-def "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector512`1[ubyte]>
;* V07 tmp2 [V07 ] ( 0, 0 ) simd64 -> zero-ref "spilled call-like call argument"
-; V08 tmp3 [V08,T03] ( 2, 2 ) simd64 -> mm2 single-def "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector512`1[ubyte]>
+; V08 tmp3 [V08,T03] ( 2, 2 ) simd64 -> mm0 single-def "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector512`1[ubyte]>
;* V09 tmp4 [V09 ] ( 0, 0 ) simd64 -> zero-ref "Inline return value spill temp" <System.Runtime.Intrinsics.Vector512`1[ubyte]>
;
; Lcl frame size = 0
@@ -31,23 +31,20 @@ G_M10214_IG01: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}
G_M10214_IG02: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000002 {ecx}, byref
; byrRegs +[ecx]
vpcmpeqb k1, zmm0, zmm1
- vpmovm2b zmm2, k1
- vxorps ymm3, ymm3, ymm3
- vpcmpub k1, zmm0, zmm3, 1
- vpmovm2b zmm3, k1
- vpternlogd zmm3, zmm0, zmm1, -54
- vpcmpub k1, zmm0, zmm1, 1
- vpmovm2b zmm4, k1
- vpternlogd zmm4, zmm0, zmm1, -54
- vpternlogd zmm2, zmm3, zmm4, -54
- vmovups zmmword ptr [ecx], zmm2
- ;; size=69 bbWeight=1 PerfScore 16.33
+ vxorps ymm2, ymm2, ymm2
+ vpcmpub k2, zmm0, zmm2, 1
+ vpblendmb zmm2 {k2}, zmm1, zmm0
+ vpcmpub k2, zmm0, zmm1, 1
+ vpblendmb zmm0 {k2}, zmm1, zmm0
+ vpblendmb zmm0 {k1}, zmm0, zmm2
+ vmovups zmmword ptr [ecx], zmm0
+ ;; size=48 bbWeight=1 PerfScore 14.83
G_M10214_IG03: ; bbWeight=1, epilog, nogc, extend
vzeroupper
pop ebp
ret 128
;; size=7 bbWeight=1 PerfScore 3.50
-; Total bytes of code 102, prolog size 6, PerfScore 28.08, instruction count 19, allocated bytes for code 102 (MethodHash=6846d819) for method System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
+; Total bytes of code 81, prolog size 6, PerfScore 26.58, instruction count 16, allocated bytes for code 81 (MethodHash=6846d819) for method System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
; ============================================================
-21 (-20.59%) : 273943.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector5121[ubyte],System.Runtime.Intrinsics.Vector5121[ubyte]):System.Runtime.Intrinsics.Vector5121ubyte
@@ -13,7 +13,7 @@
; V02 arg1 [V02,T02] ( 4, 4 ) simd64 -> mm1 single-def <System.Runtime.Intrinsics.Vector512`1[ubyte]>
;* V03 loc0 [V03 ] ( 0, 0 ) simd64 -> zero-ref single-def <System.Runtime.Intrinsics.Vector512`1[ubyte]>
;* V04 loc1 [V04 ] ( 0, 0 ) simd64 -> zero-ref single-def <System.Runtime.Intrinsics.Vector512`1[ubyte]>
-; V05 loc2 [V05,T03] ( 2, 2 ) simd64 -> mm2 single-def <System.Runtime.Intrinsics.Vector512`1[ubyte]>
+; V05 loc2 [V05,T03] ( 2, 2 ) simd64 -> mm0 single-def <System.Runtime.Intrinsics.Vector512`1[ubyte]>
;* V06 loc3 [V06 ] ( 0, 0 ) simd64 -> zero-ref <System.Runtime.Intrinsics.Vector512`1[ubyte]>
;* V07 loc4 [V07 ] ( 0, 0 ) simd64 -> zero-ref <System.Runtime.Intrinsics.Vector512`1[ubyte]>
;* V08 loc5 [V08 ] ( 0, 0 ) simd64 -> zero-ref "spilled call-like call argument"
@@ -31,23 +31,20 @@ G_M22834_IG01: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}
G_M22834_IG02: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000002 {ecx}, byref
; byrRegs +[ecx]
vpcmpeqb k1, zmm0, zmm1
- vpmovm2b zmm2, k1
- vxorps ymm3, ymm3, ymm3
- vpcmpub k1, zmm0, zmm3, 1
- vpmovm2b zmm3, k1
- vpternlogd zmm3, zmm1, zmm0, -54
- vpcmpub k1, zmm0, zmm1, 6
- vpmovm2b zmm4, k1
- vpternlogd zmm4, zmm0, zmm1, -54
- vpternlogd zmm2, zmm3, zmm4, -54
- vmovups zmmword ptr [ecx], zmm2
- ;; size=69 bbWeight=1 PerfScore 16.33
+ vxorps ymm2, ymm2, ymm2
+ vpcmpub k2, zmm0, zmm2, 1
+ vpblendmb zmm2 {k2}, zmm0, zmm1
+ vpcmpub k2, zmm0, zmm1, 6
+ vpblendmb zmm0 {k2}, zmm1, zmm0
+ vpblendmb zmm0 {k1}, zmm0, zmm2
+ vmovups zmmword ptr [ecx], zmm0
+ ;; size=48 bbWeight=1 PerfScore 14.83
G_M22834_IG03: ; bbWeight=1, epilog, nogc, extend
vzeroupper
pop ebp
ret 128
;; size=7 bbWeight=1 PerfScore 3.50
-; Total bytes of code 102, prolog size 6, PerfScore 28.08, instruction count 19, allocated bytes for code 102 (MethodHash=885fa6cd) for method System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
+; Total bytes of code 81, prolog size 6, PerfScore 26.58, instruction count 16, allocated bytes for code 81 (MethodHash=885fa6cd) for method System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
; ============================================================
-7 (-5.47%) : 4816.dasm - System.Buffers.ProbabilisticMap:IsCharBitSetAvx2(System.Runtime.Intrinsics.Vector2561[ubyte],System.Runtime.Intrinsics.Vector2561[ubyte],System.Runtime.Intrinsics.Vector2561[ubyte]):System.Runtime.Intrinsics.Vector2561ubyte
@@ -35,20 +35,19 @@ G_M53822_IG02: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000002 {e
vpshufb ymm1, ymm2, ymm1
vpand ymm0, ymm0, ymmword ptr [@RWD64]
vpcmpub k1, ymm0, ymmword ptr [@RWD96], 6
- vpmovm2b ymm2, k1
- vpsubb ymm3, ymm0, ymmword ptr [@RWD128]
- vmovups ymm4, ymmword ptr [ebp+0x28]
- vpshufb ymm3, ymm4, ymm3
- vmovups ymm4, ymmword ptr [ebp+0x48]
- vpshufb ymm0, ymm4, ymm0
- vpternlogd ymm2, ymm3, ymm0, -54
- vpand ymm0, ymm2, ymm1
+ vpsubb ymm2, ymm0, ymmword ptr [@RWD128]
+ vmovups ymm3, ymmword ptr [ebp+0x28]
+ vpshufb ymm2, ymm3, ymm2
+ vmovups ymm3, ymmword ptr [ebp+0x48]
+ vpshufb ymm0, ymm3, ymm0
+ vpblendmb ymm0 {k1}, ymm0, ymm2
+ vpand ymm0, ymm0, ymm1
vxorps ymm1, ymm1, ymm1
vpcmpeqb ymm0, ymm0, ymm1
vpcmpeqd ymm1, ymm1, ymm1
vpxor ymm0, ymm0, ymm1
vmovups ymmword ptr [ecx], ymm0
- ;; size=110 bbWeight=1 PerfScore 35.50
+ ;; size=103 bbWeight=1 PerfScore 35.00
G_M53822_IG03: ; bbWeight=1, epilog, nogc, extend
vzeroupper
pop ebp
@@ -61,6 +60,6 @@ RWD96 dq 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F
RWD128 dq 1010101010101010h, 1010101010101010h, 1010101010101010h, 1010101010101010h
-; Total bytes of code 128, prolog size 6, PerfScore 45.25, instruction count 26, allocated bytes for code 128 (MethodHash=47dc2dc1) for method System.Buffers.ProbabilisticMap:IsCharBitSetAvx2(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte]):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
+; Total bytes of code 121, prolog size 6, PerfScore 44.75, instruction count 25, allocated bytes for code 121 (MethodHash=47dc2dc1) for method System.Buffers.ProbabilisticMap:IsCharBitSetAvx2(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte]):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
; ============================================================
-15 (-5.21%) : 4815.dasm - System.Buffers.ProbabilisticMap:ContainsMask32CharsAvx2(System.Runtime.Intrinsics.Vector2561[ubyte],System.Runtime.Intrinsics.Vector2561[ubyte],byref):System.Runtime.Intrinsics.Vector256`1ubyte
@@ -16,8 +16,8 @@
; V05 loc1 [V05,T05] ( 3, 3 ) simd32 -> mm3 <System.Runtime.Intrinsics.Vector256`1[ushort]>
; V06 loc2 [V06,T06] ( 3, 3 ) simd32 -> mm4 <System.Runtime.Intrinsics.Vector256`1[ubyte]>
; V07 loc3 [V07,T07] ( 3, 3 ) simd32 -> mm2 <System.Runtime.Intrinsics.Vector256`1[ubyte]>
-; V08 loc4 [V08,T15] ( 2, 2 ) simd32 -> mm1 <System.Runtime.Intrinsics.Vector256`1[ubyte]>
-; V09 loc5 [V09,T16] ( 2, 2 ) simd32 -> mm0 <System.Runtime.Intrinsics.Vector256`1[ubyte]>
+; V08 loc4 [V08,T15] ( 2, 2 ) simd32 -> mm0 <System.Runtime.Intrinsics.Vector256`1[ubyte]>
+; V09 loc5 [V09,T16] ( 2, 2 ) simd32 -> mm1 <System.Runtime.Intrinsics.Vector256`1[ubyte]>
;* V10 loc6 [V10 ] ( 0, 0 ) simd32 -> zero-ref "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
; V11 tmp1 [V11,T17] ( 2, 2 ) simd32 -> [ebp-0x20] spill-single-def "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
; V12 tmp2 [V12,T02] ( 4, 4 ) simd32 -> mm4 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
@@ -26,7 +26,7 @@
;* V15 tmp5 [V15 ] ( 0, 0 ) simd32 -> zero-ref "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
;* V16 tmp6 [V16 ] ( 0, 0 ) simd32 -> zero-ref "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
;* V17 tmp7 [V17 ] ( 0, 0 ) simd32 -> zero-ref "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
-; V18 tmp8 [V18,T18] ( 2, 2 ) simd32 -> mm3 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
+; V18 tmp8 [V18,T18] ( 2, 2 ) simd32 -> mm4 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
; V19 tmp9 [V19,T03] ( 4, 4 ) simd32 -> mm2 "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
;* V20 tmp10 [V20 ] ( 0, 0 ) simd32 -> zero-ref "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
;* V21 tmp11 [V21 ] ( 0, 0 ) simd32 -> zero-ref "Inline stloc first use temp" <System.Runtime.Intrinsics.Vector256`1[ubyte]>
@@ -36,16 +36,17 @@
; V25 cse1 [V25,T09] ( 3, 3 ) simd32 -> mm5 "CSE - moderate"
; V26 cse2 [V26,T10] ( 3, 3 ) simd32 -> mm6 "CSE - moderate"
; V27 cse3 [V27,T11] ( 3, 3 ) simd32 -> mm7 "CSE - moderate"
-; V28 cse4 [V28,T12] ( 3, 3 ) simd32 -> [ebp-0x40] spill-single-def "CSE - moderate"
+; V28 cse4 [V28,T12] ( 3, 3 ) simd32 -> mm3 "CSE - moderate"
;
-; Lcl frame size = 64
+; Lcl frame size = 32
G_M59405_IG01: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, nogc <-- Prolog IG
push ebp
mov ebp, esp
- sub esp, 64
+ sub esp, 32
vzeroupper
- ;; size=9 bbWeight=1 PerfScore 2.50
+ vmovups ymm1, ymmword ptr [ebp+0x08]
+ ;; size=14 bbWeight=1 PerfScore 6.50
G_M59405_IG02: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000006 {ecx edx}, byref
; byrRegs +[ecx edx]
vmovups ymm2, ymmword ptr [edx]
@@ -67,40 +68,37 @@ G_M59405_IG02: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000006 {e
vpand ymm4, ymm4, ymm6
vmovups ymm7, ymmword ptr [@RWD128]
vpcmpub k1, ymm4, ymm7, 6
- vpmovm2b ymm3, k1
- vmovups ymm0, ymmword ptr [@RWD160]
- vmovups ymmword ptr [ebp-0x40], ymm0
- vpsubb ymm1, ymm4, ymm0
- vmovups ymm0, ymmword ptr [ebp+0x08]
- vpshufb ymm1, ymm0, ymm1
- vmovups ymm0, ymmword ptr [ebp+0x28]
- vpshufb ymm4, ymm0, ymm4
- vpternlogd ymm3, ymm1, ymm4, -54
- vpand ymm1, ymm3, ymmword ptr [ebp-0x20]
- vxorps ymm3, ymm3, ymm3
- vpcmpeqb ymm1, ymm1, ymm3
- vpcmpeqd ymm3, ymm3, ymm3
- vpxor ymm1, ymm1, ymm3
- vpsrld ymm3, ymm2, 5
- vpand ymm3, ymm3, ymm5
- vmovups ymm4, ymmword ptr [@RWD64]
- vpshufb ymm3, ymm4, ymm3
+ vmovups ymm3, ymmword ptr [@RWD160]
+ vpsubb ymm0, ymm4, ymm3
+ vmovups ymmword ptr [ebp+0x08], ymm1
+ vpshufb ymm0, ymm1, ymm0
+ vmovups ymm1, ymmword ptr [ebp+0x28]
+ vpshufb ymm4, ymm1, ymm4
+ vpblendmb ymm0 {k1}, ymm4, ymm0
+ vpand ymm0, ymm0, ymmword ptr [ebp-0x20]
+ vxorps ymm4, ymm4, ymm4
+ vpcmpeqb ymm0, ymm0, ymm4
+ vpcmpeqd ymm4, ymm4, ymm4
+ vpxor ymm0, ymm0, ymm4
+ vpsrld ymm4, ymm2, 5
+ vpand ymm4, ymm4, ymm5
+ vmovups ymm5, ymmword ptr [@RWD64]
+ vpshufb ymm4, ymm5, ymm4
vpand ymm2, ymm2, ymm6
vpcmpub k1, ymm2, ymm7, 6
- vpmovm2b ymm4, k1
- vpsubb ymm5, ymm2, ymmword ptr [ebp-0x40]
- vmovups ymm6, ymmword ptr [ebp+0x08]
- vpshufb ymm5, ymm6, ymm5
- vpshufb ymm0, ymm0, ymm2
- vpternlogd ymm4, ymm5, ymm0, -54
- vpand ymm0, ymm4, ymm3
+ vpsubb ymm3, ymm2, ymm3
+ vmovups ymm5, ymmword ptr [ebp+0x08]
+ vpshufb ymm3, ymm5, ymm3
+ vpshufb ymm1, ymm1, ymm2
+ vpblendmb ymm1 {k1}, ymm1, ymm3
+ vpand ymm1, ymm1, ymm4
vxorps ymm2, ymm2, ymm2
- vpcmpeqb ymm0, ymm0, ymm2
+ vpcmpeqb ymm1, ymm1, ymm2
vpcmpeqd ymm2, ymm2, ymm2
- vpxor ymm0, ymm0, ymm2
- vpand ymm0, ymm1, ymm0
+ vpxor ymm1, ymm1, ymm2
+ vpand ymm0, ymm0, ymm1
vmovups ymmword ptr [ecx], ymm0
- ;; size=270 bbWeight=1 PerfScore 95.33
+ ;; size=250 bbWeight=1 PerfScore 88.67
G_M59405_IG03: ; bbWeight=1, epilog, nogc, extend
vzeroupper
mov esp, ebp
@@ -115,6 +113,6 @@ RWD128 dq 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F0F0Fh, 0F0F0F0F0F0F
RWD160 dq 1010101010101010h, 1010101010101010h, 1010101010101010h, 1010101010101010h
-; Total bytes of code 288, prolog size 9, PerfScore 101.58, instruction count 60, allocated bytes for code 288 (MethodHash=e39717f2) for method System.Buffers.ProbabilisticMap:ContainsMask32CharsAvx2(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte],byref):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
+; Total bytes of code 273, prolog size 9, PerfScore 98.92, instruction count 58, allocated bytes for code 273 (MethodHash=e39717f2) for method System.Buffers.ProbabilisticMap:ContainsMask32CharsAvx2(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte],byref):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
; ============================================================
-7 (-4.38%) : 273746.dasm - System.Numerics.Tensors.TensorPrimitives:g_HalfAsWidenedUInt32ToSingleVector512|210_2(System.Runtime.Intrinsics.Vector5121[uint]):System.Runtime.Intrinsics.Vector5121float
@@ -39,16 +39,15 @@ G_M58105_IG02: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000002 {e
vpandd zmm3, zmm4, dword ptr [@RWD128] {1to16}
vpord zmm4, zmm3, dword ptr [@RWD132] {1to16}
vptestnmd k1, zmm2, zmm2
- vpmovm2d zmm2, k1
- vpslld zmm5, zmm4, 1
- vpternlogd zmm2, zmm4, zmm5, -54
+ vpslld zmm2, zmm4, 1
+ vpblendmd zmm2 {k1}, zmm2, zmm4
vpslld zmm0, zmm0, 13
vpandd zmm0, zmm0, dword ptr [@RWD136] {1to16}
vpaddd zmm0, zmm0, zmm2
vsubps zmm0, zmm0, zmm3
vpord zmm0, zmm0, zmm1
vmovups zmmword ptr [ecx], zmm0
- ;; size=137 bbWeight=1 PerfScore 29.00
+ ;; size=130 bbWeight=1 PerfScore 28.00
G_M58105_IG03: ; bbWeight=1, epilog, nogc, extend
vzeroupper
pop ebp
@@ -64,6 +63,6 @@ RWD132 dd 38000000h
RWD136 dd 0FFFE000h
-; Total bytes of code 160, prolog size 6, PerfScore 37.75, instruction count 26, allocated bytes for code 160 (MethodHash=e6ab1d06) for method System.Numerics.Tensors.TensorPrimitives:<ConvertToSingle>g__HalfAsWidenedUInt32ToSingle_Vector512|210_2(System.Runtime.Intrinsics.Vector512`1[uint]):System.Runtime.Intrinsics.Vector512`1[float] (FullOpts)
+; Total bytes of code 153, prolog size 6, PerfScore 36.75, instruction count 25, allocated bytes for code 153 (MethodHash=e6ab1d06) for method System.Numerics.Tensors.TensorPrimitives:<ConvertToSingle>g__HalfAsWidenedUInt32ToSingle_Vector512|210_2(System.Runtime.Intrinsics.Vector512`1[uint]):System.Runtime.Intrinsics.Vector512`1[float] (FullOpts)
; ============================================================
libraries_tests.run.windows.x86.Release.mch
-14 (-17.72%) : 370873.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator1[uint]:Invoke(System.Runtime.Intrinsics.Vector1281[uint],System.Runtime.Intrinsics.Vector1281[uint]):System.Runtime.Intrinsics.Vector1281uint
@@ -34,19 +34,17 @@ G_M10273_IG02: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000002 {e
vpcmpeqd xmm2, xmm0, xmm1
vxorps xmm3, xmm3, xmm3
vpcmpud k1, xmm0, xmm3, 1
- vpmovm2d xmm3, k1
- vpternlogd xmm3, xmm0, xmm1, -54
+ vpblendmd xmm3 {k1}, xmm1, xmm0
vpcmpud k1, xmm0, xmm1, 1
- vpmovm2d xmm4, k1
- vpternlogd xmm4, xmm0, xmm1, -54
- vpternlogd xmm2, xmm3, xmm4, -54
+ vpblendmd xmm0 {k1}, xmm1, xmm0
+ vpternlogd xmm2, xmm3, xmm0, -54
vmovups xmmword ptr [ecx], xmm2
- ;; size=59 bbWeight=1 PerfScore 12.33
+ ;; size=45 bbWeight=1 PerfScore 10.00
G_M10273_IG03: ; bbWeight=1, epilog, nogc, extend
pop ebp
ret 32
;; size=4 bbWeight=1 PerfScore 2.50
-; Total bytes of code 79, prolog size 6, PerfScore 23.08, instruction count 17, allocated bytes for code 79 (MethodHash=7471d7de) for method System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[uint]:Invoke(System.Runtime.Intrinsics.Vector128`1[uint],System.Runtime.Intrinsics.Vector128`1[uint]):System.Runtime.Intrinsics.Vector128`1[uint] (Tier1)
+; Total bytes of code 65, prolog size 6, PerfScore 20.75, instruction count 15, allocated bytes for code 65 (MethodHash=7471d7de) for method System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[uint]:Invoke(System.Runtime.Intrinsics.Vector128`1[uint],System.Runtime.Intrinsics.Vector128`1[uint]):System.Runtime.Intrinsics.Vector128`1[uint] (Tier1)
; ============================================================
-14 (-17.72%) : 366898.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator1[uint]:Invoke(System.Runtime.Intrinsics.Vector1281[uint],System.Runtime.Intrinsics.Vector1281[uint]):System.Runtime.Intrinsics.Vector1281uint
@@ -34,19 +34,17 @@ G_M23551_IG02: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000002 {e
vpcmpeqd xmm2, xmm0, xmm1
vxorps xmm3, xmm3, xmm3
vpcmpud k1, xmm0, xmm3, 1
- vpmovm2d xmm3, k1
- vpternlogd xmm3, xmm1, xmm0, -54
+ vpblendmd xmm3 {k1}, xmm0, xmm1
vpcmpud k1, xmm0, xmm1, 6
- vpmovm2d xmm4, k1
- vpternlogd xmm4, xmm0, xmm1, -54
- vpternlogd xmm2, xmm3, xmm4, -54
+ vpblendmd xmm0 {k1}, xmm1, xmm0
+ vpternlogd xmm2, xmm3, xmm0, -54
vmovups xmmword ptr [ecx], xmm2
- ;; size=59 bbWeight=1 PerfScore 12.33
+ ;; size=45 bbWeight=1 PerfScore 10.00
G_M23551_IG03: ; bbWeight=1, epilog, nogc, extend
pop ebp
ret 32
;; size=4 bbWeight=1 PerfScore 2.50
-; Total bytes of code 79, prolog size 6, PerfScore 23.08, instruction count 17, allocated bytes for code 79 (MethodHash=3243a400) for method System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[uint]:Invoke(System.Runtime.Intrinsics.Vector128`1[uint],System.Runtime.Intrinsics.Vector128`1[uint]):System.Runtime.Intrinsics.Vector128`1[uint] (Tier1)
+; Total bytes of code 65, prolog size 6, PerfScore 20.75, instruction count 15, allocated bytes for code 65 (MethodHash=3243a400) for method System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[uint]:Invoke(System.Runtime.Intrinsics.Vector128`1[uint],System.Runtime.Intrinsics.Vector128`1[uint]):System.Runtime.Intrinsics.Vector128`1[uint] (Tier1)
; ============================================================
-14 (-17.72%) : 366792.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator1[uint]:Invoke(System.Runtime.Intrinsics.Vector1281[uint],System.Runtime.Intrinsics.Vector1281[uint]):System.Runtime.Intrinsics.Vector1281uint
@@ -34,19 +34,17 @@ G_M10273_IG02: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000002 {e
vpcmpeqd xmm2, xmm0, xmm1
vxorps xmm3, xmm3, xmm3
vpcmpud k1, xmm0, xmm3, 1
- vpmovm2d xmm3, k1
- vpternlogd xmm3, xmm0, xmm1, -54
+ vpblendmd xmm3 {k1}, xmm1, xmm0
vpcmpud k1, xmm0, xmm1, 1
- vpmovm2d xmm4, k1
- vpternlogd xmm4, xmm0, xmm1, -54
- vpternlogd xmm2, xmm3, xmm4, -54
+ vpblendmd xmm0 {k1}, xmm1, xmm0
+ vpternlogd xmm2, xmm3, xmm0, -54
vmovups xmmword ptr [ecx], xmm2
- ;; size=59 bbWeight=1 PerfScore 12.33
+ ;; size=45 bbWeight=1 PerfScore 10.00
G_M10273_IG03: ; bbWeight=1, epilog, nogc, extend
pop ebp
ret 32
;; size=4 bbWeight=1 PerfScore 2.50
-; Total bytes of code 79, prolog size 6, PerfScore 23.08, instruction count 17, allocated bytes for code 79 (MethodHash=7471d7de) for method System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[uint]:Invoke(System.Runtime.Intrinsics.Vector128`1[uint],System.Runtime.Intrinsics.Vector128`1[uint]):System.Runtime.Intrinsics.Vector128`1[uint] (Tier1)
+; Total bytes of code 65, prolog size 6, PerfScore 20.75, instruction count 15, allocated bytes for code 65 (MethodHash=7471d7de) for method System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[uint]:Invoke(System.Runtime.Intrinsics.Vector128`1[uint],System.Runtime.Intrinsics.Vector128`1[uint]):System.Runtime.Intrinsics.Vector128`1[uint] (Tier1)
; ============================================================
librariestestsnotieredcompilation.run.windows.x86.Release.mch
-14 (-1.97%) : 167868.dasm - System.Numerics.Tests.GenericVectorTests:TestConditionalSelectuint:this (FullOpts)
@@ -78,8 +78,7 @@ G_M21446_IG02: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}
vmovups ymm1, ymmword ptr [ecx+0x08]
vmovups ymmword ptr [ebp-0x48], ymm1
vpcmpud k1, ymm0, ymm1, 6
- vpmovm2d ymm2, k1
- vpternlogd ymm2, ymm0, ymm1, -54
+ vpblendmd ymm2 {k1}, ymm1, ymm0
vmovups ymmword ptr [ebp-0x68], ymm2
mov ecx, 0xD1FFAB1E ; System.Action`2[int,uint]
; gcrRegs -[ecx]
@@ -129,9 +128,9 @@ G_M21446_IG02: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}
vmovups ymm2, ymmword ptr [ebp-0xA8]
vextracti128 xmm0, ymm2, 1
vmovd edx, xmm0
- ;; size=272 bbWeight=1 PerfScore 104.50
-G_M21446_IG03: ; bbWeight=1, extend
push edx
+ ;; size=266 bbWeight=1 PerfScore 104.33
+G_M21446_IG03: ; bbWeight=1, extend
mov edx, 4
mov ecx, gword ptr [edi+0x04]
; gcrRegs +[ecx]
@@ -167,9 +166,8 @@ G_M21446_IG03: ; bbWeight=1, extend
vmovups ymm0, ymmword ptr [ebp-0x28]
vmovups ymm1, ymmword ptr [ebp-0x48]
vpcmpud k1, ymm0, ymm1, 2
- vpmovm2d ymm2, k1
- vpternlogd ymm2, ymm0, ymm1, -54
- vmovups ymmword ptr [ebp-0x88], ymm2
+ vpblendmd ymm0 {k1}, ymm1, ymm0
+ vmovups ymmword ptr [ebp-0x88], ymm0
mov ecx, 0xD1FFAB1E ; System.Action`2[int,uint]
call CORINFO_HELP_NEWSFAST
; gcrRegs +[eax]
@@ -181,70 +179,70 @@ G_M21446_IG03: ; bbWeight=1, extend
; gcrRegs -[eax esi]
; byrRegs -[edx]
mov dword ptr [edi+0x0C], 0xD1FFAB1E
- vmovups ymm2, ymmword ptr [ebp-0x88]
- vmovups ymmword ptr [ebp-0xC8], ymm2
- vmovd edx, xmm2
+ vmovups ymm0, ymmword ptr [ebp-0x88]
+ vmovups ymmword ptr [ebp-0xC8], ymm0
+ vmovd edx, xmm0
push edx
xor edx, edx
mov ecx, gword ptr [edi+0x04]
; gcrRegs +[ecx]
call [edi+0x0C]<unknown method>
; gcrRegs -[ecx]
- vmovdqu xmm0, xmmword ptr [ebp-0xC8]
- vpextrd edx, xmm0, 1
+ vmovdqu xmm1, xmmword ptr [ebp-0xC8]
+ vpextrd edx, xmm1, 1
push edx
mov edx, 1
mov ecx, gword ptr [edi+0x04]
; gcrRegs +[ecx]
call [edi+0x0C]<unknown method>
; gcrRegs -[ecx]
- vmovdqu xmm0, xmmword ptr [ebp-0xC8]
- vpextrd edx, xmm0, 2
+ vmovdqu xmm1, xmmword ptr [ebp-0xC8]
+ vpextrd edx, xmm1, 2
push edx
mov edx, 2
mov ecx, gword ptr [edi+0x04]
; gcrRegs +[ecx]
call [edi+0x0C]<unknown method>
; gcrRegs -[ecx]
- vmovdqu xmm0, xmmword ptr [ebp-0xC8]
- vpextrd edx, xmm0, 3
+ vmovdqu xmm1, xmmword ptr [ebp-0xC8]
+ vpextrd edx, xmm1, 3
push edx
mov edx, 3
mov ecx, gword ptr [edi+0x04]
; gcrRegs +[ecx]
call [edi+0x0C]<unknown method>
; gcrRegs -[ecx]
- vmovups ymm2, ymmword ptr [ebp-0xC8]
- vextracti128 xmm0, ymm2, 1
- vmovd edx, xmm0
+ vmovups ymm0, ymmword ptr [ebp-0xC8]
+ vextracti128 xmm1, ymm0, 1
+ vmovd edx, xmm1
push edx
mov edx, 4
- ;; size=304 bbWeight=1 PerfScore 128.75
-G_M21446_IG04: ; bbWeight=1, extend
mov ecx, gword ptr [edi+0x04]
; gcrRegs +[ecx]
call [edi+0x0C]<unknown method>
; gcrRegs -[ecx]
- vmovups ymm2, ymmword ptr [ebp-0xC8]
- vextracti128 xmm0, ymm2, 1
- vpextrd edx, xmm0, 1
+ ;; size=302 bbWeight=1 PerfScore 131.58
+G_M21446_IG04: ; bbWeight=1, extend
+ vmovups ymm0, ymmword ptr [ebp-0xC8]
+ vextracti128 xmm1, ymm0, 1
+ vpextrd edx, xmm1, 1
push edx
mov edx, 5
mov ecx, gword ptr [edi+0x04]
; gcrRegs +[ecx]
call [edi+0x0C]<unknown method>
; gcrRegs -[ecx]
- vmovups ymm2, ymmword ptr [ebp-0xC8]
- vextracti128 xmm0, ymm2, 1
- vpextrd edx, xmm0, 2
+ vmovups ymm0, ymmword ptr [ebp-0xC8]
+ vextracti128 xmm1, ymm0, 1
+ vpextrd edx, xmm1, 2
push edx
mov edx, 6
mov ecx, gword ptr [edi+0x04]
; gcrRegs +[ecx]
call [edi+0x0C]<unknown method>
; gcrRegs -[ecx]
- vmovups ymm2, ymmword ptr [ebp-0xC8]
- vextracti128 xmm0, ymm2, 1
+ vmovups ymm0, ymmword ptr [ebp-0xC8]
+ vextracti128 xmm0, ymm0, 1
vpextrd edx, xmm0, 3
push edx
mov edx, 7
@@ -252,7 +250,7 @@ G_M21446_IG04: ; bbWeight=1, extend
; gcrRegs +[ecx]
call [edi+0x0C]<unknown method>
; gcrRegs -[ecx edi]
- ;; size=102 bbWeight=1 PerfScore 50.75
+ ;; size=96 bbWeight=1 PerfScore 45.75
G_M21446_IG05: ; bbWeight=1, epilog, nogc, extend
vzeroupper
lea esp, [ebp-0x08]
@@ -266,6 +264,6 @@ G_M21446_IG06: ; bbWeight=0, gcVars=00000000 {}, gcrefRegs=00000000 {}, b
int3
;; size=7 bbWeight=0 PerfScore 0.00
-; Total bytes of code 709, prolog size 14, PerfScore 292.50, instruction count 167, allocated bytes for code 709 (MethodHash=8544ac39) for method System.Numerics.Tests.GenericVectorTests:TestConditionalSelect[uint]():this (FullOpts)
+; Total bytes of code 695, prolog size 14, PerfScore 290.17, instruction count 165, allocated bytes for code 695 (MethodHash=8544ac39) for method System.Numerics.Tests.GenericVectorTests:TestConditionalSelect[uint]():this (FullOpts)
; ============================================================
Details
Improvements/regressions per collection
| Collection |
Contexts with diffs |
Improvements |
Regressions |
Same size |
Improvements (bytes) |
Regressions (bytes) |
| benchmarks.run.windows.x86.checked.mch |
1 |
1 |
0 |
0 |
-117 |
+0 |
| benchmarks.run_pgo.windows.x86.checked.mch |
1 |
1 |
0 |
0 |
-126 |
+0 |
| benchmarks.run_tiered.windows.x86.checked.mch |
1 |
1 |
0 |
0 |
-117 |
+0 |
| coreclr_tests.run.windows.x86.checked.mch |
16 |
16 |
0 |
0 |
-672 |
+0 |
| libraries.crossgen2.windows.x86.checked.mch |
0 |
0 |
0 |
0 |
-0 |
+0 |
| libraries.pmi.windows.x86.checked.mch |
24 |
24 |
0 |
0 |
-565 |
+0 |
| libraries_tests.run.windows.x86.Release.mch |
10 |
10 |
0 |
0 |
-896 |
+0 |
| librariestestsnotieredcompilation.run.windows.x86.Release.mch |
7 |
7 |
0 |
0 |
-845 |
+0 |
| realworld.run.windows.x86.checked.mch |
0 |
0 |
0 |
0 |
-0 |
+0 |
|
60 |
60 |
0 |
0 |
-3,338 |
+0 |
Context information
| Collection |
Diffed contexts |
MinOpts |
FullOpts |
Missed, base |
Missed, diff |
| benchmarks.run.windows.x86.checked.mch |
24,486 |
4 |
24,482 |
0 (0.00%) |
0 (0.00%) |
| benchmarks.run_pgo.windows.x86.checked.mch |
119,833 |
41,887 |
77,946 |
0 (0.00%) |
0 (0.00%) |
| benchmarks.run_tiered.windows.x86.checked.mch |
47,980 |
28,727 |
19,253 |
0 (0.00%) |
0 (0.00%) |
| coreclr_tests.run.windows.x86.checked.mch |
574,728 |
320,026 |
254,702 |
7 (0.00%) |
7 (0.00%) |
| libraries.crossgen2.windows.x86.checked.mch |
242,344 |
15 |
242,329 |
0 (0.00%) |
0 (0.00%) |
| libraries.pmi.windows.x86.checked.mch |
305,049 |
6 |
305,043 |
0 (0.00%) |
0 (0.00%) |
| libraries_tests.run.windows.x86.Release.mch |
632,286 |
427,924 |
204,362 |
0 (0.00%) |
0 (0.00%) |
| librariestestsnotieredcompilation.run.windows.x86.Release.mch |
316,428 |
21,871 |
294,557 |
0 (0.00%) |
0 (0.00%) |
| realworld.run.windows.x86.checked.mch |
35,987 |
3 |
35,984 |
0 (0.00%) |
0 (0.00%) |
|
2,299,121 |
840,463 |
1,458,658 |
7 (0.00%) |
7 (0.00%) |
jit-analyze output
benchmarks.run.windows.x86.checked.mch
To reproduce these diffs on Windows x86:
superpmi.py asmdiffs -target_os windows -target_arch x86 -arch x86
Summary of Code Size diffs:
(Lower is better)
Total bytes of base: 7123696 (overridden on cmd)
Total bytes of diff: 7123579 (overridden on cmd)
Total bytes of delta: -117 (-0.00 % of base)
diff is an improvement.
relative diff is an improvement.
Detail diffs
Top file improvements (bytes):
-117 : 22326.dasm (-11.17 % of base)
1 total files with Code Size differences (1 improved, 0 regressed), 0 unchanged.
Top method improvements (bytes):
-117 (-11.17 % of base) : 22326.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
Top method improvements (percentages):
-117 (-11.17 % of base) : 22326.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
1 total methods with Code Size differences (1 improved, 0 regressed).
benchmarks.run_pgo.windows.x86.checked.mch
To reproduce these diffs on Windows x86:
superpmi.py asmdiffs -target_os windows -target_arch x86 -arch x86
Summary of Code Size diffs:
(Lower is better)
Total bytes of base: 45854626 (overridden on cmd)
Total bytes of diff: 45854500 (overridden on cmd)
Total bytes of delta: -126 (-0.00 % of base)
diff is an improvement.
relative diff is an improvement.
Detail diffs
Top file improvements (bytes):
-126 : 94556.dasm (-11.73 % of base)
1 total files with Code Size differences (1 improved, 0 regressed), 0 unchanged.
Top method improvements (bytes):
-126 (-11.73 % of base) : 94556.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (Tier0-FullOpts)
Top method improvements (percentages):
-126 (-11.73 % of base) : 94556.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (Tier0-FullOpts)
1 total methods with Code Size differences (1 improved, 0 regressed).
benchmarks.run_tiered.windows.x86.checked.mch
To reproduce these diffs on Windows x86:
superpmi.py asmdiffs -target_os windows -target_arch x86 -arch x86
Summary of Code Size diffs:
(Lower is better)
Total bytes of base: 9444502 (overridden on cmd)
Total bytes of diff: 9444385 (overridden on cmd)
Total bytes of delta: -117 (-0.00 % of base)
diff is an improvement.
relative diff is an improvement.
Detail diffs
Top file improvements (bytes):
-117 : 44440.dasm (-11.17 % of base)
1 total files with Code Size differences (1 improved, 0 regressed), 0 unchanged.
Top method improvements (bytes):
-117 (-11.17 % of base) : 44440.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (Tier0-FullOpts)
Top method improvements (percentages):
-117 (-11.17 % of base) : 44440.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (Tier0-FullOpts)
1 total methods with Code Size differences (1 improved, 0 regressed).
coreclr_tests.run.windows.x86.checked.mch
To reproduce these diffs on Windows x86:
superpmi.py asmdiffs -target_os windows -target_arch x86 -arch x86
Summary of Code Size diffs:
(Lower is better)
Total bytes of base: 309424823 (overridden on cmd)
Total bytes of diff: 309424151 (overridden on cmd)
Total bytes of delta: -672 (-0.00 % of base)
diff is an improvement.
relative diff is an improvement.
Detail diffs
Top file improvements (bytes):
-56 : 207941.dasm (-3.64 % of base)
-56 : 469367.dasm (-3.69 % of base)
-56 : 469368.dasm (-1.24 % of base)
-56 : 207939.dasm (-3.64 % of base)
-56 : 207946.dasm (-3.69 % of base)
-56 : 207947.dasm (-1.24 % of base)
-56 : 469363.dasm (-3.64 % of base)
-56 : 469364.dasm (-3.64 % of base)
-28 : 207938.dasm (-0.63 % of base)
-28 : 469366.dasm (-1.86 % of base)
-28 : 207935.dasm (-1.89 % of base)
-28 : 207944.dasm (-1.86 % of base)
-28 : 207945.dasm (-1.86 % of base)
-28 : 469361.dasm (-1.89 % of base)
-28 : 469362.dasm (-0.63 % of base)
-28 : 469365.dasm (-1.86 % of base)
16 total files with Code Size differences (16 improved, 0 regressed), 0 unchanged.
Top method improvements (bytes):
-56 (-3.64 % of base) : 469364.dasm - VectorTest+VectorRelopTest`1[ubyte]:VectorRelOp(ubyte,ubyte):int (FullOpts)
-56 (-3.64 % of base) : 207941.dasm - VectorTest+VectorRelopTest`1[ubyte]:VectorRelOp(ubyte,ubyte):int (Tier0-FullOpts)
-56 (-3.69 % of base) : 469367.dasm - VectorTest+VectorRelopTest`1[uint]:VectorRelOp(uint,uint):int (FullOpts)
-56 (-3.69 % of base) : 207946.dasm - VectorTest+VectorRelopTest`1[uint]:VectorRelOp(uint,uint):int (Tier0-FullOpts)
-56 (-1.24 % of base) : 469368.dasm - VectorTest+VectorRelopTest`1[ulong]:VectorRelOp(ulong,ulong):int (FullOpts)
-56 (-1.24 % of base) : 207947.dasm - VectorTest+VectorRelopTest`1[ulong]:VectorRelOp(ulong,ulong):int (Tier0-FullOpts)
-56 (-3.64 % of base) : 469363.dasm - VectorTest+VectorRelopTest`1[ushort]:VectorRelOp(ushort,ushort):int (FullOpts)
-56 (-3.64 % of base) : 207939.dasm - VectorTest+VectorRelopTest`1[ushort]:VectorRelOp(ushort,ushort):int (Tier0-FullOpts)
-28 (-1.86 % of base) : 469366.dasm - VectorTest+VectorRelopTest`1[byte]:VectorRelOp(byte,byte):int (FullOpts)
-28 (-1.86 % of base) : 207945.dasm - VectorTest+VectorRelopTest`1[byte]:VectorRelOp(byte,byte):int (Tier0-FullOpts)
-28 (-1.89 % of base) : 469361.dasm - VectorTest+VectorRelopTest`1[int]:VectorRelOp(int,int):int (FullOpts)
-28 (-1.89 % of base) : 207935.dasm - VectorTest+VectorRelopTest`1[int]:VectorRelOp(int,int):int (Tier0-FullOpts)
-28 (-0.63 % of base) : 469362.dasm - VectorTest+VectorRelopTest`1[long]:VectorRelOp(long,long):int (FullOpts)
-28 (-0.63 % of base) : 207938.dasm - VectorTest+VectorRelopTest`1[long]:VectorRelOp(long,long):int (Tier0-FullOpts)
-28 (-1.86 % of base) : 469365.dasm - VectorTest+VectorRelopTest`1[short]:VectorRelOp(short,short):int (FullOpts)
-28 (-1.86 % of base) : 207944.dasm - VectorTest+VectorRelopTest`1[short]:VectorRelOp(short,short):int (Tier0-FullOpts)
Top method improvements (percentages):
-56 (-3.69 % of base) : 469367.dasm - VectorTest+VectorRelopTest`1[uint]:VectorRelOp(uint,uint):int (FullOpts)
-56 (-3.69 % of base) : 207946.dasm - VectorTest+VectorRelopTest`1[uint]:VectorRelOp(uint,uint):int (Tier0-FullOpts)
-56 (-3.64 % of base) : 469364.dasm - VectorTest+VectorRelopTest`1[ubyte]:VectorRelOp(ubyte,ubyte):int (FullOpts)
-56 (-3.64 % of base) : 207941.dasm - VectorTest+VectorRelopTest`1[ubyte]:VectorRelOp(ubyte,ubyte):int (Tier0-FullOpts)
-56 (-3.64 % of base) : 469363.dasm - VectorTest+VectorRelopTest`1[ushort]:VectorRelOp(ushort,ushort):int (FullOpts)
-56 (-3.64 % of base) : 207939.dasm - VectorTest+VectorRelopTest`1[ushort]:VectorRelOp(ushort,ushort):int (Tier0-FullOpts)
-28 (-1.89 % of base) : 469361.dasm - VectorTest+VectorRelopTest`1[int]:VectorRelOp(int,int):int (FullOpts)
-28 (-1.89 % of base) : 207935.dasm - VectorTest+VectorRelopTest`1[int]:VectorRelOp(int,int):int (Tier0-FullOpts)
-28 (-1.86 % of base) : 469366.dasm - VectorTest+VectorRelopTest`1[byte]:VectorRelOp(byte,byte):int (FullOpts)
-28 (-1.86 % of base) : 207945.dasm - VectorTest+VectorRelopTest`1[byte]:VectorRelOp(byte,byte):int (Tier0-FullOpts)
-28 (-1.86 % of base) : 469365.dasm - VectorTest+VectorRelopTest`1[short]:VectorRelOp(short,short):int (FullOpts)
-28 (-1.86 % of base) : 207944.dasm - VectorTest+VectorRelopTest`1[short]:VectorRelOp(short,short):int (Tier0-FullOpts)
-56 (-1.24 % of base) : 469368.dasm - VectorTest+VectorRelopTest`1[ulong]:VectorRelOp(ulong,ulong):int (FullOpts)
-56 (-1.24 % of base) : 207947.dasm - VectorTest+VectorRelopTest`1[ulong]:VectorRelOp(ulong,ulong):int (Tier0-FullOpts)
-28 (-0.63 % of base) : 469362.dasm - VectorTest+VectorRelopTest`1[long]:VectorRelOp(long,long):int (FullOpts)
-28 (-0.63 % of base) : 207938.dasm - VectorTest+VectorRelopTest`1[long]:VectorRelOp(long,long):int (Tier0-FullOpts)
16 total methods with Code Size differences (16 improved, 0 regressed).
libraries.pmi.windows.x86.checked.mch
To reproduce these diffs on Windows x86:
superpmi.py asmdiffs -target_os windows -target_arch x86 -arch x86
Summary of Code Size diffs:
(Lower is better)
Total bytes of base: 49148609 (overridden on cmd)
Total bytes of diff: 49148044 (overridden on cmd)
Total bytes of delta: -565 (-0.00 % of base)
diff is an improvement.
relative diff is an improvement.
Detail diffs
Top file improvements (bytes):
-117 : 4822.dasm (-11.17 % of base)
-56 : 273944.dasm (-20.36 % of base)
-56 : 274001.dasm (-20.36 % of base)
-28 : 274003.dasm (-16.47 % of base)
-28 : 273946.dasm (-16.47 % of base)
-21 : 273965.dasm (-20.59 % of base)
-21 : 274022.dasm (-20.59 % of base)
-21 : 273943.dasm (-20.59 % of base)
-21 : 274000.dasm (-20.59 % of base)
-20 : 4817.dasm (-7.07 % of base)
-15 : 4815.dasm (-5.21 % of base)
-14 : 273945.dasm (-14.43 % of base)
-14 : 273941.dasm (-17.72 % of base)
-14 : 273963.dasm (-17.72 % of base)
-14 : 273999.dasm (-17.07 % of base)
-14 : 273964.dasm (-17.07 % of base)
-14 : 274002.dasm (-14.43 % of base)
-14 : 274021.dasm (-17.07 % of base)
-14 : 273942.dasm (-17.07 % of base)
-14 : 273998.dasm (-17.72 % of base)
24 total files with Code Size differences (24 improved, 0 regressed), 0 unchanged.
Top method improvements (bytes):
-117 (-11.17 % of base) : 4822.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
-56 (-20.36 % of base) : 273944.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte]):ubyte (FullOpts)
-56 (-20.36 % of base) : 274001.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte]):ubyte (FullOpts)
-28 (-16.47 % of base) : 273946.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte]):ubyte (FullOpts)
-28 (-16.47 % of base) : 274003.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte]):ubyte (FullOpts)
-21 (-20.59 % of base) : 273943.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
-21 (-20.59 % of base) : 273965.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
-21 (-20.59 % of base) : 274000.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
-21 (-20.59 % of base) : 274022.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
-20 (-7.07 % of base) : 4817.dasm - System.Buffers.ProbabilisticMap:ContainsMask16Chars(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte],byref):System.Runtime.Intrinsics.Vector128`1[ubyte] (FullOpts)
-15 (-5.21 % of base) : 4815.dasm - System.Buffers.ProbabilisticMap:ContainsMask32CharsAvx2(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte],byref):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
-14 (-17.72 % of base) : 273941.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]):System.Runtime.Intrinsics.Vector128`1[ubyte] (FullOpts)
-14 (-14.43 % of base) : 273945.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte]):ubyte (FullOpts)
-14 (-17.07 % of base) : 273942.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte]):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
-14 (-17.72 % of base) : 273963.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]):System.Runtime.Intrinsics.Vector128`1[ubyte] (FullOpts)
-14 (-17.07 % of base) : 273964.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte]):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
-14 (-17.72 % of base) : 273998.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]):System.Runtime.Intrinsics.Vector128`1[ubyte] (FullOpts)
-14 (-14.43 % of base) : 274002.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte]):ubyte (FullOpts)
-14 (-17.07 % of base) : 273999.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte]):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
-14 (-17.72 % of base) : 274020.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]):System.Runtime.Intrinsics.Vector128`1[ubyte] (FullOpts)
Top method improvements (percentages):
-21 (-20.59 % of base) : 273943.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
-21 (-20.59 % of base) : 273965.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
-21 (-20.59 % of base) : 274000.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
-21 (-20.59 % of base) : 274022.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte],System.Runtime.Intrinsics.Vector512`1[ubyte]):System.Runtime.Intrinsics.Vector512`1[ubyte] (FullOpts)
-56 (-20.36 % of base) : 273944.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte]):ubyte (FullOpts)
-56 (-20.36 % of base) : 274001.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte]):ubyte (FullOpts)
-14 (-17.72 % of base) : 273941.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]):System.Runtime.Intrinsics.Vector128`1[ubyte] (FullOpts)
-14 (-17.72 % of base) : 273963.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]):System.Runtime.Intrinsics.Vector128`1[ubyte] (FullOpts)
-14 (-17.72 % of base) : 273998.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]):System.Runtime.Intrinsics.Vector128`1[ubyte] (FullOpts)
-14 (-17.72 % of base) : 274020.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]):System.Runtime.Intrinsics.Vector128`1[ubyte] (FullOpts)
-14 (-17.07 % of base) : 273942.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte]):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
-14 (-17.07 % of base) : 273964.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte]):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
-14 (-17.07 % of base) : 273999.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte]):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
-14 (-17.07 % of base) : 274021.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte],System.Runtime.Intrinsics.Vector256`1[ubyte]):System.Runtime.Intrinsics.Vector256`1[ubyte] (FullOpts)
-28 (-16.47 % of base) : 273946.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte]):ubyte (FullOpts)
-28 (-16.47 % of base) : 274003.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector512`1[ubyte]):ubyte (FullOpts)
-14 (-14.43 % of base) : 273945.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte]):ubyte (FullOpts)
-14 (-14.43 % of base) : 274002.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]:Invoke(System.Runtime.Intrinsics.Vector256`1[ubyte]):ubyte (FullOpts)
-117 (-11.17 % of base) : 4822.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
-20 (-7.07 % of base) : 4817.dasm - System.Buffers.ProbabilisticMap:ContainsMask16Chars(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte],byref):System.Runtime.Intrinsics.Vector128`1[ubyte] (FullOpts)
24 total methods with Code Size differences (24 improved, 0 regressed).
libraries_tests.run.windows.x86.Release.mch
To reproduce these diffs on Windows x86:
superpmi.py asmdiffs -target_os windows -target_arch x86 -arch x86
Summary of Code Size diffs:
(Lower is better)
Total bytes of base: 188553323 (overridden on cmd)
Total bytes of diff: 188552427 (overridden on cmd)
Total bytes of delta: -896 (-0.00 % of base)
diff is an improvement.
relative diff is an improvement.
Detail diffs
Top file improvements (bytes):
-182 : 367511.dasm (-18.00 % of base)
-182 : 369103.dasm (-18.00 % of base)
-154 : 369537.dasm (-17.44 % of base)
-98 : 363521.dasm (-14.78 % of base)
-98 : 369394.dasm (-14.78 % of base)
-84 : 370885.dasm (-10.05 % of base)
-56 : 318312.dasm (-5.28 % of base)
-14 : 366898.dasm (-17.72 % of base)
-14 : 370873.dasm (-17.72 % of base)
-14 : 366792.dasm (-17.72 % of base)
10 total files with Code Size differences (10 improved, 0 regressed), 0 unchanged.
Top method improvements (bytes):
-182 (-18.00 % of base) : 369103.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ubyte,System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]](System.ReadOnlySpan`1[ubyte]):ubyte (Tier0-FullOpts)
-182 (-18.00 % of base) : 367511.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ubyte,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]](System.ReadOnlySpan`1[ubyte]):ubyte (Tier0-FullOpts)
-154 (-17.44 % of base) : 369537.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ushort,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ushort]](System.ReadOnlySpan`1[ushort]):ushort (Tier0-FullOpts)
-98 (-14.78 % of base) : 363521.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ulong,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]](System.ReadOnlySpan`1[ulong]):ulong (Tier0-FullOpts)
-98 (-14.78 % of base) : 369394.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ulong,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]](System.ReadOnlySpan`1[ulong]):ulong (Tier0-FullOpts)
-84 (-10.05 % of base) : 370885.dasm - System.Numerics.Tensors.TensorPrimitives:InvokeSpanSpanIntoSpan[uint,System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[uint]](System.ReadOnlySpan`1[uint],System.ReadOnlySpan`1[uint],System.Span`1[uint]) (Tier1)
-56 (-5.28 % of base) : 318312.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (Tier0-FullOpts)
-14 (-17.72 % of base) : 366898.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[uint]:Invoke(System.Runtime.Intrinsics.Vector128`1[uint],System.Runtime.Intrinsics.Vector128`1[uint]):System.Runtime.Intrinsics.Vector128`1[uint] (Tier1)
-14 (-17.72 % of base) : 370873.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[uint]:Invoke(System.Runtime.Intrinsics.Vector128`1[uint],System.Runtime.Intrinsics.Vector128`1[uint]):System.Runtime.Intrinsics.Vector128`1[uint] (Tier1)
-14 (-17.72 % of base) : 366792.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[uint]:Invoke(System.Runtime.Intrinsics.Vector128`1[uint],System.Runtime.Intrinsics.Vector128`1[uint]):System.Runtime.Intrinsics.Vector128`1[uint] (Tier1)
Top method improvements (percentages):
-182 (-18.00 % of base) : 369103.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ubyte,System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]](System.ReadOnlySpan`1[ubyte]):ubyte (Tier0-FullOpts)
-182 (-18.00 % of base) : 367511.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ubyte,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]](System.ReadOnlySpan`1[ubyte]):ubyte (Tier0-FullOpts)
-14 (-17.72 % of base) : 366898.dasm - System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[uint]:Invoke(System.Runtime.Intrinsics.Vector128`1[uint],System.Runtime.Intrinsics.Vector128`1[uint]):System.Runtime.Intrinsics.Vector128`1[uint] (Tier1)
-14 (-17.72 % of base) : 370873.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[uint]:Invoke(System.Runtime.Intrinsics.Vector128`1[uint],System.Runtime.Intrinsics.Vector128`1[uint]):System.Runtime.Intrinsics.Vector128`1[uint] (Tier1)
-14 (-17.72 % of base) : 366792.dasm - System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[uint]:Invoke(System.Runtime.Intrinsics.Vector128`1[uint],System.Runtime.Intrinsics.Vector128`1[uint]):System.Runtime.Intrinsics.Vector128`1[uint] (Tier1)
-154 (-17.44 % of base) : 369537.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ushort,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ushort]](System.ReadOnlySpan`1[ushort]):ushort (Tier0-FullOpts)
-98 (-14.78 % of base) : 363521.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ulong,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]](System.ReadOnlySpan`1[ulong]):ulong (Tier0-FullOpts)
-98 (-14.78 % of base) : 369394.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ulong,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]](System.ReadOnlySpan`1[ulong]):ulong (Tier0-FullOpts)
-84 (-10.05 % of base) : 370885.dasm - System.Numerics.Tensors.TensorPrimitives:InvokeSpanSpanIntoSpan[uint,System.Numerics.Tensors.TensorPrimitives+MinMagnitudePropagateNaNOperator`1[uint]](System.ReadOnlySpan`1[uint],System.ReadOnlySpan`1[uint],System.Span`1[uint]) (Tier1)
-56 (-5.28 % of base) : 318312.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (Tier0-FullOpts)
10 total methods with Code Size differences (10 improved, 0 regressed).
librariestestsnotieredcompilation.run.windows.x86.Release.mch
To reproduce these diffs on Windows x86:
superpmi.py asmdiffs -target_os windows -target_arch x86 -arch x86
Summary of Code Size diffs:
(Lower is better)
Total bytes of base: 103930242 (overridden on cmd)
Total bytes of diff: 103929397 (overridden on cmd)
Total bytes of delta: -845 (-0.00 % of base)
diff is an improvement.
relative diff is an improvement.
Detail diffs
Top file improvements (bytes):
-182 : 167191.dasm (-18.00 % of base)
-182 : 165267.dasm (-18.00 % of base)
-154 : 167215.dasm (-17.44 % of base)
-117 : 149348.dasm (-11.17 % of base)
-98 : 167075.dasm (-14.78 % of base)
-98 : 166467.dasm (-14.78 % of base)
-14 : 167868.dasm (-1.97 % of base)
7 total files with Code Size differences (7 improved, 0 regressed), 0 unchanged.
Top method improvements (bytes):
-182 (-18.00 % of base) : 167191.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ubyte,System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]](System.ReadOnlySpan`1[ubyte]):ubyte (FullOpts)
-182 (-18.00 % of base) : 165267.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ubyte,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]](System.ReadOnlySpan`1[ubyte]):ubyte (FullOpts)
-154 (-17.44 % of base) : 167215.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ushort,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ushort]](System.ReadOnlySpan`1[ushort]):ushort (FullOpts)
-117 (-11.17 % of base) : 149348.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
-98 (-14.78 % of base) : 167075.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ulong,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]](System.ReadOnlySpan`1[ulong]):ulong (FullOpts)
-98 (-14.78 % of base) : 166467.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ulong,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]](System.ReadOnlySpan`1[ulong]):ulong (FullOpts)
-14 (-1.97 % of base) : 167868.dasm - System.Numerics.Tests.GenericVectorTests:TestConditionalSelect[uint]():this (FullOpts)
Top method improvements (percentages):
-182 (-18.00 % of base) : 167191.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ubyte,System.Numerics.Tensors.TensorPrimitives+MaxMagnitudeOperator`1[ubyte]](System.ReadOnlySpan`1[ubyte]):ubyte (FullOpts)
-182 (-18.00 % of base) : 165267.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ubyte,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ubyte]](System.ReadOnlySpan`1[ubyte]):ubyte (FullOpts)
-154 (-17.44 % of base) : 167215.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ushort,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ushort]](System.ReadOnlySpan`1[ushort]):ushort (FullOpts)
-98 (-14.78 % of base) : 167075.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ulong,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]](System.ReadOnlySpan`1[ulong]):ulong (FullOpts)
-98 (-14.78 % of base) : 166467.dasm - System.Numerics.Tensors.TensorPrimitives:MinMaxCore[ulong,System.Numerics.Tensors.TensorPrimitives+MinMagnitudeOperator`1[ulong]](System.ReadOnlySpan`1[ulong]):ulong (FullOpts)
-117 (-11.17 % of base) : 149348.dasm - System.Buffers.ProbabilisticMap:IndexOfAnyVectorized(byref,byref,int,System.ReadOnlySpan`1[ushort]):int (FullOpts)
-14 (-1.97 % of base) : 167868.dasm - System.Numerics.Tests.GenericVectorTests:TestConditionalSelect[uint]():this (FullOpts)
7 total methods with Code Size differences (7 improved, 0 regressed).