Re: Challenge For The "Expert" Tyrone

Liste des GroupesRevenir à col advocacy 
Sujet : Re: Challenge For The "Expert" Tyrone
De : physfitfreak (at) *nospam* gmail.com (Physfitfreak)
Groupes : comp.os.linux.advocacy
Date : 28. Jan 2025, 17:43:16
Autres entêtes
Organisation : individual
Message-ID : <vnb1f4$1tgcb$1@dont-email.me>
References : 1
User-Agent : Mozilla Thunderbird
On 1/28/25 10:14 AM, Farley Flud wrote:
Poor tired, exhausted Tyrone.  He must have spent days of futile
searching in an attempt to find a copy somewhere of my absolutely
perfect AVX-512 assembly code.
 (Ha, ha, ha, ha, ha, ha, ha, ha, ha, ha!)
 Of course, all of his efforts were in total vain, because no such
copy exists anywhere, except right here on C.O.L.A.
 Poor tired, exhausted Tyrone (not to mention poor, dumb bastard).
 (Ha, ha, ha, ha, ha, ha, ha, ha, ha, ha!)
 Well, I have a challenge for the "expert" Tyrone.
 I have ever so slightly modified my absolutely perfect AVX-512 code
so that it no longer will execute.  Instead it will crash horribly.
 The ever-so-slightly modified code follows.
 Let's allow the "expert" Tyrone to discover and clearly report
the fault.
 Anyone want to takes bets?
 Ha, ha, ha, ha, ha, ha, ha, ha, ha, ha, ha, ha, ha!
 I recommend that Tyrone invest his extensive and exhaustive search
time in a search for his own stupidity.
 Ha, ha, ha, ha, ha, ha, ha, ha, ha, ha, ha, ha, ha!
  ============================================
Begin AVX-512 NASM Assembly (Modified)
============================================
 BITS 64
 segment .text
global _start
 _start:
mov r8, data_in
mov r9, data_out
mov rbx, qword [stride]
xor rdx, rdx
mov rax, qword [N]
div rbx ; rax = quotient, rdx = remainder
load:
vmovdqa32 zmm1, zword [r8]
vmovdqa32 zword [r9], zmm1
add r8, 64 ; increment data pointers
add r9, 64
dec rax
jnz load
xor r11, r11 ; load mask, i.e. only rdx left over to load
mov r10, -1
mov rcx, rdx
shld r11, r10, cl
kmovq k1, r11;
vmovdqa32 zmm1{k1}{z}, zword [r8]
vmovdqa32 zword [r9], zmm1
exit:
xor edi,edi
mov eax,60
syscall
 segment .data
align 64
N: dq 37 ;set length of block and stride
stride: dq 16
data_in: dd 16 dup (0xefbeadde) ;dummy data
dd 16 dup (0xfecaafde)
dd 5 dup (0xefbeadde)
 segment .bss
alignb 64
data_out: resd 37
 ========================================
End AVX-512 NASM Assembly (Modified)
========================================
   
No Farley, not after the advent of DeepSeek.
If I, who doesn't know _hoot_ about assembly language can find the answer within about 10 seconds of your post, then any "Tyrone" can also.
Output from DeepSeek:
1. Incorrect Handling of Remainder:
     The code calculates the quotient and remainder of N / stride using the div instruction. The quotient (rax) is used to determine how many full 64-byte blocks to copy, and the remainder (rdx) is used to handle the remaining elements.
     However, the code assumes that the remainder (rdx) will always be less than or equal to 16 (the stride). If N is not a multiple of stride, the remainder could be larger than 16, which would lead to incorrect behavior when creating the mask for the final partial load.
2. Mask Calculation:
     The mask calculation for the final partial load is incorrect. The shld instruction is used to create a mask, but it doesn't correctly handle the case where the remainder is zero. If the remainder is zero, the mask should be zero, but the current code will still attempt to load and store data, which could lead to accessing out-of-bounds memory.
3. Alignment Issues:
     The data_in and data_out arrays are aligned to 64 bytes, which is correct for AVX-512 operations. However, the code does not ensure that the final partial load is also aligned, which could lead to performance issues or crashes on some architectures.

Date Sujet#  Auteur
28 Jan 25 * Challenge For The "Expert" Tyrone10Farley Flud
28 Jan 25 +* Re: Challenge For The "Expert" Tyrone8Physfitfreak
28 Jan 25 i`* Re: Challenge For The "Expert" Tyrone7Farley Flud
28 Jan 25 i `* Re: Challenge For The "Expert" Tyrone6Physfitfreak
28 Jan 25 i  `* Re: Challenge For The "Expert" Tyrone5Farley Flud
28 Jan 25 i   +* Re: Challenge For The "Expert" Tyrone3Physfitfreak
28 Jan 25 i   i`* Re: Challenge For The "Expert" Tyrone2Farley Flud
29 Jan 25 i   i `- Re: Challenge For The "Expert" Tyrone1Physfitfreak
29 Jan 25 i   `- Re: Challenge For The "Expert" Tyrone1DFS
1 Feb 25 `- Re: Challenge For The "Expert" Tyrone1Tyrone

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal