What does this SSE code do?
Optimizing GCC 4.8.2:
f:
xor rax, rax
.L4:
movdqu xmm0, XMMWORD PTR [rsi+rax]
movdqu xmm1, XMMWORD PTR [rdx+rax]
pmaxub xmm0, xmm1
movdqu XMMWORD PTR [rdi+rax], xmm0
add rax, 16
cmp rax, 1024
jne .L4
rep ret
More challenges: challenges.re; about solutions: challenges.re/#Solutions.