The prototypes for Intel® Streaming SIMD Extensions (Intel® SSE) intrinsics for load operations are in the xmmintrin.h header file.
The results of each intrinsic operation are placed in a register. This register is illustrated for each intrinsic with R0-R3. R0, R1, R2 and R3 each represent one of the four 32-bit pieces of the result register.
Intrinsic Name |
Operation |
Corresponding |
---|---|---|
_mm_loadh_pi |
Load high |
MOVHPS reg, mem |
_mm_loadl_pi |
Load low |
MOVLPS reg, mem |
_mm_load_ss |
Load the low value and clear the three high values |
MOVSS |
_mm_load1_ps |
Load one value into all four words |
MOVSS + Shuffling |
_mm_load_ps |
Load four values, address aligned |
MOVAPS |
_mm_loadu_ps |
Load four values, address unaligned |
MOVUPS |
_mm_loadr_ps |
Load four values in reverse |
MOVAPS + Shuffling |
__m128 _mm_loadh_pi(__m128 a, __m64 const *p)
Sets the upper two SP FP values with 64 bits of data loaded from the address p.
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
a0 |
a1 |
*p0 |
*p1 |
__m128 _mm_loadl_pi(__m128 a, __m64 const *p)
Sets the lower two SP FP values with 64 bits of data loaded from the address p; the upper two values are passed through from a.
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
*p0 |
*p1 |
a2 |
a3 |
__m128 _mm_load_ss(float * p )
Loads an SP FP value into the low word and clears the upper three words.
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
*p |
0.0 |
0.0 |
0.0 |
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
*p |
*p |
*p |
*p |
__m128 _mm_load_ps(float * p )
Loads four SP FP values. The address must be 16-byte-aligned.
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
p[0] |
p[1] |
p[2] |
p[3] |
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
p[0] |
p[1] |
p[2] |
p[3] |
__m128 _mm_loadr_ps(float * p)
Loads four SP FP values in reverse order. The address must be 16-byte-aligned.
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
p[3] |
p[2] |
p[1] |
p[0] |
Copyright © 1996-2011, Intel Corporation. All rights reserved.