In Intel® processors, the flush-to-zero (FTZ) and denormals-are-zero (DAZ) flags in the MXCSR register are used to control floating-point calculations. The Intel® Streaming SIMD (Single Instruction Multiple Data) Extensions (Intel® SSE) and the Intel® SSE 2 instructions, including scalar and vector instructions, benefit from enabling the FTZ and DAZ flags respectively. Floating-point computations using these Intel® SSE instructions are accelerated when the FTZ and DAZ flags are enabled and thus the performance of the application improves.
You can use the -ftz (Linux* and Mac OS* X) or /Qftz (Windows*) option to flush denormal results to zero when the application is in the gradual underflow mode. This option may improve performance if the denormal values are not critical to your application's behavior. The -ftz and /Qftz options, when applied to the main program, set the FTZ and the DAZ hardware flags. The -no-ftz and /Qftz- options leave the flags as they are.
The following table describes how the compiler processes denormal values based on the status of the FTZ and DAZ flags:
Flag |
When set to ON, the compiler... |
When set to OFF, the compiler... |
Supported on |
---|---|---|---|
FTZ (flush-to-zero) |
Sets denormal results from floating-point calculations to zero |
Does not change the denormal results |
Intel® 64 architectures, and some IA-32 architectures |
DAZ (denormals-are-zero) |
Treats denormal values used as input to floating-point instructions as zero |
Does not change the denormal instruction inputs |
Intel® 64 architecture and some IA-32 architecture |
DAZ and FTZ flags are not compatible with IEEE Standard 754, so you should only consider enabling them when strict compliance to the IEEE standard is not required.
Options -ftz and /Qftz are performance options. Setting these options does not guarantee that all denormals in a program are flushed to zero. They only cause denormals generated at run-time to be flushed to zero.
On Intel®64 and IA-32 systems, the compiler, by default, inserts code into the main routine to set the FTZ and DAZ flags. When -ftz or /Qftz option is used on IA-32 systems with the option –msse2 or /arch:sse2, the compiler will insert code to conditionally set FTZ/DAZ flags based on a run-time processor check. The -no-ftz (Linux* and Mac OS* X) or /Qftz- (Windows) will prevent the compiler from inserting any code that might set FTZ or DAZ flags.
When -ftz or /Qftz is used in combination with an Intel® SSE-enabling option on systems based on the IA-32 architecture (for example, -msse2 or /arch:sse2), the compiler will insert code in the main routine to set FTZ and DAZ. When -ftz or /Qftz is used without such an option, the compiler will insert code to conditionally set FTZ/DAZ based on a run-time processor check. -no-ftz (Linux and Mac OS X) or /Qftz- (Windows) will prevent the compiler from inserting any code that might set FTZ or DAZ.
The -ftz or /Qftz option only has an effect when the main program is being compiled. It sets the FTZ/DAZ mode for the process. The initial thread and any threads subsequently created by that process will operate in the FTZ/DAZ mode.
On systems based on the IA-32 and Intel® 64 architectures, every optimization option O level, except O0, sets -ftz and /Qftz.
If this option produces undesirable results of the numerical behavior of your program, you can turn the FTZ/DAZ mode off by using -no-ftz or /Qftz- in the command line while still benefiting from the O3 optimizations.
For some non-Intel processors, you can set the flags manually with the following macros:
Feature |
Examples |
---|---|
Enable FTZ |
|
Enable DAZ |
|
The prototypes for these macros are in xmmintrin.h (FTZ) and pmmintrin.h (DAZ).
Intel® 64 and IA-32 Architectures Software Developer's Manual, Volume 1: Basic Architecture
Copyright © 1996-2011, Intel Corporation. All rights reserved.