Intel® C++ Compiler 16.0 User and Reference Guide
Provides the ability to vectorize user-defined functions and loops.
Windows* OS: __declspec(vector(clauses)) |
Linux* OS: __attribute__((vector(clauses))) |
clauses |
Is one of the following: processor clause, in the form processor(cpuid). This clause creates a vector version of the function for the given target processor (cpuid). See cpu_specific for a list of supported values. The default processor is determined by the implicit or explicit process- or architecture-specific flag in the compiler command line. vector length clause, in the form vectorlength(n), where n is a vectorlength (vl) and must be an integer with the value 2, 4, 8, or 16. This clause tells the compiler that each routine invocation at the call site should execute the computation equivalent to n times the scalar function execution. linear clause, in the form linear(param1:step1 [, param2:step2]…), where param is a scalar variable and step is a compile-time integer constant expression. This clause tells the compiler that for each consecutive invocation of the routine in a serial execution, the value of param1 is incremented by step1, param2 is incremented by step2, and so on. If more than one step is specified for a particular variable, a compile-time error occurs. Multiple linear clauses are merged as a union. uniform clause, in the form uniform(param [, param,]…), where param is a formal parameter of the specified function. This clause tells the compiler that the values of the specified arguments can be broadcast to all iterations as a performance optimization. mask clause, in the form [no]mask. This clause tells the compiler to generate a masked vector version of the routine. |
This keyword combines with the map operation at the call site to provide the data parallel semantics. When multiple instances of the vector declaration are invoked in a parallel context, the execution order among them is not sequenced.