Intel® C++ Compiler 16.0 User and Reference Guide
This topic only applies to Intel® Many Integrated Core Architecture (Intel® MIC Architecture).
When using Intel's language extensions for offload to Intel® MIC Architecture, the values of pointer variables used in #pragma offload are never modified on the CPU by the process of offloading. The CPU addresses in the pointer variables are used to keep track of dynamic memory allocation on Intel® MIC Architecture. The offload library manages the association between CPU addresses that have associated memory on Intel® MIC Architecture and uses the mapping during data transfer operations.
However, this requires that memory allocated on Intel® MIC Architecture have a matching memory range allocated on the CPU. Sometimes this may lead to CPU memory being allocated unnecessarily.
The targetptr and preallocated modifiers allow dynamic memory allocation on Intel® MIC Architecture without matching memory allocation on the CPU.
The targetptr modifier permits allocating memory for Intel® MIC Architecture only.
If the targetptr modifier is used when memory is allocated using alloc_if(1), then the CPU variable is updated with the address of the memory allocated on Intel® MIC Architecture. Since this value is not meaningful on the CPU, such a variable can only be used as a source or destination of data transfer in offload pragmas until a valid CPU address is assigned to it.
You can only use the targetptr modifier with mandatory offloads. Other modifiers you can use with targetptr are the offload modifiers align and alloc.
The alloc_if and free_if modifiers control both memory allocation on Intel® MIC Architecture and also adding or removing the allocated memory ranges from the runtime tables maintained by the offload runtime.
The following example shows how to allocate memory using #pragma offload_transfer, #pragma offload, and modifier targetptr:
01 #define ALLOC alloc_if(1) free_if(0) 02 #define REUSE alloc_if(0) free_if(0) 03 #define FREE alloc_if(0) free_if(1) 10 int* cpu_p = (int*)malloc(1000*sizeof(int)); 11 __declspec(target(mic)) int* mic_p; 20 // Allocate memory on Intel® MIC Architecture using targetptr 21 // mic_p is undefined before, has MIC memory address after 22 #pragma offload_transfer target(mic) \ 23 nocopy(mic_p[0:1000] : ALLOC targetptr) 30 // Transfer data from CPU and Intel® MIC Architecture and do some work 31 #pragma offload target(mic) \ 32 in(cpu_p[0:1000] : into(mic_p[0:1000]) REUSE targetptr) 33 { 34 … = *mic_p; // Use data pointed to by mic_p 35 *mic_p = … // Update values pointed to by mic_p 36 } 40 // Return data from Intel® MIC Architecture to CPU and free the memory 41 #pragma offload_transfer target(mic) \ 42 out(mic_p[0:1000] : into(cpu_p[0:1000]) FREE targetptr)
You can use the preallocated modifier with modifier targetptr to do memory allocation on Intel® MIC Architecture yourself. It specifies that the memory in Intel® MIC Architecture is already allocated by the user and has to be made available for data transfer.
Modifiers alloc_if and free_if affect entry or removal from the offload library tables only, and do not control memory allocation. The programmer manages the allocation.
The combination preallocated targetptr alloc_if(1) cannot be used with an in clause because although the memory may already be allocated on Intel® MIC Architecture, the memory address is not yet available on the CPU for data transfer. The modifiers preallocated targetptr alloc_if(1) are permissible with clause out or nocopy. At the end of the offload, the memory address on Intel® MIC Architecture is brought to the CPU. Once on the CPU, the variable can be used in future offloads to access that memory.
The following table explains the actions taken for preallocated targetptr with various combinations of clauses in, out, inout, and nocopy, and modifiers alloc_if and free_if.
Clause | Modifiers | Effect |
---|---|---|
in | alloc_if(0) free_if(0) |
Data is transferred into the variable previously registered as preallocated targetptr. |
in | alloc_if(0) free_if(1) |
Data is transferred into the variable previously registered as preallocated targetptr and registration is removed after the offload. |
in | alloc_if(1) free_if(0) |
Disallowed. |
in | alloc_if(1) free_if(1) |
Disallowed. |
out | alloc_if(0) free_if(0) |
Data is transferred from the variable previously registered as preallocated targetptr. |
out | alloc_if(0) free_if(1) |
Data is transferred from the variable previously registered as preallocated targetptr, and registration is removed after the offload. |
out | alloc_if(1) free_if(0) |
At the end of offload, the variable is registered as preallocated targetptr and data is transferred from it. |
out | alloc_if(1) free_if(1) |
At the end of offload, the variable is registered as preallocated targetptr, data is transferred from it, and then the registration is removed. |
inout | alloc_if(0) free_if(0) |
Data is transferred into the variable previously registered as preallocated targetptr at the start of offload, and from the variable at the end of offload. |
inout | alloc_if(0) free_if(1) |
Data is transferred into the variable previously registered as preallocated targetptr, at the start of offload. At the end of offload, data is transferred from thevariable and registration is removed. |
inout | alloc_if(1) free_if(0) |
Disallowed. |
inout | alloc_if(1) free_if(1) |
Disallowed. |
nocopy | alloc_if(0) free_if(0) |
No data transfer or registration changes occur. |
nocopy | alloc_if(0) free_if(1) |
No data transfer is done. At the end of offload, registration is removed. |
nocopy | alloc_if(1) free_if(0) |
No data transfer is done. At the end of offload, registration is added. |
nocopy | alloc_if(1) free_if(1) |
No data transfer or registration changes occur. |
The preallocated modifier is useful only with mandatory offloads. With optional offloads, if the offload that allocates memory does not occur then subsequent offloads that attempt to use preallocated memory will fail.
The align modifier cannot be used with preallocated because the memory allocation is done by the programmer.
You can use the alloc modifier with preallocated.
The following example shows how to allocate memory using #pragma offload and modifiers preallocated targetptr:
01 // Allocate memory on Intel® MIC Architecture using malloc and register it 02 #pragma offload target(mic) \ 03 nocopy(mic_p[0:1000] ALLOC preallocated targetptr ) 04 { 05 mic_p = user_malloc(1000*sizeof(int)); 06 } 10 // Transfer data from CPU and Intel® MIC Architecture and do some work 11 #pragma offload target(mic) \ 12 in(cpu_p[0:1000] : into(mic_p[0:1000]) REUSE targetptr ) 13 { 14 = *mic_p; 15 *mic_p = … 16 } 20 // Remove MIC memory registration and free Intel® MIC Architecture memory 21 #pragma offload target(mic) \ 22 nocopy(mic_p : FREE preallocated targetptr ) 23 { 24 free(mic_p); 25 }