Intel® C++ Compiler 16.0 User and Reference Guide

offload_wait

Specifies a wait for a previously initiated asynchronous data transfer. This pragma only applies to Intel® MIC Architecture.

Syntax

#pragma offload_wait clause[, clause ...]

Required Clauses

Optional Clause

Arguments

Required Clauses

target-name [ :target-number] )

target-name represents the target. Use mic for Intel® Xeon Phi™ products.

target-number is an integer expression whose value is interpreted as follows:

>=0

Executes the statement on a specified target according to the following formula:

target = target-number % number_of_targets

For example, in a system with four targets:

  • Specifying 2 or 6 tells the runtime systems to execute the code on target 2, because the result of 2 % 4 and 6 % 4 is 2.

  • Specifying 1000 tells the runtime systems to execute the code on target 0, because the result of 1000 % 4 is 0.

<-1

Reserved.

wait ( tag [, tag, ...] ) )

Specifies a wait until a previously initiated asynchronous data transfer or asynchronous computation is completed.

tag is an expression that is a pointer-size value in the baseline language. This expression serves as a handle on a previously initiated asynchronous activity which used the same expression value in a signal clause. The activity could be an asynchronous computation or asynchronous data transfer.

This clause refers to a specific target device, so you must specify a target-number in the target clause that is greater than or equal to zero.

The signal associated with the handle is cleared following completion of the previously initiated asynchronous data transfer or asynchronous computation. Querying a signal before the signal has been initiated results in undefined behavior and a runtime abort of the application. For example, querying a signal on target:0 that was initiated for target:1 results in a runtime abort of the application because the signal was initiated for target:1, so there is no signal associated with target:0.

Optional Clauses

if (if-clause)

A Boolean expression.

If the expression evaluates to ...

... then the following occurs.

true

The runtime will wait until the tags are signaled.

false

The clause is ignored.

Use the same expression that you used to start the asynchronous computation or data transfer with offload or offload_transfer.

Note

Use this clause to control whether offload is enabled. A set of related pragmas should use this clause in a coordinated fashion, so that either all or none of the related offload statements are enabled.

status ( statusvarname )

Determines the status of the execution of an offloading construct. The statusvarname variable contains the value that explains the status of the execution. The Description section below explains how to initialize a status variable and the possible values for this variable.

stream ( handle )

Causes a wait for all offloads to the stream specified by handle to complete.

If handle is 0 and the target clause specifies a generic Intel® MIC Architecture device, then a wait occurs for all offloads on all devices to complete; for example, target(mic).

If handle is 0 but the target clause specifies a specific Intel® MIC Architecture device, then a wait occurs for all streams on that device to complete. For example, if target(mic:1) is specified, then a wait occurs for all offloads to device 1 to complete.

For more information about streams, see Offload Using Streams.

Description

This pragma specifies a wait for the completion of a previously initiated asynchronous data transfer done by offload_transfer, or an asynchronous computation and return data transfer, if any, done by offload.

The status clause requires a statusvarname variable. Initialize the statusvarname variable by using the OFFLOAD_STATUS_INIT( statusvarname ) macro. The values of the status variables are defined in offload.h and can be the following values:

Value

Description

OFFLOAD_SUCCESS = 0

The statements were successfully executed on the target.

OFFLOAD_DISABLED

The statements were not executed were not executed on the target. If you specified if-clause and the value of this clause is false, the statements were successfully executed on the host.

OFFLOAD_UNAVAILABLE

The statements were not executed on the target because the target was unavailable.

OFFLOAD_OUT_OF_MEMORY

The statements were not executed on the target because there was not enough memory available for offload-parameter.

OFFLOAD_PROCESS_DIED

The statements were not executed on the target because a runtime error occurred on the target that caused in the target process to terminate.

OFFLOAD_ERROR

The statements were not executed on the target because of an error.

Examples

Example: Double buffer inputs to an offload

#pragma offload_attribute(push, target(mic))
int count = 25000000;
int iter = 10;
float *in1, *out1;
float *in2, *out2;
#pragma offload_attribute(pop)


void do_async_in() {
  int i;
  #pragma offload_transfer target(mic:0) in(in1 : length(count) alloc_if(0) free_if(0) ) signal(in1)
  for (i=0; i<iter; i++) {
    if (i%2 == 0) {
      #pragma offload_transfer target(mic:0) if(i!=iter-1) in(in2 : length(count) alloc_if(0) free_if(0) ) signal(in2)
      #pragma offload target(mic:0) nocopy(in1) wait(in1) out(out1 : length(count) alloc_if(0) free_if(0) )
      compute(in1, out1);
    } else {
      #pragma offload_transfer target(mic:0) if(i!=iter-1) in(in1 : length(count) alloc_if(0) free_if(0) ) signal(in1)
      #pragma offload target(mic:0) nocopy(in2) wait(in2) out(out2 : length(count) alloc_if(0) free_if(0) )
      compute(in2, out2);
    }
  }
}

Example: Output double-buffered results of an offload

#pragma offload_attribute(push, target(mic))
int count = 25000000;
int iter = 10;
float *in1, *out1;
float *in2, *out2;
#pragma offload_attribute(pop)

void do_async_out() {
  int i;
  for (i=0; i<iter+1; i++) {
    if (i%2 == 0) {
      if (i<iter) {
        #pragma offload target(mic:0) in(in1 : length(count) alloc_if(0) free_if(0) ) nocopy(out1)
        compute(in1, out1);
        #pragma offload_transfer target(mic:0) out(out1 : length(count) alloc_if(0) free_if(0) ) signal(out1)
      }
      if (i>0) {
        #pragma offload_wait target(mic:0) wait(out2)
        use_result(out2);
      }
    } else {
      if (i<iter) {
        #pragma offload target(mic:0) in(in2 : length(count) alloc_if(0) free_if(0) ) nocopy(out2)
        compute(in2, out2);
        #pragma offload_transfer target(mic:0) out(out2 : length(count) alloc_if(0) free_if(0) ) signal(out2)
      }
        #pragma offload_wait target(mic:0) wait(out1)
        use_result(out1);
    }
  }
}

See Also