Load-on-Demand AIE

If a user has more AIE code then will fit in the AI array, but some portion of the code can be latent while another portion runs, it is possible to utilize load-on-demand (LOD) grouping of the code to switch out an unused portion for a portion that must be run at a given time.

Prerequisite

  • This is an advanced use-case for AIE application design. It is assumed that the user is already familiar with basic AIE design using the VSI tool.

Changing the application on the AIE during run-time is possible by defining the different applications to be switched between in different load-on-demand hierarchies within the AIE context. There are two general options that can be utilized for handling the application switch: PS-only, or PL-control. The PS-only strategy places less restrictions on the layout options of the AIE designs, but is slower to execute an application switch than PL-control. PL-controlled application switching is faster, but comes with the main requirement that AIE graph layouts are isomorphic between switched applications.

More information about utilizing PL-controlled switching can be found here: PLAIC LOD

Both switching strategies share the requirement that the same AIE array interfaces (interfaces to the outside of the AIE array) are used for each LOD app. For the PS-switching strategy, it is allowed that some LOD apps use a subset of another LOD app. This “sharing” of interfaces outside the array is achieved using “LOD mux” GUI blocks. These blocks are not implemented in logic when the system is generated, they are there only to visually organize the connections during system-development. They show where a particular data connection will go depending on which LOD app is running at a given time.

Load-on-demand AIE apps can be created both from the VSI GUI or using the VSI constraints language. The constraints language has the added advantage of being able to automatically infer and place the LOD mux blocks and LOD config blocks, if the user is inclined to design their entire system in advance.

Creating the LOD Design

Below will be described how to place a basic LOD AIE app via the GUI, followed by a code-snippet of how it would look in constraints.

  1. To start, it is assumed there are some data-mover IPs in your PL context such as the following.
    alt text
  2. Place the input LOD mux. Right click on the AIE context (versal_aie) and select “Add IP”. In the selection menu, choose “LOD Mux”. It will get name lod_mux_0.
    alt text
  3. Repeat the last step to place the output LOD mux. It will get name “lod_mux_1”. AIE context should now look as follows:
    alt text
  4. Double click on lod_mux_0 to configure it. In the field “Number of load-on-demand groups” enter 2.
    alt text
  5. Double click on lod_mux_1 to configure it. In the field “Number of load-on-demand groups” enter 2. And sice this will be an output, uncheck “Check for mux, uncheck for demux”.
    alt text
  6. AIE context should now look as follows:
    alt text
  7. Implement the cross-context connections. Connect aximm_to_aie_0.M00_AXIS to lod_mux_0.S00_00 and aie_to_aximm_0.S00_AXIS to lod_mux_1.M00_00.
    alt text
  8. Place the LOD hierarchy for the first LOD app. Right click on the AIE context (versal_aie) and select “Create hierarchy”. A config window will pop up, give it name “lod_app_0” and uncheck “Move selected block to new hierarchy”.
    alt text alt text
  9. Repeat the last step to place the LOD hierarchy for the second LOD app. Give it name “lod_app_1” instead. AIE context should now look as follows:
    alt text
  10. Place the LOD config block for app 0. Click once on lod_app_0 so it’s selected, then right click on it and select “Add IP”. In the selection menu, choose “Load On Demand”. It will get name pr_set_0.
    alt text
  11. Place the LOD config block for app 1 by repeating the last step, only using hierarchy “lod_app_1” instead. AIE context should now look as follows:
    alt text
  12. When the LOD config block was placed for LOD app 0, it automatically got set with LOD Id 0. Now we must make sure the LOD Id of LOD app 1, gets a different Id. Double click on pr_set_0 in lod_app_1, and enter 1 in the “Load On Demand ID” field.
    alt text
  13. At this point the AIE kernel program can be added to the LOD apps. It is assumed that the user is already familiar with using VSI Software Import Wizards (vsi_gen_ip) in order to import kernel code. Next image shows how a vsi_gen_ip has been added to each LOD hierarchy. The “source search paths) for both was chosen as:
    $VSI_INSTALL/target/common/hls_examples/aie_example_kernels/simple_kernels
    For LOD app 0, function compute0 was selected as the vsi_gen_ip function. For LOD app 1, compute1 was picked.
    alt text
  14. Lastly, we will complete the connections. Perform the following connections:
    lod_mux_0/M00_00 -> lod_app_0/vsi_gen_ip_0.instream
    lod_mux_0/M00_01 -> lod_app_1/vsi_gen_ip_0.instream
    lod_app_0/vsi_gen_ip_0.outstream -> lod_mux_1.S00_00
    lod_app_1/vsi_gen_ip_0.outstream -> lod_mux_1.S00_01
    alt text

If designing the LOD app using VSI’s constraints language, the placement of the LOD muxes, hierarchies, and config blocks is abstracted away. The process above could be written as follows:

...
# Given some PL data-mover IPs with master/slave IOs, create interface arrays for using them.
pl_to_aie = mm2aie.get_interface("M%_AXIS")
aie_to_pl = aie2mm.get_interface("S%_AXIS")
# Add first LOD app. 0 arg is for lod_region. Being called first will give it lod_id 0.
lod_app_0 = versal_system.versal_aie.add_lod("lod_app_0", 0)
lod_app_0_kern = lod_app_0.add_kernels(compute0)
# Add second LOD app. 0 arg is for lod_region. Being called second will give it lod_id 1.
lod_app_1 = versal_system.versal_aie.add_lod("lod_app_1", 0)
lod_app_1_kern = lod_app_1.add_kernels(compute1)
# Make connections from outside AIE to kernel in LOD app 0.
pl_to_aie[0].connect(lod_app_0_kern.A)
aie_to_pl[0].connect(lod_app_0_kern.B)
# Make connections from outside AIE to kernel in LOD app 1.
pl_to_aie[0].connect(lod_app_1_kern.A)
aie_to_pl[0].connect(lod_app_1_kern.B)
...

Activating the AIE Software

To drive AIE LOD apps using the PS-only strategy, one must call the activate_aie function within user code imported into a vsi_gen_ip placed in the PS context.
activate_aie prototype:

int activate_aie(string target_cpu, 
                string context, 
                string lod_config_name, 
                int lod_id);

To access this function, include the activators.h header. For example, running LOD apps 0 then 1 from above would require the following:

#include <activators.h>
...
void lod_aie_driver() // Function in PS context vsi_gen_ip
{
    ...
    activate_aie("aie", "versal_aie", "pr_set_0", 0);
    // run any desired data movement
    activate_aie("aie", "versal_aie", "pr_set_0", 1);
    // run any desired data movement
    ...
}

Note: For brevity, the LOD config blocks kept their default names “pr_set_0”, in the example above; however, the user can choose any descriptive name they wish.

Load on demand applications can be much more complex as well. Below is an example with broadcasts and differing IO numbers between apps:
alt text