CHAMELEON  1.3.0
Chameleon, a dense linear algebra library for scalable multi-core architectures and GPGPUs

This document aims at describing Chameleon's API, internal functions, as well as the source code organization.

To get information about the Chameleon project, installation guide and usage examples please refer to the user's guide.

Chameleon user's API is mostly composed of linear algebra routines of the form CHAMELEON_name[_Tile[_Async]] where name follows the LAPACK naming scheme that can be used with the Chameleon library. These routines are described individually in the section Linear algebra routines.

In addition to the algorithmic routines Chameleon provides a set of functions to control the overall process, see Auxiliary and control routines

Linear algebra routines

Auxiliary and control routines

  • Control : to initialize/finalize Chameleon, functions that must be called to use any routines, pause/resume task executions, etc
  • Options : to set/get some parameters such as the number of CPUs and GPUs to use
  • Descriptor : to handle the data structure used with the Tile** interface
  • Tile : to convert (copy) LAPACK type data into Descriptor one and conversely
  • Sequences : to handle asynchronous tasks execution, to be used with the Tile_Async interface
  • Workspace : to handle specific workspace to be used with some of the algorithms
  • Auxiliary : to get some extra information such as the version, the data byte size, the mpi rank

Libraries and source code organization

Chameleon's project is made of some C libraries and executables whose compilation is optional (examples and testing).

The libraries are organized as follows :

  • chameleon : user's API and task based algorithms, depends on "chameleon_quark|openmp|parsec|starpu", "coreblas", "hqr"
  • chameleon_quark|openmp|parsec|starpu : interface to the different runtimes, depends on "coreblas" and optionally on "gpucublas" or "gpuhiblas" and on a runtime system library
  • coreblas and gpucublas or gpuhipblas : interfaces to the CPU and GPU kernels
  • hqr : HQR is a C library providing tools to generate hierachical trees adapted to 2D block-cyclic data distribution and algorithms based on tiled QR/algorithms

Lets have a look to the source code organization in directories.

  • cmake_modules : CMakes scripts for setting variables. morse_cmake is used for system introspection.
  • compute : the task based algorithms as well as the different interfaces (Lapack, Tile, Async) to call them
  • control : some Chameleon general routines to control the library, initialize, finalize, set options, descriptors and sequences handling, etc
  • coreblas : the Chameleon interface to CPU linear algebra kernels
  • distrib : some hints to install Chameleon's dependencies
  • doc : users and developers documentations
  • example : couple of C files to show how to use Chameleon
  • gpucublas : the Chameleon interface to GPU linear algebra kernels (cublas)
  • gpuhipblas : the Chameleon interface to GPU linear algebra kernels (hipblas)
  • hqr : HQR is a C library providing tools to generate hierachical trees adapted to 2D block-cyclic data distribution and algorithms based on tiled QR/algorithms
  • include : Chameleon's headers file necessary for users
  • lapack_api : the Chameleon interface CBLAS/LAPACKE like
  • lib : material related to the distribution
  • plasma-conversion : scripts to convert plasma task based algorithms into chameleon's ones
  • runtime : interfaces to runtime systems
  • simucore : data to be able to simulate chameleon executions through StarPU+SimGrid
  • testing : source files for testing executables (timing and numerical checks)
  • tools : other scripts for testing (software development quality)