This document describes the most important changes to the base package since release 2.23. In addition, it lists the known open issues and limitations in release 2.24 of the base package.
For more information regarding the status and workarounds related to any of these issues, please contact ClearSpeed support quoting the relevant CTS number.
Beta 2.24 contains a number of bug fixes since release 2.23.
The following is an overview of the major changes in this release:
CTS 2065: The overlap in components between the Base and Developer packages has been removed. You now have to install the runtime from the Base package before installing the Developer Package.
CTS 2403: The DGETRF function in this release of the CSXL library is a preview release available for Linux platforms only. Furthermore, to use this function the old DGEMM implementation must be used. To do this, set the environment variable CS_GEN1_BLAS to a nonzero value.
CTS 2860: This release includes a new implementation of the DGEMM function in the BLAS library which achieves higher performance than the previous one.
The library name has changed to:
libcsxl_blas.so
(Linux)
csxl_blas.dll
(Windows)
All references to the old library name ( libblas_cs.so or blas_cs.dll ) must be replaced with the new library name.
When using the new DGEMM implementation, you must set the environment variable CS_HOST_BLAS .
To use the previous BLAS library implementation, set the environment variable CS_GEN1_BLAS to a nonzero value.
CTS 2974: For consistency with the latest implementation of the BLAS library in CSXL, the name of the FFT library has changed:
from libcsdft.so to libcsxl_fft.so (Linux)
from csdft.dll to csxl_fft.dll (Windows)
In the short term, both file names will be available. The older file name ( libcsdft.so or csdft.dll ) is deprecated and will be removed in a future release.
The following issues have been fixed in this release:
CTS 1195: The driver may produce a warning message of the form:
HalfBridge_WaitforDMA: pre_enable_mask = 7, should be 0
or HalfBridge_waitForDMA: post_dma_int_mask not zero = 7
This message is due to an internal check failing however it does not affect the operation of the driver.
CTS 2287: Large data segments (for example, .bss ) fail to load on the Instruction Set Simulator ( isim ) on Windows XP.
CTS 2364: The example dgemmtest.c program in the CSXL User Guide , and provided as source code, may not compile on all platforms.
CTS 2428: On page 7 of the CSDFT Reference Manual , the example describing the use of the CSDFT_system_to_memory_descriptor is wrong.
The argument list of the CSDFT_system_to_memory_descriptor should be
(( void*) source_data_pointer, sizeof(DoubleComplex) * n * num_ffts)
(( void*) source_data_pointer, sizeof(DoubleComplex) * n , num_ffts)
This applies to both data_in and data_out .
CTS 2433: The documentation for CSDFT states that it is pre-alpha. It is actually more stable than this and should more correctly be called pre-beta. It is still true that some functionality is unimplemented.
CTS 2634: When building the CSDFT library examples on Windows XP using Visual C++ 2005, the following warning is displayed:
cl : Command line warning D9002 : ignoring unknown option `-g'
CTS 2680: When the status code of INVALID_SYMBOL is passed to CSDFT_return_error_message() , the error message Error: Invalid size is returned, rather than the correct message.
The following issues are currently open.
CTS 239: csrun or host client applications cannot check whether the CSX processor has been reset. Running code on a processor that has not been reset should not be attempted. It is the responsibility of the user to reset the processor before running code (using csreset -s ).
CSAPI_read_mono_memory_async_wait
CSAPI_read_mono_memory_async_poll
and their CSAPI_write counterparts will not return an error code if the asynchronous transfer failed.
CTS 1982: The kernel driver for 2.4 kernels (RHEL 3) may cause the kernel's memory space to become fragmented, resulting in out of memory failures after a very long period of continuous use. This can only be recovered by rebooting the system.
CTS 2004: This release includes a script for resetting the Advance Accelerator board when csreset fails to do so. This does a 'hard' reset of the processors. This functionality will be incorporated into csreset in a future release. Before using the reset script, please gather any diagnostic or debugging information as all state will be lost by the hard reset. For example, the output from csreset -v .
Before running the script, first setup your environment if you have not already done so. Under Linux, source the bashrc file (usually present in /opt/clearspeed/csx600_m512_le/bin ). For Windows, start a command prompt using the shortcut from the ClearSpeed Start menu item. If you have more than one board, set the environment variable LLDINST to the instance number of the board to be recovered. For example, to reset only the first board under Linux enter export LLDINST=0
To run the script, type the command recover_board . You should then see some output like this:
- when csreset fails to reset your board
- after any useful diagnostic information has been gathered (e.g. the output from csreset -v).
If you wish to continue, press the return key. Otherwise, press control-c to exit.
If you are happy to run, then press the return key. You will then see output as follows:
Board recovery attempted - you can now re-run csreset.
To be safe, the recover_board script and csreset should be run whenever the board is powered up.
CTS 2859: The test application app_mandelbrot , included in the release package, will fail to run on isim unless the -b ( --boards) option is specified in addition to the --host option.
Use the --help option to app_mandelbrot for more information on the use of --host and --boards .
app_mandelbrot --host 0 localhost -b 1
CTS 2997: If you are using Microsoft Windows and you install the SDK and then uninstall it, some files will be deleted which are required by the runtime software. The result of this is that it will no longer be possible to run any software on the board.
To fix this, you will need to reinstall the runtime software.
CTS 3093: The host library functions for reading and writing memory on the Advance board, CSAPI_read_mono_memory and CSAPI_write_mono_memory , are not thread safe. You should ensure that only a single thread attempts to call these functions.
CTS 1108: If a host application program using the CSXL library is terminated abnormally (for example, by using [Ctrl]+[C]), the Advance Accelerator board may be left in an undefined state. It may be necessary to reset the board (using the csreset command) before restarting the application.
CTS 2204: There is an upper size limit to the matrix arguments for the DGEMM function. If a matrix exceeds this size, the host DGEMM will be called rather than the accelerated DGEMM. The limit depends on the values of n and k . The limit is reached if:
(3*(ceil(k/192)+1)+192*ceil(k/192)*ceil(n/192))*8*192
is greater than 0x1F800000 bytes (504 MB).
CTS 3003: If the new CSXL BLAS library is used with the Goto host BLAS library, then the environment variable GOTO_NUM_THREADS must be set to the value 1. Any other value may cause an error message referring to MAX_THREADS , OMP_NUM_THREADS and GOTO_NUM_THREADS to be generated, or it may cause the program to hang.
CTS 3038: If you are using the previous implementation of DGEMM (that is, when CS_GEN1_BLAS is set to a nonzero value)on Microsoft Windows then the environment variable CS_BLAS_HOST_ASSIST_PERCENTAGE will have no effect. A warning message that the feature is not supported will be displayed.
CTS 3047: If you use MKL as your host library you must link your application with both MKL and CSXL, putting CSXL first. If this is not done, linking will fail with errors due to symbols not being found.
CTS 3062: If your program links to a host BLAS library such as MKL or ACML, and you want to accelerate calls to DGEMM, you need to dynamically load ClearSpeed's BLAS library. This can be done on Linux platforms using the environment variable LD_PRELOAD . For example, if your executable is called a.out , you would call it with:
LD_PRELOAD=/opt/clearspeed/csx600_m512_le/lib/libcsxl_blas.so ./a.out
By default, this will accelerate calls to DGEMM using the new implementation.
As noted for CTS 2860 under What's New in Release 2.24 , you can also set the environment variable CS_GEN1_BLAS to 1 if you wish to use the previous implementation of DGEMM.
CTS 3073: If you are using the previous implementation of DGEMM (that is, when CS_GEN1_BLAS is set to a nonzero value)then the inv() function in MATLAB will fail with the following error:
CTS 1430: The board side plan ( CSDFT_create_plan_<1|2|3>d and CSDFT_create_convolution_plan_<1|2>d ) and execute ( CSDFT_execute and CSDFT_execute_convolution ) functions for CSDFT do little checking on correctness of input. It is possible to get bus errors if unsupported values are used.
CTS 1807: The CSDFT Library is at a pre-beta stage and as such there are a number of incomplete or unsupported features. Please refer to the README included in the package for a more complete description of what features are currently available.
CTS 2483: Using printfp in conjunction with the CSDFT library will fail at link time with the error message:
Definition for the symbol 'PRINT_AREA_CONTROL' already found in module default.cso
CTS 2666: When the environment variable CS_CSAPI_DEBUGGER is set, the CSDFT host library assumes that the . csx file to be loaded has _debug appended to the file name. If this file does not exist on the CSPATH , the library will fail to find the . csx file.