We include here the basics of the bottom-layer present in our word-based self-indexes (WSIs). We adapted the char-based versions of the CSA and the SSA, to deal with an uint32-based large alphabet, so that they can manage the vocabulary that raises from parsing a text into words in our WSIs. We refer to them as i-CSA and i-SSA respectively.

Our ISIs permit us to self-index the sequence of uint32 word-ids SID obtained by the presentation layer of the WSI. Yet, they have been developed in following a general interface that aims at obtaining reusable and easily exchangeable bottom layer. Basically, they are built over the sequence SID, and provide search functions to perform count/locate/extract/display operations for searching for pattern composed of one uint32 or a sequence of uint32 (a phrase-pattern). Apart from those functions we have created an interfaceIntIndex interface also following the style of the pizza-chili API that our ISIs must implement.

The interfaceIntIndex.h is defined as follows:

ISI Interface.h

Next, we present our two implementations of the interfaceIntIndex.h: the iCSA and the iSSA. They are completely different self-indexes, yet, the iCSA and the iSSA hide their internals and using them basically requires only to specify the suitable parameters that permit to vary the space/time trade-offs they offer.


The i-CSA

The parameters required by the i-CSA for construction are those passed to the WCSA. All those parameters are set during the call to the buildIntIndex() function within build_options string. They are passed between ""... and must be separated by commas if more than one are set. Otherwise default values are provided automatically.

Parameters for the iCSA

The source code of the iCSA is available here: icsa.tar.gz.

You can download it and compile it independently by entering the folder "intIndex_larson" or "intIndex_qsort" and just typing "make". Compilation creates the file "icsa.a" that includes the implementation of the ICSA.


The i-SSA

The parameters required by the i-SSA for construction are those passed to the WCSA. All those parameters are set during the call to the buildIntIndex() function within build_options string. They are passed between ""... and must be separated by commas if more than one are set. Otherwise default values are provided automatically.

Parameters for the iSSA

The source code of the iSSA is available here: issa.tar.gz.

You can download it and compile it independently by entering the folder "intIndex_SSA" and just typing "make". Compilation creates the file "issa.a" that includes the implementation of the ISSA.


Further details

Although the iCSA and ISSA self-indexes can be compiled independently if you are planning to build WCSA and WSSA self-indexes to use them, you should better directly download the WCSA and WSSA whole implementation (they already include the iCSA and iSSA respectively), as they include Makefiles to compile the whole system, as well as scripts for testing their functionality

Also, if you want to create a new word-based self-index W-your_index (by providing a new int-based self-index for the bottom-layer), you will probably prefer to just download the source code for WCSA or WSSA, and just replace the bottom layer there by your own int-based self-index.

For example: if you download and extract the wssa.tar.gz package, you will only take a look at the "WSSA/src/intIndex_SSA" folder. You will need only to include the implementation of your int-based self-index there, and everything should work properly after some minor modifications in the "WSSA/Makefile" and "WSSA/src/intIndex_SSA/Makefile" files.

Supported in part by MCIIN (PGE and FEDER) grants(TIN2006-15071-C03-03, TIN2009-14560-C03-02, TIN2010-21246-C02-01, and CDTI CEN-20091048); Xunta de Galicia grants (Feder) 2010/17 and (Agrupación Estratéxica) CN 2012/211); and AECI grant (A/8065/07).