Class PaigeTarjan


  • public class PaigeTarjan
    extends Object
    An implementation of the Paige/Tarjan partition refinement algorithm.

    To ensure maximal performance, this class is designed in a very low-level fashion, exposing most of its internal fields directly. It should only ever be used directly, and its use should be hidden behind a facade such that neither this class nor any of its methods/referenced objects are exposed at an API level.

    This class stores most of its internal data in several, or possibly even a single, (mostly int) array(s), to achieve maximal cache efficiency. The layout of these arrays is described in the documentation of the respective public fields:

    The PaigeTarjanInitializers provides methods for initializing this data structure for common cases (e.g., DFA minimization). Similarly, the PaigeTarjanExtractors class provides methods for transforming the resulting data structure.
    • Field Detail

      • numInputs

        public int numInputs
        The number of input symbols.
      • numStates

        public int numStates
        The number of states.
      • blockData

        public int[] blockData
        The array storing the raw block data, i.e., the states contained in a certain block. It is assumed that the positions Block.low and Block.high refer to this array.
      • posData

        public int[] posData
        The array storing the position data, i.e., for each state, its index in the blockData array.

        The layout of this array is assumed to be the following: for the state i, where 0 <= i < numStates, the index of i in blockData is posData[posDataLow + i].

      • posDataLow

        public int posDataLow
        The lowest index storing position data in the posData array.
      • predOfsData

        public int[] predOfsData
        The array storing the predecessor offset data, i.e., for each state and input symbol, the delimiting offsets of the respective predecessor list. The offsets are assumed to refer to the predData array.

        The layout of this array is assumed to be the following: for state i and input symbol j, where 0 <= i < numStates and 0 <= j < numInputs, the offset (in the predData array) of the first j-predecessor of i is predOfsData[predOfsDataLow + j*numStates + i], and the last j-predecessor of i is predOfsData[predOfsDataLow + j*numStates + i + 1] - 1. Note that this requires the index predOfsDataLow + numInputs * numStates to be valid, and the contents of the predOfsData array at this index must be the highest offset of any predecessor plus one.

      • predOfsDataLow

        public int predOfsDataLow
        The lowest index storing predecessor offset data in the predOfsData array.
      • predData

        public int[] predData
        The array storing the predecessor data, i.e., for each state and input symbol, a list of the respective predecessors.

        The layout of this array is assumed to be the following: for state i and input symbol j, where 0 <= i < numStates and 0 <= j < numInputs, the j-predecessors of i are the elements of predData from index predOfsData[predOfsDataLow + j*numStates + i], inclusive, to index predOfsData[predOfsDataLow + j*numStates + i + 1], exclusive.

      • blockForState

        public Block[] blockForState
        The array mapping states (in the range between 0 and numStates) to their containing block.
    • Constructor Detail

      • PaigeTarjan

        public PaigeTarjan()
    • Method Detail

      • setSize

        public void setSize​(int numStates,
                            int numInputs)
      • setBlockForState

        public void setBlockForState​(Block[] blockForState)
      • setBlockData

        public void setBlockData​(int[] blockData)
      • setPosData

        public void setPosData​(int[] posData,
                               int posDataLow)
      • setPredOfsData

        public void setPredOfsData​(int[] predOfsData,
                                   int predOfsDataLow)
      • setPredData

        public void setPredData​(int[] predData)
      • removeEmptyBlocks

        public void removeEmptyBlocks()
        Removes all blocks which are empty from the block list. The IDs of the blocks are adjusted to remain contiguous.

        Note that this method does not modify the worklist, i.e., it should only be called when the worklist is empty.

      • initWorklist

        public void initWorklist​(boolean addAll)
        Initializes the worklist from the block list. This assumes that the worklist is empty.

        If addAll is true, all blocks from the block list will be added to the worklist. Otherwise, all but the largest block will be added.

        Parameters:
        addAll - controls if all blocks are added to the worklist, or all but the largest.
      • computeCoarsestStablePartition

        public void computeCoarsestStablePartition()
        Refines the partition until it stabilizes.
      • createBlock

        public Block createBlock()
        Creates a new block. The Block.low and Block.high fields will be initialized to -1.
        Returns:
        a newly created block.
      • canonizeBlocks

        public void canonizeBlocks()
        Iterates over the current block list and sets every block's low and high pointer to the accumulated sum of its own high pointer and its preceding blocks' high pointers, effectively reducing each block to a single representative padded by its previous range.
      • getBlockForState

        public Block getBlockForState​(int id)
        Retrieves the corresponding block for a given state (ID).
        Parameters:
        id - the state ID
        Returns:
        the block containing the specified state
      • getRepresentative

        public int getRepresentative​(Block b)
        Retrieves a representative state from the given block. This method behaves deterministically.
        Parameters:
        b - the block
        Returns:
        a representative state in the specified block
      • statesInBlockIterator

        public PrimitiveIterator.OfInt statesInBlockIterator​(Block b)
        Retrieves an iterator for the contents of the given block.
        Parameters:
        b - the block
        Returns:
        an iterator for the contents of the specified block
      • statesInBlockSpliterator

        public Spliterator.OfInt statesInBlockSpliterator​(Block b)
        Retrieves a spliterator for the contents of the given block.
        Parameters:
        b - the block
        Returns:
        a spliterator for the contents of the specified block
      • blockListIterator

        public Iterator<Block> blockListIterator()
        Retrieves an iterator for iterating over all blocks in the block list.
        Returns:
        an iterator for iterating over all blocks
      • getNumBlocks

        public int getNumBlocks()
        Retrieves the total number of blocks in the block list.
        Returns:
        the total number of blocks