DOTD: Suffix Tree

3 Jan 2017

This is a very brief and basic definition. 

A suffix tree is a binary tree which, when traversed from the root node (at the very top) to a leaf node (square, numbered), spells out a suffix of the word for which it is built. Suffix trees can be built in O(n) time, where is the length of the string for which we are building it. The figure above shows the suffix tree for the word banana, where n = 6. (Never mind the $ for now!)

 

Why is it useful? Suffix trees are arguably the most used data structures in stringology, and have a vast multitude of uses. For example, if I want to find out whether the string ana occurs in banana, I can attempt to spell out each letter in the suffix tree. Starting at the root node, I successfully spell a (see leftmost solid edge on figure above). From this node, I then successfully spell and a (which have been merged in to one edge, as the above figure represents a compressed suffix tree). An occurrence of ana can now be reported. This process requires O(m) time, where m is the length of the string I am looking for, in this case, ana.

 

For an in depth understanding of suffix trees, read chapter 5 of:

Gusfield, Dan. Algorithms on strings, trees and sequences: computer science and computational biology. Cambridge university press, 1997.

 

[2]

 

Share on Facebook
Share on Twitter
Share on LinkedIn
Please reload

Please reload

Related Posts
PhDomics by Fatima