TRIE

TRIE is a tree data structure that allows string with similar characters prefix to use the same prefix data and store only the tails.

Each level of the tree stores the Array or Linklist of the characters except the ROOT.

Why TRIE?

The traditional data structures process the pattern and then finds the Text to be matched, so the time complexity to search the pattern is proportional to the length of the text.

TRIE can be of two types Standard TRIE and Compressed TRIE

Structure of the Standard TRIE :

  1. TRIE tree is not the binary tree and each node can have maximum 26  

   children(as there are 26 English alphabets) or minimum 0 (if it is the leaf

   node )

  1. Children are lexicographically ordered.
  2. A ROOT node is not labeled by any characters and just contains the link to the children.
  1. The path from the ROOT to the external node yields the pattern.
  2. The height of the Tree can be maximum 26 and minimum 1 (as the ROOT is not labeled).

Example of the Standard TRIE is as Follows

S= { ‘bear’, ‘bell’, ‘bid’, ‘bull’, ‘buy’, ‘sell’, ‘stock’, ‘stop’ }

 



 

Complexity:

Worst case :

Traversal, Insertion, and DeletionO(n) (n=> size of the string)

Best case :

Traversal – O(1) if the search key is immediately after root.

Insertion and Deletion – O(n)((n=> size of the string))

Memory –  O(w)( For the worst case w can be 26 times the number of nodes in the tree.)

PROs

  1. Good for the secondary storage indexing
  2. Good for the static string matching.

CONs

  1. Only applicable to strings and no other data structure.
  2. Standard TRIE is not memory efficient.
  3. A well-structured Hashtable will be more efficient for the searching as the time complexity will be reduced from O(w) to O(1).

 

Compressed TRIE (Radix Tree):

The problem with Standard TRIE is that it is not memory efficient.

The concept behind the Compressed TRIE is to convert the long chain of the single-child intermediate nodes into a single leaf node.

Photo reference (http://www.mathcs.emory.edu/~cheung/Courses/323/Syllabus/Text/trie02.html)

For the worst case scenario discussed above, the height of the Standard TRIE will be 26 as all the nodes have the single child nodes, now using the compressed tree the height of the tree can be reduced to 1.

Photo reference (http://www.mathcs.emory.edu/~cheung/Courses/323/Syllabus/Text/trie02.html)

 

Practical Applications of the TRIE:

  1. Dictionary matching – the Searching time of the TRIE is same as traversing the Linklist or an Array O(n), TRIE  is preferred where the size of the dictionary is very large.
  2. Text stemming – TRIE can be used for the word auto-completion(used by Android) and word Stemming(“add”, “adding”, “added”).
  3. Link analysis – Each leaf of a TRIE denotes a word and refers to the list of ‘URLs’ containing that word, the List can be kept in the external memory and can be used to check that URL is authorized or not.
  4. Internet Routing – Each data packet contains the IP address of its destination host. These data packets are received by the routers and the router forwards it to the destination using the prefix matching. Prefix matching can be done using the TRIE data structure.

Read more about TRIE…

Hope you find it useful. Thanks for Reading.

About CauseCode: We are a technology company specializing in Healthtech related Web and Mobile application development. We collaborate with passionate companies looking to change health and wellness tech for good. If you are a startup, enterprise or generally interested in digital health, we would love to hear from you! Let's connect at bootstrap@causecode.com

Leave a Reply

Your email address will not be published. Required fields are marked *

STAY UPDATED!

Do you want to get articles like these in your inbox?

Email *

Interested groups *
Healthtech
Business
Technical articles

Archives