Data Structures: Tries

Hi, I’m Gayle Laakmann McDowell, author of Cracking the Coding Interview. Today I’m going to talk about tries. This isn’t something
CS students might have spent that much time on in school, but it’s really really
important for interviews. A try is a data structure that is actually a type of tree,
but it’s often used to store characters. And the way that it works is each node
might store a, just a character as its data, but then if we look at the path
from the root down to that node, that node is really representing a
word or a part of a word. And so what it allows us to do is very quick lookups of a
particular kind. So for example, suppose we use this try
to store this very small version of the English language. We could use it to say, hey, are there any words that start with C A R. Yes there are. There’s car, that’s
actually a complete word, but there’s also card. So car is also a prefix. So it allows
you to do very fast lookups. So let’s talk briefly about the implementation of it.
So one, you know, were going to need a class node and then rather than having as we
would in a binary search tree, a pointer to the left and right node, instead we have some
sort of either array of the, of all the children or more likely some sort of
look up table that maps from a character to that node. So we can immediately say
do you have a have a child that is an A, great, give me that immediately. Additionally we need to understand not
just is there a, do you have an R as a child, an R node as a child, but is
that actually a complete word or is it just a prefix. And so in addition
to each node having any, a mapping of from character to node, each
node also has some sort of flag in it called something like is complete word
that represents is that a complete word or not. We see tries in a lot of
problems so if you’re in interview and you get a question that has some sort of
word validation to it so, you know, check if this is you know, walk through this
table and find all the, you know, walk through this scrabble board and find all the
words. Or given this list of strings, find all celebrity names. So when you
see something that has some word list validation think about whether or not a
try might be useful. There’s one optimization that often comes up in try
problems. Suppose you had a situation where a user
was typing a word and as soon as the word became, and it clearly was going to
be an invalid word, you want to say underline it. Someone types C and you say yes that is in fact it potentially going to be a valid word and then
someone types CA. Ok great, looking good. Now they type CAL. Yes okay there’s words that start with CAL, now they type CALP, oops no. That’s an error. So that’s a good place to
use a try. Now the way that I would explain tries so far, you would do each of
these look ups from scratch. You would say is there a C, is that a valid prefix,
yes it is, and then you do is C A a valid prefix and you start from scratch over
at the root. Now in a lot of cases though we actually want these calls to kind of
build on each other so one thing we can do with our try is we can keep state in
various ways so when we look up is C a valid prefix, we actually store where we
are in the tree. So when we, when the user types the very next character, an A,
we just look immediately is A a child rather of C of that last look up rather
than starting from scratch. There’s different ways you can actually
implement that. One way you can do it is to actually keep state within the try.
Another way you could do that is actually return the node back to the
caller. So when I say is C a valid prefix, I get not only that’s a valid
prefix but I also get information that gives me exact reference to that node.
Then rather than saying starting from the root and saying is CA a valid prefix I
just ask is A a child of C, and then I ask is L a child of the CA etc. etc. So again
tries are not something that you would spend a lot of time in algorithms classes or in algorithms textbooks,
and so they’re unfamiliar to some people. But it’s a very very common data
structure for a lot of interview questions so keep an eye out for that,
you know, construct or that frame of a problem that
says something about your validation of words because that’s often where we’re
going to want to use a try. So now that you understood the basics of the try, why don’t you try applying this to a new problem. Good luck.

Leave a Reply

Your email address will not be published. Required fields are marked *