The exercise 2.2 in Purely Functional Data Structure says the original member function performs about 2d comparisons in the worst case, where d is the depth of the tree. According to this paper, the author suggests to rewrite member function to take no more than d + 1 comparisons by keeping track of a candidate element that might be equal to the query element (say, the last element for which < returns false or <= returns true) and checking for equality only when the bottom of the tree is hit.
The following are the codes:
data Tree a = Leaf
| Fork (Tree a) a (Tree a)
deriving (Show, Eq)
-- The original member function.
member :: Ord a => a -> Tree a -> Tree a
member _ Leaf = False
member x (Fork l y r)
| x < y = member x l
| x > y = member x r
| otherwise = True
-- The improved member function that I write.
member' :: Ord a => a -> Tree a -> Tree a
member' v t = loop Nothing v t
where
loop candidate _ Leaf = Just x == candidate
loop candidate x (Fork l y r)
| x < y = loop candidate x l
| otherwise = loop (Just y) x r
Now my questions are:
According to the paper, is my improved
memberfunction right?In the worst case, the improved
memberis more efficient than the original one, since it compares no more thand + 1times. But according to the algorithm, the improvedmemberwill taked + 1comparisons for every query. For the originalmember, if the queried value equals the root value, it only compaires two times, whereas the improvedmemberseems to still taked + 1comparisons. So it seems the improved algorithm is not as efficient as the original one in the best case. Do I misunderstand anything here? Actually the paper about the algorithm reads:
When the searched-for node is reached the algorithm does not terminate but continues in the right subtree. Since all elements in that subtree are larger than the one searched for, the algorithm will make no more right turns after passing that node. Thus if the element searched for is to be found in the tree it has to be in the node where the last right turn was made, which is the node referred to by candidate.
So it seems that's how I understand the algorithm.
- If point 2 is true, then how to improve the algorithm so that it doesn't need to compare
d + 1times in the best case besides using the three-way comparison functoncomparein Haskell?
Thanks for any information!