Searching the tree

Contents

Searching a tree for numerical data

We have been playing with node indices since the beginning, assuming we knew what node corresponded to what index. You may want to search a tree, to get the index of a particular node.

There is no specialized method to search a tree. I tried to stick to the standard MATLAB syntax, and simply wrapped or implemented the MATLAB standard search functions for trees.So if you know how to search an array, you already know everything.

Here is an example: The example static method of the tree class returns two synchronized trees. The first one, lineage contains the standard name of the C.elegans embryo cells. The second one, duration, contains the length, in minutes, of this cell cycle.

[ lineage, duration ] = tree.example; % 1st one is made of strings only, 2nd one of integers
disp(duration.tostring)
                                   2                                   
             +--------------------+--------------+                     
             |                                   |                     
            19                                  19                     
      +------+------+                  +--------+-----------+          
      |             |                  |                    |          
     16            16                 18                    9          
  +---+--+       +--+---+       +-----+-----+         +-----+---+      
  |      |       |      |       |           |         |         |      
 10      6      16     16      11           9         6         4      
 ++-+  +-+-+   ++--+  ++--+    ++---+    +-+---+    ++--+    +-+---+   
 |  |  |   |   |   |  |   |    |    |    |     |    |   |    |     |   
 9  9 10  10  13   8 11   7   12    6    5     5   18   9   18     2   
                             ++--+     ++--+  ++-+         ++--+  ++-+ 
                             |   |     |   |  |  |         |   |  |  | 
                            12   4    17   4  3  5        20   9  5  8 

Here is how find the nodes with a duration longer than 15 minutes:

indices = find ( duration > 15 );
for i = indices
    fprintf('The cell %s has an interphase duration of %d minutes.\n', ...
        lineage.get(i), ...
        duration.get(i) ); % We will speak of synchronized iteration later
end
The cell AB has an interphase duration of 19 minutes.
The cell AB.a has an interphase duration of 16 minutes.
The cell AB.p has an interphase duration of 16 minutes.
The cell AB.pl has an interphase duration of 16 minutes.
The cell AB.pr has an interphase duration of 16 minutes.
The cell P1 has an interphase duration of 19 minutes.
The cell P2 has an interphase duration of 18 minutes.
The cell C.aa has an interphase duration of 17 minutes.
The cell MS.a has an interphase duration of 18 minutes.
The cell E.a has an interphase duration of 18 minutes.
The cell E.al has an interphase duration of 20 minutes.

The function all and any have also been implemented, and serve to interrogate a whole logical tree:

% Are all durations longer than 2 minutes?
all( duration > 2 )
ans =

     0

% Is there any cell which interphase last less than 5 minutes and more than
% 0 minutes?
any( duration < 5 & duration > 0 )
ans =

     1

% What ones?
find( duration < 5 & duration > 0 )
ans =

     1    24    26    31    35    40

% Let's get their name:
for i = find( duration < 5 & duration > 0 )
    disp(lineage.get(i))
end
Zygote
C.ap
C.pa
E
E.p
Z3

Searching a tree for text data

In summary, if you want to search for numerical data within a tree, logical operations and the find method are your friends. If you want now to search for text data, you have to use string comparison and string parsing methods.

Here is a list of the methods that compare two synchronized trees based on their string content, node against node:

These 4 methods offer scalar expansion: if one argument is a string instead of a tree, it is compared against all nodes of the tree. For instance:

% Find the node named 'P3':
id_P3 = find( strcmp(lineage, 'P3') ) %#ok<NOPTS>
id_P3 =

    20

% Find all the descendants of the cell AB (their names start all with
% 'AB'), and display the search result as a tree.
ABdesc = strncmp(lineage, 'AB', 2);
disp(ABdesc.tostring)
                                                                   false                                                                   
                       +--------------------------------------------+------------------------+                                             
                       |                                                                     |                                             
                     true                                                                  false                                           
           +-----------+-----------+                                    +-------------------+------------------------+                     
           |                       |                                    |                                            |                     
         true                    true                                 false                                        false                   
     +-----+-----+           +-----+-----+                +------------+----------+                    +-------------+------+              
     |           |           |           |                |                       |                    |                    |              
   true        true        true        true             false                   false                false                false            
  +--+--+     +--+--+     +--+--+     +--+--+         +--+-------+         +------+------+          +--+---+         +------+------+       
  |     |     |     |     |     |     |     |         |          |         |             |          |      |         |             |       
true  true  true  true  true  true  true  true      false      false     false         false      false  false     false         false     
                                                   +--+---+             +--+---+      +--+---+                    +--+---+      +--+---+   
                                                   |      |             |      |      |      |                    |      |      |      |   
                                                 false  false         false  false  false  false                false  false  false  false 

Other search methods allow to inspect the content of each node:

Regular expression are also implemented, if you need more elaborated searches:

And there is also the basic tools for string manipulation and substitutions:

All these methods return a tree, and their call can be chained. For instance, here we:

  1. Start from the lineage tree
  2. Replace all cell names starting by 'AB' with 'P0a'
  3. Add a point before each lower-case letter
  4. Print the resulting tree
lineage.strrep('AB', 'P0a').regexprep('([a-z])', '\.$1').tostring
ans =

                                                                                           Z.y.g.o.t.e                                                                                           
                                                   +-------------------------------------------+----------------------------------------------------+                                            
                                                   |                                                                                                |                                            
                                                 P0.a                                                                                              P1                                            
                         +-------------------------+-------------------------+                                               +---------------------+---------------------+                       
                         |                                                   |                                               |                                           |                       
                      P0.a..a                                             P0.a..p                                           P2                                          EMS                      
            +------------+------------+                         +------------+------------+                  +--------------+-----+                      +---------------+------+                
            |                         |                         |                         |                  |                    |                      |                      |                
        P0.a..a.l                 P0.a..a.r                 P0.a..p.l                 P0.a..p.r             P3                    C                     MS                      E                
      +-----+------+            +-----+------+            +-----+------+            +-----+------+         ++----+        +-------+-------+           +--+---+          +-------+-------+        
      |            |            |            |            |            |            |            |         |     |        |               |           |      |          |               |        
 P0.a..a.l.a  P0.a..a.l.p  P0.a..a.r.a  P0.a..a.r.p  P0.a..p.l.a  P0.a..p.l.p  P0.a..p.r.a  P0.a..p.r.p   P4     D      C..a            C..p        MS..a  MS..p      E..a            E..p       
                                                                                                         +-+-+        +---+---+       +---+---+                     +---+---+       +---+---+    
                                                                                                         |   |        |       |       |       |                     |       |       |       |    
                                                                                                        Z2  Z3     C..a.a  C..a.p  C..p.a  C..p.p                E..a.l  E..a.r  E..p.l  E..p.r  

Generally, all these methods have the same syntax and options than their core MATLAB counterparts. They only differ in that they operate on and return trees. Therefore, if you want to know more on how to prepare a search, read the corresponding core functions help section.

Back to main page.