All good points - thanks for the clarification!<br><br><div><span class="gmail_quote">On 2/9/07, <b class="gmail_sendername">Krishna Dole</b> <<a href="mailto:dontfall@gmail.com">dontfall@gmail.com</a>> wrote:</span>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Hi Jeff,<br><br>Thanks for writing us about this, and welcome to the list.<br><br>> First, I think the parent_id field is completely unnecessary. The point of
<br>> the nested set model seems to be that you eliminate the need for the join,<br>> thereby creating a table with a single fact, in a single place. If you<br>> don't have a parent_id, you identify a root as anything with lft=1. You can
<br>> achieve multiple roots by using composite keys (by adding a set_id column,<br>> for example).<br><br>You're right-- strictly speaking the parent_id field is unnecessary,<br>and thus non-normalized. However, we have chosen to keep it thus far
<br>for several reasons:<br><br>1) It provides a good backup to the lft/rgt indexes. Nested sets are<br>inherently both volatile and brittle. I feel good about our code's<br>ability to maintain the left/right values, but if they somehow became
<br>corrupted, repairing them without having parent_id would be a devilish<br>manual process at best and impossible at worst. We provide handy<br>methods for both checking and re-indexing using the parent_id.<br><br>2) It provides significant performance advantages in a number of
<br>situations: the common task of fetching a node's parent is _much_<br>faster using parent_id. Fetching immediate children would require a<br>'level' column if the parent_id was not present, and maintaining a
<br>level column would be a bit of work. (But nice to have-- it's on the<br>to-do list)<br><br>3) It makes reversion to an adjacency-list model trivial (just nuke<br>the lft and rgt cols) if you decide nested sets aren't for you. One of
<br>my goals with the code is to make it easy for people using<br>acts_as_tree to try a nested set model.<br><br>4) The parent_id is required (and used to good effect) in the ez-set branch.<br><br>5) It doesn't really cause any problems, performance or otherwise.
<br><br>><br>> Second, the numbering scheme need not be contiguous. If you use the full<br>> range of floats given in a database you can insert thousands of children<br>> without ever having to recalculate other rows in the database. This will
<br>> drastically increase performance on inserts/updates, and unless you are<br>> working with huge sets, you may never need to recalculate parent boundaries.<br><br>I looked into using floats, but I was turned off by the fact that
<br>trees less than 60 levels deep (if I recall correctly) could overrun<br>the index. That said, if you would like to try implementing something<br>like this, please do-- I'm happy to provide what advice I can.<br><br>
Keeping the index sequential has a few benefits: you can find out how<br>many descendants a node has without hitting the database, and it makes<br>finding 'leaf' nodes easy.<br><br>><br>> For more information (if you haven't already seen it), check out:
<br>><br>> <a href="http://searchoracle.techtarget.com/tip/1,289483,sid13_gci537290,00.html">http://searchoracle.techtarget.com/tip/1,289483,sid13_gci537290,00.html</a><br><br>Thanks for linking to that. I'd seen most of it before (Joe has
<br>recycled that material a lot of times), but some of the information at<br>the bottom was interesting.<br><br>Cheers,<br>Krishna<br>_______________________________________________<br>Betternestedset-talk mailing list<br>
<a href="mailto:Betternestedset-talk@rubyforge.org">Betternestedset-talk@rubyforge.org</a><br><a href="http://rubyforge.org/mailman/listinfo/betternestedset-talk">http://rubyforge.org/mailman/listinfo/betternestedset-talk
</a><br></blockquote></div><br>