Hello, Hierarchical factors are a very common data structure. For instance, one might have municipalities within states within countries within continents. Other examples include occupational codes, biological species, software types (R within statistical software within analytical software), etc.
Such data structures commonly use hierarchical coding systems. For example, the 2007 North American Industry Classification System (NAICS) <http://www.census.gov/cgi-bin/sssd/naics/naicsrch?chart=2007>has twenty two-digit codes (e.g., 42 = Wholesale trade), within each of these varying numbers of 3-digit codes (e.g., 423 = Merchant wholesalers, durable goods), then varying numbers of 4-digit codes (4231 = Motor Vehicle and Motor Vehicle Parts and Supplies Merchant Wholesalers), then varying numbers of five-digit codes, varying numbers of six-digit codes, etc. At the lowest level (longest code) one can readily tell all the higher levels. For example, 441222 is "Boat Dealers" who are part of 44122, "Motorcycle, Boat, and Other Motor Vehicle Dealers," which is part of 4412 (Other Motor Vehicle Dealers), which is part of 441 (Motor Vehicle and Parts Dealers), which is part of 44 (Retail Trade). (The US Census Bureau has extended the 6-digit NAICS to an even more fine-grained 10-digit system.) I haven't seen any R packages or sample code that handles this kind of data, but I don't want to reinvent the wheel and would rather stand on the shoulders of you giants. Is there any package or other R-based software out there that handles this kind of data structure? Thanks, Marsh Feldman [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.