Custom fields and the Jazz DW
Hi, is there any way to have custom fields 'data warehoused' so that
they can be used in reports? Sorry if I just haven't found the instructions to do that. Thanks, Jeremy |
One answer
Jeremy Hughes wrote:
Hi, is there any way to have custom fields 'data warehoused' so that Hi Jeremy. There are two ways I could imagine doing this. Here are some comments on each: 1. We already aggregate and store a large amount of data about work items in the data warehouse. You might want to simply add the custom fields to this collected data. However, as you've probably noticed, it's not possible to do this today. This is for a few reasons: (A) We don't know about these fields. The collection code is written in Java and knows only about the core attributes of a work item. (B) Collecting data in a data warehouse isn't as simple as just specifying which fields to store. Adding new fields (in this case, called "dimensions" in data warehouse-speak) can have very serious space implications on the existing tables; allowing our mechanism to be extensible in some way would open the door to people being able to explode the size of the data warehouse by many orders of magnitude. (C) The schema for the data warehouse, once created, is not mutable. So we can't dynamically add new fields to the database as new custom fields are created (what would happen with the existing data?). As a followon to (B) above, imagine a case where you want to add two custom fields to our data collection. One represents a "Platform" field, which has ten possible values (Windows, Linux, etc). Another represents the type of ice cream that the person who fixes the bug gets (Vanilla, Chocolate, etc). There are ten possible ice cream flavours. Adding these two fields to our existing tables has the potential to increase the size of the data warehouse (the work items states table, specifically) 100 times, or 2 orders of magnitude. That's because there are now 10 * 10 = 100 times more permutations of dimensions for each fact, and the states table stores a row for each permutation (omitting those for which the count is zero, of course). So this looks like a trivial thing to want to do but would have a disastrous impact on the performance of both the data collection and of the generation of reports. Having said that... 2. Instead of piggy-backing on the existing tables, create some new tables from scratch. The data warehouse is an extensible mechanism, and you can define your own tables and write your own snapshot collection code. It may turn out that your needs for the tables don't include much of the other stuff that we collect, so you can include your custom fields, and the other fields that you need, without consuming the known universe. There are a few resources that might help you in this task: - Our article on extending the data warehouse, found here: https://jazz.net/wiki/bin/view/Main/DataWarehouseExtending - The code in our work items snapshot, found in com.ibm.team.datawarehouse.service.workitems. It's kind of hairy, but it shows how we collect all work item changes, store them in the changes table (where each row represents a single work item change) and then from the changes table generate the states table (where each row represents a point-in-time fact about the number of work items which satisfy some permutation of criteria). In addition, there's an excellent article on data warehousing in general, which describes the facts and dimensions and discusses space implications: http://ciobriefings.com/Publications/WhitePapers/DesigningtheStarSchemaDatabase/tabid/101/Default.aspx Hope this helps. james Jazz Reports Team Lead |
Your answer
Dashboards and work items are no longer publicly available, so some links may be invalid. We now provide similar information through other means. Learn more here.