Dummy needs help to change thinking [message #366897] |
Thu, 05 August 1999 21:04 |
Jill
Messages: 6 Registered: August 1999
|
Junior Member |
|
|
If anyone could provide me a brief description of PC Express I would appreciate it. I am trying to set up a database and have read the user guide and introduction several times but just don't get 'variables' and 'dimensions'. I keep treating it like a spreadsheet. How do I change my thinking???
|
|
|
Re: Dummy needs help to change thinking [message #366898 is a reply to message #366897] |
Thu, 12 August 1999 11:00 |
Josh
Messages: 9 Registered: August 1999
|
Junior Member |
|
|
I don't know if this is exactly what you need - this email was originally a response to a question asked by a summer intern working with us - she wanted me to clarify the difference between variables and dimesions. The spreadsheet metaphor isn't a bad one, but I think visualizing the express cube as a cartesian coordinate system may be more accurate. (Our prototype express cube is built off a database of network alarms, with Time, Location, and Operator dimensions. The variables are Alarm Count and Response Time)
-------------------------------
two dimensions are easier to start with - so think about a graph with two
axes - say, time and geography. The years mark major ticks, quqrters
divide each major segment into four, months into twelve, etc. The same
idea on the Geography side.
At the lowest level (timestamp and network element) you may specify a
point in the space - that point is a value - or fact from the fact table.
Some points are filled in, others aren't, depending on the sparsity of the
data.
Now, in our case, variables are of the same dimension, and therefore,
coexist in the same space. Therefore, each "point" in our graph will be
associated with two possible entries.
Totalling is simply selecting an area of this space, and adding all
variables of the same type together. So - if you want a total number of
alarms in 1998 (which specifies a major division, and thus containing
timestamps on the time axis) over all network elements, you're just
selecting a cut of that space... you see? you could get tricky and
specify squares, or more complex regions, but that's the idea.
To do it in 3D, just extend a new axis. For instance - operators.
Operators really only have one level of division, so you don't have to
worry about any tricky aggregation - the axis just gives you another way
to specify a region of space.
This make sense?
What an OLAP tool like explorer does is simply pre compute and store
aggregate values for each level - but its a little tricky and quite
storage intensive because each aggregate some is _also_ dimensioned by the
other axis - which has its own aggregates - etc... But you don't really
need to worry about all that.
Now - I don't know exactly what you mean by one variable subtracted from
another - you could subtract regions over a single variable - for
instance, alarm count for 1999 minus alarm count for 1998 - or perform
more complicated operations - std dev of alarms at the month level for the
year 1998. T-test on the 1 quater of 1998 vs. 1999.
But I have a feeling this isn't exactly what you mean... lemme know.
josh
|
|
|