Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
![]() |
![]() |
Home -> Community -> Usenet -> c.d.o.server -> Re: Computing truncated mean
On Feb 7, 7:50 am, "Charles Hooper" <hooperc2..._at_yahoo.com> wrote:
> On Feb 6, 4:40 pm, Walt <walt_ask..._at_SHOESyahoo.com> wrote:
>
> > A truncated mean or a trimmed average is the average after throwing away
> > the top and bottom n percent.
>
> > I seem to recall a built-in function in Oracle that did this fairly
> > simply, but I can't seem to locate it in the docs. I could cook up my
> > own in PL-SQL, but if there's something already out there I'd prefer to
> > use it. Anybody know a simple way to do it?
>
> > Thanks.
>
> > //Walt
>
> > W2k3, Ora 9.2 if that matters...
>
> Set up a test case:
> CREATE TABLE T1 (N NUMBER(10));
>
> INSERT INTO T1 VALUES (1);
> INSERT INTO T1 VALUES (2);
> INSERT INTO T1 VALUES (8);
> INSERT INTO T1 VALUES (10);
> INSERT INTO T1 VALUES (15);
> INSERT INTO T1 VALUES (20);
> INSERT INTO T1 VALUES (34);
> INSERT INTO T1 VALUES (40);
> INSERT INTO T1 VALUES (55);
> INSERT INTO T1 VALUES (18);
> INSERT INTO T1 VALUES (22);
> INSERT INTO T1 VALUES (28);
> INSERT INTO T1 VALUES (32);
> INSERT INTO T1 VALUES (40);
> INSERT INTO T1 VALUES (49);
> INSERT INTO T1 VALUES (50);
> INSERT INTO T1 VALUES (80);
> INSERT INTO T1 VALUES (200);
> INSERT INTO T1 VALUES (201);
> INSERT INTO T1 VALUES (210);
>
> SELECT
> *
> FROM
> T1
> ORDER BY
> N;
>
> N
> ----------
> 1
> 2
> 8
> 10
> 15
> 18
> 20
> 22
> 28
> 32
> 34
> 40
> 40
> 49
> 50
> 55
> 80
> 200
> 201
> 210
>
> Now, an experiement to find the point in our table that represents the
> bottom 20% (START_POINT) and top 20% (END_POINT):
> SELECT
> PERCENTILE_DISC(0.20) WITHIN GROUP (ORDER BY N) START_POINT,
> PERCENTILE_DISC(0.20) WITHIN GROUP (ORDER BY N DESC) END_POINT
> FROM
> T1
> GROUP BY
> 1;
>
> START_POINT END_POINT
> ----------- ----------
> 10 80
>
> Now that we have the end points defined, we need a way to limit the
> input into the AVG function so that only the rows that contain column
> values between the two extremes are retrieved. Sliding the above into
> an inline view and joining it to our original table solves the
> problem:
> SELECT
> AVG(N) AVERAGE,
> COUNT(*) NUM_RECORDS,
> MIN(N) MINIMUM,
> MAX(N) MAXIMUM,
> PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY N) MIDPOINT_1,
> PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY N DESC) MIDPOINT_2
> FROM
> T1,
> (SELECT
> PERCENTILE_DISC(0.2) WITHIN GROUP (ORDER BY N) START_POINT,
> PERCENTILE_DISC(0.2) WITHIN GROUP (ORDER BY N DESC) END_POINT
> FROM
> T1
> GROUP BY
> 1) L
> WHERE
> T1.N BETWEEN L.START_POINT AND L.END_POINT;
>
> AVERAGE NUM_RECORDS MINIMUM MAXIMUM MIDPOINT_1 MIDPOINT_2
> ---------- ----------- ---------- ---------- ---------- ----------
> 35.2142857 14 10 80 32 34
>
> After throwing out those values outside the 20% to 80% range, our
> average is 35.21, there are 14 records, the minimum is 10, maximum is
> 80, the first number in the middle of the group sorted from lowest to
> highest is 32, and the first number in the middle of the group sorted
> from highest to lowest is 34.
>
> Charles Hooper
> PC Support Specialist
> K&M Machine-Fabricating, Inc.
Continuing with another example that returns a different result (which
right answer do you want?):
SELECT
N,
PERCENT_RANK() OVER (ORDER BY N) RANKING
FROM
T1
ORDER BY
N;
N RANKING
---------- ----------
2 .052631579 8 .105263158 10 .157894737 15 .210526316 18 .263157895 20 .315789474 22 .368421053 28 .421052632 32 .473684211 34 .526315789 40 .578947368 40 .578947368 49 .684210526 50 .736842105 55 .789473684 80 .842105263 200 .894736842 201 .947368421 210 1
You will note now that the between 20% and 80% rule will include those
values between 15 and 55, if we use the above as a row limiter:
SELECT
AVG(N) AVERAGE,
COUNT(*) NUM_RECORDS,
MIN(N) MINIMUM,
MAX(N) MAXIMUM,
PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY N) MIDPOINT_1,
PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY N DESC) MIDPOINT_2
FROM
(SELECT
N,
PERCENT_RANK() OVER (ORDER BY N) RANKING
FROM
T1)
WHERE
RANKING BETWEEN .20 AND .80;
AVERAGE NUM_RECORDS MINIMUM MAXIMUM MIDPOINT_1 MIDPOINT_2
---------- ----------- ---------- ---------- ---------- ----------
33.5833333 12 15 55 32 34
After throwing out those values outside the 20% to 80% range {values in () indicate previous results}, our average is 33.58 (35.21), there are 12 (14) records, the minimum is 15 (10), maximum is 55 (80), the first number in the middle of the group sorted from lowest to highest is 32 (32), and the first number in the middle of the group sorted from highest to lowest is 34 (34).
Charles Hooper
PC Support Specialist
K&M Machine-Fabricating, Inc.
Received on Wed Feb 07 2007 - 07:08:43 CST
![]() |
![]() |