Re: Relation subset operators
Date: Tue, 2 Jun 2009 13:34:03 -0700 (PDT)
Message-ID: <b59af2db-cd0c-4b1f-b01e-f9a89d1842bf_at_k8g2000yqn.googlegroups.com>
On 2 juin, 22:24, Cimode <cim..._at_hotmail.com> wrote:
> While working on aggregation within groupping operations on the db
> core I design for relation manipulation, I questionned myself about
> the opportunity of using new operators to simplify relational division
> formulation and make it more systematic. For instance, conside the
> following questions:
>
> suppose CAR_SALE relation represented as
>
> CAR_SALE
> id car salesman price color date
> 1 Buick Henderson 10000 Red 01/01/1990
> 2 Buick Wilkinson 10000 Red 02/01/1990
> 3 Chevrolet Hutchinson 10000 Red 12/01/1990
> 4 Buick Wilkinson 10000 Blue 13/01/1990
> 5 Chevrolet Henderson 10000 Red 14/01/1990
> 6 Buick Henderson 10000 Blue 16/01/1990
> 7 Buick Henderson 10000 Blue 18/01/1990
> 8 Chevrolet Parson 10000 Yellow 18/01/1990
>
> Now consider the following questions
>
> > Question 1: What are the total sales of salesmen who have sold AT LEAST (one) Blue car.
>
> returning...
>
> salesman total_sale
> Wilkinson 20000
> Henderson 30000
>
> > Question 2: What are the total sales of salesman who have sold AT LEAST (one) blue car AND (one) red car.
>
> returning...
>
> salesman total_sale
> Wilkinson 20000
> Henderson 30000
>
> > Question 3: What are the total sales of salesman who have sold AT LEAST (one) blue car OR (one) red car.
>
> returning...
>
> salesman total_sale
> Hutchinson 10000
> Wilkinson 20000
> Henderson 30000
>
> > Question 4: What are the total sales of salesman who have NOT sold ANY blue cars
>
> returning...
>
> salesman total_sale
> Hutchinson 10000
> Parson 10000
>
> > Question 5: What are the total sales of salesman who have NOT sold NEITHER a blue car NOR a red car
>
> returning...
>
> salesman total_sale
> Parson 10000
>
> Even though the above examples are easy to express algebrically (at
> least compared to their SQL expression), they must to be decomposed
> into a minimum of two to three elementary operations to be solved when
> using elementary operators (JOIN, UNION, GROUP BY). I thought about
> creating an specific operator for groupping subsets to simplify the
> expression of such problems. Such operator would allow to operate an
> relation involved in a division operation with one attribute drawn
> from the header subset. I thought about symbolizing the operator as */
> * followed by the operator to be implemented within the groupping
> subset such as */=* */>*, */<*, */<>*, */*. Using such operator
> allows the expression of the questions above in a simpler way.
>
> SYMBOLOGY
> / : relational division
> +: relational union
> /=: equality within the group
>
> DEFINITION
> For a relation R and un-ary relations p and q part of R header, and V
> values drawn from any attribute of R, */=* would be defined such as
>
> R/p WITH q /= V <=> R/p (INNER) JOIN (p WHERE q = V) ON p
>
> Using the above definition, the expression of Question 1 comes to
>
> CARSALE/salesman WITH color /= 'Blue' is a interesting shortcut to
> (CARSALE/salesman) JOIN (salesman WHERE color = 'Blue') ON salesman
>
> to get Question 2, instead of writing (CARSALE/salesman) JOIN
> ((salesman WHERE color = 'Blue') JOIN (salesman WHERE color = 'Red')
> on salesman ) ON salesman, we could simply write...
>
> CARSALE/salesman WITH color /= 'Blue' AND color /= 'Red'
>
> to get Question 3, instead of writing (CARSALE/salesman) JOIN
> ((salesman WHERE color = 'Blue') UNION ALL (salesman WHERE color =
> 'Red') on salesman ) ON salesman, we could write...
>
> CARSALE/salesman WITH color /= 'Blue' OR color /= 'Red'
>
> to get Question 4 is even simpler...instead of writing (CARSALE/
> salesman) JOIN (salesman MINUS (salesman WHERE color = 'Blue')) ON
> salesman
> we keep the coherence of expressing
>
> CARSALE/salesman WITH color /<> 'Blue' (/<> would be the opposite of /
> =)
>
> Finally to get Question 5 expressed as (CARSALE/salesman) JOIN
> (salesman MINUS (salesman WHERE color = 'Blue') MINUS (salesman WHERE
> color = 'Red')) ON salesman
>
> we would simply write
>
> CARSALE/salesman WITH color /<> 'Blue' AND color /<> 'Red' .
>
> The need for such simplification is even more obvious when the queries
> become more complex
>
> We could imagine a Question 6 which would be: What are the total
> sales of salesmen who either have *not* sold a single Blue OR have
> *not* sold a single Red car...
>
> Using traditional operators, Question 6 would be expressed as
> (CARSALE/salesman) JOIN ((salesman MINUS (salesman WHERE color =
> 'Blue')) UNION ALL (salesman MINUS (salesman WHERE color = 'Blue')))
> ON salesman
>
> where they could be written...
>
> CARSALE/salesman WITH color /<> 'Blue' OR color /<> 'Red' .
>
> It seems to me such simplification would be beneficial to express more
> systematically algebric formulation of relational divisions.
>
> Note: I do not mention SQL into this because there would be too much
> verbose to write. Using the db core and subsequent language I am
> working ontoI can write this...
>
> --DEFINITION
> R1 = (1, 'A', 0, 10) + (1, 'A', 1, 10) + (1, 'B', 1, 10) + (1,
> 'B', 1, 10) WITH LEFT TO RIGHT ALIGN (id, groupper, differentiator,
> value)
> Q1 = R1/salesman WITH (color /= 'Blue')
> Q2 = R1/salesman WITH (color /= 'Blue' AND color /= 'Red')
> Q3 = R1/salesman WITH (color /= 'Blue' OR color /= 'Red')
> Q4 = R1/salesman WITH color /<> 'Blue'
> Q5 = R1/salesman WITH color /<> 'Blue' AND color /<> 'Red'
> Q5 = R1/salesman WITH color /<> 'Blue' OR color /<> 'Red'
>
> --PRESENTATION in tabular format
> PRESENT2D(salesman, SUM(price)).Q1
> PRESENT2D(salesman, SUM(price)).Q2
> PRESENT2D(salesman, SUM(price)).Q3
> PRESENT2D(salesman, SUM(price)).Q4
> PRESENT2D(salesman, SUM(price)).Q5
>
> produces previously mentionned results..
>
> Comments would be welcome to evaluate the usefulness of such
> operator....
>
> Thank you.
I also forgot to mention that such operator would allow to express specialization by constraints efficiently by expressing the any constraint a un-ary relation as a subset of its domain. For instance, considering domain I of Integers, a constraint to express un-ary relation R(attribute1) containing only strictly positive values could be expressed simply as
I/R WITH attribute1 /> 0
That way specialization by constraint can be formulated imperatively and without necessarily involving a header definition. Received on Tue Jun 02 2009 - 22:34:03 CEST