Re: Limit Rows in sql query output

From: Kenny Payton <k3nnyp_at_gmail.com>
Date: Thu, 19 Jun 2014 13:58:00 -0400
Message-Id: <65B98DA3-822F-42D0-A39A-BF50E3D901FA_at_gmail.com>



Thanks for pointing out the flaws. I could see how readers might not pick up on the original intent, especially since full disclosure was not included, and make assumptions in the wrong direction.

Row limit was known and accepted by the business.
> “ This gave us the sorted top 1k rows of the first 10k rows."
>

We could not afford to wade through 10,000 rows before presenting the first 50. Fairly random block fetches at 8ms gets expensive during the dawn of the online era. The previous approach was to re-execute the query with an ever increasing rownum limit up to 900 for each page. Each query threw (50 x page #) rows away and the last page was much more expensive than the 1st. The index only searching solution increased the size of the net to 10k, while still limiting the result set to 1k. The good news is the original page ( work table ) population only needed to traverse index blocks and sort the results which took a couple hundred ms. From that point on each page was pretty consistent since it just needed to fetch 50 rows at a time.

As for data updates, this was read only data and when data was pushed to the system, via transportable tablespaces, the work table was purged.

On Jun 19, 2014, at 8:50 AM, Mark W. Farnham <mwf_at_rsiz.com> wrote:

> IF I understand you correctly, your 10k/1k method would only sample from the correct 10k rows by luck or by selecting only columns in an index ordered to your desire and used by the plan.
>
> We’d have to see the data, index, query, and plan to know whether your method is reliable.
>
> There is a method based on selecting an offset downward from the max of a numeric key that is reliable.
>
> That looks like:
>
> select c.* from (select b.* from yourtable b, (select max(a.num) high from yourtable a) x where b.num > x.high-10000 order by b.num desc) c where rownum <= 100 order by c.num
>
> By limiting the inner queries to <keys> and rowid and making an additional outer query select d.* from yourtable b where d.rowid in ( ) c
>
> Knowing the relative sparseness of “num” is relevant to selecting a reasonable offset. Similar methods work for other subtract-able key types if you have some reasonable way to make a reliably large enough offset subtraction value to exceed the number of rows to be fetched by the final query sufficiently (and ideally with as small an oversample as is practical.) If there is a breakdown between your logic and the actual data received, then you are vulnerable to error. And there may indeed be no reasonable algorithm by which to filter from the index to get “enough” of the correct rows, especially when ties must be considered as well as a legitimate underflow of available rows (regardless of stopkey).
>
> Needing tortuous constructions like this to get the right answer quickly was a good reason for Oracle to build the stuff Mr. Hall references in his blog.
>
> As for the worktable solution, that is fine as long as the data being evaluated is static (or you do not want changes after you start analysis to be reflected in the results projected.) Sometimes that is true. Otherwise a materialized view type solution would be required.
>
> I’m not claiming your method is wrong without seeing it in detail. But I have seen a lot of folks think they had a correct solution similar to what you’ve described that was not reliable.
>
> mwf
>
> From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Kenny Payton
> Sent: Thursday, June 19, 2014 5:48 AM
> To: krishna000_at_gmail.com
> Cc: oracle-l_at_freelists.org
> Subject: Re: Limit Rows in sql query output
>
> Another approach is to use a work table to store rownum and rowid from an initial query and paginate over the work table. I have done this before and it worked very well. I was able to ensure the initial query that populated the work table was only accessing the index and was very quick. The end result was the actual table rows were only accessed once over the lifetime of the search at the expense of some redo/undo. We also stored the session id from the application so the user session could bounce around app servers and not lose the state. The data was cleaned up nightly as it aged out.
>
> I also created a set of views that matched index definitions to help guide developers down the right path. The view was limited to columns in the indexes to ensure they didn't step off into table block fetches which dramatically slowed things down.
>
> I'd be interested in how efficient the 12c feature is, my approach was developed many years ago on 8i.
>
> Another problem I recall is we wanted sorted data but needed realistic limits on the result set. We ordered the sub query and limited it by 10k rows and then limited the outer query by 1k rows. This gave us the sorted top 1k rows of the first 10k rows. The first 10k rows was lightening fast as long as it stayed in the index. Goal was 1s for the first 50 rows returned to the user and I believe this pre-work took around 200ms to complete back when 8ms random reads were acceptable.
>
> Kenny
>
> On Jun 18, 2014 5:32 AM, "Bala Krishna" <krishna000_at_gmail.com> wrote:
> Hi All,
>
> We are intrested in fetching output on rows wise for ex.
>
> in first iteration 1 .. 50
> second iteration 51 .. 100
> Third iteration 101 .. 150 so on .
>
> SELECT * FROM (SELECt m.*,rownum r FROm EM_INFO m ORDER BY DBNAME ASC) WHERE r >=1 AND r <=50
>
> or
>
> SELECT * FROM (SELECt m.*,rownum r FROm EM_INFO m ) WHERE r >=1 AND r <=50 and ORDER BY DBNAME ASC
>
> I've tried with both the above queries but its not displaying in the sorted order Can somebody help me pls .
>
> Regards
> Bala

--
http://www.freelists.org/webpage/oracle-l
Received on Thu Jun 19 2014 - 19:58:00 CEST

Original text of this message