Blogs Home
Monday, October 17, 2016 12:00 PM

SQL Tip: The Good, the Bad, and the Ugly of TOP X

Written by Thomas Harlan, Jim McGrath; Reporting Services Team - iatricSystems

As you’re learning SQL, you will invariably come across the TOP X clause to a SELECT, which allows you to get – for example – the first 50 rows of a query:

SELECT TOP 50 ADV.VisitID FROM livedb.dbo.AdmVisits ADV ;

This can be quite handy for testing if tables are populated, or just grabbing some records to review when you’re looking for something. And if you’re an old, grizzled, MEDITECH NPR programmer (as many of our followers are), you’ll be happy to see something that acts like Z.record.limit.

However...

The Ugly

You won’t notice right away (unless you’ve been careful to read the MSDN documentation), but SQL does not guarantee that records coming back from a query will be in a specific order, unless you add the ORDER BY clause to your query.

Which means when using TOP X, that you will get the first X records… which may not be the first 50 in the order you wanted… they could, potentially, be in a semi-random order, depending on which data pages the engine traversed first.

So you need to be mindful of setting that ORDER BY, but that means your query requires sorting after retrieving all records and then picking the first 50. This, as you can imagine, might be unexpectedly slow!

The Bad

In production reporting, a real use of TOP 1 is to check a sub-table and get the first instance of a something, like a query response. With this approach, you have a derived or calculated field, which is generated by a sub-query on a row-by-row basis, like:

( SELECT TOP 1
       RQR.ValueInfo
FROM   livefdb.dbo.RegAcctQuery_Result RQR
WHERE  RQR.SourceID = RAM.SourceID
       AND
       RQR.VisitID = RAM.VisitID
       AND
       RQR.Query_MisQryID = ‘ADM.SMOKING’
       AND
       RQR.ValueInfo <> '~' -- Ignore placeholder values?
ORDER BY RQR.[DateTime] DESC )

NOTE: This code example is simplified. The real data in RegAcctQuery_Result needs cleaning up before you can use it much of the time.

If you look at this code, though, you see that we’re hitting the RegAcctQuery_Result table multiple times:

  • Once per row in the main / outside result-set
  • And we check for this visit
  • And the ADM.SMOKING query

Then we sort the results of that query, and pick off the last documented instance (because of that ORDER BY… DESC clause).

This works. But it can be slow… Particularly as you add more and more of these fields, the whole query gets slower and slower. So it’s valid, but not the approach you want.

The approach you want is to get all of the VisitID’s into a #temp table, then hit all of the supporting tables like RegAcctQuery_Results and get the responses, and line-up/line-down them with a windowing function, then JOIN to that #temp table to get the first or last instance.

The Good

There is, however, one very, very handy use of TOP X that you should keep in mind, for that lucky day when you need it.

Say you have a really big query – hundreds of fields perhaps - which is quite complex and the day comes when you realize you need to store the output of that query in a table… and it needs to be a real table in the db, probably with a whack of indexes on it.

You could dig through the query and figure out the column data-types and widths of all the fields and then write a script to create that new table.

Or… you could do:

SELECT TOP 0 *
INTO <NewTableName>
FROM <ReallyComplexQuery> ;

And be done! Zero (0) is a valid parameter for TOP, and this makes a new, empty table for you in a snap. Then you just have to add your indexes, and you are ready to populate it via some scheduled process.

If you need more help…

Our Report Writing team can help you fix reports, create new ones, make old ones faster, and much more. Simply reach out to your Iatric Systems Account Executive or our NPR report writing team at reportwriting@iatric.com to discuss how we can help support your team!