Monday, December 31, 2018 3:00 PM

SSIS Tip: Using #Temp Tables with Integration Services Packages

Written by Thomas Harlan, Jim McGrath; Reporting Services Team - iatricSystems

RWS-Blog-HeaderImage-report-writing-2018-12b

SQL Server Integration Services is a powerful tool for building a single DTSX package, which can generate a single output file — or dozens of them — and let that set of work be scheduled to run automatically. And it’s included in your MEDITECH Data Repository server license for SQL Server! Instead of a dozen NPR reports, each with its own schedule (or the same setup using SQL and bcp’ing out files), you can have one .DTSX package that calls a stored procedure per file you need.

Which is all fantastic, until you find that you need a #temp table for some kind of multi-step processing inside your stored procedure…

This is useful for many data scenarios, for example:

When you create a calendar, then report on things happening within a given day, and you need a row where nothing happens
When you want to span out an event, like an ADT admit/discharge pair of datetimes, to include all of the days between those events

And then when you run your SSIS package, or when you try and connect that stored procedure to a Data Flow task, you get an error like this:

RWblog_using#temptables_pic1 Why does that happen? An excellent question…

When you connect a Data Flow task to a query or a stored procedure, the Data Tools / Business Intelligence Design Studio attempts to figure out what result set the query returns. Like many Microsoft tools, it thinks it’s smart enough to do something behind the scenes to help you, without you being able to control what it’s doing.

In this case, the tool parses the code, looking for the returned SELECT list. And if there is a #temp table involved, it hits that CREATE TABLE code and stops with an error, because there will be multiple SELECT list(s) in the stored procedure or query.

At this point, you’re stumped, and you go looking on the Internet for a solution. You might find a lot of crazy talk involving using TABLE variables, or SET FMTONLY OFF, or using global ##temp tables, etc.

You don’t want to do any of those things… you want to be able to use #temp tables in your code with the minimum amount of fuss and bother, and not need to do any setup outside of your query / stored procedure.

There are two ways to proceed — one if you’re on SQL 2005-2008, and one if you’re on SQL 2012 or better. (By the way, if you’re still on SQL 2008 R2, it’s time to upgrade your database servers.)

Using SQL 2008

At this level of the database, we need to fool the parser by providing it with a “template” of our final result set. At the very top of your stored procedure, add these two SET lines:

RWblog _using#temptables_pic 2 Then, before you do anything else, you make an IF statement that wraps up a fake SELECT:

RWblog_using#temptables_pic 3 The 1=2 part means everything between the BEGIN and END will never execute, because for SQL 1 does not equal 2.

But the SELECT hidden inside the IF 1=2 is picked up by the parser, and it then thinks the result set of the query / stored procedure is the fields (and data types) that you’ve listed.

At the end of your multi-step processing code with a hundred #temp tables, you make sure you return a result set with the same structure. Then — to your relief — SSIS is happy, you’re happy, and everything works as intended.

Key Points

The structure you define in IF 1=2 BEGIN/END must match the result set structure by number of fields / columns AND datatypes being returned by the stored procedure. Otherwise, the BIDS designer will complain bitterly!

Extra Note
Because you have to define the lengths of the fields in your template, make them big. MEDITECH is always monkeying around with field lengths in the DR. At a minimum, make them the equivalent of a FREE field in NPR — that is, VARCHAR(256), as in the example above. If one of the fields is a comment, set it to VARCHAR(1250) or larger.

Then you are less likely to need to change the result set definition in the future.

Using SQL 2012 and up
Obviously, programmers complained bitterly about having to do this. So, in SQL 2012 Microsoft introduced a new clause for the EXEC statement for a stored procedure: WITH RESULT SETS.

This clause goes after your EXEC and all of its parameters, like this:

RWblog_using#temptables_pic 4 This has the same effect. Essentially, you’re declaring an expected result set from the stored procedure, but it happens outside of the stored procedure.

No changes are needed internally, which is both good and bad. It’s nice not having to monkey around with the actual code, but then you have to remember to change how you call the stored procedure in SSIS to include the WITH RESULT SETS clause.

Key Points Continued…

The custom code database (zcus in this example) must be at compatibility level (CL) 110 or better to support this functionality. Note that your custom code database might be at a different CL than your livedb.
This only works with a stored procedure. Which means it’s a good thing you’re already basing all of your Data Flow tasks on sp’s, right?
The structure you define in WITH RESULT SETS must match the result set structure by number of fields / columns being returned by the stored procedure. Otherwise, kaboom!
It is possible to mismatch the DATA TYPE of a field between the stored procedure returned data and the WITH RESULT SETS definition. This causes an implied conversion operation, which is a performance hit and may blow up unexpectedly at runtime. So, don’t do that.
The result set from the stored procedure and the result set in the clause match positionally, which means you can change the column / field names between the two… This is very, very confusing to debug. So, don’t do that either!
See that (( at the beginning and the end )) of the clause? That’s there because you can send back multiple result sets from a stored procedure with this setup, which looks like this:

RWblog_using#temptables_pic 5

You may include that to support ADO / .NET calls to a stored procedure, where you can process / manipulate the multiple sets. But SSIS cannot do that… So, don’t do that, either. Only one result set definition inside (( )) is what you need.

Note that in testing, you can run just the EXEC without the WITH RESULT SETS clause to make sure the sp is working properly.

But Sometimes…
Very rarely (well, it’s happened to me once…), when you have a stored procedure and SSIS package that you migrate from SQL 2008 to SQL 2012+, the WITH RESULT SETS approach just does not work. In that case, I put the IF 1=2 structure back in place, and it worked fine.

Also, and this is more subtle, if when you schedule the SSIS Package through SQL Agent, and you happen to turn off the 64-bit Runtime option:

RWblog_using#temptables pic 6 The WITH RESULT SETS will not work with your package! This is because the 32-bit version of the SSIS engine is running the SQL 2008 code. If you have to use a 32-bit execution, you will likely need to use the IF 1=2 structure in your code.

Extra Credit
Update your SSIS development workflow to include always creating an Extract Documentation spreadsheet, based on our handy template.

And finally…
Our Report Writing team can help with Data Repository reporting, NPR report writing, data extract or analysis, and much more. Simply reach out to your Iatric Systems Account Executive or our NPR report writing team at reportwriting@iatric.com to discuss how we can help support your team!