Thursday, February 23, 2017 1:00 PM

DR Tip: A Dynamic Extract Documentation Template

Written by Thomas Harlan, Jim McGrath; Reporting Services Team - iatricSystems

This blog post is designed for readers who use MEDITECH’s Data Repository.

As part and parcel of developing ETL extracts for vendors, it is often quite useful to build a technical specification first. In the case of vendor extract files that involve the hospital getting money back from the federal government (think 340B), I recommend that a proper file spec be mandatory.

You should not think of this as a chore. Instead, view it as an opportunity to meet organizational requirements and a chance to make your life easier when coding the extract.

Here are four versions of an Extract Documentation Template:

The layout of each of these templates is the same, but what is different is the Data Dictionary lookup tab, which has been populated with the MEDITECH data definitions for the most recent version of each platform. In case you are not on the latest version of MEDITECH, here are the .sql codes to populate the Data Dictionary tab:

These queries let you repopulate the Data Dictionary tab with detail specific to your DR.

Using the Template

First, when you are preparing to document an ETL file extract, make a copy of the base template into the folder where you have all of your project documentation. (You have project documentation folders for each of your ETL extracts, right?)

Then, in that new spreadsheet, you have a series of pre-created tabs:

The Overview Tab

Update the (Vendor) Extract Specifications cell to reflect this specific vendor.
Fill in the Wiki (or KB) Article URL to keep track of that.
Document the Frequency cell with “Daily”, “Weekly”, etc.
And then document each file in the extract (for example, “Demographics”, “Charges”, etc.) and the basic logic used to pull records for that file when it runs.

Then you're done with that one.

The FileXX Tabs

We’ve pre-created five FileXX tabs, but you can copy them and make as many as you need. Hopefully you don’t have dozens of them…

On each FileXX tab:

Rename the Tab to the base name of the file, like “Encounter.”
Note this updates cell B1 to reflect the tab name.
The Extract By cell gives you another spot to document the logic used to pull records into this file.
The Format cell is to document how the output file will be formatted… CSV, tab-delimited, etc.
The real meat of the tab follows in the main grid:

The File Field Name column is specifically for the field/column name provided by the vendor. Preferably, this column has the exact field names the vendor wants to see in the extracted file.
The DataType column is for the SQL Server data type of the field to be extracted – as the vendor wants to see it! Note that this has a picklist on the field. An example of using this column is where a data element in the DR is a DATETIME, but the vendor wants to see it in a VARCHAR format (like CCYY-MM-DD HH:MM). Here you’d put VARCHAR, then…
In Length you put the field length that the vendor expects to see. In our current example you’d put 19 (as a date in CCYY-MM-DD HH:MM:SS format is 19 characters long).
In DR Table.Field or Value, you’d put either a hard-coded value that will always be the same for each record in the file (not so common…), or the DR Table.Field reference where you’re getting the value. An example of this could be:

AbsAnesthesiaPhysicians.ProviderID
When you do this, the MT-Element column should automatically populate with the data element in MEDITECH that populates the DR table and field. In this case, you could get something like:

ABS.PAT.oper.anesth.dr

This lets you automatically construct the data lineage of the field in the vendor file. In one place, you can now see what the File Field Name is for the vendor, the location in the DR from which you’re extracting, and then the source data element in MEDITECH from which the DR is populated.
The next two columns – Field Definition and Notes – are, respectively, for information provided by the vendor about their definition of the field, and your notes about how you’re getting / formatting / using the field.

A tiny break follows, and then there are four tremendously useful calculated columns (though you probably won’t need all four each time you build an extract):

The SSIS Template (SQL2008<) column builds a definition for use in SQL Server Integration Services DTSX packages of the record-set layout your stored procedure is sending to SSIS. You will only need this column if your stored procedure is using temp tables (which will confuse SSIS) and you happen to still be using SSIS 2008.
The Result Set (SQL 2012+) column gives you the definition for SQL 2012 and higher, to avoid the same problem with temp tables by adding a WITH RESULT SETS clause to your stored procedure EXEC.
The SELECT List column provides a handy starting point for your actual SELECT statement that is going to get data for this file. Note that each line is automatically numbered, which is very helpful for making sure you’ve got everything in the right place in a 500+ field extract, and does an automatic CONVERT() to the datatype and length the vendor wants to see. If you’re using SSIS to generate your output, this is a life-saver… because the one thing that SSIS handles very badly… is changing the datatype and length of a column in the DTSX package. This lets you define both of those attributes up front, as the vendor expects them, and helps automate the conversion.
Finally on the far right is a Possessive Join column. This leverages the data def to pre-build the JOIN statement you will need to bring information from a possessive field into your query. So you can copy and paste those as needed into your main FROM/JOIN structure.

That should be enough to get you started!

You will find that if you use a template like this consistently, your total time to build, test, and deliver extracts will be significantly reduced.

Extra Credit

Update the Data Dictionary tab to automatically connect to your DR and run the data definition query as needed.

If you need more help…

Our Report Writing team can help you fix reports, create new ones, make old ones faster, and much more. Simply reach out to your Iatric Systems Account Executive or our NPR report writing team at reportwriting@iatric.com to discuss how we can help support your team!