SSIS Bulk Load Settings

Andrew Smith Posted on

This post explores the SQL Server bulk load settings that are exposed to SSIS, and measures their effect on a simple SSIS data flow.

It considers bulk loads into heaps and clustered indexes, as well as showing the effect of minimally logged inserts into a database running with the simple recovery model.

Test Conditions

The SSIS package used here is a simple data flow that reads all data from a 5 million row table, and writes it to an empty table on the same server. CHECKPOINT; DBCC DROPCLEANBUFFERS; was executed prior to each test, to prevent the source table being accessed from the buffer pool. OLE DB is used for both the source and destination components, and the underlying database is set to the simple recovery model. For each test, the package was executed via the SSIS catalog using Basic Logging. The average time recorded by the catalog from 3 successive runs was taken. The test machine is a 64-bit i5-6200 dual-core Dell laptop, running SQL Server 2014 SP1.

Fast Load Option

To begin with, the effect of using the Fast Load option on the OLE DB destination is measured:

Enabling it allows the OLE DB provider to make use of SQL Server’s BULK INSERT mechanism.

Test cases: Not using Fast Load is compared with using Fast Load, where in the latter case all Boolean Fast Load options are disabled and the remainder are left as their defaults.

Results:

Without Fast Load 998 secs
With Fast Load 25 secs

So the dramatic effect of the bulk insert can clearly be seen here.

TABLOCK Option

Aside from allowing for concurrent loads on the same table through the use of BU locks and avoiding the overhead of individual row level locks, TABLOCK can provide for minimally logged inserts when used on databases running with the simple recovery model.

Test cases: This test compares the Fast Load from the last test, with and without the TABLOCK option.

Results:

Without TABLOCK 25 secs
With TABLOCK 14 secs

We can see that the use of TABLOCK resulted in a healthy 44% performance improvement here.

Heap vs Clustered Index

Test cases: This test uses the same destination as previously, but now with a clustered index added. TABLOCK is off, and the result is compared to the previous heap load where TABLOCK was also off.

Results:

Heap 25 secs
Clustered Index 28 secs

So the clustered index (B-Tree) insert can be seen to be a little slower here.

Clustered index: with and without TF 610

Trace Flag 610 can provide for minimal logging on a clustered index.

Test cases: This test re-runs the same cluster load as previously, but this time with Trace Flag 610 set. The result is compared to the previous cluster load.

Results:

Without TF 610 28 secs
With TF 610 23 secs

So there’s some improvement here, but it still doesn’t compare with the fastest heap load.

Clustered index with TF 610: with and without a sorted input

Test cases: This test re-runs the same cluster load using TF 610 as previously, but this time informing the bulk loader that the source is already sorted on the cluster key (which it is for this test), by using the FastLoadOptions property of the OLE DB Destination:

FastLoad

Results:

Without sort 23 secs
With sort 20 secs

This tweak results in a modest improvement to the load time.

Maximum Insert Commit Size Option = 0

The Fast Load option for the OLE DB destination has a parameter called Maximum Insert Commit Size (MICS), which defaults to 2GB. When MICS is larger than the SSIS buffer size (default 10MB) then there is 1 commit per buffer. This has been the case for all tests so far.

Test cases: This test re-runs the best heap and clustered index loads, but with MICS = 0. This results in 1 commit for the entire load, instead of 1 per buffer.

Results:

Heap (with TABLOCK) 11 secs
Clustered Index (with TF 610 and ORDER) 11 secs

Whilst both loads are faster, the cluster load was affected the most, with a 45% improvement.

Summary

All of the options used made some difference in the test environment – some being very significant indeed. The fact that they are options is important though – pros and cons exist for practically all bulk load settings, relative to the particular environment in which they are being used.

It must be emphasised also that an SSIS data load never exists in isolation. Decisions such as whether to run a database with the simple recovery model or whether Trace Flag 610 can be set globally, can only be made within the context of the system as a whole.

Finally, the above tests re-enforce the fact that benchmarking is key when developing an ETL package. Optimal performance can only be found by benchmarking candidate solutions within the context of the target environment.