A SQL Server Integration Services (SSIS) package imports daily transactions from several
files into a SQL Server table named Transaction. Each file corresponds to a different store
and is imported in parallel with the other files. The data flow tasks use OLE DB destinations
in fast load data access mode.
The number of daily transactions per store can be very large and is growing- The
Transaction table does not have any indexes.
You need to minimize the package execution time.
What should you do?
A.
Partition the table by day and store.
B.
Create a clustered index on the Transaction table.
C.
Run the package in Performance mode.
D.
Increase the value of the Row per Batch property.
Explanation:
* Data Access Mode – This setting provides the ‘fast load’ option which internally uses a
BULK INSERT statement for uploading data into the destination table instead of a simple
INSERT statement (for each single row) as in the case for other options.
* BULK INSERT parameters include:
ROWS_PER_BATCH =rows_per_batch
Indicates the approximate number of rows of data in the data file.
By default, all the data in the data file is sent to the server as a single transaction, and the
number of rows in the batch is unknown to the query optimizer. If you specify
ROWS_PER_BATCH (with a value > 0) the server uses this value to optimize the bulkimport operation. The value specified for ROWS_PER_BATCH should approximately the
same as the actual number of rows.
I think it’s A. Partition the table by date and store
I think ‘D’ is most appropriate in this situation.
‘A’ would have been correct if you partition the table on “store”. If you partition the table on day (and then store), the daily transaction data which is of next day will either use the one of the existing or a single new partition for load. It will not be doing a parallel load.
To optimize parallel load for BULK INSERT in the above scenario, increasing ‘rows per batch’ is most effective; and, it is particularly very easy to configure when using OLE DB Destination in SSIS.
D. Increase the value of the Row per Batch property