Incremental Aggregation

Incremental Aggregation If the source changes incrementally and you can capture changes, you can configure the
session to process those changes. This allows the Integration Service to update the target incrementally, rather than forcing it to process the entire source and recalculate the same data each time you run the session. For example, you might have a session using a source that receives new data every day. You can capture those incremental changes because you have added a filter condition to the mapping that removes pre-existing data from the flow of data. You then enable incremental aggregation When the session runs with incremental aggregation enabled for the first time on March 1, you use the entire source. This allows the Integration Service to read and store the necessary aggregate data. On March 2, when you run the session again, you filter out all the records except those time-stamped March 2. The Integration Service then processes the new data and updates the target accordingly. Consider using incremental aggregation in the following circumstances: You can capture new source data - Use incremental aggregation when you can capture new source data each time you run the session. Use a Stored Procedure or Filter transformation to process new data. Incremental changes do not significantly change the target - Use incremental aggregation when the changes do not significantly change the target. If processing the incrementally changed source alters more than half the existing target, the session may not benefit from using incremental aggregation. In this case, drop the table and recreate the target with complete source data
Integration Service Processing for Incremental Aggregation The first time you run an incremental aggregation session, the Integration Service processes the entire source. At the end of the session, the Integration Service stores aggregate data from that session run in two files, the index file and the data file. The Integration Service creates the files in the cache directory specified in the Aggregator transformation properties. Each subsequent time you run the session with incremental aggregation, you use the incremental source changes in the session. For each input record, the Integration Service checks historical information in the index file for a corresponding group. If it finds a corresponding group, the Integration Service performs the aggregate operation incrementally, using the aggregate data for that group, and saves the incremental change. If it does not find a corresponding group, the Integration Service creates a new group and saves the record data. When writing to the target, the Integration Service applies the changes to the existing target. It saves modified aggregate data in the index and data files to be used as historical data the next time you run the session.
If the source changes significantly and you want the Integration Service to continue saving aggregate data for future incremental changes, configure the Integration Service to overwrite existing aggregate data with new aggregate data. Each subsequent time you run a session with incremental aggregation, the Integration Service creates a backup of the incremental aggregation files. The cache directory for the Aggregator transformation must contain enough disk space for two sets of the files.
When you partition a session that uses incremental aggregation, the Integration Service creates one set of cache files for each partition. The Integration Service creates new aggregate data, instead of using historical data, when you perform one of the following tasks: - Save a new version of the mapping. - Configure the session to reinitialize the aggregate cache. - Move the aggregate files without correcting the configured path or directory for the files in the session properties. - Change the configured path or directory for the aggregate files without moving the files to the new location. - Delete cache files. - Decrease the number of partitions. When the Integration Service rebuilds incremental aggregation files, the data in the previous files is lost. Note: To protect the incremental aggregation files from file corruption or disk failure, periodically back up the files Reinitializing the Aggregate Files If the source tables change significantly, you might want the Integration Service to create new aggregate data, instead of using historical data. For example, you can reinitialize the aggregate cache if the source for a session changes incrementally every day and completely changes once a month. When you receive the new source data for the month, you might configure the session to reinitialize the aggregate cache, truncate the existing target, and use the new source table during the session. After you run a session that reinitializes the aggregate cache, edit the session properties to disable the Reinitialize Aggregate Cache option. Avoid moving or modifying the index and data files that store historical aggregate information. If you move the files into a different directory, the Integration Service rebuilds the files the next time you run the session

Incremental Aggregation

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Incremental Aggregation

Enviado por

Direitos autorais:

Formatos disponíveis

Incremental Aggregation If the source changes incrementally and you can capture changes, you can configure the

Você também pode gostar