Você está na página 1de 2

11/13/2017 DEV'S DATASTAGE TUTORIAL,GUIDES,TRAINING AND ONLINE HELP 4 U.

UNIX, ETL, DATABASE RELATED SOLUTIONS: DATASTAGE

More Next Blog

Home Datastage Related Datastage Training Big Data Unix Database Interview Related Certifications Discussion Forum

Many thanks for visiting my Blog..!!Please share this blog using below share bu

SEARCH YOUR PROBLEMS SOLUTION IN T

Share This Blog..!!

ABOUT ME : CLICK ON G+ BUTTON TO FOL


DATASTAGE Performance Tuning Tips V1.1
Devendra Kumar Yada
This Blog give you a complete details, how we can improve the performance of datastage Parallel.
Follow

You may like these links as well: 303 followers


1 Click here to know more datastage performance tuning tips
2.Click here to know Datastage Partitioning Methods and Use
3.Click here to know Datastage Jobs Performance Improvement Tips1
VISITOR'S VIEW COUNT PROFESS
4.Click here to know Partitioning Considerations
5 2 8 4 1 8 8
Some Most Common Points For Datastage Jobs Performance Tuning :
TRANSLATE THIS BLOG
DONATE F
1.Select suitable configurations file (nodes depending on data volume) Select Language
2. Select buffer memory correctly
3. Select proper partition
4. Turn off Runtime Column propagation wherever its not required
5. Taking care about sorting of the data. OTHER DATASTAGE QUESTIONS SOLUTION
6. Handling null values (use modify instead of transformer) 2016 (5)
7. Try to decrease the use of transformer. (Use copy, filter, modify)
2015 (18)
8. Use dataset instead of sequential file in the middle of the vast jobs
9. Take maximum 20 stages for a job for best performance. 2014 (34)
10. Select Join or Lookup or Merge (depending on data volume) 2013 (48)
11. Stop propagation of unnecessary metadata between the stages.
2012 (4)

Points we need to consider : Dec (3)


Nov (1)
1.Staged the data coming from ODBC/OCI/DB2UDB stages or any database on the server using Hash/Sequential DATASTAGE Performance Tuning Tips
files for optimum performance
2. Tuned the OCI stage for 'Array Size' and 'Rows per Transaction' numerical values for faster inserts, updates and
selects. MY MOST POPULAR FREQUENTLY ACCESS
3. Tuned the 'Project Tunables' in Administrator for better performance
4. Used sorted data for Aggregator. Datastage 8.5, 8.7 and 9.1 Differences
5. Sorted the data as much as possible in DB and reduced the use of DS-Sort for better performance of jobs .
Data partitioning & collecting methods Examp
6. Removed the data and columns not used from the source as early as possible in the job.
7. Worked with DB-admin to create appropriate Indexes on tables for better performance of DS queries .
DATASTAGE Performance Tuning Tips V1.1
8. Converted some of the complex joins/business in DS to Stored Procedures on DS for faster execution of the
jobs. Surrogate Key Generator Implementation
9. If an input file has an excessive number of rows and can be split-up then use standard logic to run jobs in
parallel. Transformer Looping Functions for Pivoting
10. Before writing a routine or a transform, make sure that there is not the functionality required in one of the
standard routines supplied in the sdk or ds utilities categories. Constraints are generally CPU intensive and take Datastage Transformer Stage Looping concep
a significant amount of time to process. This may be the case if the constraint calls routines or external macros
IBM Datastage 9.1 Newly Added features
but if it is inline code then the overhead will be minimal.
11. Try to have the constraints in the 'Selection' criteria of the jobs itself. This will eliminate the unnecessary
Parameters Using Parameter/Value Set/Value
records even getting in before joins are made.
12. Tuning should occur on a job-by-job basis. Datastage Scenario Based Question/Answer
13. Use the power of DBMS.
14. Try not to use a sort stage when you can use an ORDER BY clause in the database. IBM Datastage 11.3.x Newly Added Features
15. Using a constraint to filter a record set is much slower than performing a SELECT WHERE.
16. Make every attempt to use the bulk loader for your particular database. Bulk loaders are generally faster
than using ODBC or OLE. LIST OF VISITOR'S COUNTRIES
17. Minimize the usage of Transformer (Instead of this use Copy modify Filter Row Generator
18. Use SQL Code while extracting the data
19. Handle the nulls Properly usning modify stage.
20. Minimize the warnings

http://datastageinfoguide.blogspot.in/2012/11/datastage-performance-tuning-tips.html 1/3
11/13/2017 DEV'S DATASTAGE TUTORIAL,GUIDES,TRAINING AND ONLINE HELP 4 U. UNIX, ETL, DATABASE RELATED SOLUTIONS: DATASTAGE
21. Reduce the number of lookups in a job design.
22. Try not to use more than 20 stages in a job if expected data volume is too high.
23. Use IPC stage between two passive stages Reduces processing time
24. Drop indexes before data loading and recreate after loading data into tables
25. Check the write cache of Hash file. If the same hash file is used for Look up and as well as target disable this
Option.
26. If the hash file is used only for lookup thenenable Preload to memory. This will improve the performance.
RECENTLY VISITED USER'S LOCATION
Also check the order of execution of the routines.
27. Don't use more than 7 lookups in the same transformer; introduce new transformers if it exceeds 7 Live Traffic Feed
lookups. A visitor from Paramus, New Jersey viewe
28. Use Preload to memory option in the hash file output. "DEV'S DATASTAGE
TUTORIAL,GUIDES,TRAINING AND
29. Use Write to cache in the hash file input.
ONLINE HELP 4 U. UNIX, ETL, DATAB
30. Write into the error tables only after all the transformer stages. RELATED
A visitor from SOLUTIONS:
United StatesConvert
viewedSpecia
"DEV
31. Reduce the width of the input record - remove the columns that you would not use. Characters to Any Other Char" 3 mins ago
DATASTAGE
32. Cache the hash files you are reading from and writing into. Make sure your cache is big enough to hold the TUTORIAL,GUIDES,TRAINING AND
hash files. ONLINE HELP 4 U. UNIX, ETL, DATAB
33. Use ANALYZE.FILE or HASH.HELP to determine the optimal settings for your hash files. RELATED
A visitor from SOLUTIONS:
France viewedDatastage
"DEV'S8.5, 8
34. Ideally, if the amount of data to be processed is small, configuration files with less number of nodes should and 9.1 Differences" 8 mins ago
DATASTAGE
be used while if data volume is more , configuration files with larger number of nodes should be used. TUTORIAL,GUIDES,TRAINING AND
35. Partitioning should be set in such a way so as to have balanced data flow i.e. nearly equal partitioning of ONLINE HELP 4 U. UNIX, ETL, DATAB
RELATED SOLUTIONS: Datastage C/C+
data should occur and data skew should be minimized. A visitor from
Compiler issueDelhi viewed Machine"
on Windows "DEV'S 18
36. In DataStage Jobs where high volume of data is processed, virtual memory settings for the job should be DATASTAGE
ago
optimized. Jobs often abort in cases where a single lookup has multiple reference links. This happens due to low TUTORIAL,GUIDES,TRAINING AND
temp memory space. In such jobs $APT_BUFFER_MAXIMUM_MEMORY, $APT_MONITOR_SIZE and ONLINE HELP 4 U. UNIX, ETL, DATAB
$APT_MONITOR_TIME should be set to sufficiently large values. RELATED
A visitor from SOLUTIONS:
Nashville, Tennessee
Datastageviewe
Relat
ProblemsDATASTAGE
"DEV'S and Solutions" 19 mins ago
37. Sequential files should be used in following conditions. When we are reading a flat file (fixed width or
TUTORIAL,GUIDES,TRAINING AND
delimited) from UNIX environment which is FTP ed from some external system
ONLINE HELP 4 U. UNIX, ETL, DATAB
38. When some UNIX operations has to be done on the file Dont use sequential file for intermediate storage RELATED
A visitor from SOLUTIONS:
Florianpolis,
Surrogate
Santa Catarin
Key
between jobs. It causes performance overhead, as it needs to do data conversion before writing and reading Generator
viewed "DEV'S
Implementation"
DATASTAGE 23 mins ago
from a UNIX file TUTORIAL,GUIDES,TRAINING AND
39. In order to have faster reading from the Stage the number of readers per node can be increased (default ONLINE HELP 4 U. UNIX, ETL, DATAB
value is one). RELATED
A visitor from SOLUTIONS:
San Francisco,
Solution
California
for "V
v
40. Usage of Dataset results in a good performance in a set of linked jobs. They help in achieving end-to-end UNLOCK
"DEV'S DATASTAGE
is not in your VOC"" 27 mins a
parallelism by writing data in partitioned form and maintaining the sort order. TUTORIAL,GUIDES,TRAINING AND
ONLINE HELP 4 U. UNIX, ETL, DATAB
41. Look up Stage is faster when the data volume is less. If the reference data volume is more, usage of Lookup
RELATED SOLUTIONS: Datastage Interv
Stage should be avoided as all reference data is pulled in to local memory A visitor from Jakarta, Jakarta Raya viewe
Questions and Answers V1.2" 43 mins ago
42. Sparse lookup type should be chosen only if primary input data volume is small. "DEV'S DATASTAGE
43. Join should be used when the data volume is high. It is a good alternative to the lookup stage and should be TUTORIAL,GUIDES,TRAINING AND
used when handling huge volumes of data. ONLINE HELP 4 U. UNIX, ETL, DATAB
44. Even though data can be sorted on a link, Sort Stage is used when the data to be sorted is huge.When we RELATED
A visitor from SOLUTIONS:
Japan viewed Datastage
"DEV'S 8.5, 8
and 9.1 Differences" 44 mins ago
DATASTAGE
sort data on link ( sort / unique option) once the data size is beyond the fixed memory limit , I/O to disk takes
TUTORIAL,GUIDES,TRAINING AND
place, which incurs an overhead. Therefore, if the volume of data is large explicit sort stage should be used ONLINE HELP 4 U. UNIX, ETL, DATAB
instead of sort on link.Sort Stage gives an option on increasing the buffer memory used for sorting this would RELATED SOLUTIONS: Datastage Scen
mean lower I/O and better performance. A visitor
Based from San Antonio,
Question/Answer Texas
:1" 56 minsviewed
ago
45. It is also advisable to reduce the number of transformers in a Job by combining the logic into a single "DEV'S DATASTAGE
TUTORIAL,GUIDES,TRAINING AND
transformer rather than having multiple transformers.
ONLINE HELP 4 U. UNIX, ETL, DATAB
46. Presence of a Funnel Stage reduces the performance of a job. It would increase the time taken by job by
Real-time view Get Feedjit
30% (observations). When a Funnel Stage is to be used in a large job it is better to isolate itself to one job. Write
the output to Datasets and funnel them in new job.
47. Funnel Stage should be run in continuous mode, without hindrance.
48. A single job should not be overloaded with Stages. Each extra Stage put in a Job corresponds to lesser MY BLOG POSTS
number of resources available for every Stage, which directly affects the Jobs Performance. If possible, big jobs
DEV'S DATAWAREHOUSING HELP GUI
having large number of Stages should be logically split into smaller units.
Datastage 11.5 Newly added features
49. Unnecessary column propagation should not be done. As far as possible, RCP (Runtime Column Propagation) 1 year ago
should be disabled in the jobs
50. Most often neglected option is dont sort if previously sorted in sort Stage, set this option to true. This
improves the Sort Stage performance a great deal
51. In Transformer Stage Preserve Sort Order can be used to maintain sort order of the data and reduce
sorting in the job.
52. Reduce the number of Stage variables used.
53. The Copy stage should be used instead of a Transformer for simple operations
54. The upsert works well if the data is sorted on the primary key column of the table which is being loaded.
55. Dont read from a Sequential File using SAME partitioning.
56. By using hashfile stage we can improve the performance. In case of hashfile stage we can define the read
cache size & write cache size but the default size is 128M.B.
57. By using active-to-active link performance also we can improve the performance.
Here we can improve the performance by enabling the row buffer, the default row buffer size is 128K.B.

Reactions: Like (2) Useful (2) Dislike (0)

http://datastageinfoguide.blogspot.in/2012/11/datastage-performance-tuning-tips.html 2/3

Você também pode gostar