Kbase P117178: What performance gain using idxbuild multi-threads
Autor |
  Progress Software Corporation - Progress |
Acesso |
  Público |
Publicação |
  2/15/2010 |
|
Status: Unverified
GOAL:
What performance gain using idxbuild multi-threads
GOAL:
Advantages to use idxbuild with option "-threadnum"
GOAL:
What are the index rebuild (idxbuild) enhancements that were introduced in Progress 9.1D07?
GOAL:
What are the index rebuild (idxbuild) enhancements that were introduced in OpenEdge 10?
GOAL:
What improvements have been made to the idxbuild utility in Progress 9.1D07 and OpenEdge 10?
GOAL:
How can I make an index rebuild run faster?
GOAL:
How to improve the performance of idxbuild?
GOAL:
How are sort groups used in idxbuild?
GOAL:
How are threads used in idxbuild?
FACT(s) (Environment):
All Supported Operating Systems
OpenEdge 10.x
Progress 9.1D
Progress 9.1E
FIX:
The performance of threaded index rebuild is affected by the following factors: the database layout, the number of indexes to be rebuilt, the I/O devices defined in .srt file, the CPU and OS.
Index rebuild is intensive in both CPU and IO operations, the utility can be divided into a couple phases and operations, some of them are IO bound, some are CPU bound. There will be various performance improvements over various populations of databases. Threading feature allows main process to assign work to multiple threads during external merging of multiple index groups, known as phase 2, index groups are merged in series, but concurrently by different threads. After the main process creates other threads, it also participates in the work by building index trees, known as phase 3 immediately for groups that have finished external merging, by doing so; phase 3 can be performed in parallel with phase 2.
For a machine with 12 CPUs, -threadnum 12 means at most 12 threads will start and work in parallel in phase 2, if the number of index groups for the area is equal or more than 12, 12 threads will be all in the work, obviously, this will take the most advantage of CPUs. If the number of index group is less than 12, say 5, 5 threads will start and work in parallel, if the number of index group is only one, it wouldn't take advantage of multiple threads, because there is no other group to be worked with concurrently.
External merging is I/O intensive, temp files are accessed for read and write randomly and intensively, If multiple temp files of different index groups reside in the same I/O device, threads will be competing for the resource of I/O device, this will become the bottleneck of the process, imaging a 8-lane highway with a one lane toll station, this certainly will not take much advantage of threads. To solve the problem, the feature allows temp files of different index groups to spread in different directories as much as possible. The new algorithm assigns directories defined in .srt file to index groups by order instead of using the next directory only when the prior directories are completely consumed. It's recommended that a user defines multiple I/O devices in the .srt file.
Result of a test with 8 CPUs, a 4.2GB database, 733 tables, 1333 indexes, 5 I/O devices of 2GB each specified, index rebuild with threading(8 threads): 20% improvement comparing to non threading model. For a larger database with similar layout, the improvement tends to be bigger.
FIX:
Starting with Progress 9.1D07, the idxbuild index rebuild utility contained new performance enhancements that include:
Changing the way CI (cursor index) structures are built and located.
Changing the format of a key entry inserted into a sort file to use 3 less bytes per entry.
Simplifying and reducing the time to build a key entry.
Removing unnecessary format conversions.
Modifying the algorithm used to scan a database so that free and index blocks are not recycled when a split scan is required
Improving the algorithm used to parse the .srt file used for sorting groups (-SG).
A new algorithm for the external sort and merge phases that will:
Reduce sort and merge I/O between different indexes by creating more index groups
Reduce the cost per block I/O operation by having each index group have its own set of input/output buffers and temporary disk files.
Reduce the number of read/write operations for each block in a temporary file by adding more input buffers.
The -SG parameter was introduced to define the number of index sort groups. Prior to Progress 9.1D07, the number of index groups was hard coded at 8. Currently the minimum value allowed for -SG is 8 and the maximum is 64. If -SG is not specified, a default value of 48 is used since internal testing found that this value gave the best results in most cases.
The larger the -SG value, the fewer the merge/sort operations are needed between different indexes. However this is only true if the database has many indexes residing in one or a few areas. If the indexes are distributed across many areas, a lower -SG value should be used to avoid unnecessary memory allocation and temporary file creation. Larger values of -SG require more memory allocation and file handles. The memory required for each index group can be calculated by using the formula: (-TB) * ((-TM + 1). The value obtained using this formula represents the amount of memory in kilobytes (KB) needed for each index group.
Starting in 9.1D07, there is large file support for temporary files, so a .srt file does not need to define many directories to avoid a 2GB limitation. A fewer number of defined directories reduces the number of file handle needed. However it should be mentioned, and take special note, it is important to have various directories defined for temporary index sort files in the .srt file. Idxbuild threads (introduced in OpenEdge 10) will compete for I/O and if the temporary files are located in different directories, this will help minimize any I/O bottlenecks. The sort and merge phases of idxbuild can experience higher competition for I/O than other stages, and the newer, faster sort can increase this competition. Performance improvements were made to reducing the I/O cost of the external sort and merge, but all effort should be made to spread out the I/O to minimize disk contention as much as possible.
Another change to note is the way in which these temporary sort files are created. Pre 9.1D07, if 30 directories are specified in the .srt file, 30 files will always be opened, even if not all of the files are used.
Post 9.1D07, the directories are used in the order in which they are specified. If a value of -SG 48 is used, then 48 files will be opened in the first directory specified in the .srt file. When the size limitation is reached (as defined in the .srt file), then a new file will be opened in the next directory, and so on.
As an example, say that rebuilding a certain index requires 10GB of temporary file space and -SG 48 is used to define the number of sort index groups. Given a .srt file that defines 30 directories with a size limit of .5GB, we could potential open 960 files, though not necessarily all files would be opened at the same t.ime:
10GB / .5GB = 20
48 * 20 = 960
As long as the operating system does not complain about the number of file handles in use at any one time, this is not a problem.
If however, the first (or only) directory defined in the .srt file was sized at 10GB (or more), then only 48 files would be opened.
The disadvantage of using more than one set of temporary files it that the idxbuild process has to switch between different files during I/O operations of the first phase of the idxbuild. While the impact of this is relatively small, it is something to be aware of.
In general, if there are no constraints on memory, disk space and the number of active file handles, the value used for -SG should be based on the average number of indexes in one area and the size of those indexes. The following are test case recommendations:
For a database of 4.5GB in size, with 733 tables and 1333 indexes defined in one area, use -SG 64
For a database of 2GB in size, with 20 areas and 315 indexes, where most of the index data (in size) is contained in 17 indexes residing in one area, use -SG of 8 - 12.
For a database of 10GB in size, with 3 tables in one area, where most of the data (in size) is contained in one table and one index, use -SG 8.
Starting with OpenEdge 10, the idxbuild rebuild utility added the option to perform a threaded index rebuild for Enterprise database licenses. This allows the main idxbuild process to assign work to multiple threads during the external merging of multiple index sort groups. This phase of the idxbuild, commonly referred to as Phase 2, merges the index groups in series,. Adding threading allows multiple groups to perform this operation concurrently. After creating threads, the main process also participates in the work of building index trees for groups that have finished the external merging. This phase is commonly referred to as phase 3 and with threading, can be performed in parallel with phase 2.
The external merging is very I/O intensive as temporary files are accessed for reads and writes. If multiple temporary files of different index groups reside on the same I/O device threads may find high contention for that device. To help alleviate this, directories defined in the .srt file are assigned to index groups by order, instead of using the next directory defined only when the prior directories are completely filled. Therefore, the general recommendation is to define multiple I/O devices in the .srt file.
By default, idxbuild will use threading (-thread 1) and will set the number of threads to equal the number of CPU's available on the system. However, if an area has only one index to rebuild, or only one index is specified to rebuild, idxbuild will run without threads.
Also note that there is an internal priority to temporary files. If the -SS option is used, then the location specified is used to find the .srt file. If -SS is not used, the current directory is searched for a .srt file. If found, that file is used. If -T was specified, then that directory will be used for the temporary files (but only if -SS was not used and there is no .srt file located in the current working directory). Finally, if none of the above, the current working directory is used as the location of the temporary sort files..