Friday, July 11, 2014

PL/SQL: Stop Making the Same Performance Mistakes

PL/SQL: Stop Making the Same Performance Mistakes

PL/SQL is great, but like any programming language it is capable of being misused. This article highlights the common performance mistakes made when developing in PL/SQL, turning what should be an elegant solution into a resource hog. This is very much an overview, but each section includes links to the relevant articles on this site that discuss the topic in greater depth, including example code, so think of this more like a check-list of things to avoid, that will help you get the best performance from your PL/SQL.

Stop using PL/SQL when you could use SQL

The first sentence in the first chapter of the PL/SQL documentation states the following.
"PL/SQL, the Oracle procedural extension of SQL, is a portable, high-performance transaction-processing language."
So PL/SQL is an extension to SQL, not a replacement for it. In the majority of cases, a pure SQL solution will perform better than one made up of a combination of SQL and PL/SQL. Remember, databases are designed to work with sets of data. As soon as you start to process data in a row-by-row (or slow-by-slow) manner, you are stopping the database from doing what it does best. With that in mind, a PL/SQL programmer should aim to be an expert in SQL that knows a bit of PL/SQL, rather than an expert in PL/SQL that knows a little bit of SQL.
SQL has evolved greatly over the last 20 years. The introduction of features like analytic functions and SQL/XML mean you can perform very complex tasks directly from SQL. The following points describe some of the common situations where people use PL/SQL when SQL would be more appropriate.
  • Stop using UTL_FILE to read text files if you can external tables. Using the UTL_FILE package to read data from flat files is very inefficient. Since Oracle 7 people have been using SQL*Loader to improve performance, but since Oracle 9i the recommended way to read data from flat files is to use external tables. Not only is is more efficient by default, but it is easy to read the data in parallel and allows preprocessor commands to do tasks like unzipping files on the fly before reading them. In many cases, your PL/SQL load process can be replaced by a single INSERT ... SELECT statement with the data sourced from an external table.
  • Stop writing PL/SQL merges if you can use the MERGE statement. Merging, or upserting, large amounts of data using PL/SQL is a terrible waste of resources. Instead you should use the MERGE statement to perform the action in a single DML statement. Not only is it quicker, but it looks simpler and is easily made to run in parallel.
  • Stop coding multitable insert manually. Why send multiple DML statements to the server when an action can be performed in a single multitable insert? Since Oracle 9i multitable inserts have provided a flexible way of reducing round-trips to the server.
  • Stop using bulk binds (FORALL) when you can use DML error logging (DBMS_ERRLOG) to trap failures in DML. By default, if a single row in a DML statement raises an exception, all the work done by that DML statement is rolled back. In the past this meant operations that were logically a single INSERT ... SELECTUPDATE or DELETE statement affecting multiple rows had to be coded as a PL/SQL bulk operation using the FORALL ... SAVE EXCEPTIONS construct, for fear that a single exception would trash the whole process. Oracle 10g Release 2 introduced DML error logging, allowing us to revert back to using a single DML statement to replace the unnecessary bulk bind operation.
The thing to remember about all these points is they replace PL/SQL with DML. In addition to them being more efficient, provided the server has enough resources to cope with it, it is very easy to make them even faster on large operations by running them in parallel. Making PL/SQL run in parallel is considerably more difficult in comparison (see parallel-enabled pipelined table functions and DBMS_PARALLEL_EXECUTE).

Stop avoiding bulk binds

Having just told you to avoid bulk binds in favor of single DML statements, I'm now going to tell you to stop avoiding bulk binds where they are appropriate. If you are in a situation where a single DML statement is not possible and you need to process many rows individually, you should use bulk binds as they can often provide an order of magnitude performance improvement over conventional row-by-row processing in PL/SQL.
Bulk binds have been available since Oracle 8i, but it was the inclusion of record processing in bulk bind operations in Oracle 9i Release 2 that made them significantly easier to work with.
The BULK COLLECT clause allows you to pull multiple rows back into a collection. The FORALL construct allows you to bind all the data in a collection into a DML statement. In both cases, the performance improvements are achieved by reducing the number of context switches between PL/SQL and SQL that are associated with row-by-row processing.

Stop using pass-by-value (NOCOPY)

As the Oracle database and PL/SQL have matured it has become increasingly common to work with large objects (LOBs)collections and complex object types, such as XMLTYPE. When these large and complicated types are passed as OUT and IN OUT parameters to procedures and functions, the default pass-by-value processing of these parameters can represent a significant performance overhead.
The NOCOPY hint allows you to switch from the default pass-by-value to pass-by-reference, eliminating this overhead. In many cases, this can represent a significant performance improvement with virtually no effort.

Stop using the wrong data types

When you use the wrong data types, Oracle is forced to do an implicit conversion during assignments and comparisons, which represents an unnecessary overhead. In some cases this can lead to unexpected and dramatic issues, like preventing the optimizer from using an index or resulting in incorrect date conversions.
Oracle provide a variety of data types, many of which have dramatically difference performance characteristics. Nowhere is this more evident than with the performance of numeric data types.
Make sure you pick the appropriate data type for the job you are doing!

Quick Points


No comments:

Post a Comment

How to improve blog performance

Improving the performance of a blog can involve a variety of strategies, including optimizing the website's technical infrastructure, im...