Monday 2 September 2013

Optimizing SQL Query

  • Optimizing the WHERE clause - there are many cases where index access path of a column of the WHERE clause is not used even if the index on that column has already been created. Avoid such cases to make best use of the indexes, which will ultimately improve the performance. Some of these cases are: COLUMN_NAME IS NOT NULL (ROWID for a null is not stored by an index), COLUMN_NAME NOT IN (value1, value2, value3, ...), COLUMN_NAME != expression, COLUMN_NAME LIKE'%pattern' (whereas COLUMN_NAME LIKE 'pattern%' uses the index access path), etc. Usage of expressions or functions on indexed columns will prevent the index access path to be used. So, use them wisely!
  • Using WHERE instead of HAVING - usage of WHERE clause may take advantage of the index defined on the column(s) used in the WHERE clause.
  • Using the leading index columns in WHERE clause - the WHERE clause may use the complex index access path in case we specify the leading index column(s) of a complex index otherwise the WHERE clause won't use the indexed access path.
  • Indexed Scan vs Full Table Scan - Indexed scan is faster only if we are selcting only a few rows of a table otherwise full table scan should be preferred. It's estimated that an indexed scan is slower than a full table scan if the SQL statement is selecting more than 15% of the rows of the table. So, in all such cases use the SQL hints to force full table scan and suppress the use of pre-defined indexes. Okay... any guesses why full table scan is faster when a large percentage of rows are accessed? Because an indexed scan causes multiple reads per row accessed whereas a full table scan can read all rows contained in a block in a single logical read operation.
  • Using ORDER BY for an indexed scan - the optimizer uses the indexed scan if the column specified in the ORDER BY clause has an index defined on it. It'll use indexed scan even if the WHERE doesn't contain that column (or even if the WHERE clause itself is missing). So, analyze if you really want an indexed scan or a full table scan and if the latter is preferred in a particular scenario then use 'FULL' SQL hint to force the full table scan.
  • Minimizing table passes - it normally results in a better performance for obvious reasons.
  • Joining tables in the proper order - the order in which tables are joined normally affects the number of rows processed by that JOIN operation and hence proper ordering of tables in a JOIN operation may result in the processing of fewer rows, which will in turn improve the performance. The key to decide the proper order is to have the most restrictive filtering condition in the early phases of a multiple table JOIN. For example, in case we are using a master table and a details table then it's better to connect to the master table first to connecting to the details table first may result in more number of rows getting joined.
  • Simple is usually faster - yeah... instead of writing a very complex SQL statement, if we break it into multiple simple SQL statements then the chances are quite high that the performance will improve. Make use of the EXPLAIN PLAN and TKPROF tools to analyze both the conditions and stick to the complex SQL only if you're very sure about its performance.
  • Using ROWID and ROWNUM wherever possible - these special columns can be used to improve the performance of many SQL queries. The ROWID search is the fastest for Oracle database and this luxury must be enjoyed wherever possible. ROWNUM comes really handy in the cases where we want to limit the number of rows returned.
  • Usage of explicit cursors is better - explicit cursors perform better as the implicit cursors result in an extra fetch operation. Implicit cursosrs are opened the Oracle Server for INSERT, UPDATE, DELETE, and SELECT statements whereas the explicit cursors are opened by the writers of the query by explicitly using DECLARE, OPEN, FETCH, and CLOSE statements.
  • Reducing network traffic - Arrays and PL/SQL blocks can be used effectively to reduce the network traffic especially in the scenarios where a huge amount of data requires processing. For example, a single INSERT statement can insert thousands of rows if arrays are used. This will obviously result into fewer DB passes and it'll in turn improve performance by reducing the network traffic. Similarly, if we can club multiple SQL statements in a single PL/SQL block then the entire block can be sent to Oracle Server involving a single network communication only, which will eventually improve performance by reducing the network traffic.
  • Using Oracle parallel query option - Since Oracle 8, even the queries based on indexed range scans can use this parallel query option if the index is partitioned. This feature can result in an improved performance in certain scenarios.

No comments:

Post a Comment