emperor: (Default)
Add MemoryShare This Entry
posted by [personal profile] emperor at 01:48pm on 07/02/2005
I've been running some long queries on this 'ere Oracle database. Enterprise mangler provides a simple interface to the Execute plan, which I've been using since actually using EXPLAIN PLAIN looked complex. One of the columns is "Time (seconds)". A query I ran this morning said ~21,000 seconds (6 hours and a bit) but took only about 400 seconds (about 6 minutes). WTF?
Mood:: 'confused' confused
There are 13 comments on this entry. (Reply.)
chess: (Default)
posted by [personal profile] chess at 06:16am on 07/02/2005
Was it perhaps running on more than one processor or something, and hence took 21,000 seconds of some kind of 'computer time'?
emperor: (Default)
posted by [personal profile] emperor at 06:29am on 07/02/2005
I don't have a 60-processor machine...
 
posted by [identity profile] mobbsy.livejournal.com at 06:16am on 07/02/2005
First obvious question is, are stats on table and indexes up to date?
The "Time" bit is new in 10g, and can only be a best guess. It's a bit worrying that it's so bogus though.

Could you run the following at an SQL*Plus prompt and post the results:
"EXPLAIN PLAN FOR query"
"@?/rdbms/admin/utlxpls"
emperor: (Default)
posted by [personal profile] emperor at 06:26am on 07/02/2005

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3571711743

---------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name         | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                 |              |   113K|   883K|       |  1008K (19)| 03:21:42 |
|   1 |  SORT GROUP BY                   |              |   113K|   883K|   102M|  1008K (19)| 03:21:42 |
|   2 |   VIEW                           |              |  6580K|    50M|       |  1075K (24)| 03:35:03 |
|   3 |    SORT UNIQUE                   |              |  6580K|   652M|   734M|  1075K (17)| 03:35:03 |
|   4 |     CONCATENATION                |              |       |       |       |            |          |
|*  5 |      HASH JOIN                   |              |  3290K|   326M|    22M|   505K (18)| 01:41:12 |

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------
|   6 |       TABLE ACCESS FULL          | LOCATION     |   749K|    14M|       |  2502   (7)| 00:00:31 |
|*  7 |       HASH JOIN                  |              |  3290K|   263M|   414M|   486K (18)| 01:37:19 |
|*  8 |        HASH JOIN                 |              |  6581K|   338M|       |   185K (23)| 00:37:11 |
|*  9 |         TABLE ACCESS FULL        | LOCATION     |  1164 | 27936 |       |  2859  (19)| 00:00:35 |
|* 10 |         TABLE ACCESS FULL        | MOVEMENT     |    38M|  1112M|       |   179K (21)| 00:35:53 |
|* 11 |        TABLE ACCESS FULL         | MOVEMENT     |    38M|  1112M|       |   179K (21)| 00:35:53 |
|  12 |      NESTED LOOPS                |              |   473 | 49192 |       |   487K (18)| 01:37:30 |
|* 13 |       HASH JOIN                  |              |   473 | 39732 |   414M|   486K (18)| 01:37:19 |
|* 14 |        HASH JOIN                 |              |  6581K|   338M|       |   185K (23)| 00:37:11 |
|* 15 |         TABLE ACCESS FULL        | LOCATION     |  1164 | 27936 |       |  2859  (19)| 00:00:35 |
|* 16 |         TABLE ACCESS FULL        | MOVEMENT     |    38M|  1112M|       |   179K (21)| 00:35:53 |

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------
|* 17 |        TABLE ACCESS FULL         | MOVEMENT     |    38M|  1112M|       |   179K (21)| 00:35:53 |
|  18 |       TABLE ACCESS BY INDEX ROWID| LOCATION     |     1 |    20 |       |     2   (0)| 00:00:01 |
|* 19 |        INDEX UNIQUE SCAN         | LOCATION_KEY |     1 |       |       |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("A"."LKEY"="LOCATION"."KEY")
   7 - access("A"."AKEY"="B"."AKEY" AND "A"."MDAT"="B"."MDAT")
   8 - access("KEY"="B"."LKEY")

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------
   9 - filter("PTYPE"='AH' AND  REGEXP_LIKE ("POSTC",'HS[1-2]','i') AND "DDATE" IS NULL)
  10 - filter("B"."MDIR"=2 AND "B"."CSTAT"='L')
  11 - filter("A"."MDIR"=1 AND "A"."CSTAT"='L')
  13 - access("A"."AKEY"="B"."AKEY" AND "A"."MDAT"="B"."MDAT")
       filter(LNNVL("KEY"="B"."LKEY"))
  14 - access("KEY"="A"."LKEY")
  15 - filter("PTYPE"='AH' AND  REGEXP_LIKE ("POSTC",'HS[1-2]','i') AND "DDATE" IS NULL)
  16 - filter("A"."MDIR"=1 AND "A"."CSTAT"='L')
  17 - filter("B"."MDIR"=2 AND "B"."CSTAT"='L')
  19 - access("A"."LKEY"="LOCATION"."KEY")

43 rows selected.

 
posted by [identity profile] mobbsy.livejournal.com at 07:38am on 07/02/2005
Well... that's an ugly plan.

Are the row counts vaguely accurate, or are they way out too? If they're also way out then it might be one of the annoying cases where EXPLAIN PLAN isn't giving you the execution plan that's being used. Beyond that, I'm afraid I don't really know.

No chance of using some indexes? If you're doing a lot of the same sort of thing, think about function based indexes "CREATE INDEX foo ON movement (regexp_substr(postc,'HS[1-2]'));" (or whatever the exact function is). Saves having to compute the function each time for every row on the table.

emperor: (Default)
posted by [personal profile] emperor at 06:28am on 07/02/2005
done, as a large comment of pain.
the @?.... rune was entirely opaque to me, however. Presumably this is some way of outputting the contents of a table where EXPLAIN PLAN sticks its stuff by default?
 
posted by [identity profile] mobbsy.livejournal.com at 06:32am on 07/02/2005
So I see, I'll look through it :-)

"@" is just "source in this file to SQL*Plus" - @foo looks in the current directory for foo.sql and runs it.

"?" is short for $ORACLE_HOME in SQL*Plus

The rest is just a path. If you look in $ORACLE_HOME/rdbms/admin you'll find a lot of SQL scripts, utlxpls.sql is Utility Explain Plan Serial
 
posted by [identity profile] mobbsy.livejournal.com at 06:37am on 07/02/2005
...and looking at the utlxpls script, rather dates me. In 8i it was a big SQL query formatting a "SELECT ... FROM plan_table" - in 9iR2 and 10g it just wraps the dbms_xplan package.
 
posted by [identity profile] ex-lark-asc.livejournal.com at 07:24am on 07/02/2005
rather dates me

Now I have this weird mental image of you being asked out by a database management system.

I really don't want to think about how databases shag.
emperor: (eye)
posted by [personal profile] emperor at 07:28am on 07/02/2005
IRTA "inner join"
 
posted by [identity profile] sheffers.livejournal.com at 06:18am on 07/02/2005
Aaaargh, what are all the acronyms? Please enlighten my poor little head.
emperor: (Default)
posted by [personal profile] emperor at 06:29am on 07/02/2005
ThisMustBeSomeNewMeaningOfTheWord"second"OfWhichIWasPreviouslyUnaware
 
posted by [identity profile] deliberateblank.livejournal.com at 09:16am on 08/02/2005
Since I can't see an actual answer here yet...

The time estimate given by the plan processor is really just a guess. It uses rough heuristics involving the types of subquery used (table scan/index scan/bookmark lookup/filter/etc) and some statistics based on the actual data (number of rows, very broad distribution of data values within columns), but crucially it doesn't actually *do* the query at that stage so it can't know up front how long it will really take. The cost estimate is used to rank alternative plans, and is (for you) just a guideline that may alongside further experimentation affect the way you write queries to influence the plan generator in the right direction.

This has several effects, the major one being that despite all the up front analysis the plan generator may not actually result in the plan with the smallest execution time: for example a table with indexes on two columns both mentioned in the WHERE clause can be performed as an INDEX SCAN on the first column followed by a filter on the second, or vice versa, or INDEX SCAN on both then a set intersection. Which is fastest depends on the actual data and values fed into the WHERE clause, the number of matching rows you expect for each column, which is why most SQLs allow you to force a particular plan (WITH INDEX (column_name_index)) for cases when the DBMS gets it wrong.

(Sometimes, some good plans aren't even eligible unless you force them in this manner.)

Consider a query which ultimately is going to need to TABLE SCAN a large table, with a WHERE clause, no ORDER BY constraint, but an output row count limit. If specific values fed into your WHERE clause just happen to hit the start of the table's internal order it will complete immediately. The end and you're going to be waiting for a while. What estimate should EXPLAIN give for execution time? You don't want EXPLAIN itself potentially taking 21000 seconds to complete...

October

SunMonTueWedThuFriSat
      1
 
2
 
3
 
4
 
5
 
6
 
7
 
8
 
9
 
10
 
11
 
12
 
13
 
14
 
15
 
16
 
17
 
18
 
19
 
20
 
21
 
22
 
23
 
24
 
25
26
 
27
 
28
 
29
 
30
 
31