2008年11月16日星期日

Single hash clustered table: 3

We have a big lookup table (30G) and there is a query which runs 12k per second against it.  5 LIOs is requred per execution. How much benifit we could get if convert it into hash clustered table? After the lots of test, the anwer is trivial.

Performance improvement could come from two aspects in comparison with index access:
1: non-concurrent environment:  Is hash access faster than index access?
2: concurrent environment: if there are many queries, say 12k,  per second, will the performance degrade much less than index access? If index access performance degrade much, but hash access only degrade slightly or even do not degrade, then hash access wins. But if index access doesn't degrade much, then hash access won't help from concurrent perspective.

In this article, I will discuss the first one.
 
It's well known that only one LIO is required using hash access if there is no chained block, and several LIOs is required using index access. And it's also well know LIO needs cpu. If the number of LIO drops 1/5 after a sql tuning, we are too quick to say the cpu will drop 1/5. There are two aspects to consider also:
A: CPU spent on LIO is not the total cpu used in a query. Sort also need cpu, but the cpu it used is not included in the LIOs. And more importantly, a query needs three phases: parse, execute and query. Normally for a query, the LIOs only appear in the query phase.
B: LIOs are not equally cost.  If you have 5 houses whose total value is $5M, after sell one house, does the value of the left 4 hours equal $4M? No, maybe the sold house cost $4M, and the rest 4 houses only cost $1M.

For point A.
The following tkprof result is generated from a simple query looped 10,000 times. The query used an index to fetch a row.  From the table, we could see the fetch phase only takes 1/3 of the total cpu time of the query.  Since we can't tune the cpu spent in the execute phase(for the parse time, maybe we could parse once and execute many times), no match how we tune the sql, the cpu can't  drop to less than 0.86 second for 10,000 executions. That means the dropping from 5LIOs to 1 LIOs couldn't save 80% cpu.

        select /*+ index(test1,idx_test1) */* from test1 where a =:v1;

Slide 13
call     count       cpu    elapsed       disk      query    current        rows             
------- ------  -------- ---------- ---------- ---------- ----------  ----------     
Parse    10000      0.28       0.24          0          0          0           0     
Execute  10000      0.86       0.84          0          0          0           0     
Fetch    10000      0.68       0.51          0      40000          0       10000     
------- ------  -------- ---------- ---------- ---------- ----------  ----------     
total    30000      1.82       1.61          0      40000          0       10000

For point B, I run a query 10k times using hash access, and then run a query 10k times using index access against the same table, and then get the time spent on LIO from the view v$sql_plan_statistics as beloew.
We could see for the hash access, the LIO need 17.5us. And for the LIOs from index access, the 3LIO from index segments used 26us, so on average each LIO on index segments is 26/3=9us. And 1LIO on data segment is 11.5us.

SQL> select x.HASH_VALUE,x.ELAPSED_TIME,x.EXECUTIONS from v$sql_plan_statistics x  where (hash_value=1683717485 or hash_value=63678345 ) and x.EXECUTIONS>100;
 
HASH_VALUE ELAPSED_TIME EXECUTIONS
---------- ------------ ----------
1683717485       175510      10000   ==> hash block access takes nearly 17.5 us
  63678345       375775      10000   ==> 1LIO on data segment take 11us (37.5-26=11.5us)
  63678345       260611      10000   ==> 3LIOs on index segment take 26us, one used 26/3=9 us.