Constraints and performance: you want as many constraints as possible

articles: 

Several times I have had to deal with people who do not want to define constraints. I have never understood why they don't, because my experience is that the more constraints you can define, the better Oracle will perform. If anyone knows where the idea that not defining constraints is a Good Thing comes from, I would be interested to know.

Following are two very simple examples of constraints allowing the optimizer to develop better plans.

First, foreign key constraints. These give the optimizer a lot of information about the data that may mean it can cut out whole tables from a query. Try this:

conn / as sysdba
drop user jon cascade;
grant dba to jon identified by jon;
conn jon/jon
alter session set optimizer_mode=all_rows;

create table t1(c1 varchar2(1) not null);
create table t2(c1 varchar2(1));
alter table t2 add constraint t2pk primary key (c1);
alter table t1 add constraint t1fk foreign key (c1) references t2;

insert into t2 values('a');
insert into t2 values('b');
begin for i in 1..10000 loop
insert into t1 values('a');
insert into t1 values('b');
end loop;
end;
/

exec dbms_stats.gather_table_stats('JON','T1');
exec dbms_stats.gather_table_stats('JON','T2');

set autotrace on
select * from t1 where c1 not in (select * from t2);
alter table t1 drop constraint t1fk;
select * from t1 where c1 not in (select * from t2);
set autotrace off

These are my results for the last two queries. With the foreign key constraint:

Execution Plan
----------------------------------------------------------
Plan hash value: 3332582666

-----------------------------------------------------------------------------------
| Id  | Operation                  | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |      |     1 |     2 |     0   (0)|          |
|*  1 |  FILTER                    |      |       |       |            |          |
|   2 |   TABLE ACCESS STORAGE FULL| T1   | 20000 | 40000 |    11   (0)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(NULL IS NOT NULL)

and now without it:
Execution Plan
----------------------------------------------------------
Plan hash value: 3341028080

-----------------------------------------------------------------------------------
| Id  | Operation                  | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |      |  5000 | 20000 |    12   (9)| 00:00:01 |
|   1 |  NESTED LOOPS ANTI         |      |  5000 | 20000 |    12   (9)| 00:00:01 |
|   2 |   TABLE ACCESS STORAGE FULL| T1   | 20000 | 40000 |    11   (0)| 00:00:01 |
|*  3 |   INDEX UNIQUE SCAN        | T2PK |     2 |     4 |     0   (0)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("C1"="T2"."C1")

So you can see that the constraint means that there is no need to make the join at all, because Oracle knows that there can be no orphan rows.

Second, not null constraints. Try this:

create table t3(c1 number not null);
begin
for i in 1..10000 loop
insert into t3 values(i);
end loop;
end;
/

create index t3i on t3(c1);

exec dbms_stats.gather_table_stats('JON','T3');

set autotrace on exp
select count(*) from t3;
alter table t3 modify c1 null;
select count(*) from t3;

My execution plan with the not null constraint is
Execution Plan
----------------------------------------------------------
Plan hash value: 1631330922

------------------------------------------------------------------------------
| Id  | Operation                     | Name | Rows  | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |      |     1 |     7   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE               |      |     1 |            |          |
|   2 |   INDEX STORAGE FAST FULL SCAN| T3I  | 10000 |     7   (0)| 00:00:01 |
------------------------------------------------------------------------------

and after dropping te not null constraint:
Execution Plan
----------------------------------------------------------
Plan hash value: 463314188

---------------------------------------------------------------------------
| Id  | Operation                  | Name | Rows  | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |      |     1 |     7   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE            |      |     1 |            |          |
|   2 |   TABLE ACCESS STORAGE FULL| T3   | 10000 |     7   (0)| 00:00:01 |
---------------------------------------------------------------------------

Obvious, isn't it? The constraint meant that Oracle did not have to touch the table at all, because every row must have an entry in the index.

So, to conclude, these very simple examples demonstrate the general case: always define as many constraints as you can, to give the optimizer the information it needs to develop efficient execution plans.

Comments

I think putting in foreign key constraints will slow down the DML's. In case of staging tables where data is inserted and deleted intensively the strategy of putting in constraints may not be useful. We had a situation like this recently where the performance of a screen used for uploading records into the staging table through webutil performed badly since the foreign keys were put with the actual base tables. In this case it's better to leave the task of validations to the batch processes rather than freezing up the screen for long. Please let me know your views.