Quantcast
Channel: Oracle数据库技术文刊-汇聚全世界的甲骨文技术 » infrastructure – ORACLE数据库技术文刊-汇聚全世界的甲骨文技术 – ParnassusData诗檀软件旗下网站
Viewing all articles
Browse latest Browse all 42

Sorted Hash Clusters RIP

$
0
0

Sorted Hash Clusters have been around for several years, but I’ve not yet seen them being used, or even investigated in detail. This is a bit of a shame, really, because they seem to be engineered to address a couple of interesting performance patterns.

The basic concept is that data items that look alike are stored together (clustered) by applying a hashing function to generate a block address; but on top of that, if you query the data by “hashkey”, the results are returned in sorted order of a pre-defined “sortkey” without any need for sorting. (On top of everything else, the manuals describing what happens and how it works are wrong).

Yesterday I had reason to take a closer look at them, and decided that perhaps the reason no one talks about them is that they simply aren’t safe.  Here’s a trivial demonstration, which I’ve run on 10.2.0.5, 11.2.0.3, and 12.1.0.1:


execute dbms_random.seed(0)

create cluster sorted_hash_cluster (
	hash_value	number(6,0),
	sort_value	varchar2(2)	sort
)
size 300
hashkeys 100
;

create table sorted_hash_table (
	hash_value	number(6,0),
	sort_value	varchar2(2),
	v1		varchar2(10),
	padding		varchar2(30)
)
cluster sorted_hash_cluster (
	hash_value, sort_value
)
;


begin
	for i in 1..5000 loop
		insert into sorted_hash_table values(
			trunc(dbms_random.value(0,99)),
			dbms_random.string('U',2),
			lpad(i,10),
			rpad('x',30,'x')
		);
		commit;
	end loop;
end;
/

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'sorted_hash_table'
	);
end;
/

select count(*) from sorted_hash_table where hash_value = 92;
select count(*) from sorted_hash_table where hash_value = 92 and sort_value is null;
select count(*) from sorted_hash_table where hash_value = 92 and sort_value is not null;

select * from sorted_hash_table where hash_value = 92 and sort_value >= 'YR';
select * from sorted_hash_table where hash_value = 92 and sort_value > 'YR';

I think the nature of the last two queries is exactly the type for which the feature has been invented – just check the results, which come from a cut-n-paste after setting echo on:


SQL> select count(*) from sorted_hash_table where hash_value = 92;

  COUNT(*)
----------
        60

1 row selected.

SQL> select count(*) from sorted_hash_table where hash_value = 92 and sort_value is null;

  COUNT(*)
----------
        60

1 row selected.

SQL> select count(*) from sorted_hash_table where hash_value = 92 and sort_value is not null;

  COUNT(*)
----------
        60

1 row selected.

SQL> select * from sorted_hash_table where hash_value = 92 and sort_value >= 'YR';

HASH_VALUE SO V1         PADDING
---------- -- ---------- ------------------------------
        92 YR       4773 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
        92 ZF        250 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
        92 ZJ       2046 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
        92 ZT         65 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

4 rows selected.

SQL> 
SQL> select * from sorted_hash_table where hash_value = 92 and sort_value > 'YR';

no rows selected


So: Null is not null, and ‘ZF’ is not greater than ‘YR’, it’s only greater than or equal to ‘YR’ !
I’d be interested to see the test cases that the developer used for this feature that allowed it to ship at all.



Viewing all articles
Browse latest Browse all 42

Latest Images

Trending Articles



Latest Images