Quantcast
Channel: Oracle数据库技术文刊-汇聚全世界的甲骨文技术 » infrastructure – ORACLE数据库技术文刊-汇聚全世界的甲骨文技术 – ParnassusData诗檀软件旗下网站
Viewing all articles
Browse latest Browse all 42

Clustaghhh!

$
0
0

It doesn’t matter which bit of Oracle technology you want to use, eventually someone, somewhere, runs into the special case where something nasty happens. Here’s an edge case for people using (index) clusters – Oracle Bug  17866999 ora-1499 for cluster following rman convert

It comes from a conversation on Oracle-L where Jack van Zanen reported a problem of inconsistent results after migrating data between platforms using rman to converts some tablespaces containing index clusters. This is the starting post where he shows a query that is clearly getting the wrong answer (select where channel_number = 503 obviously shouldn’t return data with channel_number 501).

Unfortunately the “audit-trail” on this problem is incomplete because I accidentally started sending email to Jack instead of the list (reply, rather than reply all) – but given that the problem column was the cluster key of an index cluster the obvious guess was that something had gone wrong as the data for this key value had grown and chained into a second (or third) block. To confirm this I asked Jack to select the rowids reported for the cluster and use the dbms_rowid package to convert them into distinct (file_number, block_number) values:

select
	dbms_rowid.rowid_relative_fno(rowid)	file_no,
	dbms_rowid.rowid_block_number(rowid)	block_no,
	count(*)				rows_in_block
from
	&m_table	t1
group by
	dbms_rowid.rowid_relative_fno(rowid),
	dbms_rowid.rowid_block_number(rowid)
order by
	dbms_rowid.rowid_relative_fno(rowid),
	dbms_rowid.rowid_block_number(rowid)
;

With a few file and block numbers in hand, we then needed to see the results from dumping the blocks. Jack sent me the trace files, and I found that my guess was correct – and also discovered that I had been on a private discussion, so I posted the results back to the list. As you can see from that posting, the cluster key (indicated by the initial K in the flag byte, and by its presence in “tab 0″ in the block) is 4 columns long, and I’ve got entries from two blocks where the cluster keys are pointing to each other – but the cluster keys don’t hold the same value.

It’s an extraordinary corruption – the data structures appear to be totally self-consistent and perfectly clean, and yet the data content is wrong.  (Usually I find that a corruption means the data becomes unusable).

Oracle eventually tracked this down and reproduced it; it’s a problem that can arise if you are using index clusters and use rman to do a cross-platform conversion – and it’s not going to be easy to fix it. I don’t suppose many people are likely to see this bug (it’s been around for years, apparently, but there aren’t many people who use clusters, and of those perhaps very few have migrated between platforms using rman) – but at present the only workaround is to export and reimport the data; which is not nice when you’re talking about terabytes of clusters.



Viewing all articles
Browse latest Browse all 42

Trending Articles