Automatic Segment Space Management (ASSM) helps to distribute your data across the data blocks. It’s no longer necessary to take care about the parameter PCTUSED, which was in former years responsible for the space within one Oracle block. Those days a block was filled until it reaches the high water mark PCTFREE (default 10%). At that point the block was released from the list of free blocks and as long as the low water mark PCTUSED (default 40%) hasn’t been reached again no data will be stored in that block. So only if a massive amount of data will be deleted from the table first the block will be filled again with new data.
With ASSM a bitmap takes care about the filling of a block. So if one row is going to be deleted from a block the next new row might be inserted again helping to keep the free space of a block low and so avoiding fragmentation. But is that true?
Example
The table we are using is called orders and no wonder it includes the information of orders being processed over a couple of years. Orders are added every day (we are an internet company) and over the last years the table was growing and growing. With the old “ANALYZE” command we can check the current block usage of the table:
ANALYZE TABLE orders COMPUTE STATISTICS; SELECT table_name, num_rows, blocks, empty_blocks, avg_space, avg_row_len FROM user_tables WHERE table_name = 'ORDERS'; TABLE_NAME NUM_ROWS BLOCKS EMPTY_BLOCKS AVG_SPACE AVG_ROW_LEN ----------- --------- --------- ------------ ---------- ----------- ORDERS 4000000 129309 611 861 231
This is the size and layout of the orders segment before any kind of reorganizations end of November 2014. Right before Christmas season we realized that we need to get rid of old data as the current size is about approx. 1 GB. Okay that’s not really huge but for this example it should be sufficient.
So we are going to archive the table and delete all order entered before January 1st 2014 and with the orderstatus “G” which meant those orders are closed.
DELETE FROM orders WHERE orderdate < to_date('01.01.2014','DD.MM.YYYY') AND orderstatus = 'G'; COMMIT;
Looking at the space usage of the table using ANALYZE TABLE again we can see that the total number of blocks hasn’t changed as we expected but the average space (AVG_SPACE) is about 4 KB now so half of each block is empty – on average. So there might be some blocks still filled up to the PCTFREE (like those blocks holding data from 2014) and some blocks are nearly empty (like those from 2010 where probably all orders are closed).
TABLE_NAME NUM_ROWS BLOCKS EMPTY_BLOCKS AVG_SPACE AVG_ROW_LEN ----------- ---------- -------- ------------ ---------- ----------- ORDERS 2166150 129309 611 4165 229
New orders are continuously entered into the system and at the end of January 2015 – sorry, we are a little ahead of current time – we’ll check about the space usage again (ANALYZE TABLE):
TABLE_NAME NUM_ROWS BLOCKS EMPTY_BLOCKS AVG_SPACE AVG_ROW_LEN ----------- ---------- -------- ------------ ---------- ----------- ORDERS 2336150 129309 611 3867 229
Approx. 200 000 orders have been entered since November 2014 but the size of the segment hasn’t increased (still 129 309 blocks). But the average space has been reduced by 10 percent.
This is nice to know and expected as we wanted to reduce the space usage of this segment. But does this have an impact on the performance of the application? – YES!
Let’s do three very simple queries:
Statement 1:
SELECT * FROM orders WHERE orderdate >= to_date('12.10.2014','DD.MM.YYYY') AND orderdate < to_date('13.10.2014','DD.MM.YYYY') call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 1 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 201 0.00 0.00 10 502 0 2996 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 203 0.00 0.01 10 503 0 2996
Statement 2:
SELECT * FROM orders WHERE orderdate >= to_date('14.12.2014','DD.MM.YYYY') AND orderdate < to_date('15.12.2014','DD.MM.YYYY'); call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- --------- Parse 1 0.00 0.00 0 1 0 0 Execute 1 0.00 0.00 0 0 0 Fetch 185 0.00 0.01 155 516 0 2760 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 187 0.00 0.01 155 517 0 2760
Statement 3:
SELECT * FROM orders WHERE orderdate >= to_date('14.01.2015','DD.MM.YYYY') AND orderdate < to_date('15.01.2015','DD.MM.YYYY'); call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 202 0.00 0.00 0 563 0 3002 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 204 0.00 0.00 0 563 0 3002
There is an index on the orderdate so we expect several blocks being read by the index.
Explanation
If you especially compare statement 1 with statement 3 you can see an increase of more than 10 percent in the number of blocks (from 503 to 563) while the number of rows is nearly the same.
The first query is running against orders, which were entered before our archiving and without the impact of the delete. The second and third query is running against orders, which were entered after the reorganization.
With the orders from January 2014 until November 2014 the blocks are packed to 90% of current data (PCTFREE 10%). But starting with December 2014 the blocks are reused so there is a great chance that old data (from 2013, 2012 or even older) is in the same block as the brand new “hot” data used in daily business. After a while or after all the “old” blocks are filled again to 90% the performance will increase again as then new empty blocks will be allocated for current data.
Conclusion
This example shows that there is no “one fit’s all” solution available even with new Oracle features. If you have tables like the one in the example there is a chance for data fragmentation due to reusage of blocks. In that case it might be feasible to reorganize the table right after the deletion of old data. I would recommend storing those kinds of tables in their own tablespaces. In case of reorganization you can simply move the table into a new tablespace and drop the old one. Yes, you temporarily need twice the space for the table but it’s the cleanest way.
This example is, of course, only valid for ordinary tables and not for partitioned tables – as long as they are partitioned based on the orderdate.
One question to the audience: Is there a way to get the information about avg_space, avg_row_length and chain_cnt without using ANALYZE TABLE ? Any idea and, of course, comments on the blog will be greatly appreciated.