With Oracle 12.1.0.2 the brand new database option named inMemory has been introduced. There are many blogs and articles out in the field so I don’t want to go in to much detail of the main concept. To me “inmemory” sounds a little obvious because whenever we are working with the Oracle database most of our data should be in Memory – right? But there is another naming for it: Column store. So it stores the data in columns instead of rows? Due to the Oracle documentation this isn’t true because “The columnar format exists only in memory (Concepts Guide)”. So the concept of inMemory is that an area exists in memory as part of the sga and the data in this area is no longer populated in row format like the buffer cache but in a column format. Especially Datawarehouses and Online Analytic Systems with benefit but because there is no difference in the storage of data the option can be simply enabled and used in OLTP systems as well. For the main concept I would recommend reading the Oracle Concepts Guide and the whitepaper “Oracle Database In-Memory”.
Real Life
Some of my customers tried inMemory more or less successful. Most of the time the reason for an unsuccessful test was that the data was not suitable for inMemory (e.g. no star schema, too complex queries, insufficient amount of data). But at least I had an appointment with a customer who is mainly specialized on Business Intelligence solutions. So a perfect fit for inMemory. We did some basic tests in their environment and without any changes to the application the queries run 10 to 20 times faster (and even more). I’m not allowed to use the customer queries over here but I’ve created a test environment in my own lab and here are some results.
Examples
The queries are based on a star flake schema with one fact table named “UMSAETZE” (German for “REVENUE”) with about 145 million rows and four dimension tables named “ARTIKEL”; “REGION”, “ZEITEN” and “KUNDEN” (German for “ITEM”, “REGION”, “DATES” and “CUSTOMERS”).
The first query ist simply this:
SELECT count(*) FROM umsatz;
The second query is more complex:
SELECT sum(u.umsatz) as gesamtumsatz, a.artikelgruppe, z.monat_des_jahres, z.jahreszahl, r.region, k.plz, k.ort FROM region r INNER JOIN umsaetze u ON (r.region_id = u.region_id) INNER JOIN kunden k ON (k.kunden_id = u.kunden_id) INNER JOIN artikel a ON (a.artikel_id = u.artikel_id) INNER JOIN zeiten z ON (zeiten_id = zeit_id) WHERE r.region = 'Nord' AND z.jahreszahl in (2014,2015) AND z.monat_des_jahres = 12 GROUP BY a.artikelgruppe, z.monat_des_jahres, z.jahreszahl, k.plz, k.ort, r.region FETCH FIRST 10 PERCENT ROWS ONLY;
To allow tables to be populated into the column store the parameter inmemory_size must be set. As the inmemory area is part of the sga this parameter must be set in conjunction with the sga_target or memory_target parameter. In my sample database the sga_target was about 10 GB while the inmemory_size was 5 GB. The inMemory area is not a cache like the buffer cache so if a table is too huge or the area is too small tables might be populated partly and an error message will be written to the alert log.
Populate the tables
Tables can be populated with the “ALTER TABLE … INMEMORY” command. Even though this is the default I would not recommend using it. Because with this command a table will only be populated when it is used the first time (e.g. with an DML or a SELECT command). Instead I would recommend to populate the most important fact table first and next all tables associated with it:
SQL> ALTER TABLE umsaetze INMEMORY PRIORITY CRITICAL; SQL> ALTER TABLE artikel INMEMORY PRIORITY HIGH; SQL> ALTER TABLE kunden INMEMORY PRIORITY HIGH; SQL> ALTER TABLE region INMEMORY PRIORITY HIGH; SQL> ALTER TABLE zeiten INMEMORY PRIORITY HIGH;
With this commands the table UMSAEZE will be populated immediately and whenever the instance is being started. After the table has been loaded completely into memory the rest of the tables will be populated as long as there is succifient space in the inmemory area.
The views v$im_segments and v$im_user_segments can be used to validate that the tables have been populated. I granted SELECT on the v$im_user_segments to my test user so there is no need to use a privileged user for the verification.
SELECT inst_id, segment_name, inmemory_size, bytes, bytes_not_populated, populate_status FROM v$im_user_segments; SEGMENT_NAME INMEMORY_SIZE BYTES BYTES_NOT_POPULATED POPULATE_ -------------------- ------------- ---------- ------------------- --------- REGION 1179648 393216 0 COMPLETED KUNDEN 24313856 83886080 0 COMPLETED UMSAETZE 2710831104 6978273280 0 COMPLETED ZEITEN 1179648 131072 0 COMPLETED
One strange behavior which I cannot explain yet: the table “ARTIKEL” is not populated at all. I tried with several options and even with “PRIORITY CRITICAL” the table is not listed. As this is a very small table (only 300 rows) it doesn’t affect the overall query result.
Now we are ready to start our first test:
1. Query 1 with the NO_INMEMORY hint
call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 2 11.32 39.68 308384 308455 0 1 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 11.32 39.68 308384 308455 0 1 Misses in library cache during parse: 0 Optimizer mode: ALL_ROWS Parsing user id: 105 Number of plan statistics captured: 1 Rows (1st) Rows (avg) Rows (max) Row Source Operation ---------- ---------- ---------- --------------------------------------------------- 1 1 1 SORT AGGREGATE (cr=308455 pr=308384 pw=0 time=39683926 us) 144500000 144500000 144500000 INDEX FAST FULL SCAN PK_UMSAETZE (cr=308455 pr=308384 pw=0 time=60586018 us cost=81707 size=0 card=144500000)(object id 92071)
2. Query 2 without the hint
call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.03 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 2 0.95 1.14 0 12 0 1 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 0.95 1.17 0 12 0 1 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 105 Number of plan statistics captured: 1 Rows (1st) Rows (avg) Rows (max) Row Source Operation ---------- ---------- ---------- --------------------------------------------------- 1 1 1 SORT AGGREGATE (cr=12 pr=0 pw=0 time=1144222 us) 144500000 144500000 144500000 TABLE ACCESS INMEMORY FULL UMSAETZE (cr=12 pr=0 pw=0 time=1130484 us cost=8708 size=0 card=144500000)
Result 1
The result is really impressive. Instead of approximately 40 seconds the query took only one second.
But what about the more complex query?
3. Query 2 with the NO_INMEMORY hint
all count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.04 0.04 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 88 29.36 69.63 854299 849731 0 1304 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 90 29.41 69.68 854299 849731 0 1304 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 105 Number of plan statistics captured: 1 Rows (1st) Rows (avg) Rows (max) Row Source Operation ---------- ---------- ---------- --------------------------------------------------- 1304 1304 1304 VIEW (cr=849731 pr=854299 pw=4619 time=69638236 us cost=268671 size=217524901 card=1587773) 13031 13031 13031 WINDOW BUFFER (cr=849731 pr=854299 pw=4619 time=69645196 us cost=268671 size=130197386 card=1587773) 13031 13031 13031 HASH GROUP BY (cr=849731 pr=854299 pw=4619 time=69609961 us cost=268671 size=130197386 card=1587773) 511356 511356 511356 HASH JOIN (cr=849731 pr=854299 pw=4619 time=56529021 us cost=238350 size=130197386 card=1587773) 374 374 374 TABLE ACCESS FULL ARTIKEL (cr=5 pr=0 pw=0 time=115 us cost=3 size=5984 card=374) 511356 511356 511356 HASH JOIN (cr=849726 pr=854299 pw=4619 time=56394671 us cost=238343 size=104793018 card=1587773) 1000000 1000000 1000000 TABLE ACCESS FULL KUNDEN (cr=9683 pr=9678 pw=0 time=54447 us cost=2745 size=21996920 card=999860) 511356 511356 511356 HASH JOIN (cr=840043 pr=839971 pw=0 time=43368955 us cost=229777 size=69862012 card=1587773) 1351 1351 1351 TABLE ACCESS FULL REGION (cr=41 pr=0 pw=0 time=556 us cost=13 size=13734 card=1526) 3128202 3128202 3128202 HASH JOIN (cr=840002 pr=839971 pw=0 time=85763697 us cost=229748 size=222288150 card=6351090) 32 32 32 TABLE ACCESS FULL ZEITEN (cr=14 pr=0 pw=0 time=154 us cost=6 size=616 card=56) 144500000 144500000 144500000 TABLE ACCESS FULL UMSAETZE (cr=839988 pr=839971 pw=0 time=41869421 us cost=229378 size=3468000000 card=144500000)
4. Query 2 without a hint
call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.31 0.37 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 88 1.83 2.74 4650 26 0 1304 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 90 2.14 3.11 4650 26 0 1304 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 105 Number of plan statistics captured: 1 Rows (1st) Rows (avg) Rows (max) Row Source Operation ---------- ---------- ---------- --------------------------------------------------- 1304 1304 1304 VIEW (cr=26 pr=4650 pw=4619 time=2735009 us cost=47289 size=217524901 card=1587773) 13031 13031 13031 WINDOW BUFFER (cr=26 pr=4650 pw=4619 time=2736637 us cost=47289 size=130197386 card=1587773) 13031 13031 13031 HASH GROUP BY (cr=26 pr=4650 pw=4619 time=2728219 us cost=47289 size=130197386 card=1587773) 511356 511356 511356 HASH JOIN (cr=26 pr=4650 pw=4619 time=2316748 us cost=16968 size=130197386 card=1587773) 374 374 374 TABLE ACCESS INMEMORY FULL ARTIKEL (cr=5 pr=0 pw=0 time=28765 us cost=1 size=5984 card=374) 511356 511356 511356 HASH JOIN (cr=21 pr=4650 pw=4619 time=2211160 us cost=16964 size=104793018 card=1587773) 1000000 1000000 1000000 TABLE ACCESS INMEMORY FULL KUNDEN (cr=3 pr=0 pw=0 time=72471 us cost=225 size=21996920 card=999860) 511356 511356 511356 HASH JOIN (cr=18 pr=0 pw=0 time=951774 us cost=10917 size=69862012 card=1587773) 1351 1351 1351 TABLE ACCESS INMEMORY FULL REGION (cr=3 pr=0 pw=0 time=169990 us cost=4 size=13734 card=1526) 3128202 3128202 3128202 HASH JOIN (cr=15 pr=0 pw=0 time=645427 us cost=10897 size=222288150 card=6351090) 32 32 32 JOIN FILTER CREATE :BF0000 (cr=3 pr=0 pw=0 time=14294 us cost=4 size=616 card=56) 32 32 32 TABLE ACCESS INMEMORY FULL ZEITEN (cr=3 pr=0 pw=0 time=14084 us cost=4 size=616 card=56) 3128202 3128202 3128202 JOIN FILTER USE :BF0000 (cr=12 pr=0 pw=0 time=227502 us cost=10529 size=3468000000 card=144500000) 3128202 3128202 3128202 TABLE ACCESS INMEMORY FULL UMSAETZE (cr=12 pr=0 pw=0 time=188217 us cost=10529 size=3468000000 card=144500000)
Result 2
Again a fantastic performance gain: 70 seconds without inmemory and only 3 (!) seconds with inMemory.
As a first result inMemory looks very promising. The customer was impressed and we went to the next stage: the preproduction database. Unfortunately that was a RAC database – and not an Exadata.
In my next blog I will tell you about the challenges when using inMemory with RAC.