原创

Oracle model子句

示例表:

create table sales_fact as
select country_name country,country_subregion region,prod_name product,calendar_year year,calendar_week_number week,
sum(amount_sold) sale,
sum(amount_sold*(
case
  when mod(rownum,10)=0 then 1.4
    when mod(rownum,5)=0 then 0.6
      when mod(rownum,2)=0 then 0.9
        when mod(rownum,2)=1 then 1.2
          else 1 end)) receipts
            from sh.sales,sh.times,sh.customers,sh.countries,sh.products
            where sales.time_id=times.time_id and
            sales.prod_id=products.prod_id and
            sales.cust_id=customers.cust_id and
            customers.country_id=countries.country_id
            group by
            country_name,country_subregion,prod_name,calendar_year,calendar_week_number;

需求:每年每周的库存
库存公式:
inventory for (year,week) = inventory(year,prior week)-quantity sold in this week+quantity received in this week
转化为SQL中的规则:

rules automatic
 order(inventory [ year, week ] = nvl(inventory [ cv(year), cv(week) - 1 ], 0) - sale [ cv(year), cv(week) ] + receipts [ cv(year), cv(week) ])

cv函数:cv表示现值,可用来表示从规则左侧计算得来的规则右侧的列值。例如:cv(year)指的是规则左侧year列的值。
automatic order:允许数据库引擎自动识别规则之间的依赖关系。不能用于符号标记。
sequential order:指定了规则按照其在列表中的先后顺序进行求解。不能用于符号标记。
最终含model子句的SQL:

select product, country, year, week, inventory, sale, receipts
  from sales_fact
 where country in ('Australia')
   and product = 'Xtend Memory' model return updated rows partition
 by(product, country) dimension by(year, week)
 measures(0 inventory, sale, receipts) rules automatic
 order(inventory [ year, week ] = nvl(inventory [ cv(year), cv(week) - 1 ], 0) - sale [ cv(year), cv(week) ] + receipts [ cv(year), cv(week) ])
 order by product, country, year, week;

model return updated rows声明这个语句使用model子句。使用model子句的SQL语句中,有3组列:分区列、维度列、度量值列。分区列(partition by)类似于电子表格中的一张工作表。维度列(dimension by)类似于行标签(A,B,C...)和列标签(1,2,3...)。度量值列(measures)类似于含有公式的单元格。
从数学的观点来看,model子句实现了分区数组。维度列是指向数据元素的索引,每个数组元素,也称为单元格,就是一个度量值列。
规则实现公式的作用来得出度量值列,它是在一个分区的范围内操作的,因此在规则中并没有显示的提到分区。
model子句中使用的partition关键字,与大表分区过程中使用的对象分区方案是不同的。

cv函数能够引用一个单元格,还能够使用位置或符号标记引用某个单独的单元格或单元格组。也可以使用for循环来以类似数组的方式创建或修改多个单元格。
位置标记能够在结果集中插入一个新单元格或更新一个已有的单元格。

select product, country, year, week, inventory, sale, receipts
  from sales_fact
 where country in ('Australia')
   and product = 'Xtend Memory' model return updated rows partition
 by(product, country) dimension by(year, week)
 measures(0 inventory, sale, receipts) rules automatic
 order(inventory [ year, week ] = nvl(inventory [ cv(year), cv(week) - 1 ], 0) - sale [ cv(year), cv(week) ] + receipts [ cv(year), cv(week) ], sale [ 2002, 1 ] = 0, receipts [ 2002, 1 ] = 0)
 order by product, country, year, week;

数据行中并没有2002年第一周的数据,可以插入一个新的单元格(如果有数据,则是更新)来初始化2002年第一周的列值。如上面规则中的sale [ 2002, 1 ] = 0, receipts [ 2002, 1 ] = 0。在方括号中,值的位置与维度子句中所申明的列的顺序是对应的。
符号标记能够在规则左侧声明一定的范围值。符号标记只能更新已有单元格,不能加入新的单元格。
例如:将2000年和2001年第1、52、53周的sale列的值更新为实际的110%。因为没有week为53,所以没有相应的行返回。

select product, country, year, week, sale
  from sales_fact
 where country in ('Australia')
   and product = 'Xtend Memory' model return updated rows partition
 by(product, country) dimension by(year, week)
 measures(sale) rules(sale[year in (2000,2001),week in (1,52,53)] order by year,week =sale[cv(year), cv(week)]*1.10)
 order by product, country, year, week;

该SQL仅返回了4行数据,尽管还有其他周的数据行。return updated rows子句控制了这一行为。如果没有这个子句,更新了的和没有更新的数据行都会返回。return updated rows只取出被规则修改过的数据行。

select product, country, year, week, sale
  from sales_fact
 where country in ('Australia')
   and product = 'Xtend Memory' model partition
 by(product, country) dimension by(year, week)
 measures(sale) rules(sale[year in (2000,2001),week in (1,52,53)] order by year,week =sale[cv(year), cv(week)]*1.10)
 order by product, country, year, week;

for循环允许指定规则左侧的值列表。for循环只可以定义在规则左侧,用来将新的单元格加入到输出中。
for循环语法:
for dimension from <value1> to <value2> [increment | decrement] <value3>
例如:想要增加2002年第1~53周数据的单元格并将值设置为0.

select product, country, year, week, inventory, sale, receipts
  from sales_fact
 where country in ('Australia')
   and product = 'Xtend Memory' model return updated rows partition
 by(product, country) dimension by(year, week)
 measures(0 inventory, sale, receipts) rules automatic
 order(inventory [ year, week ] = nvl(inventory [ cv(year), cv(week) - 1 ], 0) - sale [ cv(year), cv(week) ] + receipts [ cv(year), cv(week) ], sale [ 2002, for week from 1 to 53 increment 1 ] = 0, receipts [ 2002, for week from 1 to 53 increment 1 ] = 0)
 order by product, country, year, week;

聚合
例如:计算年平均库存、年最大销售额

select product, country, year, week, inventory,avg_inventory,max_sale, sale, receipts
  from sales_fact
 where country in ('Australia')
   and product = 'Xtend Memory' model return updated rows partition
 by(product, country) dimension by(year, week)
 measures(0 inventory,0 avg_inventory,0 max_sale, sale, receipts) rules automatic
 order(inventory [ year, week ] = nvl(inventory [ cv(year), cv(week) - 1 ], 0) - sale [ cv(year), cv(week) ] + receipts [ cv(year), cv(week) ], 
 avg_inventory[year,any]=avg(inventory)[cv(year),week],
 max_sale[year,any]=max(sale)[cv(year),week]
 )
 order by product, country, year, week;

迭代
迭代语法:
[iterate (n) [until <condition>]] (<cell_assignment>=<expression>...)
使用iterate (n)来执行一个表达式n次,使用iterate until 实现在给定条件为真时进行迭代。
例如:展示5周sale列的值。iterate(5)指定对于每一行规则程序段进行5次循环。iteration_number为当前循环次数的变量,第一次迭代从0开始。注意空字符串需要转换:cast(' ' as varchar2(50)) sale_list

select year, week, sale, sale_list
  from sales_fact
 where country in ('Australia')
   and product = 'Xtend Memory' model return updated rows partition
 by(product, country) dimension by(year, week)
 measures(cast(' ' as varchar2(50)) sale_list, sale) rules iterate(5)(sale_list [ year, week ] order by year,week = sale[cv(year),cv(week)-iteration_number+2]||case when iteration_number=0 then ' ' else ',' end ||sale_list[cv(year), cv(week)])
 order by year, week;

如果规则访问一个不存在的行,将会返回一个空值。例如上面的第一行的sale_list列有两个连续的逗号。可以使用presentv函数检查单元格是否存在来解决重复逗号的问题。
presentv函数语法:
presentv (cell_reference,expr1,expr2)
如果cell_reference引用了一个存在的单元格,那么presentv函数返回expr1,否则返回expr2。

select year, week, sale, sale_list
  from sales_fact
 where country in ('Australia')
   and product = 'Xtend Memory' model return updated rows partition
 by(product, country) dimension by(year, week)
 measures(cast(' ' as varchar2(50)) sale_list, sale) rules iterate(5)(sale_list [ year, week ] order by year,week = presentv(sale[cv(year),cv(week)-iteration_number+2],sale[cv(year),cv(week)-iteration_number+2]||case when iteration_number=0 then ' ' else ',' end ||sale_list[cv(year), cv(week)],sale_list[cv(year), cv(week)]))
 order by year, week;

presentnnv函数和presentv函数类似,但它可以进一步区分所引用的是不存在的单元格还是存在值为空的单元格。
presentnnv函数语法:
presentnnv (cell_reference,expr1,expr2)
如果cell_reference引用了一个存在的单元格并且单元格不含空值,则返回expr1,否则返回expr2。

查找表(参考表):在SQL语句的初始部分定义,然后在规则部分引用。可以有多张查找表。

select product, year, week, sale, prod_list_price
  from sales_fact
 where country in ('Australia')
   and product = 'Xtend Memory' model return updated rows reference
 ref_prod on (select prod_name, max(prod_list_price) prod_list_price
          from sh.products
         group by prod_name) dimension by(prod_name)
 measures(prod_list_price) main main_section partition
 by(product, country) dimension by(year, week)
 measures(sale, receipts, 0 prod_list_price)
 rules(prod_list_price [ year,
             week ] order by year,
             week = ref_prod.prod_list_price [ cv(product) ])
 order by year, week;

其中

reference ref_prod on 
(select prod_name, max(prod_list_price) prod_list_price
          from sh.products group by prod_name) 
 dimension by(prod_name)
 measures(prod_list_price)

定义了一个查找表ref_prod。dimension by指定了维度列,measures指定度量值列。引用表的维度列必须唯一,并且针对维度列中的每一个值只会取出一行。

main main_section
 partition by(product, country)
 dimension by(year, week)
 measures(sale, receipts, 0 prod_list_price)
 rules(prod_list_price [ year, week ] order by year, week = ref_prod.prod_list_price [ cv(product) ])

main关键字开头声明了名为main_section的main model部分。product列的当前值通过cv(product)子句传递过来作为查找表中的查找键值。要访问查找表中特定的某一行,需要从规则左侧传递维度列的当前值。在这个例子中,使用cv(product)子句来进行。可将ref_prod想象成一张表,cv(product)作为该表的主键,prod_list_price作为查找表中取出的一列。

在使用model的SQL语句中,有两个原因使得值为空:已经存在的单元格值为空或引用不存在的单元格。
默认情况下,引用不存在的单元格会返回空值。
model keep nav return updated rows中,keep nav是默认值,即等同model return updated rows。引用不存在的单元格会返回空值。
model ignore nav return updated rows,引用不存在的单元格,数值列将会返回0,文本列会返回一个空字符串。

使用model子句的SQL语句都会访问非常大的表。可以将Oracle的查询重写特性结合物化视图来改进此类语句的性能(注意:包含model子句的物化视图不适用快速增量刷新)。

create materialized view mv_model_inventory 
enable query rewrite as
select product, country, year, week, inventory, sale, receipts
  from sales_fact
 model return updated rows partition
 by(product, country) dimension by(year, week)
 measures(0 inventory, sale, receipts) rules automatic
 order(inventory [ year, week ] = nvl(inventory [ cv(year), cv(week) - 1 ], 0) - sale [ cv(year), cv(week) ] + receipts [ cv(year), cv(week) ])
 order by product, country, year, week;

通过执行计划,可以看到查询重写功能会重写以下的SQL子句并修改为访问物化视图而不是基础表。

select * from
(select product, country, year, week, inventory, sale, receipts
  from sales_fact
model return updated rows partition
 by(product, country) dimension by(year, week)
 measures(0 inventory, sale, receipts) rules automatic
 order(inventory [ year, week ] = nvl(inventory [ cv(year), cv(week) - 1 ], 0) - sale [ cv(year), cv(week) ] + receipts [ cv(year), cv(week) ]))
  where country in ('Australia')
   and product = 'Xtend Memory' 
 order by product, country, year, week;
正文到此结束
该篇文章的评论功能已被站长关闭