注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

帅小伙的博客

希望能在这里交到更多的朋友

 
 
 

日志

 
 
 
 

利用行列转换、外连接和笛卡尔积的一次完美统计  

2008-03-28 15:29:24|  分类: oracle |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |

今天,有个新的统计,要求实现以下功能:

ssn  日期  163 blog photo xyq xy2

其中,日期后面的字段为产品名称,统计内容为每个ssn在2.21到3.20号每天登录各个产品的次数,如果某个ssn某一天一次都没有登录,也要显示此ssn当天日期的登录记录,不过每个产品显示都是0而已,下面是某个用户显示的结果:

0...123455,20080221,1,0,0,0,0
0...123455,20080222,0,0,0,0,0
0...123455,20080223,0,0,0,0,0
0...123455,20080224,0,0,0,0,0
0...123455,20080225,0,0,0,0,0
0...123455,20080226,0,0,0,0,0
0...123455,20080227,0,0,0,0,0
0...123455,20080228,0,0,0,0,0
0...123455,20080229,0,0,0,0,0
0...123455,20080301,0,0,0,0,0
0...123455,20080302,0,0,0,0,0
0...123455,20080303,0,0,0,0,0
0...123455,20080304,0,0,0,0,0
0...123455,20080305,0,0,0,0,0
0...123455,20080306,0,0,0,0,0
0...123455,20080307,0,0,0,0,0
0...123455,20080308,0,0,0,0,0
0...123455,20080309,0,0,0,0,0
0...123455,20080310,0,0,0,0,0
0...123455,20080311,0,0,0,0,0
0...123455,20080312,0,0,0,0,0
0...123455,20080313,0,0,0,0,0
0...123455,20080314,0,0,0,0,0
0...123455,20080315,0,0,0,0,0
0...123455,20080316,0,0,0,0,0
0...123455,20080317,0,0,0,0,0
0...123455,20080318,0,0,0,0,0
0...123455,20080319,0,0,0,0,0
0...123455,20080320,0,0,0,0,0

在要统计的基表中,有两个表,表结构如下:

SQL> desc login_record_new_reg
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 SSN                                                VARCHAR2(20)
 LOGIN_TIME                                         DATE
 LOGIN_IP                                           VARCHAR2(15)
 LOGIN_PDT                                          VARCHAR2(15)
 AUTH_TYPE                                          VARCHAR2(10)

SQL> desc ursdw.user_new_reg_2w
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 SSN                                                VARCHAR2(20)

login_record_new_reg记录了指定日期的所有ssn的登录记录,其中login_pdt是产品标识,ursdw.user_new_reg_2w记录了需要统计的ssn,需要和第一个表进行关联,以统计产品信息。很显然,根据已有的表结构,要想得到我们的目的数据,需要进行行列转换,sql如下:

select  ssn,login_time,
 max(decode(login_pdt,'mail163',num,0)) mail163,
 max(decode(login_pdt,'blog',num,0)) blog,
 max(decode(login_pdt,'photo',num,0)) photo,
 max(decode(login_pdt,'xyq',num,0)) xyq,
 max(decode(login_pdt,'xy2',num,0))xy2
from
(
select  a.ssn,a.login_time,a.login_pdt,num
from
(
select  ssn,
 to_char(login_time,'yyyymmdd') login_time,
 login_pdt,
 count(1) num
 from system.login_record_new_reg
 where login_pdt in('mail163','blog','photo','xyq','xy2')
 group by ssn,to_char(login_time,'yyyymmdd'),login_pdt
)a,ursdw.user_new_reg_2w b
where a.ssn=b.ssn
)
group by ssn,login_time

下面结果对上面测试用户的结果进行了显示:

SSN                                      LOGIN_TIME          MAIL163       BLOG      PHOTO        XYQ        XY2
---------------------------------------- ---------------- ---------- ---------- ---------- ---------- ----------
0...123455                               20080221                  1          0          0          0          0

因为此用户刚好就2.21登录过,因此只有一条记录,下一步就是如何将其他日期的登录结果显示(2.22-3.20)。

此时自己想了半天,没有想到啥好的办法,这是同事提醒了一下,能否用外连接实现,仔细一想,这的确是个好办法,因为外连接的主要作用就是将等值和非等值的记录全部显示,这刚好满足自己的需求,但是需要构建一个连接表,此连接表需要所有统计的ssn和所有日期的记录,也就是每个ssn在每天都要有一条记录,如下所示:

ssn1 20080220

ssn1 20080221

。。。。。。

ssn1 20080320

ssn2 20080220

ssn2 20080221

。。。。。。

ssn2 20080320

。。。。。。

因为现在有表ursdw.user_new_reg_2w,记录了所有要统计的ssn,怎么样将每个ssn都加上2.21到3.20每天的记录呢?开始想用plsql来做,后来突然灵光一闪,我们想要的结果不正是用户表与日期表(一个只记录2.21到3.20的表)的笛卡尔积吗?于是便执行了如下步骤:

(2)create table month(login_time varchar2(8));

(2)insert into month values('20080221');

....

     insert into month values('20080320');

(3)create table month_2w as select ssn,login_time from ursdw.user_new_reg_2w ,ursdw.month;

(4)查询生成的连接表

SQL> select * from (select * from ursdw.month_2w  order by ssn,login_time)  where rownum<30
0...123455                               20080221
0...123455                               20080222
0...123455                               20080223
0...123455                               20080224
0...123455                               20080225
0...123455                               20080226
0...123455                               20080227
0...123455                               20080228
0...123455                               20080229
0...123455                               20080301
0...123455                               20080302
0...123455                               20080303
0...123455                               20080304
0...123455                               20080305
0...123455                               20080306
0...123455                               20080307
0...123455                               20080308
0...123455                               20080309
0...123455                               20080310
0...123455                               20080311
0...123455                               20080312
0...123455                               20080313
0...123455                               20080314
0...123455                               20080315
0...123455                               20080316
0...123455                               20080317
0...123455                               20080318
0...123455                               20080319
0...123455                               20080320

29 rows selected.

果然,完全符合我们的预期。

最后一步,就是和前面的结果进行外连接了,sql如下:

select  d.ssn,d.login_time,nvl(mail163,0),nvl(blog,0),nvl(photo,0),nvl(xyq,0),nvl(xy2,0) from
(
select  ssn,login_time,
 max(decode(login_pdt,'mail163',num,0)) mail163,
 max(decode(login_pdt,'blog',num,0)) blog,
 max(decode(login_pdt,'photo',num,0)) photo,
 max(decode(login_pdt,'xyq',num,0)) xyq,
 max(decode(login_pdt,'xy2',num,0))xy2
from
(
select  a.ssn,a.login_time,a.login_pdt,num
from
(
select  ssn,
 to_char(login_time,'yyyymmdd') login_time,
 login_pdt,
 count(1) num
 from system.login_record_new_reg
 where login_pdt in('mail163','blog','photo','xyq','xy2')
 group by ssn,to_char(login_time,'yyyymmdd'),login_pdt
)a,ursdw.user_new_reg_2w b
where a.ssn=b.ssn
)
group by ssn,login_time
)c,ursdw.month_2w d
where c.ssn(+)=d.ssn and c.login_time(+)=d.login_time
order by ssn,login_time
;

注意,一定要加上nvl,否则没有登录记录的那些登录次数都会显示空(感觉说的真是别扭)。

用ociludr工具导出为文本后,内容如下:

./ociuldr_linux  -si sql=2w.txt file=2w.out

[oracle@localhost sh]$ more 2w.out
ssn,date,mail163,blog,photo,xyq,xy2
0...123455,20080221,1,0,0,0,0
0...123455,20080222,0,0,0,0,0
0...123455,20080223,0,0,0,0,0
0...123455,20080224,0,0,0,0,0
0...123455,20080225,0,0,0,0,0
0...123455,20080226,0,0,0,0,0
0...123455,20080227,0,0,0,0,0
0...123455,20080228,0,0,0,0,0
0...123455,20080229,0,0,0,0,0
0...123455,20080301,0,0,0,0,0
0...123455,20080302,0,0,0,0,0
0...123455,20080303,0,0,0,0,0
0...123455,20080304,0,0,0,0,0
0...123455,20080305,0,0,0,0,0
0...123455,20080306,0,0,0,0,0
0...123455,20080307,0,0,0,0,0
0...123455,20080308,0,0,0,0,0
0...123455,20080309,0,0,0,0,0
0...123455,20080310,0,0,0,0,0
0...123455,20080311,0,0,0,0,0
0...123455,20080312,0,0,0,0,0
0...123455,20080313,0,0,0,0,0
0...123455,20080314,0,0,0,0,0
0...123455,20080315,0,0,0,0,0
0...123455,20080316,0,0,0,0,0
0...123455,20080317,0,0,0,0,0
0...123455,20080318,0,0,0,0,0
0...123455,20080319,0,0,0,0,0
0...123455,20080320,0,0,0,0,0

结果完全满足了我们的需求,至此,此统计完美谢幕!



 

 

  评论这张
 
阅读(314)| 评论(0)
推荐 转载

历史上的今天

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2018