連續登錄天數計算
發表時間:2019-9-24
發布人:葵宇科技
浏覽次數:61
最近有個(gè)需求,計算用戶連續登錄的最大天數(這裡使用prestoSql,使用hive也可(kě)以),先看下(xià)登錄日志數據表hive.traffic.access_user隻有兩個(gè)字段:uid,day;日期輔助表hive.ods.dim_date,這個(gè)表隻有一個(gè)字段day;
先說下(xià)思路(lù),
從上可(kě)以看到,隻要是連續登錄的話,day-rownumber的差值是一樣的,那問(wèn)題來了,這樣的減法在跨月(yuè)或者跨年的時候會出問(wèn)題,所以我們首先将日期轉換成有序的數字
select day,ROW_NUMBER() OVER(ORDER BY day) daynum from hive.ods.dim_date
接下(xià)來,我們需要将用戶登錄日志按照uid分組,然後按照日期排序,然後計算出rownumber
with a as (select uid,day from hive.traffic.access_user where day>=20190801 and uid<>'')
select uid,day,ROW_NUMBER() OVER(PARTITION BY uid ORDER BY uid,day) rownum from a group by day,uid
接下(xià)來就是計算差值,差值相同的代表連續登錄日期,完整sql如(rú)下(xià)
with a as (select uid,day from hive.traffic.access_user where day>=20190801 and uid<>''),
b as (select uid,day,ROW_NUMBER() OVER(PARTITION BY uid ORDER BY uid,day) rownum from a group by day,uid ),
c as(select day,ROW_NUMBER() OVER(ORDER BY day) daynum from hive.ods.dim_date),
d as (select uid,b.day,daynum,rownum,daynum-rownum days from b join c alt="在這裡插入圖片描述" />
end