用户在网站上访问的最后一页,从未返回

kadbb459  于 2021-07-29  发布在  Java
关注(0)|答案(1)|浏览(321)

我正在尝试查找用户访问的最后一个页面,并且在特定时间内没有返回到该网站:
例如:

Date          visitstarttime    ldapid     page_1 page_2 Page_3 Page_4 
2018-08-01    1590805941       1000123         1      0       0     0
2018-07-30    1590200345       1000123         0      1       0     0
2018-07-20    1580100098       1000100         0      1       0     0
2018-07-18    1570000987       1000100         0      0       1     0
2018-07-12    1550200678       1000007         0      1       0     0
2018-07-09    1530287323       1000007         0      0       0     1

因为我试图只找到用户谁访问了一个页面,从来没有在一个特定的时间内返回。我期望输出如下:

Date          visitstarttime    ldapid     page_1 page_2 Page_3 Page_4 
2018-08-01    1590805941       1000123         1      0       0     0
2018-07-20    1580100098       1000100         0      1       0     0
2018-07-12    1550200678       1000007         0      1       0     0

既然我们不能在gbq中同时使用aggregate和groupby函数,有没有办法解决这个问题?
查询:

SELECT 
MAX(date) as Max_date,
Max(visitStartTime) as Max_Time,
ldapid

FROM
(

SELECT
CAST(CONCAT(SUBSTR(date,0,4), '-', SUBSTR(date,5,2),'-',SUBSTR(date,7,2)) AS date ) AS date,
        visitStartTime,
    --  fullVisitorId,
   -- hit.page.pagePath AS pagepath,
    (
            SELECT
            x.value
            FROM
            UNNEST(hit.customDimensions) x
            WHERE
            x.index = 9) 
        AS ldapid,
    (
    SELECT
    MAX(
    IF
    (page.pagePath LIKE '%/applicant-center/#interview/recommendations-and-references%', 
    1, 
    0))
    FROM 
    UNNEST(hits))
    AS Recommendation_References,
    (
    SELECT
    MAX(
    IF
    (page.pagePath LIKE '%/applicant-center/#interview/interview-sign-up%',
    1,
    0))
    FROM 
    UNNEST(hits))
    AS Interview_Sign_Up,
    (
    SELECT
    MAX(
    IF
    (page.pagePath LIKE '%/applicant-center/#interview/transcript-setup%',
    1,
    0))
    FROM 
    UNNEST(hits))
    AS Transcript_Setup,
    (
    SELECT
    MAX(
    IF
    (page.pagePath LIKE '%/applicant-center/#interview/transcript-upload%',
    1,
    0))
    FROM 
    UNNEST(hits))
    AS Transcript_Upload,
    (
    SELECT
    MAX(
    IF
    (page.pagePath LIKE '%/applicant-center/#interview/pre-interview-questions%',
    1,
    0))
    FROM
    UNNEST(hits))
    AS Pre_Interview_Questions,
    (
    SELECT
    MAX(
    IF
    (page.pagePath LIKE '%/applicant-center/#interview/interview-prep%',
    1,
    0))
    FROM 
    UNNEST(hits))
    AS Interview_Prep,
    (
    SELECT
    MAX(
    IF
    (page.pagePath LIKE '%/applicant-center/#interview/interview-day-details%',
    1,
    0))
    FROM
    UNNEST(hits))
    AS Interview_day_details
    FROM
    `tfa-big-query.74006564.ga_sessions_*`,
    UNNEST(hits) AS hit
    WHERE
  _TABLE_SUFFIX BETWEEN '20190801' AND '20200529'

  GROUP BY
    date,
        visitStartTime,
        --fullVisitorId,
   -- pagepath,
    ldapid,
    Recommendation_References,
    Interview_Sign_Up,
    Transcript_Setup,
    Transcript_Upload,
    Pre_Interview_Questions,
    Interview_Prep,
    Interview_day_details
    ORDER BY visitStartTime DESC, date DESC

  )
  WHERE 
 ( Recommendation_References >= 1
  OR
  Interview_Sign_Up >= 1
  OR 
  Transcript_Setup >= 1
  OR
  Transcript_Upload>= 1
  OR
  Pre_Interview_Questions >= 1
  OR
  Interview_Prep >= 1
  OR
  Interview_day_details >= 1) and ldapid IS NOT NULL

  GROUP BY
  Max_date,
  Max_Time,
  ldapid

既然我们不能在gbq中同时使用aggregate和groupby函数,有没有办法解决这个问题?

ibps3vxo

ibps3vxo1#

我看不出你的问题和问题中的问题有什么关系。但根据你的问题和样本数据,使用 lead() :

select t.*
from (select t.*,
             lead(visitstarttime) over (partition by ldapid order by visitstarttime) as next_visitstarttime
      from t
     ) t
where next_visitstarttime is null or next_visitstarttime > visitstarttime + <whatever>

相关问题