Hive_保证concat_ws内部有序原创

原创

小哥 3年前 (2022-10-22) 阅读数 194 #大杂烩

业务背景：同一库存组织下的同一物料：可以被多个型号使用

备注：ratio_num = 当前型号使用此项目的数量。/所有型号使用此项目的总数量，cnt_model：这种材料总共有几种型号使用

我想要什么：一行数据就足够了：库存组织。+物料+每种型号的使用详情

注：需要根据使用情况按降序排序(请参阅哪些吸盘=_=，使用最多...)

如何做到：多行，聚合，但如何确保使用：concat_ws在那之后，细节领域的内部是否仍然井然有序？

第一，比率(ratio_num)排序，获取序列号123…
拼接序列号和字段
使用sort_array
再用concat_ws(多行合计一行)
将数字替换为常规数字

发现有些地方不对劲。因为如果序列号在1-9，没问题；但如果超过10，因为它是按字典顺序排序的，所以它不是我们想要的降序...

解决方案：完成序列号并将其固定为四位数 & lpad(oid,4,‘0’)即可

最后：附加SQL

SELECT
     t1.period_tag -- 它它它它它属于哪个时间段？？？？？
    ,t1.ORGANIZATION_ID
    ,t1.ITEM_CODE
    ,t1.cnt_model -- 使用的家居产品数量
    ,regexp_replace(concat_ws(；,sort_array(collect_set(concat_ws(:,cast(lpad(oid,4,0) as string),t1.model_use_dtl)))),\d+:,) as model_use_dtl
FROM(
    SELECT
         t1.period_tag -- 它它它它它属于哪个时间段？？？？？
        ,t1.ORGANIZATION_ID
        ,t1.ITEM_CODE
        ,t1.cnt_model -- 使用的家居产品数量
        ,t1.ratio_num
        ,t1.model_use_dtl 
        ,t1.item_model_ratio -- 此模型的使用百分比(最大)
        ,t1.oid -- 倒叙排序
    FROM(
        SELECT
             t1.period_tag -- 它它它它它属于哪个时间段？？？？？
            ,t1.ORGANIZATION_ID
            ,t1.ITEM_CODE
            ,t1.cnt_model -- 使用的家居产品数量
            ,round(t1.item_model_ratio*100,2) ratio_num
            ,concat(t1.PRODUCT_MODEL,：,concat(round(t1.item_model_ratio*100,2),%)) model_use_dtl
            ,max(t1.item_model_ratio) item_model_ratio -- 此模型的使用百分比(最大)
            ,dense_rank() over(partition by t1.period_tag,t1.organization_id,t1.ITEM_CODE order by cast(round(t1.item_model_ratio*100,2) as double) desc) oid
        FROM        DMA${db_para}.DMA_GROUP_ORG_ITEM_BELONG_TO_01_test t1
        where       organization_id = 1535 and item_code = 004.044.0053208
        and         period_tag = 近3个月
        GROUP BY
             t1.period_tag -- 它它它它它属于哪个时间段？？？？？
            ,t1.ORGANIZATION_ID
            ,t1.ITEM_CODE
            ,t1.cnt_model -- 使用的家居产品数量
            ,round(t1.item_model_ratio*100,2)
            ,concat(t1.PRODUCT_MODEL,：,concat(round(t1.item_model_ratio*100,2),%))
    )t1
)t1
GROUP BY
     t1.period_tag -- 它它它它它属于哪个时间段？？？？？
    ,t1.ORGANIZATION_ID
    ,t1.ITEM_CODE
    ,t1.cnt_model -- 使用的家居产品数量