1월 30일

카테고리 없음

1월 30일

mejii 2024. 1. 30. 23:58

*데이터 값이 없을 경우!

1)없는 값을 제외시키기

MySQL에서는 자동으로 없는 값을 0으로 처리해 계산한다.

하지만 0으로 처리하는 것이 아닌, 값 자체를 없는 것으로 처리하고 싶을 땐 아래와 같은 코드를 쓴다!

select restaurant_name,
avg(rating) average_of_rating,
avg(if(rating<>'Not given', rating, null)) average_of_rating2
from food_orders
group by 1

2)없는 값을 다른 값으로 바꾸기

만약, age 중 몇 개가 NULL인 경우,

select a.order_id,
       a.customer_id,
       a.restaurant_name,
       a.price,
       b.name,
       b.age,
       coalesce(b.age, 20),
       b.gender
from food_orders a left join customers b on a.customer_id=b.customer_id
where b.age is null

->즉 coalesce란 함수를 사용해 NULL을 20(위의 예에선)으로 바꾸어 준다.

coalesce(age, 대체값)

*SQL 로 Pivot Table 만들어보기

-Pivot table : 2개 이상의 기준으로 데이터를 집계할 때, 보기 쉽게 배열하여 나타낸 테이블.

위에선 집계기준과 구분 컬럼이 2개의 기준이 된다!

-MySQL에서 피벗 테이블 만들기 기본 구조:

select restaurant_name,
       max(if(hh='15', cnt_order, 0)) "15",
       max(if(hh='16', cnt_order, 0)) "16",
       max(if(hh='17', cnt_order, 0)) "17",
       max(if(hh='18', cnt_order, 0)) "18",
       max(if(hh='19', cnt_order, 0)) "19",
       max(if(hh='20', cnt_order, 0)) "20" //

max(if(hh='xx',cnt_order,0) "xx" 가 기본 형식이다. xx가 가로 컬럼에 위치하며, 기준이 된다!/ /
from
(
select a.restaurant_name,
substring(b.time, 1, 2) hh,
count(1) cnt_order
from food_orders a inner join payments b on a.order_id=b.order_id
where substring(b.time, 1, 2) between 15 and 20
group by 1, 2
) a
group by 1
order by 7 desc

*Window Function

-기본구조:

window_function(argument) over (partition by 그룹 기준 컬럼 order by 정렬 기준)

+partition by ~= group by 와 같다고 생각하자.

오늘은 아래와 같은 window function에 대해 배웠다.

1)RANK : 순위를 매겨줄 수 있다.

예시)

select cuisine_type,
       restaurant_name,
       order_count,
       rn "순위"
from
(
select cuisine_type,
       restaurant_name,
       rank() over (partition by cuisine_type order by order_count desc) rn,
       order_count
from
(
select cuisine_type, restaurant_name, count(1) order_count
from food_orders
group by 1, 2
) a
) b
where rn<=3
order by 1, 4

cuisine_type에 따라 순위가 매겨진 결과이다.

2)SUM

예시)

select cuisine_type,
       restaurant_name,
       order_count,
       sum(order_count) over (partition by cuisine_type) sum_cuisine_type,
       sum(order_count) over (partition by cuisine_type order by order_count, restaurant_name) cumulative_sum
from
(
select cuisine_type, restaurant_name, count(1) order_count
from food_orders
group by 1, 2
) a

cuisin type이 같으면 sum_cuisine_type은 같지만(order_count의 합이기에),

누적합인 cumulative_sum은 order_count가 차례로 더해지기에 다르다.

*날짜 포맷 데이터 활용하기

-기본 형식:

select date(date) date_type,
date
from payments

여기서 date() 가 날짜 포맷 함수이며 괄호 안의 date는 데이터이다.

위와 같이 xxxx-xx-xx 의 데이터를 날짜 포맷 데이터로 변환할 수 있다!

위와 같이 날짜 데이터로 바꾸어준 데이터를,

select date(date) date_type,
       date_format(date(date), '%Y') "년",
       date_format(date(date), '%m') "월",
       date_format(date(date), '%d') "일",
       date_format(date(date), '%w') "요일"
from payments

연도, 월, 일, 요일로 추출할 수 있다.

연도는 '%Y'(소문자도 ok),

월은 '%m,'

일은 '%d',

요일은' % w' 이다.