dimanche 3 juillet 2016

Hive: Finding max value in a group by

I have a hive table something like this:

create external table test(
  test_id string,
  test_name string,
  description string,
  clicks int,
  last_referred_click_date string
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY 't'
STORED AS TEXTFILE LOCATION  '{some_location}';

I need to find out total clicks for a test_id and the last click date(max date in that group of test_id)

I am doing something like this

insert overwrite table test partition(weekending='{input_date}')
  select s.test_id,s.test_name,s.description,max(click_date),
    sum(t.click) as clicks
   group by s.test_id,s.test_name,s.description order by clicks desc; 

Does max() function works for strings? My click_date is of teh format'yyyy-mm-dd' and is a string data type? If not, what can i do here ? UDF ?

Aucun commentaire:

Enregistrer un commentaire