apache pig - Join in Pig and creating a new column -
i have 2 data sets
definition of schema - name, city, state a= { ram, sunnyvale, ca soju, austin, tx rathos, bangalore, karnataka mike, portland, or } b = { ram, refund soju, refund }
i join these 2 tables based on state , have output follows
schema definition - name,city,state,refundissued (yes/no) ram,sunnyvale,ca,yes soju,austin,tx,yes rathos,bangalore,karnataka,no mike,portland, or,no
i not sure on how specify need column , goes on logic
a = load 'data1.txt' using pigstorage(',') (name: chararray,city: chararray,state: chararray); b= load 'data2.txt' using pigstorage(',') (name: chararray,type: chararray); c = join name left outer,b name; d = foreach c generate a::name firstname,b::type charge_type; --how add new column goes on refund issued yes /no store d '1outdata.txt';
a = load 'data1.txt' using pigstorage(',') (name: chararray,city: chararray,state: chararray); b= load 'data2.txt' using pigstorage(',') (name: chararray,type: chararray); c = join name left outer,b name; d = foreach c generate a::name name , a::city city, a::state state, (b::type == 'refund' ? 'true' : 'false') refundissued
note refundissues can 'true', 'false' or null because of how bincond works. if want null (left join finds no match or field value null) translated false use:
e = foreach d generate name , city, state, (refundissued null ? 'false' : refundissued) refundissued
Comments
Post a Comment