The "." is a dereference operator. This is used for look into complex data types. See http://pig.apache.org/docs/r0.12.0/basic.html#deref
The "::" is a disambiguation operator. When performing a join, you may have fields that are named the same in the relations that were joined. In order to tell pig which relation to get the field from, you need to use the disambiguation operator. This is not necessary if the field is only in one relation. See http://pig.apache.org/docs/r0.12.0/basic.html#disambiguate Hope this helps. - Pradeep On Tue, Oct 14, 2014 at 3:30 AM, Jakub Stransky <[email protected]> wrote: > Hello experienced users, > > I am relatively new to pig and I came across to one thing I do not fully > understand. I have following script: > > dirtydata = load '/data/120422' using AvroStorage(); > > sodtr = filter dirtydata by TransactionBlockNumber == 1; > sto = foreach sodtr generate Dob.Value as Dob,StoreId, > Created.UnixUtcTime; > g = GROUP sto BY (Dob,StoreId); > sodtime = FOREACH g GENERATE group.Dob AS Dob, group.StoreId as StoreId, > MAX(sto.UnixUtcTime) AS latestStartOfDayTime; > > joined = join dirtydata by (Dob.Value, StoreId) LEFT OUTER, sodtime by > (Dob, StoreId); > > cleandata = filter joined by dirtydata.Created.UnixUtcTime >= > sodtime.latestStartOfDayTime; > dump cleandata > > I am getting folloving error: > > > ERROR 0: Exception while executing (Name: joined: Local > Rearrange[tuple]{tuple}(false) - scope-166 Operator Key: scope-166): > org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception > while executing [POProject (Name: Project[long][0] - scope-152 Operator > Key: scope-152) children: null at []]: > org.apache.pig.backend.executionengine.ExecException: *ERROR 0: Scalar has > more than one row in the output.* 1st : > > (1,(20120422),64619,2164,{(((20120422),64619,2164,(1335120734,-300),2,),{},(false,840),{},{(00200079-0000-0000-0000-000000000000,((1,LUNCH),(2097271,(2097271,WL > > 119),false),{(,(1335120734,-300),CheckPrint)},{},((0),PerGroup),20121,(3,Coffee > Bar),),((34.57),(36.02)),{},{},{},{},{},{},{})},{})},(1412864847,-300)), > 2nd > > :(1,(20120422),64619,1,{(((20120422),64619,1,(1335088853,-300),3,),{},(false,840),{},{},{({(ClockedIn,(1335088800,-300),(-62135596800,0),(1),{(0,(11),false)},0,(4,Baker),{},false)},(511,Roger > Baeza-Vasquez))})},(1412864846,-300)) > 2014-10-14 05:28:25,165 [main] ERROR > org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed! > > When I change following relation: > cleandata = filter joined by dirtydata*::*Created.UnixUtcTime >= > sodtime.latestStartOfDayTime; > > Than all works fine. Seems to me like a mystery because I would expect > that the same I need to do for sodtime relation. I am missing something > here. Could someone please put some light on it? > > Thanks a lot > Jakub >
