You can do it with the parallel stuff. Without using sharedmemory, and with using processes instead of threads. Its not particularly nice but here is how it goes: (Tested in 0.39)
first add a second process (before the @everywhere calls): pidB = addprocs(1)[1] A mockup of your blocking kernal function @everywhere function fake_kernal_fun() sleep(5*rand()+5) rand(1:100) end @everywhere function B_blocking_call_to_kernal(kernal_fun_ret::RemoteRef) @assert kernal_fun_ret.where==myid() while true ret = fake_kernal_fun() put!(kernal_fun_ret, ret) end end A Function to do the reading: We will take! the value from the RemoteRef if it is ready, or if it isn't we will continue using the old value. function read_kernal_fun(kernal_fun_ret::RemoteRef) latest_value = take!(kernal_fun_ret) #initialise it -- this will wait til we have at least the first result for ii in 1:20 latest_value = isready(kernal_fun_ret) ? take!(kernal_fun_ret) : latest_value println(ii,'-', latest_value) sleep(3) end end Putting them together: shared_kernal_fun_ret = RemoteRef(pidB) remotecall(pidB, B_blocking_call_to_kernal,shared_kernal_fun_ret) #Never use the remoteref returned by this remotecall, it will never be valid (as the called function does not ever return) read_kernal_fun(shared_kernal_fun_ret) That works fine for me Outputs: 1-47 2-47 3-47 4-46 5-46 6-19 7-19 8-19 9-31 10-31 11-31 12-37 13-37 14-37 15-13 16-13 17-63 18-63 19-48 20-48 I think it is fine, and don't suffer from race conditions in a problematic way. Your other option I can think of it to write that section in C and call it with ccall. I have no experience wih that though.