A recent post [ https://medium.com/adyen/building-our-data-science-platform-with-spark-and-jupyter-1894c33e6dd0 ] describing the adoption of Jupyter notebooks by Adyen mentioned some logging issues:
*We added instrumentation across all levels of data analysis workflows — from looking when users looked in and which notebooks were opened to linking the code entered in notebooks with actual files created and accessed on HDFS. Most of the work was related to creating a specific fork of Jupyter protocol client library and making custom Java agent for drivers and executor jobs. This allowed us to create custom events we track for auditing.* <https://cdn-images-1.medium.com/max/2000/1*B1UUpG43vZQxZdIXPGaoTg.png> I'm not sure if their repos <https://github.com/Adyen> have code examples though? On Wednesday, 19 December 2018 20:41:27 UTC, Gary Page-Wood wrote: > > Hey folks, > > I've been researching options for enabling user activity monitoring in > Jupyter notebooks, and I've come across various related topics here in the > group and in GitHub over the last year or two [1][2][3][4]. My use case is > deploying JupyterHub in an environment where compliance requirements compel > us to record all user activity on their notebook server. > > gclen's implementation in [4] is the simpler of the approaches listed back > in last year's discussion [2], being very specifically logging-based with > the python logger config file for the kernel messages the only config > option. There was also mention of what sounded like a more general 'message > middleware' kind of approach where logging would just be something you > could add to a configurable pipeline of pre/post message processors that > could enable much more powerful and far-reaching customisations. > > My question is; before I dive in too far into reviving the simple logging > approach for the kernel message handler is there any opposition to taking > this route now? Is kernel message-handling middleware a thing that might be > on the horizon that would clash with this approach, or could we possibly go > ahead and just move auditing/logging concerns around later should that > become a reality? > > Cheers > Gary > > > [1]https://groups.google.com/d/msg/jupyter/bZlWn_Tas1c/WN5w4T6GCwAJ > [2]https://groups.google.com/d/msg/jupyter/sLKCCBwlKEc/CqrvYCvfBwAJ > [3]https://github.com/jupyter/notebook/issues/4136 > [4]https://github.com/jupyter/notebook/issues/2251 > -- You received this message because you are subscribed to the Google Groups "Project Jupyter" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/jupyter/5aec1829-c63e-4f11-a19d-612dc101c573%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
