Martin, thanks for bringing up this issue. Just thinking aloud:
1) It is possible to disable certain file types for copilot and it appears, that copilot claims it will not be accessing these file types then https://stackoverflow.com/a/77908836/27989141 Do you think it is sufficient to disable copilot just for .bean files? 2) Suppose someone keeps financial data in Google Sheets (which I generally don't, but suppose). Is there any reason to get more concerted of copilot accessing my financial data than google? On Friday, November 15, 2024 at 3:35:46 PM UTC+1 bl...@furius.ca wrote: > Dear Beancount users, > > This is a PSA -- TL;DR: don't enable Github Copilot completion on your > ledgers. > > If you installed Github Copilot in your personal code editor/computer, be > aware that it uploads "snippets" of your input files to it and possibly to > third-party APIs (e.g., OpenAI). I think people are just beginning to > become aware of the implications of this due to their employers crafting > policies around what LLMs they can use and what-not, but it's still early > days and it's easy to accidentally screw up, so here are some thoughts > about this. > > I think it's really easy to install Github Copilot to get code completions > in say, Emacs, and then to open up your ledger and it's in Copilot > minor-mode everywhere (for example if you enabled it via `(add-hook > 'prog-mode-hook 'copilot-mode)` or similar, to be turned on everywhere > ("it's amazing, right?")), which means you get completions on its contents. > AFAICT it's impossible to know how much context is sent up to the models > for queries. GH claims general "context" is sent: > > https://github.com/features/copilot/#faq > [image: image.png] > > In other places I've seen it's mentioned that "a few lines of context > before and after the code you're editing". AFAIK there's no way to know how > large this context is, and I've seen mentions of the selection somewhere. > For example, if you select your entire ledger file, does it upload the > whole thing as context for your completion prompt? > > Github's retention policy mentions prompts aren't retained, but what about > context? > I see "Prompts and Suggestions" in the FAQ: > [image: image.png] > > And some of your transaction data may end up getting used to train new > models? > [image: image.png] > > Please correct me if I'm wrong: > - I don't believe there is a local log (on your computer) of what was > actually sent. > (If you just accidentally once opened up your ledger with the entire > history of your financial life, it's not impossible that the whole thing > was uploaded to Copilot.) > - I don't believe Github lets you view the content you've uploaded and > sent from their site either. > - I don't believe Github lets you delete the content as a matter or normal > usage (like Google Dashboard does, e.g., > https://myaccount.google.com/dashboard) > > There's some mention in the FAQ: > [image: image.png] > > This takes you to this page: > [image: image.png] > Okay, so maybe. This looks good in theory, but what if your data has also > been sent to a third-party service? > AFAIK Copilot uses OpenAI's Codex model. Do they have a setup to host and > run it themselves? > Or is all the data sent to a service run by OpenAI? > > I think it's appropriate to be really cautious about this. > > -- You received this message because you are subscribed to the Google Groups "Beancount" group. To unsubscribe from this group and stop receiving emails from it, send an email to beancount+unsubscr...@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/beancount/fe428c54-9a6b-4b28-be75-84b1a5daa805n%40googlegroups.com.