Martin,

thanks for bringing up this issue. Just thinking aloud:

1) It is possible to disable certain file types for copilot and it appears, 
that copilot claims it will not be accessing these file types then 
https://stackoverflow.com/a/77908836/27989141

Do you think it is sufficient to disable copilot just for .bean files?

2) Suppose someone keeps financial data in Google Sheets (which I generally 
don't, but suppose). Is there any reason to get more concerted of copilot 
accessing my financial data than google?

On Friday, November 15, 2024 at 3:35:46 PM UTC+1 bl...@furius.ca wrote:

> Dear Beancount users,
>
> This is a PSA -- TL;DR: don't enable Github Copilot completion on your 
> ledgers.
>
> If you installed Github Copilot in your personal code editor/computer, be 
> aware that it uploads "snippets" of your input files to it and possibly to 
> third-party APIs (e.g., OpenAI). I think people are just beginning to 
> become aware of the implications of this due to their employers crafting 
> policies around what LLMs they can use and what-not, but it's still early 
> days and it's easy to accidentally screw up, so here are some thoughts 
> about this.
>
> I think it's really easy to install Github Copilot to get code completions 
> in say, Emacs, and then to open up your ledger and it's in Copilot 
> minor-mode everywhere (for example if you enabled it via `(add-hook 
> 'prog-mode-hook 'copilot-mode)` or similar, to be turned on everywhere 
> ("it's amazing, right?")), which means you get completions on its contents. 
> AFAICT it's impossible to know how much context is sent up to the models 
> for queries. GH claims general "context" is sent:
>
> https://github.com/features/copilot/#faq
> [image: image.png]
>
> In other places I've seen it's mentioned that "a few lines of context 
> before and after the code you're editing". AFAIK there's no way to know how 
> large this context is, and I've seen mentions of the selection somewhere. 
> For example, if you select your entire ledger file, does it upload the 
> whole thing as context for your completion prompt?
>
> Github's retention policy mentions prompts aren't retained, but what about 
> context?  
> I see "Prompts and Suggestions" in the FAQ:
> [image: image.png]
>
> And some of your transaction data may end up getting used to train new 
> models?
> [image: image.png]
>
> Please correct me if I'm wrong:
> - I don't believe there is a local log (on your computer) of what was 
> actually sent. 
> (If you just accidentally once opened up your ledger with the entire 
> history of your financial life, it's not impossible that the whole thing 
> was uploaded to Copilot.)
> - I don't believe Github lets you view the content you've uploaded and 
> sent from their site either.
> - I don't believe Github lets you delete the content as a matter or normal 
> usage (like Google Dashboard does, e.g., 
> https://myaccount.google.com/dashboard)
>
> There's some mention in the FAQ:
> [image: image.png]
>
> This takes you to this page:
> [image: image.png]
> Okay, so maybe. This looks good in theory, but what if your data has also 
> been sent to a third-party service?
> AFAIK Copilot uses OpenAI's Codex model.  Do they have a setup to host and 
> run it themselves?
> Or is all the data sent to a service run by OpenAI?
>
> I think it's appropriate to be really cautious about this.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beancount+unsubscr...@googlegroups.com.
To view this discussion visit 
https://groups.google.com/d/msgid/beancount/fe428c54-9a6b-4b28-be75-84b1a5daa805n%40googlegroups.com.

Reply via email to