On Fri, Nov 15, 2024 at 12:21 PM Chary Chary <chary...@gmail.com> wrote:
> Martin, > > thanks for bringing up this issue. Just thinking aloud: > > 1) It is possible to disable certain file types for copilot and it > appears, that copilot claims it will not be accessing these file types then > https://stackoverflow.com/a/77908836/27989141 > > Do you think it is sufficient to disable copilot just for .bean files? > I don't know. The answer would be found in the source code for the copilot editor support for your editor. > > 2) Suppose someone keeps financial data in Google Sheets (which I > generally don't, but suppose). Is there any reason to get more concerted of > copilot accessing my financial data than google? > They're different entities. Github is Github, Google is Google. Github is also Microsoft. Github also ferries your data to models, which surely means some well-provisioned model APIs, which I suspect are located at OpenAI. Google is one entity with internal protocols for safety and privacy mechanisms, including access restriction mechanisms preventing even employees from accessing data (except very narrowly defined based only on business need). FWIW I used to work there and as a result of what I've seen during my time I trust my data there as much as files on my own personal computers. But a lot of people like to hate on large companies these days... that's up to you... I'm the wrong person to ask, I'm a fanboy. > On Friday, November 15, 2024 at 3:35:46 PM UTC+1 bl...@furius.ca wrote: > >> Dear Beancount users, >> >> This is a PSA -- TL;DR: don't enable Github Copilot completion on your >> ledgers. >> >> If you installed Github Copilot in your personal code editor/computer, be >> aware that it uploads "snippets" of your input files to it and possibly to >> third-party APIs (e.g., OpenAI). I think people are just beginning to >> become aware of the implications of this due to their employers crafting >> policies around what LLMs they can use and what-not, but it's still early >> days and it's easy to accidentally screw up, so here are some thoughts >> about this. >> >> I think it's really easy to install Github Copilot to get code >> completions in say, Emacs, and then to open up your ledger and it's in >> Copilot minor-mode everywhere (for example if you enabled it via `(add-hook >> 'prog-mode-hook 'copilot-mode)` or similar, to be turned on everywhere >> ("it's amazing, right?")), which means you get completions on its contents. >> AFAICT it's impossible to know how much context is sent up to the models >> for queries. GH claims general "context" is sent: >> >> https://github.com/features/copilot/#faq >> [image: image.png] >> >> In other places I've seen it's mentioned that "a few lines of context >> before and after the code you're editing". AFAIK there's no way to know how >> large this context is, and I've seen mentions of the selection somewhere. >> For example, if you select your entire ledger file, does it upload the >> whole thing as context for your completion prompt? >> >> Github's retention policy mentions prompts aren't retained, but what >> about context? >> I see "Prompts and Suggestions" in the FAQ: >> [image: image.png] >> >> And some of your transaction data may end up getting used to train new >> models? >> [image: image.png] >> >> Please correct me if I'm wrong: >> - I don't believe there is a local log (on your computer) of what was >> actually sent. >> (If you just accidentally once opened up your ledger with the entire >> history of your financial life, it's not impossible that the whole thing >> was uploaded to Copilot.) >> - I don't believe Github lets you view the content you've uploaded and >> sent from their site either. >> - I don't believe Github lets you delete the content as a matter or >> normal usage (like Google Dashboard does, e.g., >> https://myaccount.google.com/dashboard) >> >> There's some mention in the FAQ: >> [image: image.png] >> >> This takes you to this page: >> [image: image.png] >> Okay, so maybe. This looks good in theory, but what if your data has also >> been sent to a third-party service? >> AFAIK Copilot uses OpenAI's Codex model. Do they have a setup to host >> and run it themselves? >> Or is all the data sent to a service run by OpenAI? >> >> I think it's appropriate to be really cautious about this. >> >> -- > You received this message because you are subscribed to the Google Groups > "Beancount" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to beancount+unsubscr...@googlegroups.com. > To view this discussion visit > https://groups.google.com/d/msgid/beancount/fe428c54-9a6b-4b28-be75-84b1a5daa805n%40googlegroups.com > <https://groups.google.com/d/msgid/beancount/fe428c54-9a6b-4b28-be75-84b1a5daa805n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "Beancount" group. To unsubscribe from this group and stop receiving emails from it, send an email to beancount+unsubscr...@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/beancount/CAK21%2BhN7K%2BtqZhPBFG_9jerc-SdRLNYKybNx6FUAGkWdYsboeA%40mail.gmail.com.