On Sat, Apr 1, 2023 at 10:18 PM rupert THURNER <[email protected]> wrote: > On Sat, Apr 1, 2023 at 11:36 PM Erik Moeller <[email protected]> wrote: > > > > ... I am confident (based on, e.g., the recent > > results with Alpaca: https://crfm.stanford.edu/2023/03/13/alpaca.html) > > that the performance of smaller models will continue to increase as we > > find better ways to train, steer, align, modularize and extend them. > > to host open models like above would be really > cool for multiple reasons, the most important one to bring > back the openess into the training....
Wow! While Alpaca is English only and released under CC-NC-BY, it does seem like it's very easily replicated with a wide context window and could probably be made widely multilingual beyond the performance of GPT-3.5 for less than it would cost to merely host BLOOM for a few months. This shocked me and of course I take back what I said about requiring several million dollars. https://www.databricks.com/blog/2023/03/24/hello-dolly-democratizing-magic-chatgpt-open-models.html https://huggingface.co/databricks/dolly-v1-6b https://github.com/tatsu-lab/stanford_alpaca What kind of hardware should WMCS buy to support such a project? On Sat, Apr 1, 2023 at 2:36 PM Erik Moeller <[email protected]> wrote: > > ... I'm not sure if the "hallucination" problem is tractable when > all you have is an LLM I disagree, which is why I have been pushing RARR and ROME. RARR seeks to use the same principles of WP:V to eliminate hallucination, requiring confirmation from verifiable sources, which can be limited to e.g. those approved by WP:RSP, and cited in a way that readers can independently verify. I've been posting links to the RARR paper which doesn't go very deep on some of those points, but here's an hour-long presentation by one of the authors which is a lot meatier on such topics: https://www.youtube.com/watch?v=d45Ms8LmF5k And here's a Twitter thread which is more accessible to those less familiar with similar literature: https://twitter.com/kelvin_guu/status/1582714222080688133 Once an attribution and verification system like RARR has identified inaccuracies and hallucinations, the ROME/MEMIT method of editing the models directly can eliminate them completely, and in a way that also eliminates similar generalized mistakes; please see: "Rank-One Editing of Encoder-Decoder Models" https://arxiv.org/abs/2211.13317 I can't believe that the large AI labs aren't working harder on these efforts than they've been letting on. Either they aren't or they are in an uncharacteristically secretive fashion, which would suggest they want to exploit such advances as proprietary trade secrets. In either case, it's vital that fully open organizations like the Foundation get involved quickly. There is reason to believe the latter case, because Google Bard uses a much less rigorous form of attribution and verification (probably based on SPARROW, https://arxiv.org/abs/2209.14375) but it actually causes its hallucinations to get worse e.g. in https://i.redd.it/f30u9n0gn9pa1.png If you watch the RARR video towards the end, Dr. Lao indicates they encountered similar issues but were able to eliminate almost all of them. -LW _______________________________________________ Wikimedia-l mailing list -- [email protected], guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l Public archives at https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/2RZWYFSCPY4KGOSX22KPCKJFL6V36U56/ To unsubscribe send an email to [email protected]
