On Sun, Jun 22, 2025 at 07:34:58PM -0400, Martin Blais wrote:
> Today's models are pretty amazing actually.
> You can say things like "output the text under the white cat" and that
> would likely work.
> I'm blown away every moment of the day these days using these.

Yeah, but especially for vision models it seems to me that the quality
gap between self-hostable open-weight models and remote proprietary ones
is still pretty big. Last time I tried vision models locally for OCR in
the context of personal finance, the results weren't great (= not usable
yet), but it was ~1 year ago and things move fast. If someone on this
list have concrete experiences about self-hostable vision models that
work well for this, I'd love to hear about the specifics (which model,
system prompt, etc.).

Cheers
-- 
Stefano Zacchiroli . [email protected] . https://upsilon.cc/zack  _. ^ ._
Full professor of Computer Science              o     o   o     \/|V|\/
Télécom Paris, Polytechnic Institute of Paris     o     o o    </>   <\>
Co-founder & CSO Software Heritage            o o o     o       /\|^|/\
Mastodon: https://mastodon.xyz/@zacchiro                        '" V "'

-- 
You received this message because you are subscribed to the Google Groups 
"Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/beancount/20250623065146.4pi355rnbshr5x4r%40upsilon.cc.

Reply via email to