Karl Semich wrote:
> so i was working on 'zinc' -- Zinc Is Not Cline 
> https://github.com/karl3wm/zinc
> (very very slowly of course) and things got intense and i stopped :s
> 
> basically it needs boost and you make a build dir and build it with cmake
> and that makes "chat" and "complete" binaries that can do chatting or
> completion with llama 3.1 405b on sambanova, hardcoded right now
> 
> i'm not sure how to uhhh add preservation of configuration data to it
> uhhh actually uhhh
> anyway i looked up the llama prompt formats and made prompts (scripts)
> to use llama to help build it, it's comparable to uhh cline? claude?
> with short input, kind of somewhat
> 
> the scripts are in scripts/ so right now you can go things like uhh
> `echo Update this file to have lots of comments about walruses. |
> python3 scripts/llama_file_flat.py src/configuration.cpp |
> build/complete | tee src/update_of_configuration.cpp.txt` and llama
> will act like cline and read the file and output a new file. but it's
> all manual right now. it's possibly a [sad] point to stop in that
> respect
> 
> but yeah it shows how easy it would be to make something like cline uhh

Did mention that after I posted here about Cline and Roo Cline, Roo Cline was 
renamed to Roo Code within a day?
It's now outpaced Cline.

Meanwhile I've been poking very very slowly at ZinC, very very slowly -- 
https://github.com/karl3wm/zinc !
It's ... funny .. and intense ... to work on. Kind of a new space. Disorienting.

Today I didn't work on it. I specifically avoided it. Because I hadn't been 
sleeping and eating much the prior two days. I spent a lot of today sleeping. I 
also moved forward on rescuing my stolen car, which was slated for being 
crushed at a scrapyard with all of my worldly possessions inside. (long story 
short). It's now on hold but mostly it wasn't my doing.

So because I haven't been working on ZinC I thought I'd talk about it instead. 
Sadly this will unsettle the ZinC energy, but it's somewhat fun!

ZinC is still at https://github.com/karl3wm/zinc . Right now the main binary is 
'chatabout' but of course these binaries and scripts are just tests and 
attempts to bootstrap a little recursion into the development. With `chatabout` 
(which likely has a bug right now who knows, but one could try `mkdir build; cd 
build; cmake ..; make; ./chatabout`, it detects a couple formats of shell 
expansion ("$(cmd) and $(<file)") and converts it to text-based citation 
formats with all the content embedded, so you can reference files and command 
outputs while talking with the language model. Of course I just made this 
quickly to meet small goals. (... and the python and bash scripts went farther 
and let the language model write files ...). I won't be running an example here 
because I'm ZinC-fasting today. Abstinence. Oh, the current bug with chatabout 
is that it crashes if you type in a url that doesn't return a 200 status code, 
because it doesn't catch the http error that is thrown.

Let's talk about some of my silly development struggles. I'm silly partly 
because it's just a "fun project" so I make funny decisions.

One of them was the API provider. Over last weekend the sambanova endpoint 
started reporting errors that my account was disabled. This was a bit of a 
weird one. I contacted support and got a reply on Monday that they had changed 
their system over the weekend to turn free accounts into paid accounts with an 
introductory token gift. My account didn't match what I expected, different 
balance than I was told (.. different usage history than I thought I should 
have ..) ... it worked again but it was funny ... Right now I've been using 
Targon instead of SambaNova. Targon's one of pseudo-blockchain-businesses where 
they call themselves blockchain but have a ton of off-chain infrastructure. So 
[of course?] it's quite enticing in the [boss/ai/ml] mindset.

Targon presently has a timing-related bug where requests terminate with an 
error mid-stream before completing. At least, my streaming requests do. To 
resolve this more robustly I ended up changing chatabout to craft the prompt by 
hand using the completion openai-style endpoing rather than the chat 
openai-style endpoint. Then when a request was interrupted midstream I could 
append the tokens manually and continue it automatically. This made it behave 
nice and robustly! Still some small errors on my end but because of the retry 
behavior they turn into performance issues rather than crashes.

Let's quote that code!

The chatabout source is at 
https://github.com/karl3wm/zinc/blob/main/cli/chatabout.cpp . Remember it's 
just a quick script, my hope was to integrate all its parts into reusable 
components. Seems to be slow doing that ...
Hmm this doesn't look like the latest version ... I thought I had dropped the 
completion tokens ...

Anyway here's the region where I added the retry loop kinda from lines 176 to 
219:
>        Log::log(zinc::span<StringViewPair>({
>            {"role", "user"},
>            {"content", msg},
>        }));
This is just local logging to a file I haven't added blockchain logging at this 
time. I was thinking if I observed myself doing that I could make it optional.

>
>        messages.emplace_back(HodgePodge::Message{.role="user", 
> .content=move(msg)});
Here you can see the introduction of a HodgePodge class which I made quickly to 
bandaid the Targon issue. HodgePodge has a more flushed-out Message structure 
than my OpenAI class so as to handle the message parts that the large language 
model prompt generation templates use. The first line of the file now has 
"#include <zinc/hodgepodge.hpp>" inserted.

>        // it might be nice to terminate the request if more data is found on 
> stdin, append the data, and retry
>        // or otherwise provide for the user pasting some data then commenting 
> on it or hitting enter a second time or whatnot
Doesn't look like I'm likely to prioritize that, unsure. Maybe !

>        prompt = HodgePodge::prompt_deepseek3(messages, "assistant" != 
> messages.back().role);
Here's where it generates the prompt. HodgePodge::prompt_deepseek3 generates 
the prompt that the official DeepSeek3 tokenizer uses for its chat interfaces. 
It converts a sequence of message objects into a string of tokens to be 
appended to.

>        msg.clear();
It stores the assistant message in a std::string msg, appending to it as it 
streams in.

>
>        cerr << endl << "assistant: " << flush;
I use std::cerr for the chat label so that if the output is piped or redirected 
it doesn't include this decoration.

>        do {
>            try {
>                //for (auto&& part : client.chat(messages)) {
This was old code, when I just used the chat interface on the server. Some LLMs 
let you prompt the assistant in an unbroken manner using the chat interface, 
but right now I'm using DeepSeek V3 and their default prompt template doesn't 
allow this; it appends end-sentence tokens and such to the provided input, 
preventing the model from resuming where it left off.

>                std::string finish_reason;
>                for (auto&& part : client.complete(prompt + msg)) {
Here I'm calling `complete` instead of `chat` with the raw prompt with any 
existing message content appended to it. As `msg` extends, the prompt extends 
too, as if it were still inside the generation loop on the server.

>                    msg += part;
>                    cout << part << flush;
>                    try {
>                        finish_reason = part.data["finish_reason"].string();
>                    } catch (std::out_of_range const&) {}
>                }
>                retry_assistant = finish_reason == "stop" ? false : true;
Here's the new normal c<i have experienced internal interruption. sending email 
as-is
            } catch (std::runtime_error const& e) {
                retry_assistant = true;
                cerr << "<..." << e.what() << "...reconnecting...>" << flush;
            /*} catch (std::system_error const& e) {
                if (e.code() == std::errc::resource_unavailable_try_again) {
                    retry_assistant = true;
                    cerr << "<...reconnecting...>" << flush;
                } else {
                    throw;
                }*/
            }

            Log::log(zinc::span<StringViewPair>({
                {"role", "assistant"},
                {"content", msg},
            }));
        } while (retry_assistant);

        messages.emplace_back(HodgePodge::Message{.role="assistant", 
.content=move(msg)});
        msg.clear();

Reply via email to