after making this post a while ago, i tried out these three techniques for providing tool-result data to the LLM
- append to assistant message
- send as user response
- send model-specific tool-response type
Findings
turns out - the assistant-message appending works great for larger LLMs, but not so well for smol ones.
meanwhile the user-side method works better than expected!
i didnt spend too much time with the model-specific tool role stuff, since i want my tooling to remain model-agnostic.
i will probably switch to the user-side method now for gopilot, leaving behind the assistant-only approach
Tool call formatting improvements
Turns out - my initial tool calling formatting was SUPER token-inefficient - who knew…
So I went from this formatting
okay lemme look that up online
<tool>{“tool_name”: “web_search”, “args”: {“query”: “how to make milk rice”}}</tool>
<result>just put milk and rice in a bowl and mix em</result>
to this, MUCH simpler format
okay lemme look that up online
Tool: web_search(“how to make milk rice”)
Result: just put milk and rice in a bowl and mix em
which is like - just… WAY better!!!
- tokens reduced from 43 down to 24 (cost savings)
- way easier to read
- relies on models code-writing ability
- allows for specific assignment like in json:
Tool: web_search(query="my query here")
i hope this is useful to someone out there.
if so, maybe share where you are applying it and tell us about your experience! <3
So openai compatible APIs have tool calls built in with a standard format that is often part of the model finetuning.
Kimi k2 has much improved tool use capability and due to its open nature means that a lot of other open weight models will be making distillations to get this capability in their models. Theo has a great video on this.