Very much work in progress
Needs to do: playwright install
Implementation to address deployment in: Llama.cpp: CPU and GPU vLLM: GPU transformers (huggingface): GPU, baseline for ensemble attention implementation?
Initially all ensemble calls are synchronous and initiated by the LLM Reasoner
start ensemble call end ensemble call
start ensemble result end ensemble result
ensemble:member_request </ensemble:member_request>
ensemble:member_response </ensemble:member_response>
member is one of: web_search wikidata_search kgraph_search kgraph_traverse logic_query code_executor llm
logic_query terms for testing: friend(?Friend) search_friends('search term', ?Friend) get_friend('friend_uri', ?Friend) traverse('uri', ?Node) traverse_incoming('uri', ?Node) traverse_outgoing('uri', ?Node)
include request id
Potentially handle async cases with:
- initial ensemble call
- acknowledgement
- result