Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(product-assistant): product memory #27270

Open
wants to merge 58 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
e4e5597
feat: perplexity prompt
skoob13 Dec 16, 2024
900558a
chore: migrations
skoob13 Dec 17, 2024
4afa58b
feat: initial memory scraping
skoob13 Dec 18, 2024
995e1f0
feat: block onboarding when someone has started it
skoob13 Dec 18, 2024
8c4fa2e
feat: assistant onboarding
skoob13 Dec 18, 2024
1af9f54
chore: migration
skoob13 Dec 20, 2024
66d3ec2
feat: flag for resumed conversation
skoob13 Dec 19, 2024
7cacd2f
fix: mypy
skoob13 Dec 20, 2024
d47b848
feat: stricter prompt for perplexity
skoob13 Dec 20, 2024
7143371
feat: simple dynamic forms
skoob13 Jan 2, 2025
92e3d44
feat: style form options
skoob13 Jan 2, 2025
4eedbd6
feat: formatting core memory
skoob13 Jan 2, 2025
7c31136
feat: core memory prompts
skoob13 Jan 2, 2025
db9d55d
feat: compress memory
skoob13 Jan 3, 2025
9dcf2bc
fix: update mobile prompts
skoob13 Jan 3, 2025
40ffbf1
fix: disable streaming for compressor
skoob13 Jan 3, 2025
4a5211b
feat: memory collector nodes
skoob13 Jan 3, 2025
51644c2
feat: append/replace memory
skoob13 Jan 3, 2025
12bd3f9
fix: mypy
skoob13 Jan 3, 2025
f8c3cd8
feat: memory router
skoob13 Jan 6, 2025
5a6b4e7
Update UI snapshots for `chromium` (1)
github-actions[bot] Jan 6, 2025
7b0cd1f
feat: re-use prompts
skoob13 Jan 6, 2025
e1e24fa
fix: excessive detection
skoob13 Jan 6, 2025
4df2af3
test: eval tests
skoob13 Jan 6, 2025
b124610
fix: broader check for the planner
skoob13 Jan 6, 2025
fc8a9eb
fix: filter out summarizer messages
skoob13 Jan 6, 2025
b3d8c04
fix: assistant tests
skoob13 Jan 6, 2025
5a7d502
test: core memory model
skoob13 Jan 6, 2025
eeec86c
Merge branch 'feat/core-agent-memory' of github.com:PostHog/posthog i…
skoob13 Jan 6, 2025
86dd54e
test: onboarding node
skoob13 Jan 7, 2025
ec1b6d8
fix: allow optional state
skoob13 Jan 7, 2025
dd0bf31
test: memory initializer
skoob13 Jan 7, 2025
3145b81
test: memory initializer interrupt
skoob13 Jan 7, 2025
d942f68
test: memory collector node
skoob13 Jan 7, 2025
b59724e
test: memory collector tools
skoob13 Jan 7, 2025
cf72449
test: abstract nodes
skoob13 Jan 7, 2025
6d9c76a
fix: use product description from project for initial memory
skoob13 Jan 7, 2025
27a45db
fix: rename tests
skoob13 Jan 7, 2025
d207440
Merge branch 'fix/taxonomy-planner-failover' into feat/core-agent-memory
skoob13 Jan 7, 2025
2859068
fix: set skipped status after rejecting scraped memory
skoob13 Jan 7, 2025
3e06455
test: assistant tests
skoob13 Jan 7, 2025
4a7ed05
chore: document code and move out messages to prompts
skoob13 Jan 7, 2025
e6eb4f7
fix: recompile dev requirements
skoob13 Jan 7, 2025
855561a
fix: mypy
skoob13 Jan 7, 2025
8036ebe
fix: hide form actions after submitting a response
skoob13 Jan 7, 2025
405e42c
fix: missing messages from memory initializer
skoob13 Jan 7, 2025
a089774
Merge branch 'master' into feat/core-agent-memory
skoob13 Jan 7, 2025
d13df1f
Merge branch 'master' into feat/core-agent-memory
Twixes Jan 8, 2025
c731539
Prevent memory being left in pending state
Twixes Jan 9, 2025
5461600
Tweak wording of the initialization flow
Twixes Jan 9, 2025
aa9a645
Tweak initialization prompts
Twixes Jan 9, 2025
6d9967b
Merge branch 'master' into feat/core-agent-memory
Twixes Jan 9, 2025
3732092
Unify `isFinalGroup` naming
Twixes Jan 9, 2025
4c4f723
Fix missing __init__.py in memory tests
Twixes Jan 9, 2025
3c68f38
Update UI snapshots for `chromium` (1)
github-actions[bot] Jan 9, 2025
6f307db
Add one more __init__.py
Twixes Jan 9, 2025
f9a9701
Merge branch 'master' into feat/core-agent-memory
Twixes Jan 9, 2025
5c7af30
Update UI snapshots for `chromium` (1)
github-actions[bot] Jan 9, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 34 additions & 11 deletions ee/hogai/assistant.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import json
from collections.abc import Generator, Iterator
from typing import Any, Optional
from typing import Any, Optional, cast
from uuid import uuid4

from langchain_core.messages import AIMessageChunk
Expand All @@ -12,6 +12,7 @@
from ee import settings
from ee.hogai.funnels.nodes import FunnelGeneratorNode
from ee.hogai.graph import AssistantGraph
from ee.hogai.memory.nodes import MemoryInitializerNode
from ee.hogai.retention.nodes import RetentionGeneratorNode
from ee.hogai.schema_generator.nodes import SchemaGeneratorNode
from ee.hogai.trends.nodes import TrendsGeneratorNode
Expand Down Expand Up @@ -57,6 +58,17 @@
AssistantNodeName.RETENTION_GENERATOR: RetentionGeneratorNode,
}

STREAMING_NODES: set[AssistantNodeName] = {
AssistantNodeName.MEMORY_ONBOARDING,
AssistantNodeName.MEMORY_INITIALIZER,
AssistantNodeName.SUMMARIZER,
}
"""Nodes that can stream messages to the client."""


VERBOSE_NODES = STREAMING_NODES | {AssistantNodeName.MEMORY_INITIALIZER_INTERRUPT}
"""Nodes that can send messages to the client."""


class Assistant:
_team: Team
Expand Down Expand Up @@ -117,8 +129,11 @@ def _stream(self) -> Generator[str, None, None]:
# Check if the assistant has requested help.
state = self._graph.get_state(config)
if state.next:
interrupt_value = state.tasks[0].interrupts[0].value
yield self._serialize_message(
AssistantMessage(content=state.tasks[0].interrupts[0].value, id=str(uuid4()))
AssistantMessage(content=interrupt_value, id=str(uuid4()))
if isinstance(interrupt_value, str)
else interrupt_value
)
else:
self._report_conversation_state(last_viz_message)
Expand Down Expand Up @@ -227,26 +242,34 @@ def _process_value_update(self, update: GraphValueUpdateTuple) -> BaseModel | No
return node_val.messages[0]
elif node_val.intermediate_steps:
return AssistantGenerationStatusEvent(type=AssistantGenerationStatusType.GENERATION_ERROR)
elif node_val := state_update.get(AssistantNodeName.SUMMARIZER):
if isinstance(node_val, PartialAssistantState) and node_val.messages:
self._chunks = AIMessageChunk(content="")
return node_val.messages[0]

for node_name in VERBOSE_NODES:
if node_val := state_update.get(node_name):
if isinstance(node_val, PartialAssistantState) and node_val.messages:
self._chunks = AIMessageChunk(content="")
return node_val.messages[0]

return None

def _process_message_update(self, update: GraphMessageUpdateTuple) -> BaseModel | None:
langchain_message, langgraph_state = update[1]
if isinstance(langchain_message, AIMessageChunk):
if langgraph_state["langgraph_node"] in VISUALIZATION_NODES.keys():
node_name = langgraph_state["langgraph_node"]
if node_name in VISUALIZATION_NODES.keys():
self._chunks += langchain_message # type: ignore
parsed_message = VISUALIZATION_NODES[langgraph_state["langgraph_node"]].parse_output(
self._chunks.tool_calls[0]["args"]
)
parsed_message = VISUALIZATION_NODES[node_name].parse_output(self._chunks.tool_calls[0]["args"])
if parsed_message:
initiator_id = self._state.start_id if self._state is not None else None
return VisualizationMessage(answer=parsed_message.query, initiator=initiator_id)
elif langgraph_state["langgraph_node"] == AssistantNodeName.SUMMARIZER:
elif node_name in STREAMING_NODES:
self._chunks += langchain_message # type: ignore
if node_name == AssistantNodeName.MEMORY_INITIALIZER:
if not MemoryInitializerNode.should_process_message_chunk(langchain_message):
return None
else:
return AssistantMessage(
content=MemoryInitializerNode.format_message(cast(str, self._chunks.content))
)
return AssistantMessage(content=self._chunks.content)
return None

Expand Down
27 changes: 27 additions & 0 deletions ee/hogai/eval/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
from langchain_core.runnables import RunnableConfig

from ee.models import Conversation
from ee.models.assistant import CoreMemory
from posthog.demo.matrix.manager import MatrixManager
from posthog.models import Organization, Project, Team, User
from posthog.tasks.demo_create_data import HedgeboxMatrix
Expand Down Expand Up @@ -78,6 +79,32 @@ def user(team, django_db_blocker) -> Generator[User, None, None]:
user.delete()


@pytest.fixture(scope="package")
def core_memory(team) -> Generator[CoreMemory, None, None]:
initial_memory = """Hedgebox is a cloud storage service enabling users to store, share, and access files across devices.

The company operates in the cloud storage and collaboration market for individuals and businesses.

Their audience includes professionals and organizations seeking file management and collaboration solutions.

Hedgebox’s freemium model provides free accounts with limited storage and paid subscription plans for additional features.

Core features include file storage, synchronization, sharing, and collaboration tools for seamless file access and sharing.

It integrates with third-party applications to enhance functionality and streamline workflows.

Hedgebox sponsors the YouTube channel Marius Tech Tips."""

core_memory = CoreMemory.objects.create(
team=team,
text=initial_memory,
initial_text=initial_memory,
scraping_status=CoreMemory.ScrapingStatus.COMPLETED,
)
yield core_memory
core_memory.delete()


@pytest.mark.django_db(transaction=True)
@pytest.fixture
def runnable_config(team, user) -> Generator[RunnableConfig, None, None]:
Expand Down
178 changes: 178 additions & 0 deletions ee/hogai/eval/tests/test_eval_memory.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
import json
from collections.abc import Callable
from typing import Optional

import pytest
from deepeval import assert_test
from deepeval.metrics import GEval, ToolCorrectnessMetric
from deepeval.test_case import LLMTestCase, LLMTestCaseParams
from langchain_core.messages import AIMessage
from langchain_core.runnables.config import RunnableConfig
from langgraph.graph.state import CompiledStateGraph

from ee.hogai.assistant import AssistantGraph
from ee.hogai.utils.types import AssistantNodeName, AssistantState
from posthog.schema import HumanMessage


@pytest.fixture
def retrieval_metrics():
retrieval_correctness_metric = GEval(
name="Correctness",
criteria="Determine whether the actual output is factually correct based on the expected output.",
evaluation_steps=[
"Check whether the facts in 'actual output' contradicts any facts in 'expected output'",
"You should also heavily penalize omission of detail",
"Vague language, or contradicting OPINIONS, are OK",
"The actual fact must only contain information about the user's company or product",
"Context must not contain similar information to the actual fact",
],
evaluation_params=[
LLMTestCaseParams.INPUT,
LLMTestCaseParams.CONTEXT,
LLMTestCaseParams.EXPECTED_OUTPUT,
LLMTestCaseParams.ACTUAL_OUTPUT,
],
threshold=0.7,
)

return [ToolCorrectnessMetric(), retrieval_correctness_metric]


@pytest.fixture
def replace_metrics():
retrieval_correctness_metric = GEval(
name="Correctness",
criteria="Determine whether the actual output tuple is factually correct based on the expected output tuple. The first element is the original fact from the context to replace with, while the second element is the new fact to replace it with.",
evaluation_steps=[
"Check whether the facts in 'actual output' contradicts any facts in 'expected output'",
"You should also heavily penalize omission of detail",
"Vague language, or contradicting OPINIONS, are OK",
"The actual fact must only contain information about the user's company or product",
"Context must contain the first element of the tuples",
"For deletion, the second element should be an empty string in both the actual and expected output",
],
evaluation_params=[
LLMTestCaseParams.INPUT,
LLMTestCaseParams.CONTEXT,
LLMTestCaseParams.EXPECTED_OUTPUT,
LLMTestCaseParams.ACTUAL_OUTPUT,
],
threshold=0.7,
)

return [ToolCorrectnessMetric(), retrieval_correctness_metric]


@pytest.fixture
def call_node(team, runnable_config: RunnableConfig) -> Callable[[str], Optional[AIMessage]]:
graph: CompiledStateGraph = (
AssistantGraph(team).add_memory_collector(AssistantNodeName.END, AssistantNodeName.END).compile()
)

def callable(query: str) -> Optional[AIMessage]:
state = graph.invoke(
AssistantState(messages=[HumanMessage(content=query)]),
runnable_config,
)
validated_state = AssistantState.model_validate(state)
if not validated_state.memory_collection_messages:
return None
return validated_state.memory_collection_messages[-1]

return callable


def test_saves_relevant_fact(call_node, retrieval_metrics, core_memory):
query = "calculate ARR: use the paid_bill event and the amount property."
actual_output = call_node(query)
tool = actual_output.tool_calls[0]

test_case = LLMTestCase(
input=query,
expected_output="The product uses the event paid_bill and the property amount to calculate Annual Recurring Revenue (ARR).",
expected_tools=["core_memory_append"],
context=[core_memory.formatted_text],
actual_output=tool["args"]["memory_content"],
tools_called=[tool["name"]],
)
assert_test(test_case, retrieval_metrics)


def test_saves_company_related_information(call_node, retrieval_metrics, core_memory):
query = "Our secondary target audience is technical founders or highly-technical product managers."
actual_output = call_node(query)
tool = actual_output.tool_calls[0]

test_case = LLMTestCase(
input=query,
expected_output="The company's secondary target audience is technical founders or highly-technical product managers.",
expected_tools=["core_memory_append"],
context=[core_memory.formatted_text],
actual_output=tool["args"]["memory_content"],
tools_called=[tool["name"]],
)
assert_test(test_case, retrieval_metrics)


def test_omits_irrelevant_personal_information(call_node):
query = "My name is John Doherty."
actual_output = call_node(query)
assert actual_output is None


def test_omits_irrelevant_excessive_info_from_insights(call_node):
query = "Build a pageview trend for users with name John."
actual_output = call_node(query)
assert actual_output is None


def test_fact_replacement(call_node, core_memory, replace_metrics):
query = "Hedgebox doesn't sponsor the YouTube channel Marius Tech Tips anymore."
actual_output = call_node(query)
tool = actual_output.tool_calls[0]

test_case = LLMTestCase(
input=query,
expected_output=json.dumps(
[
"Hedgebox sponsors the YouTube channel Marius Tech Tips.",
"Hedgebox no longer sponsors the YouTube channel Marius Tech Tips.",
]
),
expected_tools=["core_memory_replace"],
context=[core_memory.formatted_text],
actual_output=json.dumps([tool["args"]["original_fragment"], tool["args"]["new_fragment"]]),
tools_called=[tool["name"]],
)
assert_test(test_case, replace_metrics)


def test_fact_removal(call_node, core_memory, replace_metrics):
query = "Delete info that Hedgebox sponsored the YouTube channel Marius Tech Tips."
actual_output = call_node(query)
tool = actual_output.tool_calls[0]

test_case = LLMTestCase(
input=query,
expected_output=json.dumps(["Hedgebox sponsors the YouTube channel Marius Tech Tips.", ""]),
expected_tools=["core_memory_replace"],
context=[core_memory.formatted_text],
actual_output=json.dumps([tool["args"]["original_fragment"], tool["args"]["new_fragment"]]),
tools_called=[tool["name"]],
)
assert_test(test_case, replace_metrics)


def test_parallel_calls(call_node):
query = "Delete info that Hedgebox sponsored the YouTube channel Marius Tech Tips, and we don't have file sharing."
actual_output = call_node(query)

tool = actual_output.tool_calls
test_case = LLMTestCase(
input=query,
expected_tools=["core_memory_replace", "core_memory_append"],
actual_output=actual_output.content,
tools_called=[tool[0]["name"], tool[1]["name"]],
)
assert_test(test_case, [ToolCorrectnessMetric()])
2 changes: 1 addition & 1 deletion ee/hogai/eval/tests/test_eval_router.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
def call_node(team, runnable_config) -> Callable[[str | list], str]:
graph: CompiledStateGraph = (
AssistantGraph(team)
.add_start()
.add_edge(AssistantNodeName.START, AssistantNodeName.ROUTER)
.add_router(path_map={"trends": AssistantNodeName.END, "funnel": AssistantNodeName.END})
.compile()
)
Expand Down
11 changes: 5 additions & 6 deletions ee/hogai/funnels/prompts.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,16 +2,15 @@
<agent_info>
You are an expert product analyst agent specializing in data visualization and funnel analysis. Your primary task is to understand a user's data taxonomy and create a plan for building a visualization that answers the user's question. This plan should focus on funnel insights, including a sequence of events, property filters, and values of property filters.

{{#product_description}}
The product being analyzed is described as follows:
<product_description>
{{.}}
</product_description>
{{/product_description}}
{{core_memory_instructions}}

{{react_format}}
</agent_info>

<core_memory>
{{core_memory}}
</core_memory>

{{react_human_in_the_loop}}

Below you will find information on how to correctly discover the taxonomy of the user's data.
Expand Down
Loading
Loading