Example v<=1.0.3

Example for chatformers<=1.0.3

from chatformers.chatbot import chat

# llm_provider_settings = {
#     "provider": 'ollama',
#     "base_url": 'http://localhost:11434',
#     "model": "openhermes",
#     "options": {},
#     "api_key": None
# }
# embedding_model_settings = {
#     "provider": 'ollama',
#     "base_url": 'http://localhost:11434',
#     "model": "nomic-embed-text",
#     "api_key": None
# }
# llm_provider_settings = {
#     "provider": 'openai',
#     "base_url": "https://api.openai.com/v1",
#     "model": "gpt-4o-mini",
#     "options": {},
#     "api_key": ""
# }
# embedding_model_settings = {
#     "provider": 'openai',
#     "base_url": "https://api.openai.com/v1",
#     "model": "text-embedding-ada-002",
#     "api_key": ""
# }

llm_provider_settings = {
    "provider": 'groq',
    "base_url": 'https://api.groq.com/openai/v1',
    "model": "gemma2-9b-it",
    "api_key": "",
}

embedding_model_settings = {
    "provider": 'jina',
    "base_url": "https://api.jina.ai/v1/embeddings",
    "model": "jina-embeddings-v2-base-en",
    "api_key": ""
}

chroma_settings = {
    "host": None,
    "port": None,
    "settings": None
}

memory_settings = {
    "try_queries": True,
    "results_per_query": 3,
}
collection_name = "conversation"
unique_session_id = "012"
unique_message_id = "A01"
system_message = "You are a helpful assistant."
buffer_window_chats = [
    {'role': 'user', 'content': 'what is 7*5?'},
    {'role': 'assistant', 'content': '35'},
    {'role': 'user', 'content': 'now add 4 on that.'},
]
query = "Now add, 100 on that."
response = chat(query=query, system_message=system_message,
                llm_provider_settings=llm_provider_settings,
                chroma_settings=chroma_settings,
                embedding_model_settings=embedding_model_settings,
                memory_settings=memory_settings,
                memory=True,
                summarize_memory=False,
                collection_name=collection_name,
                unique_session_id=unique_session_id,
                unique_message_id=unique_message_id,
                buffer_window_chats=buffer_window_chats)
print("Assistant: ", response)

You can see, that we are having the initial conversation with an assistant, where its not aware of the context (check next output also)-

Below you can see the assistant remembers the context in the next run, fetched context as memories-

Understand Settings Parameters-

  • llm_settings-

    • provider: can be openai or ollama only for now.

    • base_url: base URL of provider

    • model: name of model

    • options: This is optional, by default it uses the default settings of the provider

    • api_key: API key from provider

  • embedding_model_settings-

    • provider: can be openai or ollama only for now.

    • base_url: base URL of provider

    • model: name of model

    • options: This is optional, by default it uses default settings of the provider

    • api_key: API key from provider

  • chroma_settings-

    • host: host url of chromadb

    • port: port of chromadb

    • settings: chromadb settings, including authentication. Read chromadb documentation.

  • memory_settings-

    • try_queries: It means, the assistant will refine your query to search for similar embeddings in chromadb.

      Example: Your input is 'My name is Dipesh'

      Assistant might try these queries in vector db: 'Who is Dipesh?', 'Is there any conversation with Dipesh?', 'My name is Dipesh' etc.

    • results_per_query: How many relevant chats do you want to fetch from vector db?

  • collection_name = "conversation": (str) This the collection name you want to create in vector db.

  • unique_session_id = "012": (str) This you need to manage. For example, if user-A is having a conversation in session 1 or user-A is having a conversation in session 2.

  • unique_message_id = "A01": (str) This you need to manage. It can be any unique message id. You can use uuid as a string here,

  • system_message = "You are a helpful assistant.": (str) Any system message or prompt.

  • summarize_memory=False: flag to use summarize memory, improve quality but increase llm call

  • buffer_window_chats: If you want to manage sliding window chat history, you can pass last-n message (last-n conversation) like this in OpenAI's format.

    Example:

  • query = "Tell me more about him?": Any current/last human message.

FAQs-

  1. Can I customize LLM endpoints / Groq or other models?

    • Yes, any OpenAI-compatible endpoints and models can be used.

  2. Can I use custom hosted chromadb

    • Yes, you can specify custom endpoints for Chroma DB. If not provided, a Chroma directory will be created in your project's root folder.

  3. I don't want to manage history. Just wanted to chat.

    • Yes, set memory=False to disable history management and chat directly.

  4. Need help or have suggestions?

Last updated