---
url: https://lettuceai.app/changelog
title: "Changelog — LettuceAI"
description: "Track updates, improvements, and fixes across LettuceAI releases."
---

Changelog

# What's new

Track updates, improvements, and fixes across LettuceAI releases.

Download Latest Version 

Jump to release14 versions

-   [
    
    1.6.0 / 1.3.0
    
    May 4, 2026
    
    ](#v-1.6.0-1.3.0)
-   [
    
    1.5.1 / 1.2.1
    
    Apr 13, 2026
    
    ](#v-1.5.1-1.2.1)
-   [
    
    1.5.0 / 1.2.0
    
    Apr 13, 2026
    
    ](#v-1.5.0-1.2.0)
-   [
    
    1.4.1 / 1.1.1
    
    Apr 6, 2026
    
    ](#v-1.4.1-1.1.1)
-   [
    
    1.4.0 / 1.1.0
    
    Apr 6, 2026
    
    ](#v-1.4.0-1.1.0)
-   [
    
    1.3.3 / 1.0.3
    
    Mar 29, 2026
    
    ](#v-1.3.3-1.0.3)
-   [
    
    1.3.2 / 1.0.2
    
    Mar 27, 2026
    
    ](#v-1.3.2-1.0.2)
-   [
    
    1.3.1 / 1.0.1
    
    Mar 23, 2026
    
    ](#v-1.3.1-1.0.1)
-   [
    
    1.3.0 / release
    
    Mar 22, 2026
    
    ](#v-1.3.0-desktop)
-   [
    
    1.2.0 / Beta 4
    
    Feb 15, 2026
    
    ](#v-1.2.0-beta4)
-   [
    
    1.1.0 / Beta 3
    
    Jan 31, 2026
    
    ](#v-1.1.0-beta3)
-   [
    
    Release / Beta 2
    
    Jan 4, 2026
    
    ](#v-android-release-beta2)
-   [
    
    v1.0-beta.6.2
    
    Dec 24, 2025
    
    ](#v-beta-6.2)
-   [
    
    v1.0-beta.6
    
    Dec 21, 2025
    
    ](#v-beta-6)

Releases

2026

-   [
    
    1.6.0 / 1.3.0
    
    May 4
    
    ](#v-1.6.0-1.3.0)
-   [
    
    1.5.1 / 1.2.1
    
    Apr 13
    
    ](#v-1.5.1-1.2.1)
-   [
    
    1.5.0 / 1.2.0
    
    Apr 13
    
    ](#v-1.5.0-1.2.0)
-   [
    
    1.4.1 / 1.1.1
    
    Apr 6
    
    ](#v-1.4.1-1.1.1)
-   [
    
    1.4.0 / 1.1.0
    
    Apr 6
    
    ](#v-1.4.0-1.1.0)
-   [
    
    1.3.3 / 1.0.3
    
    Mar 29
    
    ](#v-1.3.3-1.0.3)
-   [
    
    1.3.2 / 1.0.2
    
    Mar 27
    
    ](#v-1.3.2-1.0.2)
-   [
    
    1.3.1 / 1.0.1
    
    Mar 23
    
    ](#v-1.3.1-1.0.1)
-   [
    
    1.3.0 / release
    
    Mar 22
    
    ](#v-1.3.0-desktop)
-   [
    
    1.2.0 / Beta 4
    
    Feb 15
    
    ](#v-1.2.0-beta4)
-   [
    
    1.1.0 / Beta 3
    
    Jan 31
    
    ](#v-1.1.0-beta3)
-   [
    
    Release / Beta 2
    
    Jan 4
    
    ](#v-android-release-beta2)

2025

-   [
    
    v1.0-beta.6.2
    
    Dec 24
    
    ](#v-beta-6.2)
-   [
    
    v1.0-beta.6
    
    Dec 21
    
    ](#v-beta-6)

Android · Desktop

1.6.0 / 1.3.0

May 4, 2026

## Companion Mode, New Voice Features & Embedding v4

This release introduces Companion Mode with live relationship state, companion memory surfaces, and soul authoring, while also adding Kokoro TTS, local speech recognition, and the new lettuce-emb-v4 memory model, alongside broader lorebook, runtime, and storage improvements across desktop and Android.

Added Companion Mode as a new interaction model with a dedicated relationship-oriented prompt path, authored companion soul configuration, live emotional and relationship state, and companion-specific memory and inspection pages

Added `lettuce-emb-v4` as the new embedding model for the memory layer, with a large roleplay-retrieval quality jump, 768d native embeddings, Matryoshka dimensions, and ONNX exports

Companion soul authoring is much deeper, with a full editor, presets, in-chat editing surfaces, and an AI-assisted Companion Soul Writer that can draft or refine the soul from character context

Added major new voice features through Kokoro TTS and whisper.cpp-based local speech recognition, making local speech workflows practical across platforms

Dynamic memory, storage, and session persistence received another major hardening pass, especially around normalized memory embedding storage, hot-path updates, and better consistency for long-running chats

Local runtime work continued across `llama.cpp`, offload planning, provider/model routing, and memory-related background processing, improving resilience for local and hybrid setups

What's New

-   Added Companion Mode as a separate interaction mode alongside roleplay, aimed at persistent relationship-driven chats instead of scene-first roleplay
-   Added authored companion configuration with soul fields such as essence, voice, relational style, vulnerabilities, habits, boundaries, baseline affect, and regulation style
-   Added companion relationship pages that expose live closeness, trust, affection, tension, active emotional signals, and relationship-oriented memory history
-   Added companion memory pages for browsing, editing, pinning, cooling, and pruning companion-relevant memories
-   Added in-chat companion soul editing so a companion's authored personality and relational baseline can be refined without leaving the chat context
-   Added AI-assisted companion soul generation through the Companion Soul Writer workflow
-   Added companion download/setup pages and missing-model guidance so the companion stack can be installed more intentionally instead of silently failing
-   Added `lettuce-emb-v4` as the new memory embedder, improving retrieval quality for long-running chats and roleplay memory lookups
-   Added Kokoro-related speech flows and improved voice selection/management surfaces
-   Added whisper.cpp ASR integration work so local transcription flows can participate in the app’s voice pipeline
-   Expanded lorebook workflows with more creation, generation, preview, and management tooling across character and library flows

Companion Mode (Beta)

-   Added Companion Mode as a distinct chat mode for persistent relationship-driven conversations rather than scene-first roleplay
-   Companion chats now maintain live per-session emotional and relationship state rather than only relying on static character prompts
-   The companion runtime updates state from user turns before the assistant reply is generated, so the same turn’s response can reflect evolving closeness, trust, affection, tension, and emotional regulation
-   Companion prompting now has its own template path and injects companion-state context into the prompt when needed
-   Companion setup is gated by required local models, including embedding, emotion classification, NER, and routing, which makes failures easier to understand and recover from
-   Companion-specific inspection pages make the system more legible: users can now see relationship metrics, emotional vectors, and companion-oriented memory records instead of treating the mode as a black box
-   Companion turn effects and post-turn memory plumbing were added so background memory processing can be tied back to specific turns more clearly

Voice, TTS & ASR

-   Added Kokoro TTS as a new speech-generation capability, including packaging, runtime handling, voice integration, and Android-specific eSpeak bundle work needed for reliable phoneme/voice support
-   Added whisper.cpp-based speech recognition and wired it into the broader runtime, giving the app a real local speech-input path
-   Voice management and selection behavior continued to improve across character creation and settings flows

Embedding Model v4

-   Added `lettuce-emb-v4` as the new embedding model for the memory layer, replacing the old weak roleplay-retrieval behavior with a roleplay-first embedder built for long-lived chats
-   The new model delivers a major retrieval-quality jump, with the v4 announcement reporting `0.924` recall@1 on its roleplay-memory benchmark versus `0.020` for v3
-   `lettuce-emb-v4` now uses native `768d` embeddings instead of a `512d` projected bottleneck, which removes a major quality constraint from the previous setup
-   The model supports Matryoshka slicing across `64 / 128 / 256 / 512 / 768` dimensions, so different devices can use different memory tiers without needing separate model families
-   In user terms, this is one of the biggest memory upgrades in the release: long chats, callbacks, recalled details, and roleplay continuity should all benefit from much better retrieval quality

Memory & Runtime

-   Dynamic memory storage moved further toward normalized embedded memory records instead of relying on looser legacy summary-only flows
-   Session and memory hot paths were optimized with narrower DB updates and better high-message-count consistency, reducing the cost and fragility of frequent session writes
-   Post-turn background memory scheduling became more explicit, which helps long-running chats avoid overlapping or wasteful memory work
-   Companion-mode memory now layers companion interpretation and UI on top of the shared dynamic-memory engine, which improves visibility without forking the whole memory stack
-   Local runtime work continued in model routing, provider compatibility, and `llama.cpp` behavior, helping the app tolerate more real-world local setups
-   The `llama-cpp-rs` dependency was updated again near the end of the range, keeping the vendored `llama.cpp` side current without requiring app-level API changes

Lorebooks, Creation & Content Workflows

-   Lorebook tooling expanded heavily across generation, preview, import, and management flows
-   Character creation and editing now better support companion-first authoring, including mode selection, companion prompts, and companion soul configuration
-   Companion and roleplay setup flows are now more clearly separated so users are not pushed through scene-first authoring when building a relationship-oriented character
-   Prompt-building and request-construction work continued across the chat stack, improving how character, lorebook, memory, and companion state are assembled before inference

Fixes & Stability

-   Improved session, settings, and memory persistence behavior so malformed or partial state is less likely to break the app or silently corrupt key flows
-   Hardened storage migrations and runtime state handling around newer memory and companion data structures
-   Reduced several sources of desktop runtime instability in local-model paths, especially around memory, prompting, and backend integration
-   Continued fixing packaging and setup regressions across Android and desktop speech/model flows
-   Improved internal consistency for chat branching, session copying, and related memory state carryover

Platform Notes

-   Desktop benefits most from the current companion inspection tooling because the relationship, memory, and soul pages are all deeper and easier to navigate in the desktop shell
-   Local-model runtime, offload, and memory-path hardening is especially important for desktop users running `llama.cpp`, ONNX, and mixed local/provider setups

Notable Technical Themes

-   The biggest new product feature in this release is Companion Mode: a distinct companion architecture with authored soul configuration, live relational state, and companion-specific inspection tooling
-   The voice stack now includes genuinely new product surface area, with Kokoro TTS and local speech recognition landing alongside the Android speech-packaging work needed to support them
-   The memory layer also got a major product-level upgrade through `lettuce-emb-v4`, which turns embedding quality itself into a visible feature improvement rather than only an internal model swap
-   Memory and storage work in this range focused on making long-running sessions more reliable and structurally sound instead of only adding more visible features
-   Much of the local AI work is release-hardening: better routing, better background processing, better packaging, and fewer fragile assumptions about runtime state

[View full release on GitHub →](https://github.com/LettuceAI/app/releases)

Android · Desktop

1.5.1 / 1.2.1

April 13, 2026

## Dynamic Memory Expansion, Local Tooling Resilience & Logging

This release heavily expands Dynamic Memory for local models, makes local tool calling and settings recovery more tolerant, improves diagnostics, and fixes several character and group chat setup regressions on desktop.

Dynamic Memory was heavily expanded for local models with a separate local memory-manager template, experimental recursive memory loops, configurable loop caps, richer lifecycle logging, and improved revert behavior

Dynamic Memory debugging is much stronger: raw cycle payloads can now be captured and inspected, and malformed local tool arguments are normalized before validation

`llama.cpp` local tool calling became more tolerant again by dropping the hard dependency on native parser metadata and falling back more gracefully

Settings and model configuration loading became more resilient, reducing cases where malformed model data could break provider visibility or onboarding flows

Rust panics now produce dedicated panic report files instead of only blending into the main app log

Character and group-chat setup regressions were fixed, including broken persistence for character group-chat prompt selections and the non-scrollable group setup page on smaller or scaled desktop displays

User-Facing Features

-   Dynamic Memory now includes an experimental `Recursive Memory Loops` mode for stepwise tool execution until the model signals completion
-   The recursive loop hard cap is configurable instead of being fixed internally
-   Dynamic Memory activity logs have improved revert UX and can reconstruct state more accurately after reverts
-   A separate protected prompt template now exists for local model Dynamic Memory manager behavior, and local providers are routed to it automatically
-   Developer-mode memory logs can now expose raw Dynamic Memory step payloads for debugging malformed local model outputs

Fixes & Stability

-   Character settings now correctly persist group chat conversation and roleplay prompt-template selections
-   Group chat creation and setup pages now scroll correctly on smaller or high-scale desktop displays because the nested flex layout no longer traps the viewport
-   Group chat header top padding was corrected to avoid a double-offset layout issue
-   Settings persistence and frontend settings parsing now salvage valid provider and model state more defensively when individual rows are malformed
-   `llama.cpp` local tool parsing no longer fails early just because a template lacks native parser metadata
-   Local malformed tool argument formats such as parameter-tag wrappers are normalized before they reach Dynamic Memory validation
-   Rust panic handling now writes separate panic logs with backtraces for easier post-crash diagnosis

Dynamic Memory & Local AI

-   Local Dynamic Memory can now run in recursive tool loops rather than a single pass, which helps weaker local models that prefer iterative tool usage
-   Recursive loop execution now emits clearer lifecycle logs covering configuration, per-iteration progress, and stop reasons
-   Raw Dynamic Memory tool-call payloads and per-step responses can be retained for developer inspection
-   Revert now restores memory summaries and related derived state instead of only removing memory entries
-   Revert UI behavior in memory activity logs was refined to make cycle rollback clearer and safer
-   Local memory-manager prompt infrastructure was split so local models can use their own protected template without changing non-local providers

Diagnostics & Logging

-   Dedicated panic report files are now generated for Rust panics with timestamp, thread, payload, location, and backtrace data
-   Dynamic Memory logging now includes raw tool-call capture and recursive-loop execution tracing
-   Settings read and write logging became much more explicit, including provider and model counts plus transaction rewrite details, which helped diagnose provider configuration failures

Desktop-Specific UX Fixes

-   The group chat creation flow now uses a proper nested scroll container instead of behaving like a second full-screen document inside the app shell
-   Character prompt override selections for group chat no longer appear to save and then silently reset when the edit page is reopened

Notable Technical Themes

-   Dynamic Memory shifted from a mostly single-pass workflow toward a more instrumented, iterative, and locally debuggable execution model
-   Local-model tool compatibility work focused on being more permissive with malformed or partially supported outputs instead of failing fast
-   Several fixes in this range were release-hardening patches driven by real-world failures rather than net-new product surface

[View full release on GitHub →](https://github.com/LettuceAI/app/releases)

Android · Desktop

1.5.0 / 1.2.0

April 13, 2026

## Prompt Template Upgrades, Guided Onboarding & Sync Reliability

This update expands prompt-template control, improves Dynamic Memory and local runtime reliability, adds a guided onboarding system, and hardens sync, backup, and state consistency across the app.

Prompt templates are now typed and validated through a new backend-driven parameter engine

Group chats gained editable prompt templates plus character-specific conversation and roleplay overrides

Dynamic Memory is more resilient when structured output fails, with safer cancellation and stronger validation behavior for reasoning-capable models

First-run onboarding has been replaced with a proper guided tour system across setup and early chat flows

Sync, backup, and reload behavior were hardened to preserve newer schema fields, media references, and memory progress more reliably

Turkish and Simplified Chinese are now supported, alongside broader localization coverage across the app

Prompts & Group Chat

-   Added typed prompt templates with backend-driven validation
-   Added a new backend parameter engine for prompt templates
-   Added clearer required-variable and allowed-image-slot handling for templates
-   Added editable group chat prompt templates
-   Added character-specific prompt overrides for group chats
-   Added support for separate group conversation and roleplay prompt overrides per character
-   Improved prompt and template compatibility across the app
-   Fixed protected prompt template type handling
-   Improved the prompt template editor empty state

Dynamic Memory

-   Added a configurable structured fallback format setting
-   Improved Dynamic Memory fallback behavior when ideal structured output fails
-   Added better cancellation handling for active Dynamic Memory requests
-   Fixed stale Dynamic Memory runs continuing after cancel
-   Stripped reasoning and thinking tags before summary validation
-   Improved Dynamic Memory validation reliability for reasoning-capable models
-   Preserved Dynamic Memory state more reliably on session saves
-   Improved group memory update safety by reloading latest state before applying updates

Local AI, Providers & Runtime Stability

-   Improved llama.cpp tool-call diagnostics
-   Added XML fallback parsing for malformed local tool-call outputs
-   Added raw-output recovery for local tool calls
-   Fixed non-streamed llama.cpp tool calls using the wrong parsing path
-   Improved local tool-call reliability for weaker or imperfect outputs
-   Improved Ollama whitespace handling in streamed reasoning output
-   Improved Ollama whitespace handling in native streaming deltas
-   Added proper abort support for non-streaming Ollama requests
-   Improved Gemini chat handling and overall stability
-   Fixed Gemini thinking and reasoning controls to better match model families
-   Improved cross-provider chat-state stability and fallback handling

Onboarding, Sync & Data Integrity

-   Replaced the old first-run tooltip with a proper guided tour system
-   Added guided onboarding for first run, chat detail, and post-first-message flows
-   Added a long-press hint step to the post-first-message tour
-   Added a way to reset guided tours for retesting or reuse
-   Improved the local GGUF model setup flow
-   Prevented onboarding from continuing with empty model drafts
-   Skipped guided tours after backup restore so restored installs are not treated like fresh installs
-   Improved sync and backup compatibility with the current storage schema
-   Fixed newer fields not being preserved correctly across sync, export, and import
-   Improved preservation of character design metadata in backup and sync
-   Improved preservation of group chat prompt override references in backup and sync
-   Improved preservation of session background paths and memory progress state
-   Improved preservation of group-session memory progress state
-   Expanded sync asset collection for additional referenced media
-   Improved handling of session backgrounds, design reference images, and lorebook avatar references during sync
-   Fixed prompt template entry export handling in backup logic

Reliability, Localization & Polish

-   Fixed regenerated chat variants getting out of sync after refresh
-   Improved session-state consistency after reloads
-   Improved memory-state consistency after saves and updates
-   Reduced cases where stale state could overwrite newer chat or memory data
-   Improved overall reliability across chat, group chat, prompt resolution, and memory flows
-   Added support for query-based API key requirements
-   Improved provider configuration behavior in onboarding and settings
-   Better handled provider capability differences in settings flows
-   Added Turkish language support
-   Added Simplified Chinese language support
-   Added locale icons for Turkish and Simplified Chinese
-   Added many missing translation keys across existing locales
-   Improved localization coverage across onboarding, prompts, settings, and other UI flows
-   Improved onboarding clarity and first-use guidance
-   Improved prompt editing UX
-   Improved debugging and recovery behavior around model and tool failures
-   Added general polish across chat, memory, onboarding, and provider flows

Changes

-   Removed device TTS integration
-   Reduced unnecessary dependency and build surface in some runtime paths

[View full release on GitHub →](https://github.com/LettuceAI/app/releases)

Android · Desktop

1.4.1 / 1.1.1

April 6, 2026

## macOS Titlebar Fixes, Onboarding Cleanup & Expanded Logs

A small follow-up update focused on desktop polish, local model onboarding consistency, and better logging visibility.

Fixes

-   macOS title is visible again
-   Local model onboarding was reworked to match the built-in `llama.cpp` flow instead of creating a fake onboarding provider
-   Exported logs now include SQLite activity and pool status
-   Mobile-only cleanup removed the local LLM button from onboarding and is not a desktop-facing change

[View full release on GitHub →](https://github.com/LettuceAI/app/releases)

Android · Desktop

1.4.0 / 1.1.0

April 6, 2026

## Local AI Expansion, Dynamic Memory Upgrades & Desktop UX Overhaul

This release is a major update focused on local AI, Dynamic Memory, desktop UX, and reliability.

Native **Ollama** integration joins a much more capable built-in `llama.cpp` runtime with smarter GPU offload, CPU-safe fallbacks, and better local tool calling

Dynamic Memory now has clearer progress, stronger safeguards, revert support, and more reliable recovery from stale runs

Added runtime fallback reports for local inference

Improved CPU fallback context and batch clamping

Prompt caching, chat settings, and session controls were expanded with cache-aware routing, better live-session sync, and a redesigned settings flow

Desktop gets custom frameless window chrome, wider TopNav adoption, stronger theme-token coverage, and a redesigned Logs page

Models, onboarding, networking, and platform support all received another broad pass for setup quality and runtime reliability

Local AI & Runtime

-   Added native Ollama integration and support for native Ollama tool call payloads
-   Updated `llama.cpp` and related bindings, with better local stability and runtime behavior
-   Reused loaded local models across concurrent requests to reduce unnecessary reloads
-   Added runtime fallback reports for local inference
-   Improved CPU fallback context and batch clamping, plus CPU-safe auto-context behavior on CPU runtimes
-   Fixed several CPU-only safety issues for local inference
-   Added smart GPU layer offload and strict mode overrides
-   Added GPU layer split visibility, sampler order presets, and additional reasoning settings in the model editor
-   Improved model load progress reporting and embedded template rendering via `oaicompat`
-   Improved fallback behavior across local chat templating and rendering paths
-   Hardened and stabilized local tool calling, including structured-output failure handling
-   Shared local backend usage between runtime and context-info paths

Dynamic Memory & Prompting

-   Added a progress bar for Dynamic Memory cycles
-   Allowed cancelling stale non-idle memory runs
-   Reconciled stale processing state on load
-   Improved Dynamic Memory UI state handling
-   Switched memory fallback protocol from JSON to XML
-   Hardened local tool fallback and repair logging
-   Added a llama sampler overwrite toggle for Dynamic Memory and increased its overwrite temperature
-   Hardened deletion safeguards and preset behavior
-   Added deleted memory text to tool logs
-   Added revert support for memory activity cycles
-   Preserved Dynamic Memory settings during backup restore
-   Added prompt cache TTL and sticky routing
-   Completed prompt caching support
-   Improved cache-aware usage and pricing tracking
-   Added a local RP default prompt template
-   Improved prompt and template compatibility with embedded GGUF templates
-   Supported USC system prompt imports

Chat, Scenes & Roleplay Tools

-   Added a chat settings drawer
-   Redesigned session advanced settings
-   Simplified session advanced settings in some flows
-   Restored footer focus correctly after closing the drawer
-   Synced edited messages back into the live session cache after failures and cancels
-   Added support for combined `_**bold italic**_` markdown emphasis
-   Ignored inline image tags when scene generation is unavailable
-   Added support for using chat background as a scene reference
-   Added a session-specific chat background picker
-   Added lorebook keyword detection modes and migration support for the new detection mode
-   Fixed lorebook query column alignment and schema issues
-   Improved unicode thinking parsing related to lorebooks

Desktop UX, Models & Visibility

-   Added a runability score for `llama.cpp` models and redesigned the model editor for better horizontal space usage
-   Kept the model editor on the same page after saving
-   Added a local LLM setup flow to onboarding
-   Improved recommended model installation flow
-   Fixed linking `mmproj` downloads before model creation and auto-creation of recommended installs from queue metadata
-   Unified model selector bottom menus
-   Redesigned the Logs page
-   Added full DB operation logging
-   Added `Ctrl+Shift+L` shortcut for logs
-   Hardened logs against errors
-   Improved mobile overflow and desktop layout behavior in logs
-   Added copy line(s) to the logs context menu
-   Normalized chat debug events for better parser compatibility
-   Added better message, load, and runtime visibility across the app
-   Added custom frameless titlebar and window decorations
-   Added window controls and drag regions to more pages
-   Migrated discovery pages to TopNav
-   Added window controls to chat sub-pages
-   Eliminated empty-state flashes during navigation
-   Connected more chat, sheet, and group chat surfaces to theme color tokens
-   Added About page GitHub icon support
-   Centralized toggle-switch UI work was introduced during this cycle
-   Added session background selection from chat UI

Platform, Networking & Reliability

-   Added LAN OpenAI gateway and Lettuce Host provider
-   Renamed LAN Host API to API Server
-   Deferred timed-out OpenRouter pricing refreshes instead of failing inline
-   Added iOS ONNX Runtime installer workflow
-   Improved Windows DXGI adapter and video-memory checks
-   Improved Windows Vulkan VRAM estimation clamps
-   Repaired macOS ONNX dylib acceptance
-   Fixed valid macOS title bar style enum usage for Tauri
-   Updated dependencies and runtime libraries
-   Expanded supported thinking tag variants
-   Normalized thinking tags across API and local responses
-   Improved stacked toast behavior
-   Routed persona flows through the library
-   Removed the legacy Personas page

[View full release on GitHub →](https://github.com/LettuceAI/app/releases)

Android · Desktop

1.3.3 / 1.0.3

March 29, 2026

## Sync Reliability, Provider Streaming Controls & Timeout Consistency

This update fixes a sync regression affecting some devices, adds per-provider streaming controls, and standardizes API request timeouts across the app.

Fixes

-   Fixed a sync issue where some devices failed to apply data from other devices due to replaying stale local sync payloads from older app versions
-   After updating, the app rebuilds its local sync state once and continues syncing using the current data format

Changes

-   Providers now have independent streaming toggles
-   Streaming can be enabled or disabled per officially supported provider
-   Features that require non-streaming, such as dynamic memory flows, continue to enforce it where needed

Improvements

-   Normalized API request timeouts to **30 minutes** across the app
-   This removes inconsistent timeout behavior between chat, memory, creation, group chat, and transport flows
-   Improves reliability for long-running requests

[View full release on GitHub →](https://github.com/LettuceAI/app/releases)

Android · Desktop

1.3.2 / 1.0.2

March 27, 2026

## Security Fixes, Image Generation Providers & System Prompt Tools

This update fixes two medium-severity security issues and adds new image generation integrations, prompt tooling, and a set of quality-of-life improvements across Android and Desktop.

Security

-   Fixed a backup import path traversal issue that could allow arbitrary file writes
-   Fixed a local media path traversal issue that could allow unintended file reads, writes, or deletions

New

-   Added AUTOMATIC1111 and Stability AI support for image generation
-   Added an update checker
-   Added Injection Rules for System Prompts
-   Added inline code text coloring
-   Added the ability to delete images
-   Added Scene Generation modes: `manual`, `ask first`, and `automatic`
-   Added an About App page in Settings
-   Added Debug Mode
-   Added a Reddit button

Improvements

-   Redesigned the Image Generation Settings page
-   Redesigned the System Prompts entry editor

Fixes

-   AI reference drafts now inherit model settings
-   Fixed an issue where the UI could get stuck if generation was canceled mid-process
-   Character import now accepts `.uec` files again

[View full release on GitHub →](https://github.com/LettuceAI/app/releases)

Android · Desktop

1.3.1 / 1.0.1

March 23, 2026

## Stability Fixes, Scene Writing Options & PNG Character Cards

This release focuses on bug fixes, small quality-of-life upgrades, and a few targeted additions for scene writing, reference text generation, and character card compatibility.

Added Scene Description Writer and Reference Text Writer LLM options

Added support for PNG-based Character Cards

Fixed message images, scene toggles, sync loading, and text color application issues

Improved unsaved-changes toast behavior and chat appearance preview consistency on mobile

Fixes

-   Fixed memories resetting when pressing Enter during editing
-   Bundled feedback sounds directly into the app binaries
-   Fixed the Sync page so it loads correctly again
-   Fixed Character Creation reset behavior after creation
-   Fixed regenerated images so they display correctly inside messages
-   Fixed scene generation so it respects the disable toggle
-   Fixed text color application on the Colors settings page

Improvements

-   Unsaved changes toasts now stay dismissed until the next leave attempt
-   The Chat Appearance preview now uses the same overlay as scene editing on mobile

New

-   Added a Scene Description Writer LLM option for better scene writing
-   Added a Reference Text Writer LLM option
-   Added support for PNG-based Character Cards

[View full release on GitHub →](https://github.com/LettuceAI/app/releases)

Android · Desktop

1.3.0 / release

March 22, 2026

## Images 2.0, Built-In Local AI, Sync 2.0 & Full Chat Customization

This release overhauls LettuceAI's image system, expands the local model ecosystem with built-in llama.cpp and a hardware-aware HuggingFace browser, rewrites sync, and ships a broad set of chat, UI, performance, and architecture upgrades across Android and Desktop.

Images 2.0 redesign with scene-based generation, reusable library assets, and avatar editing

Built-in llama.cpp runtime with GPU support, image generation, and native tool calling

New HuggingFace GGUF browser with hardware-aware compatibility estimates

Large chat upgrade covering group chats, templates, memory, and streaming reliability

Full sync rewrite, deeper UI customization, and a major internal architecture refactor

Images 2.0

-   Added full image generation support for chat and scenes
-   Introduced Image Language so any LLM can trigger image generation by writing scene prompts after responses
-   Unified all avatars, backgrounds, and generated images inside one reusable library
-   Added avatar generation and avatar editing with image models
-   Added reusable reference images and text for Characters and Personas during scene generation

Local AI & Model Ecosystem

-   Added a built-in llama.cpp runtime with support for NVIDIA, AMD, Intel GPUs, and Apple Silicon
-   Expanded runtime customization for local inference workflows
-   Added image generation and tool calling support to the local runtime
-   Added a HuggingFace model browser for searching and exploring GGUF models
-   Introduced hardware-aware compatibility checks with estimates for context length, quantization, and KV cache usage
-   Added a detailed scoring breakdown for model recommendations

Chat & Roleplay System

-   Increased the default max output token limit to 2048
-   Added proper memory rewind when branching chats and scene editing per session
-   Improved streaming stability, abort handling, and multimodal attachment reliability
-   Reworked group chats with configurable speaker selection modes for LLM, heuristic, and round-robin turn management
-   Added per-character mute, lorebooks, pinned messages, and typing haptics in group chats
-   Added reusable chat templates for preconfigured single-chat setups
-   Improved Dynamic Memory with missing-tag repair, cancelable memory cycles, and a no-tool-calling mode for unsupported models

UI / UX & Customization

-   Added a full chat appearance system with controls for font size, text colors, card colors, background blur, and more
-   Added multiple appearance presets for faster setup
-   Redesigned chat history, persona editor, character editor, and model editor
-   Added grid view support in the model browser and persona nicknames
-   Added full multi-language support with auto-detection and a language selector

Sync, Storage & Data

-   Rebuilt sync to compare client state and transfer only missing or outdated data instead of sending everything
-   Reduced bandwidth use and improved sync reliability with the new diff-based flow
-   Added chat package import and export support
-   Added SillyTavern `.jsonl` import support
-   Unified export flows for lorebooks, system prompts, and model configurations

Platform & Performance

-   Experimental iOS and macOS support is now available, though some features remain unstable
-   Optimized Android ONNX Runtime packaging and added a crash logger for fallback logging coverage
-   Refactored image delivery to use the Tauri Asset Protocol instead of IPC, reducing memory use and lag
-   Reduced UI jank in image-heavy flows and the HuggingFace browser
-   Improved lazy loading, rendering performance, and GPU fallback stability

Internal Architecture

-   Modularized the chat system into execution, memory, scene generation, and reply-helper layers
-   Added typed internal persistence and removed legacy command hops
-   Reorganized the app around feature-based module grouping with cleaner bootstrap boundaries

Fixes & Stability

-   Fixed provider credential routing issues
-   Fixed chat resend and duplicate-message behavior
-   Fixed scene and lorebook import bugs
-   Improved accessibility contrast and layout behavior
-   Improved mobile keyboard handling
-   Improved crash logging reliability

[View full release on GitHub →](https://github.com/LettuceAI/app/releases)

Android · Desktop

1.2.0 / Beta 4

February 15, 2026

## Desktop UX Overhaul, Prompt Runtime Controls, Dynamic Memory Upgrades & ONNX Reliability

Major investment in desktop UX and character creation flow, a significantly expanded prompt system, broad chat stability and performance hardening, and continued ONNX runtime reliability work across desktop, Android, and Windows.

Major investment in desktop UX, especially the Create Character flow

Significant expansion of the prompt system with a Prompt Structure Viewer and runtime prompt injection controls

Broad chat stability and performance hardening with large dynamic-memory improvements

Continued embedding and ONNX runtime reliability work across desktop, Android, and Windows

Expanded provider and model ecosystem, including NVIDIA NIM, plus safer import behavior

Desktop UI and Character Creation Redesign

-   Added responsive desktop layouts across character creation steps
-   Reworked character creation step order to improve setup flow
-   Redesigned character extras inputs for clearer, faster editing
-   Improved character create/edit with fallback-model selector support
-   Fixed create-step ordering and navigation consistency issues
-   Added metadata handling improvements for imported cards and avatar URL behavior
-   Added lorebook import support in character creation workflows

Prompt System Upgrades

-   Added Prompt Structure Viewer in the system prompt editor to preview message composition
-   Added conditional prompt injection mode
-   Added interval prompt injection mode
-   Added runtime option to condense prompts into a single system message
-   Fixed prompt import behavior to correctly respect `prompt_order`
-   Fixed drag-and-drop reorder bugs in the prompt entry editor
-   Improved prompt import UX and editor predictability

Chat, Group Chat, and UI

-   Added shared ChatLayout for persistent background behavior across chat sub-routes
-   Added shared GroupChatLayout with lifted data loading
-   Added branch-to-group-chat action from message actions
-   Added lorebook usage visibility per message
-   Added safe-area padding fixes for chat footer and bottom menu
-   Removed unwanted dark overlay above background images
-   Fixed chat search back-button sizing and related UI polish
-   Improved session back-stack handling across settings/history navigation
-   Fixed persona selection conflicts during scroll interactions
-   Removed duplicate dismiss controls in chat memories error state

Chat Stability and Performance

-   Fixed dynamic-memory listener leak during async chat setup
-   Bounded attachment cache growth in session hooks
-   Ignored stale attachment loads after chat state transitions
-   Fixed cleanup of jump-to-message RAF and timeout resources
-   Improved message memo checks with derived display props
-   Reduced attachment diff cost in chat memoization
-   Added fallback model retry logic with usage attribution
-   Disabled fallback attempts when no fallback model is configured
-   Added swap-places mode with role-aware generation
-   Reverted one streaming animation performance change after validation feedback

Dynamic Memory System

-   Added cursor-delta summarization of new messages
-   Added self-healing cursor behavior after deletes/rewinds
-   Added deduplication by cosine similarity at memory creation
-   Added adaptive decay rate based on access count
-   Added category tagging for memories
-   Added hybrid retrieval using similarity, recency, and access frequency
-   Added configurable retrieval selection limit
-   Added smart and cosine retrieval strategies
-   Added memory panel category filter chips
-   Added memory activity log redesign with timeline and collapsible UX
-   Auto-refresh of memory views after dynamic-memory completion
-   Enforced gating behavior for dynamic-memory manual mode

Embeddings, ONNX Runtime, and Android

-   Fixed ONNX runtime bundling and dylib path handling
-   Fixed dev rebuild-loop behavior tied to ONNX runtime integration
-   Pinned and standardized dylib preloading and path behavior
-   Ensured Android ONNX resource directory and packaging consistency
-   Improved desktop guards around ONNX runtime initialization
-   Improved handling of ORT init result variants and booleans
-   Added pre-step for embedding download and runtime ORT fetch
-   Extracted Windows DLL dependencies for ONNX runtime packaging
-   Locked ORT version to 2.0.0-rc.10
-   Added embedding model v3 support and multi-version management
-   Added experimental keep-loaded embedding runtime with cache reset on version switch
-   Fixed Android post-regenerate WebView freeze and tracing consistency
-   Made Android ONNX runtime init deterministic

Providers, Models, Endpoints, and Security

-   Added NVIDIA NIM provider
-   Added custom-provider tool-choice mode configurability
-   Added OpenRouter free-model toggle in model selector
-   Improved model selector search and suggestions
-   Added custom endpoint config persistence and auth/model-fetch mapping controls
-   Hid llama.cpp provider on mobile onboarding/settings where unsupported
-   Added security toggle to disable remote avatar downloads on card import
-   Disabled Chutes API key validation where it blocked onboarding flows

Lorebooks, Usage, Sync, and Tooling

-   Added world-info import/export and creation import action
-   Added character card metadata support and lorebook import path improvements
-   Added new pure mode content filtering system
-   Added app-time tracking backend support and analytics view
-   Enforced host-authoritative manifest diff in sync logic
-   Improved DB reset error surfacing and reset-in-place behavior
-   Migrated workflows to Blacksmith
-   Switched workflows to Bun and refreshed README/tooling docs
-   Added libclang dependency for Windows CI builds
-   Removed duplicate Cargo libraries and cleaned project config
-   Added `.gitignore` updates and docs-folder ignore adjustments
-   Fixed Tailwind warning noise in UI build paths

[View full release on GitHub →](https://github.com/LettuceAI/app/releases/tag/1.2.0)

Android · Desktop

1.1.0 / Beta 3

January 31, 2026

## Discovery, Group Chats, Smart Creator, Prompt Editor & Local Inference

This update brings Discovery, multi-character chats, a redesigned Smart Creator, and deeper local inference controls across Android and Desktop. It also includes a broad set of UI, stability, and workflow refinements shipped through January 31, 2026.

Discovery

-   A brand-new Discovery system powered by Character Tavern. Browse trending, popular, and newest cards or search directly, preview details before importing, and keep Pure Mode enabled to automatically filter NSFW results (with blurred avatars until you add a character).

Group Chats

-   A brand-new chat mode that lets multiple characters share one conversation. The app selects the next speaker automatically (or you can @mention to force a character), and roleplay groups can start with custom scenes. Long sessions are more stable with improved abort handling and streaming fixes.

Smart Creator

-   Smart Creator now supports Characters, Personas, and Lorebooks with a new goal selector and preview modes
-   Streaming responses and inline previews during creation
-   Smart Tool Selection toggle added, with manual tool presets and per-tool control in Advanced Settings
-   Image generation support with model selection in Advanced Settings
-   Smart Creator previews for Personas and Lorebooks

Help Me Reply

-   Help Me Reply now supports streaming, conversation/roleplay styles, and max token settings
-   Help Me Reply settings now allow per-feature model selection

Prompting System

-   Prompt Editor redesigned to be entry-based, with auto-scroll and mobile renaming
-   Per-entry roles and injection controls (including in-chat entries) for modular templates
-   System Prompt presets can be imported and exported
-   System Prompts UI redesigned and model-level prompts removed
-   Added `{{user}}` placeholder support and updated scene directions

Import & Export

-   Unified Entity Card (UEC) import/export support
-   Chara Card v1, v2, and v3 import support
-   Export characters as UEC, Chara Card v2 and v3
-   Personas can now be exported from the Library

Local Inference

-   Built-in llama.cpp runtime for desktop builds
-   Ollama now uses native endpoints
-   Automatic context length recommendations to prevent hardware crashes
-   Toggle to merge same-role messages for Ollama/llama.cpp compatibility
-   Advanced settings for local inference and support for <think> tags
-   CUDA support attempted for llama.cpp (currently disabled)

UI, UX & Stability

-   Redesigned Advanced Settings and Dynamic Memory pages
-   Creation menu refreshed and full-screen scene editor added
-   Improved persona selector and chat settings model selector menu
-   Long-press reordering for Lorebook and System Prompt entries on mobile
-   Redesigned toasts with unsaved changes protection (sticky + mobile bottom)
-   Fixed avatar display inconsistencies and Usage page text overflow
-   Bottom navigation simplified with larger icons and hidden labels
-   Dynamic Memory now works correctly after 120+ messages, with fixed counters
-   Fixed Mistral reasoning parameter handling and custom endpoint base URL display
-   Cost calculation fixes with a new recalc option in Advanced Settings
-   ONNX Runtime downgraded for broader device compatibility
-   Logging improved with a diagnostics section and global error integration
-   Embedding model load now has additional fail-safes

[View full release on GitHub →](https://github.com/LettuceAI/app/releases)

Android · Desktop

Release / Beta 2

January 4, 2026

## Text-to-Speech, AI Character Creator, Reply Helper, Sync, Accessibility Upgrades & Voice Playback

This release brings LettuceAI to Android along with the second desktop beta update. It introduces Text-to-Speech voices, reply generation assistance, encrypted device-to-device sync, enhanced accessibility features, and per-character voice playback controls. These updates focus on expressiveness, comfort, and smoother roleplay workflows.

AI Character Creator

-   Conversational guided character creation
-   Automatic field filling (name, traits, description, etc.)
-   Optional starting scenes to define tone
-   Attach avatars & reference material
-   You can stop at any time, everything remains editable in the manual editor
-   The Creator uses your default app model

Text-to-Speech Voices

-   **Device TTS** – uses your system's built-in voice engine
-   **ElevenLabs** – natural voice synthesis with custom voice support
-   **Gemini TTS** – neural speech generation with custom voice support
-   You can also create custom voices with style descriptions and reuse them across characters
-   Generated audio is cached locally to reduce repeated regenerations

Reply Helper

-   **Use my text as base** — improve or complete your draft
-   **Write something new** — generate a fresh reply
-   **Regenerate** — try multiple suggestions
-   Reply Helper uses your default app model

Encrypted Device Sync

-   Peer-to-peer encrypted transfer
-   No servers or permanent connections
-   You start sync manually when needed
-   One device hosts a session, the other joins with a code. Once connected, your data is synced directly between devices

Accessibility Improvements

-   Per-event volume controls
-   Optional haptic feedback with selectable intensity
-   Lightweight and non-intrusive

Per-Message Voice Playback

-   Assign a default voice per character
-   Optional autoplay
-   Manual playback button per message

Scene Directions

-   Scenes now support private "direction" notes that are hidden from the chat UI and used only to guide model behaviour during the opening context of a scene

General Improvements

-   Improved character editing workflow
-   Better consistency across Android & Desktop
-   Internal cleanup & UI polish

Bug Fixes & Behaviour Improvements

-   Reasoning now works correctly with the Google Gemini endpoint
-   Fixed an issue where Dynamic Memory processing could cancel when switching pages
-   Fixed an issue where characters could be duplicated unexpectedly
-   Added a retry button to the embedding download screen
-   Fixed Backup settings failing to load existing backups
-   Redesigned the Edit Model page into a single-page layout
-   Disabled reasoning controls for the Mistral endpoint
-   Optimised entry animations in Settings
-   Optimised Markdown rendering performance
-   Added support for `(...)` and `[...]` as italic formatting shortcuts
-   Added Scene Directions to help guide starting scene behaviour

[View full release on GitHub →](https://github.com/LettuceAI/app/releases)

Desktop

v1.0-beta.6.2

December 24, 2025

## Backup Fixes, Provider Expansion & Extended Timeout

Beta 6.2 is a stability and compatibility update focused on fixing critical backup issues, expanding provider support with Ollama and LM Studio, and improving reasoning model compatibility.

Bug Fixes

-   **Fixed backup issues** where data wasn't fully saved
-   **Fixed characters losing context** after restore
-   **Fixed OpenRouter & MistralAI reasoning** to work correctly with reasoning-capable models
-   **Fixed backups with images** not loading properly

New Features

-   **Added Ollama & LM Studio endpoint support** for locally hosted models
-   **Added custom OpenAI / Anthropic-compatible endpoints** for flexible API integration
-   **Increased request timeout** from 2 minutes to 15 minutes for better handling of slow models and reasoning tasks

[View full release on GitHub →](https://github.com/LettuceAI/app/releases)

Desktop

v1.0-beta.6

December 21, 2025

## Dynamic Memory v2, Lorebooks, In-Chat Image Generation & Major Performance Improvements

Beta 6 is a major systems and UX update focused on memory accuracy, world consistency, creative flexibility, and performance. It's designed to make long conversations faster, more coherent, and easier to control, while expanding what's possible inside a single chat.

Dynamic Memory v2

-   Dynamic Memory has been significantly upgraded with faster, more responsive memory handling, higher recall accuracy, improved behavior in long-running chats, and better stability across multiple memory cycles
-   Dynamic Memory v2 is designed to scale cleanly as conversations grow

New Embedding Model

-   A new embedding model now powers memory retrieval in Beta 6. It is approximately 50% smaller than the previous model, runs faster during inference, and supports up to 4096 tokens (previously 512)
-   Existing memories remain compatible. No migration required

Context Enrichment (Experimental)

-   An experimental Context Enrichment feature has been introduced. It enhances memory queries using the new embedding model, improves recall accuracy in follow-up messages, and reduces ambiguity during semantic search
-   This feature is currently experimental and may evolve in future releases

Lorebooks

-   Lorebooks introduce a structured way to inject world, character, and knowledge information into chats. Define locations, factions, rules, history, and concepts. Lore entries are automatically injected when relevant and treated as established canon
-   Lorebooks improve consistency across scenes and long roleplay sessions while staying separate from character memory

In-Chat Image Generation

-   Images can now be generated directly inside conversations. This is supported for models that expose image generation capabilities, enabling visual storytelling and richer creative workflows directly within the chat flow

Model & API Improvements

-   Added support for the **Chutes API endpoint**
-   Introduced an **OpenAI-compatible API endpoint** with extensive customization including custom user/assistant role names and flexible chat completion behavior
-   Added **Reasoning support** for models that expose reasoning tokens

Chat & Workflow Improvements

-   **Rewind to Here:** Resume conversations from any previous user message. Explore alternate paths without losing history
-   **Redesigned Chat Settings:** A new Chat Settings panel designed based on user feedback and suggestions

UI & Layout Improvements

-   Redesigned Character Cards for better clarity and hierarchy
-   Chat Header memory button now shows memory status and usage
-   Improved consistency across chat, settings, and character screens
-   Refined spacing, typography, and interaction feedback
-   Reduced visual noise in frequently used views
-   Redesigned chat history layout for readability

Desktop Builds

-   LettuceAI continues to be available as beta desktop builds alongside the mobile app
-   **Windows:** .msi installer, .exe portable build
-   **Linux:** .AppImage, .deb, .rpm
-   Desktop builds are still considered beta while platform-specific issues are being refined. Functionality generally matches the mobile app unless otherwise noted

Performance Improvements

-   Long chats now load up to **~8x faster**
-   Character list on the homepage loads faster and scrolls more smoothly
-   Improved internal state handling and caching logic
-   Backup system robustness significantly improved

Bug Fixes

-   Fixed an issue where Dynamic Memory could get stuck after cycle 2
-   Fixed an app freeze caused by corrupted or invalid backup files
-   Fixed an incorrect Google API endpoint URL

Thank You

-   Beta 6 is a foundational release that strengthens LettuceAI's core systems while expanding both creative and technical flexibility. Your feedback continues to shape LettuceAI into a deeply customizable, privacy-first AI companion built for long-term conversations and roleplay.

[View full release on GitHub →](https://github.com/LettuceAI/mobile-app/compare/1.0-beta.5...1.0-beta.6)