feat(report): add multi-backend summarizer and LLM task grouping#633
feat(report): add multi-backend summarizer and LLM task grouping#633leecoder wants to merge 5 commits into
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub. |
There was a problem hiding this comment.
10 issues found across 8 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="crates/tokscale-core/src/content_extractor.rs">
<violation number="1" location="crates/tokscale-core/src/content_extractor.rs:262">
P2: The truncation guard uses byte length instead of character count, which incorrectly appends ellipses for non-ASCII input.</violation>
</file>
<file name="crates/tokscale-cli/src/commands/report.rs">
<violation number="1" location="crates/tokscale-cli/src/commands/report.rs:208">
P2: Unknown `--summarizer` values return `Ok(())` instead of an error, causing silent misconfiguration and skipped summarization.</violation>
<violation number="2" location="crates/tokscale-cli/src/commands/report.rs:250">
P1: DB errors are swallowed when writing summaries, which can silently lose summarization results while reporting success.</violation>
<violation number="3" location="crates/tokscale-cli/src/commands/report.rs:327">
P2: Task grouping is silently skipped for unsupported backends (including the default `apple-fm`), so the feature can appear to run successfully while never producing task groups.</violation>
<violation number="4" location="crates/tokscale-cli/src/commands/report.rs:347">
P2: DB errors are swallowed when saving `task_group`, so grouping can silently fail while the command reports success.</violation>
<violation number="5" location="crates/tokscale-cli/src/commands/report.rs:727">
P2: End-of-day filtering truncates the last 999ms, so some sessions at the end of the `--until` day are incorrectly excluded.</violation>
</file>
<file name="crates/tokscale-core/src/wiki.rs">
<violation number="1" location="crates/tokscale-core/src/wiki.rs:116">
P2: Fallback config path uses an unexpanded `~`, which can write the wiki DB to an unintended location.</violation>
<violation number="2" location="crates/tokscale-core/src/wiki.rs:375">
P2: The `until` filter is inclusive in `query_entries` but exclusive in other range methods, causing inconsistent date-scoped behavior.</violation>
</file>
<file name="scripts/wiki-summarizer.py">
<violation number="1" location="scripts/wiki-summarizer.py:129">
P2: Validate `task_category`/`complexity` against allowed values before storing. Right now any model output string is accepted, even when it violates the declared schema.</violation>
<violation number="2" location="scripts/wiki-summarizer.py:136">
P2: The recovery path can crash because `session['session_id']` is dereferenced inside the exception handler. A malformed session then raises a second `KeyError` and aborts summarization.</violation>
</file>
Reply with feedback, questions, or to request a fix.
Re-trigger cubic
There was a problem hiding this comment.
6 issues found across 5 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="crates/tokscale-cli/src/commands/optimize.rs">
<violation number="1" location="crates/tokscale-cli/src/commands/optimize.rs:195">
P2: `partial_cmp(...).unwrap()` on `f64` can panic if `total_cost` is NaN. Use the safer pattern already established in the codebase.</violation>
<violation number="2" location="crates/tokscale-cli/src/commands/optimize.rs:455">
P2: Truncating `String` with `[..27]` can panic on non-ASCII model names due to invalid UTF-8 boundary slicing.</violation>
</file>
<file name="crates/tokscale-cli/src/main.rs">
<violation number="1" location="crates/tokscale-cli/src/main.rs:763">
P1: `--optimize` currently runs during `--json` report output, appending human-formatted text and corrupting JSON output for automation.</violation>
</file>
<file name="README.md">
<violation number="1" location="README.md:683">
P2: The new optimize example claims it defaults to today, but the command actually uses all sessions unless a date flag is provided.</violation>
</file>
Reply with feedback, questions, or to request a fix.
Re-trigger cubic
| week, | ||
| month, | ||
| }); | ||
| if optimize && result.is_ok() { |
There was a problem hiding this comment.
P1: --optimize currently runs during --json report output, appending human-formatted text and corrupting JSON output for automation.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At crates/tokscale-cli/src/main.rs, line 763:
<comment>`--optimize` currently runs during `--json` report output, appending human-formatted text and corrupting JSON output for automation.</comment>
<file context>
@@ -745,6 +759,38 @@ fn main() -> Result<()> {
week,
month,
+ });
+ if optimize && result.is_ok() {
+ let _ = commands::optimize::run_optimize(commands::optimize::OptimizeOptions {
+ json: false,
</file context>
| if optimize && result.is_ok() { | |
| if optimize && !json && result.is_ok() { |
| println!(" {}", "─".repeat(68)); | ||
| for m in report.model_insights.iter().take(8) { | ||
| let model_display: String = if m.model.len() > 28 { | ||
| format!("{}…", &m.model[..27]) |
There was a problem hiding this comment.
P2: Truncating String with [..27] can panic on non-ASCII model names due to invalid UTF-8 boundary slicing.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At crates/tokscale-cli/src/commands/optimize.rs, line 455:
<comment>Truncating `String` with `[..27]` can panic on non-ASCII model names due to invalid UTF-8 boundary slicing.</comment>
<file context>
@@ -0,0 +1,554 @@
+ println!(" {}", "─".repeat(68));
+ for m in report.model_insights.iter().take(8) {
+ let model_display: String = if m.model.len() > 28 {
+ format!("{}…", &m.model[..27])
+ } else {
+ m.model.clone()
</file context>
| format!("{}…", &m.model[..27]) | |
| format!("{}…", m.model.chars().take(27).collect::<String>()) |
| Tokscale can analyze your usage patterns and generate actionable recommendations to reduce costs and improve productivity. | ||
|
|
||
| ```bash | ||
| # Standalone optimization analysis (defaults to today) |
There was a problem hiding this comment.
P2: The new optimize example claims it defaults to today, but the command actually uses all sessions unless a date flag is provided.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At README.md, line 683:
<comment>The new optimize example claims it defaults to today, but the command actually uses all sessions unless a date flag is provided.</comment>
<file context>
@@ -674,6 +675,65 @@ tokscale report --workspace my-project --client opencode
+Tokscale can analyze your usage patterns and generate actionable recommendations to reduce costs and improve productivity.
+
+```bash
+# Standalone optimization analysis (defaults to today)
+tokscale optimize
+
</file context>
| # Standalone optimization analysis (defaults to today) | |
| # Standalone optimization analysis (uses all cached sessions by default) |
2cf54a9 to
36c4ce5
Compare
- Group sessions by model and task title in summary tables - Show daily breakdown for --week/--month, session list for --today - Integrate Apple FM summarizer for session classification - Add wiki DB for caching session summaries
- Support claude, codex, gemini, kiro as summarizer backends (in addition to apple-fm) - Add 2nd LLM pass to cluster sessions into high-level task groups - Add --summarizer flag to select backend, --rebuild to reset cached summaries - Scope summarization to date range (--week, --since/--until) - Add task_group column to wiki DB with migration - Update README with Task-Attributed Report documentation
…agate DB errors - Return error instead of Ok(()) for unknown --summarizer values - Propagate DB errors from update_summary and update_task_group - Clarify skip message when task grouping backend is unsupported (apple-fm)
- Fix end-of-day filtering: use next_day_00:00 - 1ms instead of 23:59:59
- Fix wiki.rs fallback path: use dirs::home_dir() instead of literal '~/.config'
- Fix until filter inconsistency: query_entries now uses '<' (exclusive) matching other range methods
- Validate task_category/complexity against allowed values in wiki-summarizer.py
- Fix recovery path crash: use session.get('session_id') instead of session['session_id'] in exception handler
- content_extractor: use chars().count() instead of byte len() for truncation guard - wiki.rs: rename from_str to parse to avoid clippy::should_implement_trait - report.rs: use div_ceil() instead of manual ceiling division - usage/copilot.rs: use strip_prefix() instead of manual prefix stripping - usage/minimax.rs: remove redundant closure - usage/zai.rs: fix reference-to-reference pattern - usage/mod.rs: extract type alias for complex type, use .ok() instead of manual match
36c4ce5 to
d6c5432
Compare
There was a problem hiding this comment.
1 issue found across 5 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="crates/tokscale-cli/src/main.rs">
<violation number="1" location="crates/tokscale-cli/src/main.rs:763">
P1: `--optimize` currently runs during `--json` report output, appending human-formatted text and corrupting JSON output for automation.</violation>
</file>
<file name="crates/tokscale-cli/src/commands/optimize.rs">
<violation number="1" location="crates/tokscale-cli/src/commands/optimize.rs:455">
P2: Truncating `String` with `[..27]` can panic on non-ASCII model names due to invalid UTF-8 boundary slicing.</violation>
</file>
<file name="README.md">
<violation number="1" location="README.md:683">
P2: The new optimize example claims it defaults to today, but the command actually uses all sessions unless a date flag is provided.</violation>
</file>
<file name=".opencode/skill/deploy.md">
<violation number="1" location=".opencode/skill/deploy.md:49">
P2: `docker compose exec` allocates a TTY by default; in non-interactive environments like AWS SSM this can fail with "the input device is not a TTY". Use `-T` to disable pseudo-TTY allocation.</violation>
</file>
Tip: Review your code locally with the cubic CLI to iterate faster.
Re-trigger cubic
| --instance-ids "i-078fe82953c3047b5" \ | ||
| --document-name "AWS-RunShellScript" \ | ||
| --region ap-northeast-2 \ | ||
| --parameters 'commands=["export HOME=/home/ubuntu && cd /home/ubuntu/tokscale/self-host && docker compose exec app npx drizzle-kit push --force"]' \ |
There was a problem hiding this comment.
P2: docker compose exec allocates a TTY by default; in non-interactive environments like AWS SSM this can fail with "the input device is not a TTY". Use -T to disable pseudo-TTY allocation.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At .opencode/skill/deploy.md, line 49:
<comment>`docker compose exec` allocates a TTY by default; in non-interactive environments like AWS SSM this can fail with "the input device is not a TTY". Use `-T` to disable pseudo-TTY allocation.</comment>
<file context>
@@ -0,0 +1,61 @@
+ --instance-ids "i-078fe82953c3047b5" \
+ --document-name "AWS-RunShellScript" \
+ --region ap-northeast-2 \
+ --parameters 'commands=["export HOME=/home/ubuntu && cd /home/ubuntu/tokscale/self-host && docker compose exec app npx drizzle-kit push --force"]' \
+ --timeout-seconds 120 \
+ --output json
</file context>
7657640 to
f00d277
Compare
Summary
Add multi-backend LLM summarizer support and automatic task grouping to the
tokscale reportcommand.Changes
Multi-backend summarizer
claude,codex,gemini,kiroas summarizer backends in addition to the defaultapple-fm--summarizerflag to select backend (default:apple-fm)LLM task grouping (2nd pass)
Date-scoped operations
--week,--since/--untildate filters--rebuildflag resets cached summaries within the date range and re-summarizes from scratchSchema changes
task_groupcolumn towiki_entriestable with auto-migrationreset_summaries_in_rangeandget_unsummarized_session_ids_in_rangeDB methodsDocumentation
Usage
Summary by cubic
Adds a task-attributed usage report with multi-backend summarization and optional LLM task grouping. Introduces the
tokscale reportcommand with model/task breakdowns, daily/monthly views, and a local wiki DB cache.New Features
tokscale reportshows model and task-group breakdowns; daily view for--week/--month, session list for today. Results cached in a local wiki DB;--rebuildre-summarizes in range. Auto-migration addstask_group.--summarizerbackends:apple-fm(default),claude,codex,gemini,kiro; batch mode for CLI backends. A second LLM pass clusters titled sessions into 3–8 task groups. Requires a CLI backend;apple-fmskips grouping with a clear message.--today/--week/--month,--since/--until(exclusive end),--workspace,--client,--no-summarize,--json,--rebuild.Bug Fixes
untilis exclusive using next-day 00:00 minus 1ms; DB queries now use<to match.--summarizernow errors; DB errors from summary/task-group updates propagate to the CLI.wiki-summarizer.py: validatetask_category/complexity, fix missing-ID crash; correct wiki DB fallback path resolution.Written for commit f00d277. Summary will update on new commits.