• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Sunday, June 28, 2026
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Al, Analytics and Automation

Building a Stable Fable 5 Traces Workflow in Colab: Parsing Tool Calls, Auditing Data, and Training Baselines

Josh by Josh
June 28, 2026
in Al, Analytics and Automation
0
Building a Stable Fable 5 Traces Workflow in Colab: Parsing Tool Calls, Auditing Data, and Training Baselines


rprint(Panel.fit("[bold]Baseline 1: Predict output_type from context using pure Python Naive Bayes[/bold]"))
model_artifacts = {}
classifier_df = df.dropna(subset=["output_type"]).copy()
classifier_df = classifier_df[
   classifier_df["output_type"].astype(str).str.len() > 0
].copy()
if classifier_df["output_type"].nunique() >= 2 and len(classifier_df) >= 30:
   X_text = (
       classifier_df["context"]
       .fillna("")
       .astype(str)
       .map(lambda text: text[:12000])
       .tolist()
   )
   y = classifier_df["output_type"].astype(str).tolist()
   train_indices, test_indices = stratified_train_test_indices(y, test_size=0.2, seed=SEED)
   X_train = [X_text[i] for i in train_indices]
   y_train = [y[i] for i in train_indices]
   X_test = [X_text[i] for i in test_indices]
   y_test = [y[i] for i in test_indices]
   output_type_classifier = PureMultinomialNB(
       max_features=20000,
       min_df=2,
       alpha=1.0,
   )
   output_type_classifier.fit(X_train, y_train)
   predictions = output_type_classifier.predict(X_test)
   output_type_metrics, output_report_df = evaluate_predictions(y_test, predictions)
   output_matrix_df = confusion_matrix_df(y_test, predictions)
   output_type_metrics["train_rows"] = len(X_train)
   output_type_metrics["test_rows"] = len(X_test)
   output_type_metrics["vocab_size"] = len(output_type_classifier.vocab)
   rprint("[bold]Output type classifier report:[/bold]")
   display(output_report_df)
   display(output_matrix_df)
   output_report_df.to_csv(OUT_DIR / "output_type_classifier_report.csv", index=False)
   output_matrix_df.to_csv(OUT_DIR / "output_type_confusion_matrix.csv")
   top_token_records = []
   for label in output_type_classifier.labels:
       for token, margin in output_type_classifier.top_tokens_for_class(label, n=25):
           top_token_records.append(
               {
                   "label": label,
                   "token": token,
                   "score_margin": margin,
               }
           )
   pd.DataFrame(top_token_records).to_csv(
       OUT_DIR / "output_type_top_tokens.csv",
       index=False,
   )
   with open(
       OUT_DIR / "output_type_classifier_metrics.json",
       "w",
       encoding="utf-8",
   ) as file:
       json.dump(output_type_metrics, file, ensure_ascii=False, indent=2)
   model_artifacts["output_type_classifier_metrics"] = str(
       OUT_DIR / "output_type_classifier_metrics.json"
   )
   model_artifacts["output_type_classifier_report"] = str(
       OUT_DIR / "output_type_classifier_report.csv"
   )
   model_artifacts["output_type_confusion_matrix"] = str(
       OUT_DIR / "output_type_confusion_matrix.csv"
   )
   model_artifacts["output_type_top_tokens"] = str(
       OUT_DIR / "output_type_top_tokens.csv"
   )
else:
   rprint(
       "[yellow]Skipping output_type classifier because there are too few "
       "classes or rows.[/yellow]"
   )
   output_type_metrics = {}
rprint(Panel.fit("[bold]Baseline 2: Predict tool_name from context using pure Python Naive Bayes[/bold]"))
tool_classifier_df = df[
   df["output_type"].eq("tool_use")
   & df["tool_name"].fillna("").astype(str).str.len().gt(0)
].copy()
if len(tool_classifier_df) >= 50 and tool_classifier_df["tool_name"].nunique() >= 2:
   top_tools = tool_classifier_df["tool_name"].value_counts().head(12).index.tolist()
   tool_classifier_df["tool_label"] = tool_classifier_df["tool_name"].where(
       tool_classifier_df["tool_name"].isin(top_tools),
       "__OTHER__",
   )
   y_tool = tool_classifier_df["tool_label"].astype(str).tolist()
   X_tool_text = (
       tool_classifier_df["context"]
       .fillna("")
       .astype(str)
       .map(lambda text: text[:12000])
       .tolist()
   )
   if len(set(y_tool)) >= 2:
       train_indices, test_indices = stratified_train_test_indices(y_tool, test_size=0.2, seed=SEED)
       X_train = [X_tool_text[i] for i in train_indices]
       y_train = [y_tool[i] for i in train_indices]
       X_test = [X_tool_text[i] for i in test_indices]
       y_test = [y_tool[i] for i in test_indices]
       tool_classifier = PureMultinomialNB(
           max_features=20000,
           min_df=2,
           alpha=1.0,
       )
       tool_classifier.fit(X_train, y_train)
       tool_predictions = tool_classifier.predict(X_test)
       tool_metrics, tool_report_df = evaluate_predictions(y_test, tool_predictions)
       tool_matrix_df = confusion_matrix_df(y_test, tool_predictions)
       tool_metrics["train_rows"] = len(X_train)
       tool_metrics["test_rows"] = len(X_test)
       tool_metrics["vocab_size"] = len(tool_classifier.vocab)
       rprint("[bold]Tool classifier report:[/bold]")
       display(tool_report_df)
       display(tool_matrix_df)
       tool_report_df.to_csv(OUT_DIR / "tool_name_classifier_report.csv", index=False)
       tool_matrix_df.to_csv(OUT_DIR / "tool_name_confusion_matrix.csv")
       top_tool_token_records = []
       for label in tool_classifier.labels:
           for token, margin in tool_classifier.top_tokens_for_class(label, n=25):
               top_tool_token_records.append(
                   {
                       "label": label,
                       "token": token,
                       "score_margin": margin,
                   }
               )
       pd.DataFrame(top_tool_token_records).to_csv(
           OUT_DIR / "tool_name_top_tokens.csv",
           index=False,
       )
       with open(
           OUT_DIR / "tool_name_classifier_metrics.json",
           "w",
           encoding="utf-8",
       ) as file:
           json.dump(tool_metrics, file, ensure_ascii=False, indent=2)
       model_artifacts["tool_name_classifier_metrics"] = str(
           OUT_DIR / "tool_name_classifier_metrics.json"
       )
       model_artifacts["tool_name_classifier_report"] = str(
           OUT_DIR / "tool_name_classifier_report.csv"
       )
       model_artifacts["tool_name_confusion_matrix"] = str(
           OUT_DIR / "tool_name_confusion_matrix.csv"
       )
       model_artifacts["tool_name_top_tokens"] = str(
           OUT_DIR / "tool_name_top_tokens.csv"
       )
   else:
       rprint("[yellow]Skipping tool classifier because labels collapsed to one class.[/yellow]")
       tool_metrics = {}
else:
   rprint(
       "[yellow]Skipping tool classifier because there are too few tool-use "
       "rows or tool classes.[/yellow]"
   )
   tool_metrics = {}
rprint(Panel.fit("[bold]Building simple keyword search helper[/bold]"))
def search_rows(keyword, limit=5, search_cols=("context", "cot", "completion", "text_payload")):
   keyword = str(keyword).lower()
   mask = pd.Series(False, index=df.index)
   for column in search_cols:
       mask = mask | (
           df[column]
           .fillna("")
           .astype(str)
           .str.lower()
           .str.contains(re.escape(keyword), regex=True)
       )
   hits = df[mask].head(limit)
   results = []
   for _, row in hits.iterrows():
       results.append(
           {
               "uid": row.get("uid"),
               "session": row.get("session"),
               "output_type": row.get("output_type"),
               "tool_name": row.get("tool_name"),
               "context_preview": preview_text(row.get("context"), 400),
               "payload_preview": preview_text(row.get("text_payload"), 400),
           }
       )
   return results
example_queries = [
   "Bash",
   "Write",
   "browser",
   "test",
   "README",
]
search_demo = {
   query: search_rows(query, limit=2)
   for query in example_queries
}
with open(
   OUT_DIR / "keyword_search_demo.json",
   "w",
   encoding="utf-8",
) as file:
   json.dump(search_demo, file, ensure_ascii=False, indent=2)
rprint("[bold]Example keyword search results:[/bold]")
rprint(safe_json_dumps(search_demo, max_chars=5000))
summary = {
   "dataset_id": DATASET_ID,
   "flat_jsonl_filename": FLAT_JSONL_FILENAME,
   "output_directory": str(OUT_DIR),
   "repo_file_summary": file_summary,
   "rows": int(len(df)),
   "columns": list(df.columns),
   "output_type_distribution": (
       df["output_type"]
       .fillna("missing")
       .value_counts()
       .to_dict()
   ),
   "top_tools": (
       df.loc[df["output_type"].eq("tool_use"), "tool_name"]
       .replace("", "unknown")
       .value_counts()
       .head(20)
       .to_dict()
   ),
   "top_source_roots": (
       df["source_root"]
       .fillna("unknown")
       .value_counts()
       .head(20)
       .to_dict()
   ),
   "length_summary": {
       column: {
           "mean": float(df[column].mean()),
           "median": float(df[column].median()),
           "p90": float(df[column].quantile(0.90)),
           "p95": float(df[column].quantile(0.95)),
           "max": int(df[column].max()),
       }
       for column in [
           "context_chars",
           "cot_chars",
           "completion_chars",
           "text_payload_chars",
       ]
   },
   "possible_secret_rows": int(df["possible_secret_anywhere"].sum()),
   "plots": plot_paths,
   "model_artifacts": model_artifacts,
   "safe_exports": {
       "train": str(OUT_DIR / "fable5_no_cot_chat_train.jsonl"),
       "validation": str(OUT_DIR / "fable5_no_cot_chat_validation.jsonl"),
       "test": str(OUT_DIR / "fable5_no_cot_chat_test.jsonl"),
   },
   "analysis_files": {
       "csv": str(OUT_DIR / "fable5_analysis_index.csv"),
       "pickle": str(OUT_DIR / "fable5_analysis_index.pkl"),
       "keyword_search_demo": str(OUT_DIR / "keyword_search_demo.json"),
   },
}
with open(
   OUT_DIR / "analysis_summary.json",
   "w",
   encoding="utf-8",
) as file:
   json.dump(clean_for_json(summary), file, ensure_ascii=False, indent=2, default=str)
FENCE = chr(96) * 3
report_md = (
   "# Fable 5 Traces Advanced Tutorial Report\n\n"
   "## Dataset\n\n"
   f"- Dataset: `{DATASET_ID}`\n"
   f"- Flat JSONL: `{FLAT_JSONL_FILENAME}`\n"
   f"- Rows loaded: `{len(df):,}`\n"
   f"- Unique source sessions: `{df['session'].nunique(dropna=True):,}`\n"
   f"- Unique models: `{df['model'].nunique(dropna=True):,}`\n\n"
   "## Important safety note\n\n"
   "This tutorial treats the dataset as agent telemetry. It previews and analyzes commands, "
   "tool calls, file edits, and transcript text, but it never executes commands found inside "
   "the traces.\n\n"
   f"Potential secret-like patterns detected: `{int(df['possible_secret_anywhere'].sum()):,}` rows.\n"
   "Exports redact common API-key/token-like patterns.\n\n"
   "## Output type distribution\n\n"
   f"{FENCE}json\n"
   f"{json.dumps(clean_for_json(summary['output_type_distribution']), indent=2, ensure_ascii=False)}\n"
   f"{FENCE}\n\n"
   "## Top tools\n\n"
   f"{FENCE}json\n"
   f"{json.dumps(clean_for_json(summary['top_tools']), indent=2, ensure_ascii=False)}\n"
   f"{FENCE}\n\n"
   "## Saved files\n\n"
   "- `analysis_summary.json`\n"
   "- `fable5_analysis_index.csv`\n"
   "- `fable5_analysis_index.pkl`\n"
   "- `fable5_no_cot_chat_train.jsonl`\n"
   "- `fable5_no_cot_chat_validation.jsonl`\n"
   "- `fable5_no_cot_chat_test.jsonl`\n"
   "- plot PNG files\n"
   "- baseline classifier metrics, when enough rows/classes are available\n\n"
   "## Recommended next steps\n\n"
   "1. Inspect `fable5_no_cot_chat_train.jsonl` before any fine-tuning.\n"
   "2. Keep the dataset license in mind before model training or redistribution.\n"
   "3. Avoid training directly on raw terminal outputs without additional privacy and safety filtering.\n"
   "4. Start with the no-CoT chat export unless your research explicitly requires reasoning-trace supervision.\n"
)
with open(
   OUT_DIR / "REPORT.md",
   "w",
   encoding="utf-8",
) as file:
   file.write(report_md)
rprint(
   Panel.fit(
       f"[bold green]Tutorial complete.[/bold green]\n\n"
       f"Artifacts saved in:\n{OUT_DIR}\n\n"
       f"Key files:\n"
       f"- {OUT_DIR / 'REPORT.md'}\n"
       f"- {OUT_DIR / 'analysis_summary.json'}\n"
       f"- {OUT_DIR / 'fable5_no_cot_chat_train.jsonl'}\n"
       f"- {OUT_DIR / 'fable5_analysis_index.csv'}",
       title="Done",
   )
)
display(
   pd.DataFrame(
       {
           "artifact": [
               "Report",
               "Summary JSON",
               "No-CoT train export",
               "No-CoT validation export",
               "No-CoT test export",
               "Analysis CSV",
               "Analysis pickle",
               "Keyword search demo",
           ],
           "path": [
               str(OUT_DIR / "REPORT.md"),
               str(OUT_DIR / "analysis_summary.json"),
               str(OUT_DIR / "fable5_no_cot_chat_train.jsonl"),
               str(OUT_DIR / "fable5_no_cot_chat_validation.jsonl"),
               str(OUT_DIR / "fable5_no_cot_chat_test.jsonl"),
               str(OUT_DIR / "fable5_analysis_index.csv"),
               str(OUT_DIR / "fable5_analysis_index.pkl"),
               str(OUT_DIR / "keyword_search_demo.json"),
           ],
       }
   )
)



Source_link

READ ALSO

Building an End-to-End Sentiment Analysis Pipeline with Scikit-LLM

LLMs help robots understand vague instructions and focus on key details | MIT News

Related Posts

Building an End-to-End Sentiment Analysis Pipeline with Scikit-LLM
Al, Analytics and Automation

Building an End-to-End Sentiment Analysis Pipeline with Scikit-LLM

June 28, 2026
LLMs help robots understand vague instructions and focus on key details | MIT News
Al, Analytics and Automation

LLMs help robots understand vague instructions and focus on key details | MIT News

June 27, 2026
DeepSeek Releases DSpark, a Speculative Decoding Framework That Accelerates DeepSeek-V4 Per-User Generation 60–85% Over MTP-1
Al, Analytics and Automation

DeepSeek Releases DSpark, a Speculative Decoding Framework That Accelerates DeepSeek-V4 Per-User Generation 60–85% Over MTP-1

June 27, 2026
The Roadmap to Mastering AI Agent Evaluation
Al, Analytics and Automation

The Roadmap to Mastering AI Agent Evaluation

June 27, 2026
David Autor named head of the Department of Economics | MIT News
Al, Analytics and Automation

David Autor named head of the Department of Economics | MIT News

June 27, 2026
Cursor Study Finds Reward Hacking Inflates Coding-Agent Benchmark Scores on SWE-bench Pro
Al, Analytics and Automation

Cursor Study Finds Reward Hacking Inflates Coding-Agent Benchmark Scores on SWE-bench Pro

June 26, 2026
Next Post
TechCrunch Mobility: All eyes on Tesla FSD

TechCrunch Mobility: All eyes on Tesla FSD

POPULAR NEWS

Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
Comparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025

Comparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025

November 4, 2025

EDITOR'S PICK

TikTok Researchers Introduce SWE-Perf: The First Benchmark for Repository-Level Code Performance Optimization

TikTok Researchers Introduce SWE-Perf: The First Benchmark for Repository-Level Code Performance Optimization

July 21, 2025
3 Questions: How AI could optimize the power grid | MIT News

3 Questions: How AI could optimize the power grid | MIT News

January 9, 2026
Build a Reinforcement Learning Powered Agent that Learns to Retrieve Relevant Long-Term Memories for Accurate LLM Question Answering

Build a Reinforcement Learning Powered Agent that Learns to Retrieve Relevant Long-Term Memories for Accurate LLM Question Answering

April 28, 2026
Pretrain a BERT Model from Scratch

Pretrain a BERT Model from Scratch

December 1, 2025

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • TechCrunch Mobility: All eyes on Tesla FSD
  • Building a Stable Fable 5 Traces Workflow in Colab: Parsing Tool Calls, Auditing Data, and Training Baselines
  • What Is a Chargeback? A Guide for Merchants
  • Building an End-to-End Sentiment Analysis Pipeline with Scikit-LLM
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions