SRE-Zero Full Eval Plan +-----------------------------------------------------------------------------+ | Kind | Baseline | Model | Episodes | Output | |---------------+----------+-------------------+----------+-------------------| | deterministic | scripted | deterministic/sc… | 5 | scripted_episode… | +-----------------------------------------------------------------------------+ [22:35:37] START run=1/1 baseline=scripted run_all_eval.py:225 model=deterministic/scripted episodes=5 END run=1/1 baseline=scripted run_all_eval.py:278 model=deterministic/scripted score=93.198 success=1.000 errors=0 output=D:\SRE-Zero\notes\runs\managed\blog-qwen- easy-agent-styles-2026-06-13\outputs\scripted_ep isodes5.json full sweep … 1/1 scripted | deterministic/scripted | load_balancer_tls_cert_expired 11/1… … SRE-Zero Baseline Marks +-----------------------------------------------------------------------------+ | Basel… | Model | Marks | Succe… | Reward | Evide… | Inval… | Steps | Erro… | |--------+--------+-------+--------+--------+--------+--------+-------+-------| | scrip… | deter… | 93.2 | 1.00 | 0.941 | 1.00 | 0.00 | 4.73 | 0 | +-----------------------------------------------------------------------------+ Wrote records and marks to D:\SRE-Zero\notes\runs\managed\blog-qwen-easy-agent-styles-2026-06-13\target_su mmaries\scripted_deterministic_scripted.summary.json SRE-Zero Marks by Difficulty +-----------------------------------------------------------------------------+ | | | | | | | Root | Correct | | Diffic… | Baseli… | Model | Marks | Success | Eviden… | Cause | Fix | |---------+---------+---------+-------+---------+---------+---------+---------| | easy | script… | determ… | 93.2 | 1.00 | 1.00 | 1.00 | 1.00 | +-----------------------------------------------------------------------------+ Wrote run log to D:\SRE-Zero\notes\runs\managed\blog-qwen-easy-agent-styles-2026-06-13\logs\scri pted_deterministic_scripted.run.log