Research
-
Mitigating Claude Code Reward Hacking issues
September 2025 Research 4 min read
A practical guide to addressing Claude Code's tendencies for reward hacking, including mock data usage and silent error handling, with a comprehensive CLAUDE.md configuration.
-
Benchmarking LLMs Through Knowledge Graph Analysis
May 2025 Research 5 min read
A novel approach to evaluating LLM comprehension through knowledge graph connectedness metrics, with visualizations and comparative analysis of leading models.