Jon Bisila
R&D Cybersecurity, Senior · Sandia National Laboratories
Researching AI evaluation and building tools that help security teams work with LLMs.
MS in Computer Science, Georgia Tech (May 2026).
About
I work in cybersecurity research at Sandia National Laboratories, focusing on AI/ML evaluation. My work centers on understanding how models actually perform: building benchmarks, designing evaluation frameworks, and identifying gaps between benchmark results and real-world behavior.
Recently, I've been applying this evaluation mindset to security-relevant problems: assessing how LLMs can assist with reverse engineering, building tools for threat intelligence analysis, and contributing to red-team assessments of ML systems. I'm learning a lot from the security experts I work with while helping them understand how AI can support their work.
I'm currently pursuing an MS in Computer Science at Georgia Tech while working full-time, with coursework in Machine Learning and NLP.
Projects
Automated Threat Intelligence Platform
2025–PresentPrincipal Investigator
Building an AI-powered platform that uses frontier models to parse threat intelligence reports and generate structured, actionable outputs for security analysts. Collaborating with UI/UX specialist on workflow integration.
LLM-Assisted Reverse Engineering Framework
2025–PresentLead Developer
Developing an evaluation framework for LLM-assisted reverse engineering, using Claude Code with custom skill definitions and PyGhidra MCP server integration. Exploring how well current models can support binary analysis workflows.
C-to-Rust Code Translation Benchmark (RECAST)
2024–2025Principal Investigator
Led a team developing a security-focused benchmark for evaluating LLM capabilities in C-to-Rust translation. Designed evaluation methodology targeting memory safety vulnerabilities (CWE-based test cases) to assess whether models can eliminate security weaknesses while preserving functional correctness. Transitioned project leadership for continuation.
ML Systems Red Team
2023–2024Primary Contributor
Contributed to red team assessments of production ML systems, evaluating vulnerabilities across adversarial ML and model performance attack surfaces. Focused on defining evaluation criteria and documenting failure modes that could be exploited.
Multilingual Entity Extraction Pipeline
2022–2023Lead Research Engineer
Built production pipeline using specialized LLMs for domain-agnostic entity extraction and translation across multiple languages, automating previously manual analysis at web-scale. Led experimental validation demonstrating custom LLM approach outperformed off-the-shelf solutions for requirements.
Select Publications
Seeking Enlightenment: Incorporating Evidence-Based Practice Techniques in a Research Software Engineering Team
Reed Milewicz, Jonathan Bisila, Miranda Mundt, Josh Teves
arXiv:2403.16827 (2024)
Anomaly Detection in Video Using Compression
Michael R. Smith, Renee Gooding, Jonathan Bisila, Christina Ting
IEEE Conference on Multimedia Information Processing and Retrieval (MIPR) (2024)
For the Public Good: Connecting, Retaining, and Recognizing Current and Future RSEs at National Labs
Miranda R. Mundt, Keith Beattie, Jonathan Bisila, et al.
Computing in Science & Engineering, IEEE (2023)
DevOps Pragmatic Practices and Potential Perils in Scientific Software Development
Reed Milewicz, Jonathan Bisila, Miranda Mundt, et al.
International Congress on Information and Communication Technology (ICICT), Springer (2023)
Topic Modeling with NLP for Identification of Nuclear Proliferation-Relevant Publications
Jonathan Bisila, Daniel Dunlavy, Zoe Nellie Gastelum, Craig D. Ulmer
Institute of Nuclear Materials Management (INMM) Annual Meeting (2020)
Contact
Feel free to reach out if you'd like to discuss AI/ML research, evaluation frameworks, or potential opportunities.