Jon Bisila

R&D Cybersecurity, Senior · Sandia National Laboratories

Researching AI evaluation and building tools that help security teams work with LLMs.

MS in Computer Science, Georgia Tech (May 2026).

About

I work in cybersecurity research at Sandia National Laboratories, focusing on AI/ML evaluation. My work centers on understanding how models actually perform: building benchmarks, designing evaluation frameworks, and identifying gaps between benchmark results and real-world behavior.

Recently, I've been applying this evaluation mindset to security-relevant problems: assessing how LLMs can assist with reverse engineering, building tools for threat intelligence analysis, and contributing to red-team assessments of ML systems. I'm learning a lot from the security experts I work with while helping them understand how AI can support their work.

I'm currently pursuing an MS in Computer Science at Georgia Tech while working full-time, with coursework in Machine Learning and NLP.

Projects

Automated Threat Intelligence Platform

2025–Present

Principal Investigator

Building an AI-powered platform that uses frontier models to parse threat intelligence reports and generate structured, actionable outputs for security analysts. Collaborating with UI/UX specialist on workflow integration.

PythonLLMsRAG

LLM-Assisted Reverse Engineering Framework

2025–Present

Lead Developer

Developing an evaluation framework for LLM-assisted reverse engineering, using Claude Code with custom skill definitions and PyGhidra MCP server integration. Exploring how well current models can support binary analysis workflows.

PythonClaude CodePyGhidraMCP

C-to-Rust Code Translation Benchmark (RECAST)

2024–2025

Principal Investigator

Led a team developing a security-focused benchmark for evaluating LLM capabilities in C-to-Rust translation. Designed evaluation methodology targeting memory safety vulnerabilities (CWE-based test cases) to assess whether models can eliminate security weaknesses while preserving functional correctness. Transitioned project leadership for continuation.

PythonLLMsRustCvLLMHuggingFace

ML Systems Red Team

2023–2024

Primary Contributor

Contributed to red team assessments of production ML systems, evaluating vulnerabilities across adversarial ML and model performance attack surfaces. Focused on defining evaluation criteria and documenting failure modes that could be exploited.

PythonAdversarial ML

Multilingual Entity Extraction Pipeline

2022–2023

Lead Research Engineer

Built production pipeline using specialized LLMs for domain-agnostic entity extraction and translation across multiple languages, automating previously manual analysis at web-scale. Led experimental validation demonstrating custom LLM approach outperformed off-the-shelf solutions for requirements.

PythonHuggingFaceAWSDockervLLM

Select Publications

Seeking Enlightenment: Incorporating Evidence-Based Practice Techniques in a Research Software Engineering Team

Reed Milewicz, Jonathan Bisila, Miranda Mundt, Josh Teves

arXiv:2403.16827 (2024)

Anomaly Detection in Video Using Compression

Michael R. Smith, Renee Gooding, Jonathan Bisila, Christina Ting

IEEE Conference on Multimedia Information Processing and Retrieval (MIPR) (2024)

For the Public Good: Connecting, Retaining, and Recognizing Current and Future RSEs at National Labs

Miranda R. Mundt, Keith Beattie, Jonathan Bisila, et al.

Computing in Science & Engineering, IEEE (2023)

DevOps Pragmatic Practices and Potential Perils in Scientific Software Development

Reed Milewicz, Jonathan Bisila, Miranda Mundt, et al.

International Congress on Information and Communication Technology (ICICT), Springer (2023)

Topic Modeling with NLP for Identification of Nuclear Proliferation-Relevant Publications

Jonathan Bisila, Daniel Dunlavy, Zoe Nellie Gastelum, Craig D. Ulmer

Institute of Nuclear Materials Management (INMM) Annual Meeting (2020)

Contact

Feel free to reach out if you'd like to discuss AI/ML research, evaluation frameworks, or potential opportunities.