[RFC] JSIR: A High-Level IR for JavaScript - MLIR - LLVM Discussion Forums
This RFC introduces JSIR, a high-level IR for JavaScript:
- JSIR preserves all information from the AST and supports high-fidelity round-trip between source ↔ AST ↔ JSIR;
- JSIR uses MLIR regions to represent control flow structures;
- JSIR supports dataflow analysis.
JSIR is developed and deployed in production at Google for code analysis and transform use cases.
JSIR is open source here: GitHub - google/jsir: Next-generation JavaScript analysis tooling · GitHub
Industry trend of building high-level language-specific IRs
The compiler industry is moving towards building high-level language-specific IRs. For example, the Rust and Swift compilers perform certain analyses on their high-level IRs before lowering down to LLVM. There are also a number of ongoing projects in this direction, such as Clang IR, Mojo, and Carbon.
The need for a high-level JavaScript IR
Why do we need a high-level IR for JavaScript specifically? While much of JavaScript tooling relies on ASTs (like ESTree), complex analyses require a control flow graph (CFG) and dataflow analysis capabilities, which JSIR provides by using the MLIR framework.
[2024 LLVM DevMtg] JSIR - Adversarial JavaScript Analysis with MLIR PDF
This
What is JSIR?
JSIR is designed to bridge the gap between abstract syntax trees (ASTs) and low-level IRs. Unlike typical IRs that lose source-level information during lowering, JSIR is "reversible," meaning it supports a lossless round-trip between:
Source $\leftrightarrow$ AST $\leftrightarrow$ JSIR
Key Features
High-Fidelity Round-tripping: It preserves enough information to lift the IR back into valid JavaScript source code, achieving a 99.9%+ success rate in internal Google evaluations.
MLIR-Powered Analysis: It uses
to represent JavaScript control flow structures (likeMLIR regions if,while, andlogical expressions) as nested blocks rather than flat graphs.Enhanced Dataflow API: It provides a simplified wrapper over MLIR’s dataflow analysis, making it easier for developers to define lattices and transfer functions without manually managing worklists.
Primary Use Cases
Google currently uses JSIR in production for:
Decompilation: Lifting
back into readable JavaScript.Hermes bytecode Deobfuscation: Transforming obfuscated code into a clearer format, sometimes in combination with LLMs like Gemini.
Code Transformation: Performing complex refactoring or optimizations that require both dataflow insights and the ability to output source code.
Current Status & Future
Open Source: The project is hosted on
.GitHub Community Integration: The authors are exploring upstreaming JSIR to the
project, though they note practical challenges regarding dependencies likeLLVM/MLIR (for constant folding) andQuickJS /Babel (for parsing).SWC