Thursday, May 07, 2026

IR for JavaScript - MLIR

 [RFC] JSIR: A High-Level IR for JavaScript - MLIR - LLVM Discussion Forums

This RFC introduces JSIR, a high-level IR for JavaScript:

  • JSIR preserves all information from the AST and supports high-fidelity round-trip between source ↔ AST ↔ JSIR;
  • JSIR uses MLIR regions to represent control flow structures;
  • JSIR supports dataflow analysis.

JSIR is developed and deployed in production at Google for code analysis and transform use cases.

JSIR is open source here: GitHub - google/jsir: Next-generation JavaScript analysis tooling · GitHub

Industry trend of building high-level language-specific IRs

The compiler industry is moving towards building high-level language-specific IRs. For example, the Rust and Swift compilers perform certain analyses on their high-level IRs before lowering down to LLVM. There are also a number of ongoing projects in this direction, such as Clang IRMojo, and Carbon.

The need for a high-level JavaScript IR

Why do we need a high-level IR for JavaScript specifically? While much of JavaScript tooling relies on ASTs (like ESTree), complex analyses require a control flow graph (CFG) and dataflow analysis capabilities, which JSIR provides by using the MLIR framework.


[2024 LLVM DevMtg] JSIR - Adversarial JavaScript Analysis with MLIR PDF

This RFC (Request for Comments) introduces JSIR, a high-level Intermediate Representation (IR) for JavaScript developed by Google and built on the MLIR framework.

What is JSIR?

JSIR is designed to bridge the gap between abstract syntax trees (ASTs) and low-level IRs. Unlike typical IRs that lose source-level information during lowering, JSIR is "reversible," meaning it supports a lossless round-trip between:

Source $\leftrightarrow$ AST $\leftrightarrow$ JSIR

Key Features

  • High-Fidelity Round-tripping: It preserves enough information to lift the IR back into valid JavaScript source code, achieving a 99.9%+ success rate in internal Google evaluations.

  • MLIR-Powered Analysis: It uses MLIR regions to represent JavaScript control flow structures (like if, while, and logical expressions) as nested blocks rather than flat graphs.

  • Enhanced Dataflow API: It provides a simplified wrapper over MLIR’s dataflow analysis, making it easier for developers to define lattices and transfer functions without manually managing worklists.

Primary Use Cases

Google currently uses JSIR in production for:

  • Decompilation: Lifting Hermes bytecode back into readable JavaScript.

  • Deobfuscation: Transforming obfuscated code into a clearer format, sometimes in combination with LLMs like Gemini.

  • Code Transformation: Performing complex refactoring or optimizations that require both dataflow insights and the ability to output source code.

Current Status & Future

  • Open Source: The project is hosted on GitHub.

  • Community Integration: The authors are exploring upstreaming JSIR to the LLVM/MLIR project, though they note practical challenges regarding dependencies like QuickJS (for constant folding) and Babel/SWC (for parsing).