Thursday, April 10, 2025

AI HW: NVIDIA DGX Spark vs AWS


NVIDIA DGX Spark US - A Grace Blackwell AI supercomputer on your desk

this is 1000 "TOPS" (Tera Operations Per Second) AI performance, very large

for $3999


This is the ASUS Ascent GX10 a NVIDIA GB10 Mini PC with 128GB of Memory and 200GbE

apparently there are other similar systems, with same GPU, like this one from ASUS
for $1000 less.
On the NVIDIA GB10 motherboard, we can see the NVIDIA GB10 chip with 10 Cortex-X925 and 10 Cortex-A725 Arm cores for 20 cores total.

here are comparable cloud-based instances with pricing

The math gets "interesting"... when reserved for 3 years, that instance price is $0.79
But the grand total for 3 years is just about $20,000 !
So AWS price is 5x more than NVIDIA HW price, or almost 6.7x ASUS HW price!
The cost of electricity like likely not even near price of HW.
So for those with significant AI load, owning HW can be very profitable. 

Instance SizeInferentia2 ChipsAccelerator
Memory
(GB)
vCPUMemory
(GiB)
Local
Storage
Inter-Chip
Interconnect
Network
Bandwidth
(Gbps)
EBS
Bandwidth
(Gbps)
On-Demand Price1-Year Reserved Instance3-Year Reserved Instance

inf2.8xlarge13232128EBS OnlyN/AUp to 2510$1.97$1.81$0.79

AI with TypeScript: Deno, Llama, Jupyter

The Dino 🦕, the Llama 🦙, and the Whale 🐋

  • An environment for our language model – while you can connect up to various LLM hosting environments via APIs, we are going to leverage the Ollama framework for running language models on your local machine.
  • A large language model – we will use a resized version of DeepSeek R1 that can run locally.
  • A notebook – Jupyter Notebook for interactive code and text.
  • Deno – a runtime that includes a built-in Jupyter kernel. We assume a recent version is installed.
  • An IDE – we’ll use VSCode with built-in Jupyter Notebook support and the Deno extension (extension link).
  • An AI library/framework – LangChain.js to simplify interactions with the LLM.
  • A schema validator – we’ll structure LLM output. We will use zod for this.

Build a custom RAG AI agent in TypeScript and Jupyter

  • Retrieve and prepare several blog posts to be used by our AI agent.
  • Create an AI agent which has several tools:
    • A tool to query the blog posts in the database.
    • A tool to grade if the documents are relevant to the query.
    • The ability to rewrite and improve the query if required.
  • Finally we generate a response to the query based on our collection of information.