Sunday, April 13, 2025

AI power usage estimate error 120 000 times

Excellent podcast interview with a very prominent person from computing word, 
co-creator of RISC processor design, including modern and very popular "open hardware" RISC V".

But the most interesting part of conversation was about a mistake in a scientific paper
where energy usage used for AI training was over-estimated 120 000 times!

That was based on public data from Google, that are miss-interpreted, and without insight of how Google is actually doing AI training etc. David Patterson knows it well, since he now works in Google
and is deeply involved in energy optimization.

And now, with this wrong info published and cited many times, people are making all kinds of assumptions and even plans! Yes, data centers are using a lot of energy, but not even near to levels some people are presenting it to be.

Turing Award Special: A Conversation with David Patterson - Software Engineering Daily


Good News About the Carbon Footprint of Machine Learning Training

"Unfortunately, some ... papers misinterpreted the NAS estimate as the training cost for the model it discovered, yet emissions for this particular NAS are ~1300x larger than for training the model. These papers estimated that training the Evolved Transformer model takes two million GPU hours, costs millions of dollars, and that its carbon emissions are equivalent to five times the lifetime emissions of a car. In reality, training the Evolved Transformer model on the task examined by the UMass researchers and following the 4M best practices takes 120 TPUv2 hours, costs $40, and emits only 2.4 kg (0.00004 car lifetimes), 120,000x less. This gap is nearly as large as if one overestimated the CO2e to manufacture a car by 100x and then used that number as the CO2e for driving a car."

David Patterson (computer scientist) - Wikipedia

JavaScript V8 engine internals

very "technical" and interesting

Land ahoy: leaving the Sea of Nodes · V8  
"V8’s end-tier optimizing compiler, Turbofan, is famously one of the few large-scale production compilers to use Sea of Nodes (SoN). However, since almost 3 years ago, we’ve started to get rid of Sea of Nodes and fall back to a more traditional Control-Flow Graph (CFG) Intermediate Representation (IR), which we named Turboshaft. By now, the whole JavaScript backend of Turbofan uses Turboshaft instead, and WebAssembly uses Turboshaft throughout its whole pipeline."



Google created V8 for its Chrome browser, and both were first released in 2008.[4] The lead developer of V8 was Lars Bak, and it was named after the powerful car engine.[5] For several years, Chrome was faster than other browsers at executing JavaScript


In 1994, he joined LongView Technologies LLC, where he designed and implemented high performance virtual machines for both Smalltalk and Java. After Sun Microsystems acquired LongView in 1997, Bak became engineering manager and technical lead in the HotSpot team at Sun's Java Software Division where he developed a high-performance Java virtual machine

With a team of 12 engineers, Bak coordinated the development of the V8 JavaScript interpreter for Chrome

Bak co-developed the Dart programming language presented at the 2011 Goto conference in Aarhus, Denmark



Saturday, April 12, 2025

EV: Volvo EX90 (vs Kia EV9, Rivian, Tesla model X)

apparently a very good luxury 7-seat EV SUV; $80K

likely will be much more with tariffs...

Volvo EX90 - Rolls Royce Luxury, Model X Price! - YouTube by AutoFocus / MKBHD


Volvo EX90 | Fully electric 7-seater SUV | Volvo Cars US


2025 Volvo EX90 Is the Electric SUV You Didn't Know You Needed @CarAndDriver
The ultraquiet EV offers more room than an XC90 and delivers up to 310 miles of range.


$56K vs $81K... big difference in price, similar size


data: JSONL format: JSON Lines

 Easily Open JSONL Files - Guide to JSON Lines Format | Row Zero

JSONL is a highly efficient file format for processing large datasets where each line represents a valid JSON object. Row Zero is a spreadsheet built for big data that easily opens JSONL files to view and analyze JSONL data.

What is the Difference Between .jsonl and .json files?

        {"name": "Alice", "age": 30, "city": "New York"} {"name": "Bob", "age": 25, "city": "Los Angeles"} {"name": "Charlie", "age": 35, "city": "Chicago"}

JSON Lines.org



Thursday, April 10, 2025

AI HW: NVIDIA DGX Spark vs AWS


NVIDIA DGX Spark US - A Grace Blackwell AI supercomputer on your desk

this is 1000 "TOPS" (Tera Operations Per Second) AI performance, very large

for $3999


This is the ASUS Ascent GX10 a NVIDIA GB10 Mini PC with 128GB of Memory and 200GbE

apparently there are other similar systems, with same GPU, like this one from ASUS
for $1000 less.
On the NVIDIA GB10 motherboard, we can see the NVIDIA GB10 chip with 10 Cortex-X925 and 10 Cortex-A725 Arm cores for 20 cores total.

here are comparable cloud-based instances with pricing

The math gets "interesting"... when reserved for 3 years, that instance price is $0.79
But the grand total for 3 years is just about $20,000 !
So AWS price is 5x more than NVIDIA HW price, or almost 6.7x ASUS HW price!
The cost of electricity like likely not even near price of HW.
So for those with significant AI load, owning HW can be very profitable. 

Instance SizeInferentia2 ChipsAccelerator
Memory
(GB)
vCPUMemory
(GiB)
Local
Storage
Inter-Chip
Interconnect
Network
Bandwidth
(Gbps)
EBS
Bandwidth
(Gbps)
On-Demand Price1-Year Reserved Instance3-Year Reserved Instance

inf2.8xlarge13232128EBS OnlyN/AUp to 2510$1.97$1.81$0.79

AI with TypeScript: Deno, Llama, Jupyter

The Dino 🦕, the Llama 🦙, and the Whale 🐋

  • An environment for our language model – while you can connect up to various LLM hosting environments via APIs, we are going to leverage the Ollama framework for running language models on your local machine.
  • A large language model – we will use a resized version of DeepSeek R1 that can run locally.
  • A notebook – Jupyter Notebook for interactive code and text.
  • Deno – a runtime that includes a built-in Jupyter kernel. We assume a recent version is installed.
  • An IDE – we’ll use VSCode with built-in Jupyter Notebook support and the Deno extension (extension link).
  • An AI library/framework – LangChain.js to simplify interactions with the LLM.
  • A schema validator – we’ll structure LLM output. We will use zod for this.

Build a custom RAG AI agent in TypeScript and Jupyter

  • Retrieve and prepare several blog posts to be used by our AI agent.
  • Create an AI agent which has several tools:
    • A tool to query the blog posts in the database.
    • A tool to grade if the documents are relevant to the query.
    • The ability to rewrite and improve the query if required.
  • Finally we generate a response to the query based on our collection of information.

Wednesday, April 09, 2025

Sunday, April 06, 2025

AI model: Llama 4, from Meta

 Llama 4 Unleashed! Testing Meta's Most Advanced Multimodal AI - YouTube by "Dr. Know-it-all"

The Industry Reacts to Llama 4 - "Nearly INFINITE" - YouTube

Zuck's new Llama is a beast - YouTube by "Fireship"

The Llama 4 Herd - Open Source Won? - YouTube

Llama 4 Dropped: 10 MILLION TOKEN CONTEXT?! - YouTube


Meta releases Llama 4, a new crop of flagship AI models | TechCrunch



Meta releases Llama 4, a new crop of flagship AI models | TechCrunch

"There are four new models in total: Llama 4 Scout, Llama 4 Maverick, and Llama 4 Behemoth. All were trained on “large amounts of unlabeled text, image, and video data” to give them “broad visual understanding,” Meta says.
...
Scout and Maverick are openly available on Llama.com and from Meta’s partners, including the AI dev platform Hugging Face, while Behemoth is still in training. Meta says that Meta AI, its AI-powered assistant across apps including WhatsApp, Messenger, and Instagram, has been updated to use Llama 4 in 40 countries. Multimodal features are limited to the U.S. in English for now."

"Top performance at lowest cost"  ELO, like chess ratings :)


API for ordering coffee: terminal.shop: developes only

developers-only shop for coffee!

need to use API directly, no UI!

Stripe handles payments

wip: terminal (initial commit) //www.terminal.shop

wip: terminal (initial commit)

https://api.dev.terminal.shop/product

https://api.dev.terminal.shop/product/prd_01JNH7GKWYRHX45GPRZS3M7A4X


(python, js, go, java, kotlin)

import os
from terminal_shop import Terminal
client = Terminal( bearer_token=os.environ.get("TERMINAL_BEARER_TOKEN"))
product = client.product.list()
print(product.data)

by
#461 – ThePrimeagen: Programming, AI, ADHD, Productivity, Addiction, and God | Lex Fridman Podcast


Saturday, April 05, 2025

AI tool: Claude Code (not free)

apparently available for "all" that don't use free account.

"in a research preview and requires API credits, with a $5 free credit available on their site."?


Claude Code is an agentic tool that lets developers delegate sizable coding tasks to Claude directly from their terminal. Join the research preview to try Claude Code.

Claude Code overview - Anthropic

Currently, Claude Code does not run directly in Windows, and instead requires WSL.

anthropics/claude-code: Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflows - all through natural language commands. @GitHub

npm install -g @anthropic-ai/claude-code



.NET Aspire 9

some useful tools and links

.NET Rocks! .NET Aspire 9.1 with Rob Richardson (podcast)

PowerToys

Thursday, April 03, 2025

Wednesday, April 02, 2025

AI IQ? Gemini 2.5 Pro

Gemini 2.5 Pro is a coding GENIUS - YouTube

IQ 130?

on my tests was NOT so bright across the board, more like specialized "genius" that frequently gets "stuck" and could not move on... promising and frustrating... 




chess code: Sunfish_rs: Rust vs Python

"ELO" is a measure of "chess mastery" 

On that score, this very compact Python chess program is already better than most of human players.

Porting it to Rust language made it even stronger. And longer, in lines of code :)

thomasahle/sunfish: Sunfish: a Python Chess Engine in 111 lines of code @GitHub

Sunfish is a simple, but strong chess engine, written in Python. With its simple UCI interface, and removing comments and whitespace, it takes up just 131 lines of code! (build/clean.sh sunfish.py | wc -l). Yet it plays at ratings above 2000 at Lichess.

Finally you can play sunfish now on Lichess or play against Recursing's Rust port, also on Lichess, which is about 100 ELO stronger.

Recursing/sunfish_rs: Rust rewrite of the sunfish simple chess engine @GitHub




Tuesday, April 01, 2025

OpenAI <= $40B <= SoftBank

not an April's fool's joke?

SoftBank is known for risky investments... 

OpenAI Hits $300B Valuation After $40B SoftBank Investment

"The funding is structured in stages with major conditions attached. SoftBank and its syndicate are injecting an initial $10 billion now, and another $30 billion is slated for late 2025 — but only if OpenAI completes a governance overhaul. Specifically, OpenAI must transition from its current non-profit-controlled, capped-profit model to a conventional for-profit structure by the end of 2025. If it fails to meet this deadline, SoftBank can scale back the remaining investment by up to 25%."


Sam Altman Says OpenAI Will Release an ‘Open Weight’ AI Model This Summer | WIRED


OpenAI adopts rival Anthropic's standard for connecting AI models to data | TechCrunch

AI Code tools: Augment Code vs Cursor AI

AI Coding Assistant Showdown: Augment Code vs Cursor AI (Which is Better?) - YouTube


Augment Code – Developer AI for real work

"The first AI coding assistant built for professional software engineers and large codebases."



Cursor - The AI Code Editor

"Built to make you extraordinarily productive,Cursor is the best way to code with AI."





Monday, March 31, 2025

X.AI += X - $45B

strange word of X's

📊 Market Pulse: xAI Acquires X in $45 Billion All-Stock Deal

Elon Musk’s artificial intelligence venture xAI is acquiring X (formerly Twitter) in an all-stock merger valued at $45 billion. The transaction prices xAI at approximately $80 billion and X at $33 billion (including about $12 billion of debt). By folding the social media platform into his AI company, Musk aims to unite two of his major ventures under one roof. The combined entity (worth roughly $113 billion) will span social media and AI, positioning Musk’s tech empire for ambitious growth.

Sunday, March 30, 2025

data: RecordIO-Protobuf vs Parquet

when processing large amounts of data, for example for ML/AL
it is important to store data in a efficient and compact format.

JSON is not efficient, and CSV is also quite inefficient. 

  • Parquet is a columnar format, 
  • RecordIO-protobuf is used for binary record-level serialization.


Parquet is great for analytics data due to its small file size and allows you to scan only the columns of interest.

RecordIO format is typically used for training machine learning models so that the data that the model needs is presented only when needed.


Leveraging RecordIO for Efficient Training and Cost Reduction in Amazon SageMaker | LinkedIn

RecordIO: It’s a streaming data format that organizes a file as a series of length-prefixed binary records. The “length-prefix” means that every chunk of data is preceded by 4 or 8 bytes that tell you how long the next record is.

Protobuf is a serialization format (like JSON or CSV but binary).
Protobuf encodes structured data (like Python objects) into an efficient, compact binary format.

RecordIO-wrapped Protobuf means you have a file where each Protobuf-encoded message is wrapped with a size prefix (via RecordIO). Each record starts with a 4-byte integer indicating its size, followed by the actual Protobuf message.


Apache Mesos - RecordIO Data Format

xeno14/recordio: multiple Protocol Buffers in a single binary file: forked from google/or-tools @GitHub

eclesh/recordio: recordio implements a file format for a sequence of records in Go, @GitHub

Using Protobuf with TypeScript for data serialization - LogRocket Blog