Tuesday, January 21, 2025

AI from China: DeepSeek; Open Source

DeepSeek claims its 'reasoning' model beats OpenAI's o1 on certain benchmarks | TechCrunch

Chinese AI lab DeepSeek has released an open version of DeepSeek-R1, its so-called reasoning model, that it claims performs as well as OpenAI’s o1 on certain AI benchmarks.

R1 is available from the AI dev platform Hugging Face under an MIT license, meaning it can be used commercially without restrictions. According to DeepSeek, R1 beats o1 on the benchmarks AIME, MATH-500, and SWE-bench Verified.





at the World Artificial Intelligence Conference in Shanghai, Baidu’s CEO, Robin Li Yanhong, asked a surprising question: Does China have too many AI startups? As he put it: “In 2023, intense competition among over 100 LLMs has emerged in China, resulting in a significant waste of resources, particularly computing power. … How about real-world applications? Who has benefited from them?”


run TypeScript: tsx vs ts-node

TypeScript is complex. JavaScript ecosystem is complex.
Most things can be done in may different ways, and things break and stop working all the time.
When ts-node has some issues, often tsx works. That is helpful.
But is it enough to "switch"?

Frequently Asked Questions | tsx


GitHub - privatenumber/ts-runtime-comparison: Comparison of Node.js TypeScript runtimes

TSX vs. TS-Node and Nodemon. Which NodeJS runner is fastest for… | by Lincoln W Daniel | ModernNerd Code | Medium

"I wanted tsx to be faster since it's so much simpler, but it unfortunately is not... "




  • tsx is zero-config because it has smart detections built in. As a runtime, it detects what's imported to make many options in tsconfig.json redundant—which was designed for compiling matching files regardless of whether they're imported.
  • It seamlessly adapts between CommonJS and ESM package types by detecting how modules are loaded (require() or import) to determine how to compile them. It even adds support for require()ing ESM modules from CommonJS so you don't have to worry about your dependencies as the ecosystem migrates to ESM.
  • At the core, tsx is powered by esbuild for blazing fast TypeScript compilation, whereas ts-node (by default) uses the TypeScript compiler.

  • ts-node incorporates type checking, tsx does not
  • tsx handles package types automatically, ts-node does not


node --import tsx adds support for both Module and CommonJS contexts. To only import one, you can use node --import tsx/esm or node --require tsx/cjs.

node -r ts-node/register only supports a CommonJS context, node --loader ts-node/esm must be used for projects that are type Module.


"On average tsx was faster, (about twice as fast on medium sized projects), than ts-nodetsx also includes a watch option which automatically reruns when the codebase is changed, which can be useful in certain circumstances. 


Overall, it feels that losing type checking for a faster and more flexible runtime is a better choice ... for running tests and small dev scripts.



Monday, January 20, 2025

AI Agents: Google whitepaper

 Agents | Kaggle

Authors: Julia Wiesinger, Patrick Marlow and Vladimir Vuskovic

"Humans are fantastic at messy pattern recognition tasks. However, they often rely on tools - like books, Google Search, or a calculator - to supplement their prior knowledge before arriving at a conclusion. Just like humans, Generative AI models can be trained to use tools to access real-time information or suggest a real-world action. For example, a model can leverage a database retrieval tool to access specific information, like a customer's purchase history, so it can generate tailored shopping recommendations. Alternatively, based on a user's query, a model can make various API calls to send an email response to a colleague or complete a financial transaction on your behalf. To do so, the model must not only have access to a set of external tools, it needs the ability to plan and execute any task in a self- directed fashion. This combination of reasoning, logic, and access to external information that are all connected to a Generative AI model invokes the concept of an agent, or a program that extends beyond the standalone capabilities of a Generative AI model. This whitepaper dives into all these and associated aspects in more detail."


Feed | LinkedIn

Google recently published a whitepaper on AI Agents that everyone should read.
It covers everything you need to know about this new wave.


Here's what's included:
- Introduction to AI Agents
- The role of tools in Agents
- Enhancing model performance with targeted learning
- Quick start to Agents with LangChain
- Production applications with Vertex AI Agents


Intro to AI agents - YouTube by Google Cloud Tech





AI robotics: Figure AI, NVIDIA, OpenAI, Microsoft

Nvidia Focuses on Robots Amid Stiffer AI Chip Competition

Facing rising competition in the AI chip space, Nvidia is reportedly turning to robotics.

 it joined Microsoft and OpenAI in a February funding round that valued humanoid robotics company Figure AI at $2.6 billion.



Sunday, January 19, 2025

AI on Windows 98 Pentium II 128MB RAM

 AI language model runs on a Windows 98 system with Pentium II and 128MB of RAM — Open-source AI flagbearers demonstrate Llama 2 LLM in extreme conditions | Tom's Hardware

Pentium II with 128MB of RAM could generate an impressive 35.9 tok/sec.

Andrej Karpathy's llama2.c, which can be summarized as "700 lines of pure C that can run inference on models with Llama 2 architecture." 

exo-explore/llama98.c: Inference Llama models in one file of pure C for Windows 98 running on 25-year-old hardware @GitHub

karpathy/llama2.c: Inference Llama 2 in one file of pure C @GitHub
by Andrej Karpathy




AI risks: warning from Godfather of AI

 Why The "Godfather of AI" Now Fears His Own Creation | Geoffrey Hinton - YouTube

Professor Geoffrey Hinton, a prominent figure in AI and 2024 Nobel Prize recipient,
discusses the urgent risks posed by rapid AI advancements
in today's episode of Theories of Everything with Curt Jaimungal.


Saturday, January 18, 2025

Reversible Computing for AI efficiency

 Reversible Computing Escapes the Lab - IEEE Spectrum

"Intuitively, information may seem like an ephemeral, abstract concept. But in 1961, Rolf Landauer at IBM discovered a surprising fact: Erasing a bit of information in a computer necessarily costs energy, which is lost as heat. It occurred to Landauer that if you were to do computation without erasing any information, or “reversibly,” you could, at least theoretically, compute without using any energy at all."

A traditional exclusive-OR (XOR) gate is not reversible—you cannot recover the inputs just by knowing the output. Adding an extra output, just a copy of one of the inputs, makes it reversible. Then, the two outputs can be used to “decompute” the XOR gate and recover the inputs, and with it, the energy used in computation.


New Computer Breakthrough is Defying the Laws of Physics - YouTube


Rolf Landauer - Wikipedia

ideas: macro-engineering (new sea in desert)

interesting, strange idea

Qattara Depression Project - Wikipedia

"The Qattara Depression Project or Qattara Project is a macro-engineering project concept in Egypt. Rivalling the Aswan High Dam in scope, the intention is to develop the hydroelectric potential of the Qattara Depression by creating an artificial lake.[1]

The Qattara depression is a region that lies 60 m (200 ft) below sea level on average and is currently a vast, uninhabited desert. Water could be let into the area by connecting it to the Mediterranean Sea with tunnels and/or canals. The inflowing water would then evaporate quickly because of the desert climate. A controlled balance of inflow and evaporation would produce a continuous flow to generate hydroelectricity."

Friday, January 17, 2025

Microsoft 365 Copilot = Office

Microsoft's dumbest rebrand in its near 50 year history just got even dumber | Windows Central

Synonymous with Windows, for the past several decades, is its suite of productivity tools known to the vast majority of the globe as "Microsoft Office." At least, it used to be.

A couple of years ago, Microsoft inexplicably rebranded it to "Microsoft 365," throwing away decades of brand recognition and removing the actual function of the product from its name.

"Microsoft 365" is still called "Microsoft 365 (Office)" on app stores like Android

Microsoft has decided to take it to the next level with "Microsoft 365 Copilot"



AI: Agentic RAG workflow + Multimodal RAG

How I finally got agentic RAG to work right (@vectorize.io)

by Chris Latimer | LinkedIn


Chain of Thought with agentic RAG systems

This problem required the LLM to performing reasoning such as:
  • Deciding what steps to take next to get closer to solving the user’s problem.
  • Deciding when it had found a solution to the problem
  • Decide when to give up and bring a human into the loop
OpenAI gpt-4o models, using structured responses with strict=true (JSON)





Multimodal RAG Patterns Every AI Developer Should Know - Vectorize




Thursday, January 16, 2025

SpaceX Starship Flight 7 & Blue Origin New Glenn first flight

two huge rockets, fly on the same day!


Blastoff! SpaceX launches Starship on Flight 7 — catches booster, loses ship - YouTube

SpaceX catches Super Heavy booster on Starship Flight 7 test but loses upper stage (video, photos) | Space


'OXYGEN LEAK!'' Elon Musk Revealed WHY Starship Flight 7 Exploded... - YouTube



Jeff Bezos’ Blue Origin launches massive New Glenn rocket on first test flight - YouTube

Jeff Bezos’ Blue Origin launches massive New Glenn rocket | AP News
Jeff Bezos’ New Glenn rocket reaches orbit on first test flight




AWS GenAI Apps

Best practices to build generative AI applications on AWS | AWS Machine Learning Blog

Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon via a single API.

Amazon SageMaker is a fully managed service that makes it straightforward to build, train, and deploy ML models.

Amazon SageMaker JumpStart offers an ML hub where you can explore, train, and deploy a wide selection of public FM
Common generative AI approaches

Prompt engineering:
Zero-shot prompting, Few-shot prompting, Chain-of-thought prompting

Retrieval Augmented Generation (RAG) 
allows you to customize a model’s responses when you want the model to consider new knowledge or up-to-date information.

Agents
Frameworks like LangChain and certain FMs such as Claude models provide function-calling capabilities to interact with APIs and tools. However, Amazon Bedrock Agents, a new and fully managed AI capability from AWS, aims to make it more straightforward for developers to build applications using next-generation FMs.

Model customization
Fine-tuning
Continued pre-training
Retraining or training from scratch




Wednesday, January 15, 2025

AWS what-is AI: Cloud Computing Concepts Hub

Cloud Computing Concepts Hub | AWS

The Cloud Computing Concepts Hub is the centralized place where you can browse or search for informative articles about cloud computing. You'll find easy-to-understand info about broad topics such as "What is Machine Learning?" and "What is Data Science?" These articles are intended to help you up-level your understanding of frequently asked cloud computing topics.





Tuesday, January 14, 2025

EV: new Tesla Model Y

front side looks almost like enlarged Cybercab, or smaller Cybertruck.

 Tesla confirms starting production of new Model Y at Gigafactory Berlin | Electrek



Tesla Model Y 'Juniper' Is Going Into Production Today In Europe: Report @InsideEV

Princeton Course: Java & Computer Science

 Introduction to Programming in Java · Computer Science

textbooks for a first course in computer science
for the next generation
of scientists and engineers

Introduction to Programming in Java.

textbook Introduction to Programming in Java [ Amazon · Pearson · InformIT ] is an interdisciplinary approach to the traditional CS1 curriculum with Java. We teach the classic elements of programming, using an “objects-in-the-middle” approach that emphasizes data abstraction. We motivate each concept by examining its impact on specific applications, taken from fields ranging from materials science to genomics to astrophysics to internet commerce. The book is organized around four stages of learning to program:

Computer Science.

textbook Computer Science [ Amazon · Pearson · InformIT ] contains Introduction to Programming in Java as its first four chapters. The second half of the book explores core ideas of Turing, von Neumann, Shannon, and others that ignited the digital age.

  • Chapter 5: Theory of Computing surveys the fundamental concepts of universality, computability, and intractability, which raise questions about the role of computation in understanding the natural world.
  • Chapter 6: A Computing Machine describes a simple imaginary machine that has many of the characteristics of real processors at the heart of the computational devices that surround us.
  • Chapter 7: Building a Computer considers the design of a processor, including Boolean logic, combinational circuits, and sequential circuits.

Reading a book and surfing the web are two different activities: This booksite is intended for your use while online (for example, while programming and while browsing the web); the textbook is for your use when initially learning new material and when reinforcing your understanding of that material (for example, when reviewing for an exam).

Computer Science: An Interdisciplinary Approach: 9780134076423: Computer Science Books @ Amazon.com

Introduction to Programming in Java: An Interdisciplinary Approach: Sedgewick, Robert, Wayne, Kevin: 9780672337840: Amazon.com: Books



Sunday, January 12, 2025

elasticmq: SQS compatible message queue


softwaremill/elasticmq: In-memory message queue with an Amazon SQS-compatible interface. Runs stand-alone or embedded. @GitHub
  • in-memory message queue system
  • runs stand-alone, via Docker or embedded
  • Amazon SQS-compatible interface
  • fully asynchronous implementation, no blocking calls
  • optional UI, queue persistence
  • created and maintained by:
ElasticMQ is a message queue system, offering an actor-based Scala and an SQS-compatible REST (query) interface.

ElasticMQ follows the semantics of SQS. Messages are received by polling the queue. When a message is received, it is blocked for a specified amount of time (the visibility timeout). If the message isn't deleted during that time, it will be again available for delivery. Moreover, queues and messages can be configured to always deliver messages with a delay.







Friday, January 10, 2025

AWS AI translation

 Build a multilingual automatic translation pipeline with Amazon Translate Active Custom Translation | AWS Machine Learning Blog

how to use the AWS Management Console and Amazon Translate public API to deliver automatic machine batch translation, and analyze the translations between two language pairs: English and Chinese, and English and Spanish. We also recommend best practices when using Amazon Translate in this automatic translation pipeline to ensure translation quality and efficiency.





Wednesday, January 08, 2025

NVIDIA CES 2025 news

CES 2025: AI Advancing at ‘Incredible Pace,’ NVIDIA CEO Says | NVIDIA Blog

Key Announcements



NVIDIA Cosmos: A World Foundation Model Platform for Physical AI - YouTube

Cosmos WFMs are now available under NVIDIA’s open model license on Hugging Face and the NVIDIA NGC catalog. Cosmos models will soon be available as fully optimized NVIDIA NIM microservices.

Developers can access NVIDIA NeMo Curator for accelerated video processing and customize their own world models with NVIDIA NeMo. NVIDIA DGX Cloud offers a fast and easy way to deploy these models, with enterprise support available through the NVIDIA AI Enterprise software platform.

NVIDIA also announced new NVIDIA Llama Nemotron large language models and NVIDIA Cosmos Nemotron vision language models that developers can use for enterprise AI use cases in healthcare, financial services, manufacturing and more.






C4: visualizing software architecture

 Home | C4 model

The C4 model is:

  1. A set of hierarchical abstractions (software systems, containers, components, and code).
  2. A set of hierarchical diagrams (system context, containers, components, and code).
  3. Notation independent.
  4. Tooling independent.

An overview of the C4 model for visualising software architecture


Visualising software architecture with the C4 model - Simon Brown, Agile on the Beach 2019 - YouTube

vs UML

Has UML Died Without Anyone Noticing? | Ernesto Garbarino


cs.dartmouth.edu/~cs50/Reading/97_Things_Every_Programmer_Should_Know.pdf





Tuesday, January 07, 2025

NVIDIA GB10: Smallest AI Supercomputer; NeMo AI platform

 NVIDIA Puts Grace Blackwell on Every Desk and at Every AI Developer’s Fingertips | NVIDIA Newsroom

NVIDIA Project DIGITS With New GB10 Superchip Debuts as World’s Smallest AI Supercomputer Capable of Running 200B-Parameter Models

NVIDIA unveiled NVIDIA® Project DIGITS, a personal AI supercomputer that provides AI researchers, data scientists and students worldwide with access to the power of the NVIDIA Grace Blackwell platform.

Project DIGITS will be available in May from NVIDIA and top partners, starting at $3,000.

Developers can fine-tune models with the NVIDIA NeMo™ framework, accelerate data science with NVIDIA RAPIDS™ libraries and run common frameworks such as PyTorch, Python and Jupyter notebooks....



NVIDIA NeMo™ is an end-to-end platform for developing custom generative AI—including large language models (LLMs), vision language models (VLMs), video models, and speech AI—anywhere.

Deliver enterprise-ready models with precise data curation, cutting-edge customization, retrieval-augmented generation (RAG), and accelerated performance with NeMo, part of NVIDIA AI Foundry—a platform and service for building custom generative AI models with enterprise data and domain-specific knowledge.


...groundbreaking RTX 50 series GPUs powered by the Blackwell architecture.
...revolutionary advancements in AI, accelerated computing,
and industrial digitalization transforming every industry.







DIGITS: Deep learning GPU Intelligence Training System


GB10 “AI superchip"... . 1 petaflop (1,000 TOPS) of performance for $3000
size of Mac Mini (or Mini PC)







NVIDIA Cosmos™ is a platform of state-of-the-art generative world foundation models (WFM), advanced tokenizers, guardrails, and an accelerated data processing and curation pipeline built to accelerate the development of physical AI systems such as autonomous vehicles (AVs) and robots.

Generative AI @ AWS

 What is Generative AI? - Gen AI Explained - AWS

Generative AI - Digital and Classroom Training | AWS

Diffusion models

...create new data by iteratively making controlled random changes to an initial data sample.
They start with the original data and add subtle changes (noise),
progressively making it less similar to the original.
This noise is carefully controlled to ensure the generated data remains coherent and realistic.

After adding noise over several iterations, the diffusion model reverses the process.
Reverse denoising gradually removes the noise
to produce a new data sample that resembles the original.
...work by training two neural networks in a competitive manner.
The first network, known as the generator, generates fake data samples by adding random noise.
The second network called the discriminator, tries to distinguish between real data and the fake data produced by the generator.

During training, the generator continually improves its ability to create realistic data while the discriminator becomes better at telling real from fake.
This adversarial process continues until the generator produces data that is so convincing that the discriminator can't differentiate it from real data.

GANs are widely used in generating realistic images, style transfer, and data augmentation tasks.

Variational autoencoders (VAEs)

...learn a compact representation of data called latent space.
The latent space is a mathematical representation of the data.
You can think of it as a unique code representing the data based on all its attributes. 
For example, if studying faces, the latent space contains numbers representing eye shape, nose shape, cheekbones, and ears.

VAEs use two neural networks—the encoder and the decoder. 

The encoder neural network maps the input data to a mean and variance for each dimension of the latent space. It generates a random sample from a Gaussian (normal) distribution. This sample is a point in the latent space and represents a compressed, simplified version of the input data.

The decoder neural network takes this sampled point from the latent space and reconstructs it back into data that resembles the original input. Mathematical functions are used to measure how well the reconstructed data matches the original data.


Transformer-based models

...builds upon the encoder and decoder concepts of VAEs. Transformer-based models add more layers to the encoder to improve performance on text-based tasks like comprehension, translation, and creative writing.

Transformer-based models use a self-attention mechanism.
They weigh the importance of different parts of an input sequence when processing each element in the sequence.

Another key feature is that these AI models implement contextual embeddings. The encoding of a sequence element depends not only on the element itself but also on its context within the sequence.






Amazon Bedrock is a fully-managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies, along with a broad set of features to build generative artificial intelligence (generative AI) applications. Amazon Bedrock Titan foundation models are a family of FMs pre-trained by AWS on large datasets. They are powerful, general-purpose models built to support a variety of use cases. Use them as is or customize them with your own data.

Monday, January 06, 2025

Kindle App Text-To-Speech @ Android, iOS

Text-to-speech for e-books is available not only on Kindle devices, but all with Kindle apps!

How to Use the Assistive Reader in Kindle Apps - Amazon Customer Service

To turn on Assistive Reader:

  1. Open the Kindle app on your iOS, Android, or Fire Tablet device.
  2. Open the content where you'd like to use Assistive Reader. Once you turn on Assistive Reader, it remains available until you turn it off again.
  3. Tap the top center of the screen, and then select the reading settings menu, Aa.
  4. Select More, and then turn on Assistive Reader.
  5. To display the player controls, tap the screen. You can control the reading speed, or rewind by 30 seconds using the player controls at the bottom of the screen.

Amazon Kindle - Apps on Google Play



Civet language: compact, TypeScript superset

resembling Python and F# (terse, powerful, functional language),
compiles to JavaScript

A Programming Language for the New Millennium
Code More with Less in a TypeScript Superset

Civet is a programming language that compiles to TypeScript or JavaScript, so you can use existing tooling (including VSCode type checking, hints, completion, etc.) while enabling concise and powerful syntax. 

It starts with 99% JS/TS compatibility, making it easy to transition existing code bases. Then it adds many features and syntactic sugar, with some highlights below and more comprehensive examples in the reference



Two years old and well maintained, Civet offers an interesting approach. Think JavaScript but with Python style indentation, chained comparisons, built-in JSX, & more. 
This example alone shows off the potential for tighter, easier-to-write code.