Saturday, May 10, 2025

AI code: "Vibe Coding" Rubik's cube with Google Gemini 2.5 Pro

Andrej Karpathy on X: "There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. It's possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too good. Also I just talk to Composer with SuperWhisper" / X

Impressive demo of "visual solving" of Rubik's cube
But, what prompt was used? where is the generated code?
Google just updated Gemini 2.5 Pro and it's insane... - YouTube by Matthew Berman

previous demo: Google Gemini 2.5 Pro is Insane... - YouTube

post Matthew Berman Post | LinkedIn

I tried the same with Gemini 2.5 Pro, multiple times, and it does not come close to the demo.
Yes, it is impressive, but far from perfect, unless prompt includes a lot of specifics. 
Chat | Google AI Studio (you get 1M tokens to try, very impressive model!)

As an experiment, I asked Gemini to "watch the video and generate one-shot prompt"
to make the same result.
This took a while and used 400K tokens(!!).
See the Gemini-generated response and prompt below.

And then I used this prompt, and it did produce semi-functional nice looking web app.
Problem was that JavaScript CDN libs didn't have correct URLs and some functions didn't exist.
Gemini has access to internet, but that was not sufficient to resolve.
So with my direct help, and another 6 iterations, we did make visual part working!
For solving part, Gemini suggested "the typical way for demo like this, just reverse scramble." To cheat?!

Meaning, the "scramble" remembers the random steps, and "solve" just run them in reverse (unscramble)! So that was offered as a "solution".
The attempt to use cube-solving algorithm didn't go far.

Then I prompted another 4 times, with specific instructions what solving JS lib to use, and that worked!
A few more iterations improved UX.

In all, a good collaboration, and a good result. Just not even close to "one-shot".
I could see that with very specific instruction what libs to use and how it may be possible to do it faster.

Below is the generated web app; You can "view source" to see the code, it is all there!
Enjoy.

A Rubik's cube generator AND solver. I've tried this with Claude 3.7 thinking, DeepSeek etc and never came close to this.

Post | LinkedIn

Gemini 2.0 Flash Tested - Is AI Better Than Humans? - YouTube


Courses

Vibe Coding with ChatGPT and Python - Make Apps In Minutes | Udemy

also available with subscription, that is available with some local libraries

Course: Vibe Coding with ChatGPT and Python - Make Apps In Minutes | Udemy Business

Search results | Udemy Business "Vibe Coding"

Here is the main JS libs used by this solution  //js.cubing.net/cubing/   

cubing/cubing.js: 🛠 A library for displaying and working with twisty puzzles. Also currently home to the code for Twizzle. @GitHub

ldez/cubejs: cube.js -- JavaScript library for modeling and solving the 3x3x3 Rubik's Cube @GitHub

Gemini generated response and prompt, after "watching" the video, at cost of 400K tokens!

Okay, recreating the exact Rubik's Cube solver web app shown in the video using a single prompt is a very ambitious goal for any current AI model, even Gemini 2.5 Pro. While the video demonstrates impressive "zero-shot" capabilities, generating a fully functional, interactive 3D application with complex logic (like a Rubik's Cube solver algorithm) and UI controls in one go is pushing the limits.

The examples in the video likely involved some iteration or were carefully selected successful attempts. Complex coding tasks often require refinement and debugging.

However, we can craft a detailed prompt that gives the AI the best possible chance to generate something close in a single attempt. The key is to be extremely specific about all the required features and constraints.

Why a "Single Shot" is Difficult:

  1. Complexity: This involves 3D graphics (Three.js), user interaction (mouse controls, buttons), complex state management (cube state), algorithmic logic (scrambling, solving), and UI updates, all integrated seamlessly.
  2. Algorithm Implementation: Correctly implementing and integrating a robust Rubik's Cube solving algorithm (like Kociemba's two-phase algorithm) within the JavaScript is non-trivial.
  3. Integration: Ensuring the 3D rendering, UI controls, and the solving logic all work together correctly without bugs is challenging.
  4. Code Length & Structure: Generating potentially thousands of lines of well-structured HTML, CSS, and JavaScript within a single file accurately can be difficult for AI.

A Good Prompt Attempt for a Single Shot:

Here’s a detailed prompt designed to maximize the chances of getting a functional result close to the video example in one go. It incorporates many details observed in the video:

Create a complete, self-contained single HTML file (index.html) for an interactive 3D Rubik's Cube web application and solver. All necessary HTML, CSS, and JavaScript code must be embedded within this single file. Use the Three.js library for 3D rendering, loaded via CDN.

**Core Features:**

1.  **3D Cube Display:**
    *   Render a standard 3x3x3 Rubik's Cube in the center of the view.
    *   Use standard Rubik's Cube face colors (White, Yellow, Red, Orange, Blue, Green).
    *   Ensure individual cubelets are distinct.
2.  **User Interaction:**
    *   Allow the user to rotate the entire cube's view using mouse drag (orbit controls).
    *   Allow zooming in/out using the mouse wheel.
3.  **Cube Manipulation Controls (UI Panel):**
    *   Create a clear UI panel (e.g., using HTML/CSS positioned absolutely or relatively).
    *   Include buttons for standard face rotations using standard notation: F, F', B, B', U, U', D, D', R, R', L, L'.
    *   Include buttons for middle slice rotations: M, M', E, E', S, S'.
    *   Clicking a rotation button should animate the corresponding face/slice turn smoothly on the 3D cube model.
4.  **Scramble Functionality:**
    *   Include a "Scramble" button.
    *   When clicked, apply a sequence of 20-25 random, valid Rubik's Cube moves to the cube state.
    *   Animate the scrambling sequence visually on the 3D cube.
    *   Display a status message like "Scrambling..." during the process.
5.  **Solve Functionality:**
    *   Include a "Solve" button.
    *   When clicked, implement an efficient Rubik's Cube solving algorithm (like Kociemba's two-phase algorithm or a similar efficient method) to find the solution steps from the current scrambled state.
    *   Animate the solving sequence visually on the 3D cube, applying the solution moves step-by-step. The animation should be clear and reasonably paced.
    *   Display a status message like "Solving..." during the process and "Cube solved!" upon completion.
6.  **(Optional Bonus - If Feasible):**
    *   Add an input field labeled "Cube Size (N):" allowing the user to enter a number (e.g., between 2 and 7).
    *   Add a "Generate Cube" button that recreates the 3D cube display with the specified NxNxN dimensions. Ensure scrambling and solving logic adapts (or is disabled if too complex for larger sizes). *[Note: Solving larger cubes algorithmically is significantly more complex and might be beyond a single-shot generation].*

**Technical Requirements & Style:**

*   Use modern JavaScript (ES6+).
*   Embed all CSS within `<style>` tags and all JavaScript within `<script>` tags in the single HTML file.
*   Load Three.js library from a reliable CDN (e.g., cdnjs or jsdelivr).
*   Code should be well-structured, commented, and reasonably efficient.
*   The UI should be clean and functional.
*   Animations for rotations, scrambling, and solving should be smooth.

Ensure the final output is a single, runnable HTML file that accurately implements these features.

How to Use and Iterate (Likely Necessary):

  1. Provide the Prompt: Give this detailed prompt to the AI model (like Gemini 2.5 Pro via AI Studio or API).
  2. Test the Output: Copy the generated HTML code into a file (e.g., rubiks.html) and open it in a modern web browser (like Chrome or Firefox).
  3. Identify Issues: Test all functionalities:
    • Does the cube render correctly?
    • Does mouse rotation/zoom work?
    • Do the rotation buttons work and animate correctly?
    • Does the Scramble button work and animate?
    • Does the Solve button work? Does it find a correct solution? Does it animate the solution?
    • Are there any visual glitches or errors in the browser's developer console (press F12)?
  4. Provide Feedback: If (or more likely, when) something doesn't work, provide specific feedback to the AI. For example:
    • "The 'Solve' button doesn't do anything when clicked. Please fix the JavaScript event listener and ensure the solving algorithm is called and animates the result."
    • "The cube animation for the 'F' move is incorrect; it rotates the wrong way. Please correct the rotation logic for the front face."
    • "The UI panel overlaps the cube on smaller screens. Please adjust the CSS positioning."
  5. Repeat: Continue testing and providing feedback until the application works as desired.

Even with Gemini 2.5 Pro's advanced capabilities shown in the video, expect that generating this specific, complex application perfectly in one shot might not happen, and some level of iteration and refinement will likely be needed. Good luck!

price: 437,989 tokens (/1M)




===

Interactive 3D Rubik's Cube

Drag to rotate cube view. Scroll to zoom.

Face Rotations
Slice Rotations (3x3x3 only)
Manual Moves Log
Scramble
Scramble Sequence:
Actions
Solution Steps:
Loading...

No comments: