LLM Coding Leaderboard: May 15th (11 Models Tested)
I've compiled all my recent tests with 11 different LLMs into one summary table. The results are based on 3 different Laravel projects that I published on my YouTube channel over last weeks.
I've compiled all my recent tests with 11 different LLMs into one summary table. The results are based on 3 different Laravel projects that I published on my YouTube channel over last weeks.
11-minute video for Premium members. Codex App is really convenient to work with multiple projects at the same time. I've found quite a few "hidden tricks" that are not visible right away. Let me show you.
11-minute video for Premium members. I have given two tasks to 6 different LLMs - Kimi, MiMo, Deepseek, GLM, Qwen, and Minimax. Is there a clear winner?
Many people are switching to Codex from Claude Code in April/May, so I thought to compile some features of Codex CLI that you may not know.
20-minute video for Premium members. I gave three tasks to three models: to fix a bug, to plan a project, and to implement the part of the plan. I will compare the code quality, speed, and usage of the tokens.
13-minute video for Premium members. I generated a small demo for multi-language Laravel + Filament project, with Opus and Codex. Let me show the differences in the code - they were shockingly consistent all over codebase.
23-minute comparison video for Premium members. I'm a big fan of Plan Mode in Claude Code, but Superpowers plugin gives extra layer on top. Is it worth it? I've tried it and compared to Codex plan mode, too.
13-minute video for Premium members. A new experiment with switching the tech-stack from Livewire to React.js, looking how LLMs will handle browser testing.
A 26-minute video for Premium members. I decided to test Codex GPT-5.4 as a main driver to create a real project with Laravel and Filament. In this video, I will show the things I've learned.
A 12-minute video for Premium members. I asked Opus / Codex / Gemini to generate a startup landing page with HTML and Tailwind CSS. Let me show the results.