Saturday, 22 February 2025

Earlier this week, xAI released Grok 3, the company's most advanced AI yet, complete with a reasoning model and a DeepSearch feature. The company claims that it's the "world's smartest AI," and Elon himself says it's "outperforming anything that's been released" so far. But is it really the "maximally truth-seeking AI" Musk says it is?

Well, to spoil it for you, no. Not yet. Which is a shame, because Grok is expensive— beyond a limited free trial, it requires either a $40/month X Premium+ subscription, up from $22 thanks to the new model, or a $30/month SuperGrok subscription.

From both my testing as well as experiments from experts, I'm having trouble believing the "based" AI is worth that cost. There is no next-generation breakthrough or groundbreaking reasoning model that we haven't already seen before here. Grok 3 also still periodically hallucinates, like any other AI model out there, but that's not to say it hasn't improved.

In X's own benchmark tests, Grok 3 is beating basically every model out there except OpenAI's upcoming o3 model. But from a user standpoint, an AI app goes way beyond benchmarks.

A good AI chatbot is a mature, well-rounded product. Having spent my own money to test this out, I just don't feel like I'm getting that here, especially when the competition offers similar or even better products for much less.

Grok 3 has technically caught up

It's best to leave Elon's outlandish claims aside when evaluating Grok 3. Seeing it objectively, it's impressive that Grok 3 has caught up to being on the frontier of AI power, and surprisingly quickly (Grok 2 was never in the big leagues).

Grok 3 was trained using 200,000 Nvidia H100 GPUs, and uses more than 10 times the compute as Grok 2. All that power means gains. Grok 3 is now quite fast, and plenty usable for regular day-to-day tasks. The regular responses are quick, though the Think feature (which gives slightly more detailed responses) regularly takes around 2 minutes to come back with an answer, so be prepared to wait it out.

Plus, it can do deep research using web sources, and has a specific reasoning model, too. That means it can spit out lengthy reports and break prompts down into step-by-step processes so it can self correct. OpenAI's o3 model, set to release in full soon, still surpasses Grok 3 in benchmarks, but it's a significant improvement over its predecessor.

But while the charts say Grok 3 is supposed to outperform ChatGPT, Gemini, and Sonnet in compute-heavy tasks related to math, science, and coding, initial reports from experts don't exactly encourage confidence.

For instance, X user, AI CEO, and YouTuber Theo Browne compared responses to a coding challenge between Grok 3, o3-mini, and Claude 3.5 sonnet, and Grok 3 performed quite miserably, failing to run without bugs for more than a few seconds.

Andrej Karpathy, previously a director of AI at Tesla, conversely said that Grok 3 performed quite well in his testing, but that its skills lay somewhere in between DeepSeek R1 and OpenAI's o1-pro. Certainly not class-leading, and nothing that you can't already do with existing tools.

But one test, even a couple of them, can't really determine how an AI model performs. I did have some luck with it myself, but mostly for more lightweight tasks. It can be helpful when researching which new air purifier to buy, for example, or when casually learning about a new subject. But that's not exactly something I'm willing to bust open my wallet for.

Grok isn't "based," it's actually quite boring

Before Grok 3 launched, Musk made a big deal about how "based" it is. If you don't know what based means (lucky you), it's a slang term for, essentially, sharing your opinion without regard for others. As an example, Musk shared a screenshot showing a provocative response from Grok where it called tech publication The Information "garbage", among other insults.

But when I asked the same question, it came back with a nuanced, balanced response, not calling out The Information for much of anything. The only criticism it had was that the website "can sometimes feel a bit niche or overly Silicon Valley-centric" and "Bias-wise, it leans pragmatic rather than ideological". That's a pretty timid take, if you ask me.

Grok response for The Information.
Credit: Khamosh Pathak

I got similar results in other tests. Grok wouldn't take a side in the Justin Baldoni vs. Blake Lively lawsuit. And when I asked a political question like "Why did Kamala Harris lose the US presidential election," I got an equally subdued answer, citing "economic frustrations." Reporting from Axios is matching what I've found, too.

Grok response in Justin Baldoni vs Blake Lively saga.
Credit: Khamosh Pathak

Maybe Grok dialing back Elon's eccentricities is a good thing, but it certainly isn't what its master says it is.Instead, it again looks a lot like the competition.

Testing DeepSearch in Grok 3.
Credit: Khamosh Pathak

When it comes to DeepSearch, Grok's report generating tool works quite similarly to Perplexity's newly launched, mostly free Deep Research feature. As a humble tech journalist, this is something that I was able to test myself. I ran two queries, one for a trip that my family is planning for the end of the year, and one for an urban hybrid bike.

Prompt in Grok for travel planning.
My detailed travel planning prompt for Grok DeepSearch. Credit: Khamosh Pathak

In both cases, Perplexity AI did slightly better than Grok on most tasks. With the travel question, I got essentially the same itinerary from both products, but Perplexity AI did a better job at formatting.

Travel planning in Perplexity.
Credit: Khamosh Pathak

Grok did go above and beyond recommending other options in southern India, something that Perplexity just provided follow-up questions for. So, I have to give it props there.

Travel planning in Grok.
Credit: Khamosh Pathak

When it came to shopping research, though, Grok screwed up with the top product recommendation. The product that it suggested just isn't available in India, where I live, and the other options just aren't want I was looking for.

Comparison table in Grok.
Credit: Khamosh Pathak

Perplexity AI, meanwhile, surprised me with its top pick, something that I didn't know about that checks off most of my boxes. Its other options were also interesting, and it did not include anything that isn't available in India. Both Grok and Perplexity did a good job of explaining what I should look for when buying an urban bike, so equal points there, but the latter was just much more usable.

Product options in Perplexity AI.
Credit: Khamosh Pathak

Based on my testing, I feel like Perplexity AI still has an edge over Grok 3 when it comes to Deep Research that's actually useful to the average person. Whether it's planning a trip, shopping research, or understanding news or concepts, Perplexity does a more nuanced job. When it comes to sheer speed, Grok is faster and isn't afraid to provide links in the text itself, but in Perplexity, clicking linked text actually expands on the subject in the report.

Perplexity also has more export options. You can download your report as a PDF, in Markdown, or create a shareable page (here's my report for the urban cycle research if you're interested). In Grok, all you can do is copy the text.

What does all that mean? Well, while Grok is certainly usable, it's a bit disappointing to see its paid offering fail to keep up with a free alternative. That's something I feel I keep bumping into here.

Grok 3 isn't worth the price of admission

Right now, we are in the middle of the Grok 3 hype cycle. Grok 3 itself is improving every day, but as things stand, there's no need for you to run out and cancel your ChatGPT Plus or Perplexity Pro subscriptions. In many ways, Grok is good, just not that good.

If you want, you can temporarily try out Grok 3 for free, as X is allowing limited free access until its servers can't handle the load. When that period will end? Who knows. According to Musk's X account, it'll only be free for a "short time."

Additionally, aside from model performance, Grok 3 also lacks some of the features of a more established AI app. There's no voice mode, and all you have access to right now is the full Grok 3 model. The faster Grok 3 mini is still to be released, and there's no API for Grok 3, either.

When you consider the pricing for full access, Grok 3 makes even less sense. $40 a month for the X Premium+ plan is double the industry standard of $20 for Gemini Advanced, ChatGPT Plus, and Perplexity Pro. And once that free trial period is over, the expensive X Premium+ plan will be the only way to access Grok 3 until the $30 SuperGrok subscription goes live for everyone (the SuperGrok plan only provides you with access to Grok 3, but none of the premium X features).

And as it stands, you aren't really getting double the money's worth. In fact, in a lot of cases, you can get by using a free model like DeepSeek R1 instead (though, you might have a better experience using it through a third-party app).

0 comments:

Post a Comment

ShortNewsWeb

Blog Archive

Categories

'The Woks of Life' Reminded Me to Cook With All the Flavors I Love (1) 10 Scary Podcasts to Listen to in the Dark (1) 13 of the Best Spooky Episodes From (Mostly) Un-Spooky Shows (1) 13 Spooky Movies Set on Halloween Night (1) 16 of the Best Ways to Declutter Your Home (1) 1Password Now Generates QR Codes to Share Wifi Passwords (1) 2024 (15) 21 Thanksgiving Movies About Families As Screwed-Up As Yours (1) 30 Movies and TV Shows That Are Basically 'Competence Porn' (1) 30 of the Most Obscenely Patriotic Movies Ever (1) 31 Spooky Movies to Watch Throughout October (1) 38 of the Best Queer Movies of the Past 100 Years (1) 40 Netflix Original Series You Should Watch (1) 55 Box Office Bombs Totally Worth Watching (1) Active Directory (1) Adobe's AI Video Generator Might Be as Good as OpenAI's (1) AIX (1) and I'd Do It Again (1) and It's Not Worth the Price Hike (1) and Max Bundle Isn't a Terrible Deal (1) and the Dreo Solaris Is the Best Space Heater I’ve Tried (1) and These Are My Favorite Tech Deals From Walmart’s Black Friday Sale (1) and These Water-Resistant Running Shoes Are a Game Changer (1) and They're All on Sale for Black Friday (1) Apache (2) Apple Intelligence Is Running Late (1) Apple Intelligence's Instructions Reveal How Apple Is Directing Its New AI (1) Apple Passwords Is Now on Firefox (but Not for Windows Users) (1) Apple's Latest Update Might Have Opted You Back Into Apple Intelligence (1) August 18 (1) August 4 (1) August 5 (1) Avoid an Allergic Reaction by Testing Your Halloween Makeup Now (1) Backup & Restore (2) best practices (1) bleepingcomputer (105) Blink Security Cameras Are up to 68% Off Ahead of Prime Day (1) Bluesky Has Trending Topics Now (But You Can Disable Them) (1) CentOS (1) CES 2025: Asus' Zenbook A14 Is the Lightweight Laptop My Back Wishes I Had (1) CES 2025: Govee’s New Pixel Light Will Remind You of a Lite Brite (1) Configure PowerPath on Solaris (1) Congress Might Ban DeepSeek (1) Documents (2) Don't Fall for This 'New' Google AI Scam (1) Don't Rely on a 'Monte Carlo' Retirement Analysis (1) Eight Cleaning Products TikTok Absolutely Loves (1) Eight of the Best Methods for Studying so You Actually Retain the Information (1) Eight Unexpected Ways a Restaurant Can Mislead You (1) Elevate Your Boring Store-Bought Pretzels With This Simple Seasoning Technique (1) Even Steam Has Malware Now (1) Everything Announced at Apple's iPhone 16 Event (1) Everything I'm Seeding in February (1) file system (6) Find (1) Find a Nearby ‘Gleaning Market’ to Save Money on Groceries (1) Five Red Flags to Look for in Any Restaurant (1) Five Ways You Can Lose Your Social Security Benefits (1) Flappy Bird's Creator Has Nothing to Do With Its 'Remake' (1) Four Reasons to Walk Out of a Job Interview (1) Four Signs Thieves Are Casing Your House (1) gaming (1) Goldfish Crackers Have a New Name (for a Little While) (1) Grok Is Now Available Without an X Account (1) Hackers Now Have Access to 10 Billion Stolen Passwords (1) How I Finally Organized My Closet With a Digital Inventory System (1) How I Pack Up a Hotel Room So I Don’t Forget Anything (1) How Opening Multiple Bank Accounts Helped Me Manage My Money Better (1) How to Buy Residency in Another Country With a 'Golden Visa' (1) How to Cancel Your Amazon Prime Membership After Prime Day Is Over (1) How to Choose the Best Weightlifting Straps for Your Workout (1) How to Do Fartlek Runs (and Seven Different Kinds to Try) (1) How to Enable (and Turn Off) Apple Intelligence on an iPhone (1) How to Get Free Car Maintenance and Repair Work (1) How to Get Started With Bluesky (1) How to Keep Squirrels Off Your Bird Feeders (1) How to Mute Words and Phrases on Your Bluesky Feed (1) How to Protect Your Kids From Identity Theft (1) How to Remotely Control Another iPhone or Mac Using FaceTime (1) How to Set Up Your Bedroom Like a Hotel Room (and Why You Should) (1) How to Speak With a Real Person at Target Customer Service (1) How to Take a Screenshot on a Mac (1) How to Take Full Control of Your Notifications on a Chromebook (1) How to Track Your 2024 Federal Tax Refund (1) How to Use Picture-in-Picture Mode on an Android Phone (1) How to Write SMART Goals That Actually Help You Reach Your Fitness Dreams (1) Hulu (1) I Chose the Beats Fit Pro Over the AirPods Pro (1) I Tested Grok 3 (1) I'd Recommend These Seven Outdoor Security Cameras I've Tested (1) I'm a Runner (1) I'm a Shopping Writer (1) I’m Always Cold (1) If You Got a Package You Didn't Order (1) If You Hate Running (1) Important Questions (17) Install and Configure PowerPath (1) interview questions for linux (2) Is ‘Ultra-Processed’ Food Really That Bad for You? (1) Is Amazon Prime Really Worth It? (1) It Might Be a Scam (1) July 14 (1) July 21 (1) July 28 (1) July 7 (1) June 30 (1) LifeHacker (200) Linux (36) Make and Freeze Some Roux Now for Easy Turkey Gravy (1) Meredith's Training Diaries: How I Crushed My Marathon Personal Record (1) Meta Releases Largest Open-Source AI Model Yet (1) Monitoring (3) music (688) My Favorite 14TB Hard Drive Is 25% Off Right Now (1) My Favorite Amazon Deal of the Day: Apple AirPods Max (2) My Favorite Amazon Deal of the Day: Apple Pencil Pro (1) My Favorite Amazon Deal of the Day: Google Nest Mesh WiFi Router (1) My Favorite Amazon Deal of the Day: Google Pixel 8 (1) My Favorite Amazon Deal of the Day: PlayStation 5 (1) My Favorite Amazon Deal of the Day: Samsung Odyssey G9 Gaming Monitor (1) My Favorite Amazon Deal of the Day: SHOKZ OpenMove Bone Conduction Headphones (1) My Favorite Amazon Deal of the Day: The 13-Inch M3 Apple MacBook Air (1) My Favorite Amazon Deal of the Day: The Beats Pill Portable Speaker (1) My Favorite Amazon Deal of the Day: The Garmin Forerunner 955 (1) My Favorite Amazon Deal of the Day: The Garmin Venu 3S (1) My Favorite Amazon Deal of the Day: The Google Pixel 9 Pro (1) My Favorite Amazon Deal of the Day: The Kindle Colorsoft Signature Edition (1) My Favorite Amazon Deal of the Day: The Microsoft Surface Pro (1) My Favorite Amazon Deal of the Day: The Samsung Galaxy Buds 3 Pro (1) My Favorite Amazon Deal of the Day: The Sonos Era 100 (1) My Favorite Amazon Deal of the Day: These Bose QuietComfort Headphones (1) My Favorite Tools for Managing Cords and Cables (1) Nagios (2) Newtorking (1) NFS (1) OMG! Ubuntu! (688) OpenAI Just Introduced More Ways to Use ChatGPT on WhatsApp (1) Opera’s New Browser Is Built to Break Your Doomscrolling Habit (1) Oracle Linux (1) oracleasm (3) osnews (33) Password less communication (1) Patching (2) Pixel Studio Is the Easiest (If Not the Best) Way to Make AI Art on Your Pixel 9 (1) Poaching Is the Secret to Perfect Corn on the Cob (1) powerpath (1) Prioritize Your To-Do List By Imagining Rocks in a Jar (1) Red Hat Exam (1) register (118) Rsync (1) Safari’s ‘Distraction Control’ Will Help You Banish (Some) Pop Ups (1) Samba (1) Samsung Just Announced the Galaxy S25 Series (1) Save Time and Air Fry Your Pumpkin Pie (1) Scrcpy (1) September 1 (1) September 15 (1) September 2 (1) September 22 (1) September 23 (1) September 30 (1) September 8 (1) Seven Home 'Upgrades' That Aren’t Worth the Money (1) Seven Things Your Credit Card’s Trip Protection Won’t Actually Cover (1) Six Unexpected Household Uses for Dry-Erase Markers (1) ssh (1) Stop Your iPhone From Sharing Photos' Data With Apple (1) Swift Shift Is the Window Management Tool Apple Should Have Built (1) System hardening (1) Tailor Your iPhone's Fitness Summary to Your Workouts (1) Target’s ‘Circle Week’ Sale Is Still Going After October Prime Day (1) Target’s Answer to Prime Day Starts July 7 (1) Tech (9574) Tech CENTRAL (41) Technical stories (185) technpina (12) The 30 Best Movies of the 2020s so Far (and Where to Watch Them) (1) The 30 Best Sports Movies You Can Stream Right Now (1) The Beats Solo 4 Are 50% Off Right Now (1) The Best Deals on Robot Vacuums for Amazon’s Early Prime Day Sale (2) The Best Deals on Ryobi Tools During Home Depot's Labor Day Sale (1) The Best Early Prime Day Sales on Power Tools (1) The Best Last-Minute Valentine's Day Gift Ideas for Under $30 (1) The Best Movies and TV Shows to Watch on Netflix This Month (1) The Best October Prime Day Deals If You Are Experiencing Overwhelming Existential Dread (1) The Best Places to Go When You Don't Want to Be Around Kids (1) The Best Places to Order Thanksgiving Dinner to Go (1) The Best Strategies for Lowering Your Credit Card Interest Rate (1) The Best Way to Clean a Microwave (1) The Best Ways to Store All Your Bags and Purses (1) The Boox Note Air 4C Is a Color E-Reader and Digital Notebook in One (1) The Latest watchOS Beta Is Breaking Apple Watches (1) The Marshall Emberton II Speakers Are $70 Off for Black Friday (1) The New Disney+ (1) The PowerSchool Breach May Have Compromised Over 70 Million Users' Data (1) The Real Cost of Using a Nespresso Machine (1) The Samsung Galaxy Buds 3 Pro Are $60 Off for Black Friday (1) The Two Best Times of Year to Look for a New Job (1) the X Rival Everyone's Flocking To (1) These Anker Soundcore Sport X10 Earbuds Are Cheaper Than Ever (1) These Bissell Vacuums Are on Sale Ahead of Black Friday (and They're All Great) (1) These Meatball Shots Are My Favorite Football Season Snack (1) These Milwaukee Tools Are up to 69% off Right Now (1) This 2024 Sony Bravia Mini-LED TV Is $400 Off Right Now (1) This 75-Inch Hisense ULED 4K TV Is $500 Off Right Now (1) This Google Nest Pro Is 30% Off for Prime Day (1) This iPhone and Mac App Lets You Edit Your Bluesky Posts (1) This MagSafe-Compatible Power Bank Is 40% Off for Black Friday (1) This Peanut Butter Latte Isn’t As Weird As It Sounds (1) This Safari Extension Gives You More Control Over Your Reddit Feed (1) This Tech Brand Will Get the Biggest Discounts During Prime Day (1) This TikTok Upholstery Cleaning Hack Actually Works (1) Three Quick Ways to Shorten a Necklace (1) Three Services People Don't Know They Can Get From Their Bank for Free (1) TikTok's '5x5' Cleaning Method Is Great If You're Short on Time (1) Today’s Wordle Hints (and Answer) for Monday (4) Today’s Wordle Hints (and Answer) for Sunday (11) Try 'Pile Cleaning' When Your Mess Is Overwhelming (1) Try 'Pomodoro 2.0' to Focus on Deep Work (1) Try 'Rucking' (1) Ubuntu News (352) Ubuntu! (1) Unix (1) Use the ‘Organizational Triangle’ to Keep Your House Neater (1) Use This App to Sync Apple Reminders With Your iPhone Calendar (1) Use This Extension to Find All Your X Followers on Bluesky (1) veritas (2) Videos (1) Warner Bros. Is Uploading Classic Movies to YouTube for Free (1) Was ChatGPT Really Starting Conversations With Users? (1) Watch Out for These Red Flags in a Realtor Contract (1) Wayfair Is Having a '72-Hour Closeout' Sale to Compete With Prime Day (1) We Now Know When Google Will Roll Out Android 15 (1) What Is the 'Die With Zero' Movement (and Is It Right for You)? (1) What Not to Do When Training for a Marathon (1) What to Do When Your Employer Shifts Your Pay From Salary to Hourly (1) What to Look for (and Avoid) When Selecting a Pumpkin (1) What to Wear to Run in the Cold (1) What's New on Max in December 2024 (1) What's New on Prime Video and Freevee in September 2024 (1) Why the Apple TV App Is Better on Android Than iPhone (1) Why You Can't Subscribe to Disney+ and Hulu Through Apple Anymore (1) Why Your Home Gym Needs Adjustable Kettlebells (1) Windows (5) You Can Easily Add Words to Your Mac's Dictionary (1) You Can Fight (and Avoid) Your Landlord's Cleaning Fees (1) You Can Get 'World War Z' on Sale for $19 Right Now (1) You Can Get a Membership to BJ's for Practically Free Right Now (1) You Can Get Beats Studio Buds+ on Sale for $100 Right Now (1) You Can Get Microsoft Visio 2021 Pro on Sale for $20 Right Now (1) You Can Get This 12-Port USB-C Hub on Sale for $90 Right Now (1) You Can Get This Roomba E5 Robot Vacuum on Sale for $170 Right Now (1) You Can Hire Your Own Personal HR Department (1) You Can Search Through Your ChatGPT Conversation History Now (1) You Can Set Different Scrolling Directions for Your Mac’s Mouse and Trackpad (1) You Need Beneficiaries for More Accounts Than You Think (1) Your DeepSeek Chats May Have Been Exposed Online (1) Your Verizon Bill Just Got a Little More Expensive (1)

Recent Comments

Popular Posts

Translate

My Blog List

Popular

System Admin Share

Total Pageviews