Economic of AI age

 I want to talk about what I see : Economics in AI age will be drastically different than that of internet age, and that’s boiled down to cost structure of compute. 

The current AI company business model reminds me of a business disaster — MoviePass. It launched a subscription platform that let users go to cinemas across the U.S. and watch unlimited movies during their subscription period. It’s a promising idea, until it turns into to a disaster.

Unit Economics 

The reason was simple: the power users, the people who loved and used the service the most — also cost the company the most. In other words, the more people loved the product, the faster money the company bleed money. That’s a business built upside down. 

Most AI companies are on the same path, they are using the proven working subscription funnel business model in the Internet age : to attract as many users as possible first, and figure out cost efficiency later. This doesn’t seems working in age of AI.

Cursor is a great example. It’s the golden boy of vibe coding culture, blessed with user love, there is only one problem : users who rely on cursor most — ones who use it all day — also cost the company the most to serve. And when Cursor tried to restrict usage ( so it doesn’t die), it faced backlash from loyal users who felt betrayed. 

This foretells a fundamental shift towards how business model change in AI age, and  to fully grasp it, one must understand the oldest business concept — unit economics — aka do you make or lose money per unit of usage? In AI, that unit is an inference —each time the model generates an answer.

Fundamental difference between Internet and AI

Unit economics didn’t come up much during the internet age, because scaling software had low marginal cost. As Kevin Kelly puts it : “The Internet is a copy machine. At its most foundational level, it copies every action, every character, every thought we make while we ride upon it.” 

AI, on the other hand, it’s underlying technology, Deep Neural Network, is all about matrix multiplications. Its costs comes from two parts. Cost of training a model, and cost of inference. Training a model is expensive, but it’s a one off payment (for now) and cost is predictable, not a big issue. Inference cost is an issue, as in business terms —- it means the cost rises proportionately with usage.

Now — how to solve this? 

Quantization, Architectural knowledge, business model. 

Quantization rocks. 

With Quantization you could lower cost of inference cost. Here is the math. Now, would quantization lower precision? Yes. It seems clear now that dominating AI use case would be : specialized AI and multi-agents framework. Which means 

  1. importance how well you coordinate and organize models around a goal rather than maximizing brilliance of the model. 
  2. Cost explodes if you’re planning to use full precision model to do multi agent work rather than quantized models 
  3. Via fine tuning model on specific domain data, you could have a quantized model perform on par, even better than a much larger full precision model.

Of all these choices —- the best combination would be binary neural network. In a Binary neural network, weights and activations are constrained to 0 and 1s only. BNN seems to be most scalable choice as :

  1. Lower inference cost ; BNNs lowers memory footprint by up to 32x. Memory footprint is the amount of data a model or computation needed to store data and move it around. This directly addresses the number 1 cost of AI —— memory movement, rather than computation itself, uses more energy, thus money in AI inference. 
  2. BNN is built for edge usages . Edge means offloading computing from cloud to local device, for example the recent M5 chip Apple released is designed for exact that — inference locally on device, offline.

as mentioned, inference cost is most important; you can’t scale AI as a service if the more people love your work the more money you bleed. And BNN attack the heart of this problem. 

Perhaps the only downside is that BNN, tho cheaper at inferences, is more expensive to train, which means a higher CapEX . But this isn’t an issue — a cost you can predict is hardly a problem as ones you can’t in business. 

Computer Architecture choice. 

Why is Sam Altman obsessed with compute recently ? It’s because OpenAI is bleeding money —— users cost more than they pay, and compute is the make or die challenge to overcome. In other words, for OpenAI to be sustainable — it must drastically lower the cost of inference. And that means custom hardware–software co-design. 

Google, on the other hand, co-designs Gemini with their custom TPU v5. This integration cuts the cost per inference by an order of magnitude.

Seemingly, Apple is preparing to do the same. Reportedly, Gemini’s inference cost is much lower than that of ChatGPT. But why? Because specialized chips can increase energy efficiency by 5 to 10 times — and energy cost is the biggest variable in inference, and ultimately, in total cost.

As mentioned before bottleneck comes from memory, not compute. Co-designing custom architectures exploits the principle of locality — keeping data close to the processor, moving less of it, and shortening the distance of data movement.

Business Model

In the future, AI companies will buy compute from specialized compute providers — much like how businesses buy large quantities from factories. When you buy in bulk, you get a lower unit price. Factories prefer large, predictable orders because they stabilize cash flow and reduce uncertainty, and in return, they offer discounts.

AI will likely work the same way, companies will negotiate long-term compute contracts at wholesale rates, in other words, eonomies of scale will emerge not only in model training but in compute purchasing itself.

On the user side, charging by usage makes far more sense than the current subscription model. It works like a coffee shop: each cup has a price. Users pay per inference, per generation, per unit of value. This aligns cost with consumption. The business stays profitable as usage scales, rather than bleeding faster as users love the product more.

In the end, the future of AI economics may look less like the internet and more like manufacturing — stable supply chains underneath, predictable costs, and pricing models grounded in outcomes. 

On the Flywheel effect,

I’ve been super interested in fly wheel effect recently. I want to talk about what I think the Fly Wheel effect is, when it works, when it doesn’t.

The Fly Wheel effect is a phenomenon where a business enters a positive reinforcement loop: the more users it has, the more its existing users attract new users, mainly through two effects. 

First kind is due to humans' natural connections, this is the sort of effect we see on Facebook, where users join after reaching a certain threshold, then their friends join, then the friends of friends, and so forth. This kind of Fly Wheel effect is dangerous and the most significant advantage one could have.

The other Fly Wheel effect is more rare, it comes from the reputation or the enormous amount of trust a business could have due to its disruptive ability. It is no longer a gravitational pull; it could be seen as a high dimension or a loophole in the old system of the world. The most classic example is Amazon.com. When it first began to see its Fly Wheel effect, there was this phenomenon where Amazon could :

1 order stocks from the supplier to the warehouse

2 leveraging It's reputation, Instead of paying right when they receive it, pay 90 days 

3 Sell those goods within 90 days

They could sell those stocks in most cases since they have an enormous number of users and algorithmic recommendation systems that know what the users want and how to convince them to buy on their website ---- Essentially infinite cash !

Those above are the two most classic archetypes of the flywheel effect, and below I want to talk about some general rules that are not necessarily a doctrine like a physics law, but more like a phenomenon.

1 Blitz proof

"Once you had a fly wheel, competitors leverage their existing user base for a easy win"

Even with an astonishing amount of advantage, someone else cannot beat your flywheel. For example, Amazon tried to beat eBay in 1999 with Amazon Auctions in the online customer-to-customer market via leveraging their large user database and data recommendation systems. eBay, at the time, was already known for having a self-reinforcing loop: the more users ---> better sale ---> more users ( buyers and sellers )

Amazon had an enormous user base that was much bigger than eBay's, similar if not greater purchasing power, and much better algorithmic recommendation systems and database. Yet, Amazon Auction was shut down in 2001 due to lack of traction. 

It shows us a simple truth. When a self-reinforcement loop has been established, even if you have a big force like Amazon, which at that time had a much bigger number of active users as eBay, trying to overtake eBay's self-reinforcement loop failed. Perhaps the reason is that the whole point isn't having a certain number of users or a database; it is the loop or the ecosystem.

Amazon has the customers, but it did not have the ecosystem that eBay established, which includes the sellers and the environment those users are used to—a very different kind of environment.

In short, once a passive reinforcement loop has been established, you have a great wall of protection for your business. 

Unless…

Unless you meet a stronger fly wheel.


2 A stronger fly wheel

Imagine you have a fly effect on your business. You don't have to do anything and users keep growing and growing, and this in turn attracts even more users. Disruptors from other industry will find it super hard to disrupt your business. What could go wrong? 

This is what Myspace believed in 2008( famous last words )

Quote from one of the co-founders of Facebook “We were able to beat MySpace because they messed up internally with management teams and short-termism. MySpace had such a strong self-reinforcement loop that it is impossible for others to compete with them, until they blowed it up themselves.”

But is that the truth, or an oversimplification?

I'm sure that's one of the big reasons, but I think one of the overlooked aspects is that Facebook has much stronger, unversarial fly wheel, that sets its potential, and without that, it might have been impossible to overtake MySpace, even though MySpace had internal problems. 

So what is that? The issue with MySpace's loop is —— lack of retention. 

In MySpace, having  users —> more users, but not necessarily to better user retention. In other words, users download MySpace to look up their friends, but only stay for a short period of time, and then they don't stick around anymore, whereas Facebook users hop back in every day, usually more frequent as time goes on.

There is also a fatal flaw in MySpace design. Due to the lack of standardization, the more users it has, the more messed up and hard to navigate. It gets S. Lacking standardization protocol means users can set up their homepage however they want. Buttons fly everywhere. Worst of all, there are terrible musics. Terrible music taste of some users did not help this. Jumping to one page to another person's page could be a whole different world. 

This was MySpace strength and became its weakness as it increased entropy in ecosystem —— the only users that really stick is niche (music, artist, coders etc). In other words, its only real users are a small group, the rest are like isolated islands with  special looking profiles, the more it scales the bigger this lack of Standardizaltion hurts it. Whereas Facebook focuses on socializing and connection. Users aren't confused because its interface is standardized. Facebook also invested earlier in algorithms, a better database, fewer tech debt and crashes. 

All this leads to one thing. The more users —> more users + user retention. 

In other words, MySpace had a flywheel effect, but it was also losing users at astonishing rates. The entropy that grows with user growth wasn't well managed. In contrast, Facebook managed this extremely well through algorithmic improvement and a strong feedback loop. They nailed it. Once you reach a threshold point, the more users you have, the more users it attracts, and the more they are willing to stay. Because you're using the data generated to create a better user experience. In short, here's the second law of the flywheel effect. It could be taken down by a stronger flywheel effect, giving the users the same kind.



Nature of Technology

1 technology accelerates

Technology compounds — roughly twice as fast ever year.

The process is boring, it compounds in the background until a breakthrough is made in the whole world notices it. 


Let’s look at a diagram of gdp growth over the last 200 years. Before 1800s it is essentially a flat line but somewhere where you see it grow exponentially —— what’s the main difference between the last 200 years and previous 200,000 is that technological advancement has taken over labour to became the main driver of economic growth. 


2 genetic pool 

The reason technology grows exponentially is because this nature works is essentially like bacteria, the more cells you have, the more cells reproduces with each other —— faster growth. One example is the birth of large language model technology crisis back to Alex Nat in 2012, proving neuron network is appeared to traditional algorithms, which the entire thing is feasible because advancement in parallel computing specifically cuda and graphical processing units. In other words, it is a combination of different fields in hardware and software that makes large language model, feasible and intern large language model accelerate development of technology as more people can program more. Creativity can be released faster programming time. Individuals are more efficient this forms a positive reinforcement where a technology will contribute to feasibility of another technology, which will then lead to the next ones faster.


3 Bottleneck

Technological advancement often follows a theme of solving bottlenecks in society. In the 1800s, with the creation of the steam engine, there was an urgent need to transport the might of the industrial revolution around the world as fast and steadily as possible, sparking the birth of railway systems across European and American countries. With the maturity of railways, we could build and transport more. The bottleneck became steel, leading to a boom in the steel industry. After steel and railways, came oil.

The first bottleneck after finding oil is its refinement. After that, the bottleneck becomes utilizing the liquid form of oil for better transportation. Standard Oil Company addressed these last two bottlenecks, establishing a monopoly like never before. Once the oil industry was settled, a long tail effect was formed by the organizational capability of mankind. We have lots of oil, lots of steel production, and thus a foundation for many innovations to build upon like personal vechiles.

The birth of the internet follows the same path. First, we have the internet, which is amazing, but it doesn't scale until the bottleneck of standardization is solved. Then the bottleneck becomes infrastructure, like building railways for the industrial revolution. With the AI revolution, we can see the same trend — LLM sparked the need for AI, and the bottleneck became energy and computation power. After they are solved, we see a bottleneck in the amount of data supplied.

4 Toy until establishment of atomic unit

Almost every mainstream thing started as an edge in humanity. The internet was seen as a toy; Instagram started as a literal toy, Amazon started out as a better bookshop, Twitter/slack started out as an internal tool for a company, so does AWS , which now brings in about 70% of Amazon's revenue nowadays.

Every technology follows a modular approach. The internet served as a random medium for people and hackers to express themselves until the birth of DNS, which standardized website domains. The birth of social media was also a toy and a mess. With MySpace, everyone could host their own pages, but it soon developed into chaos; until Facebook brought standardization to social media pages that social media could actually scale. The invention of the tokenization system and the attention mechanism standardizes processing, making it more scalable by creating standard modular parts that are repeatable.

Researches studies

I've realised, research studies sometimes lies, and it's getting more and more often.

Here is how I try identify fake researches:

1 Researches with n lower than 70 in general are just way too small. 

2 Data extraction method shows ONLY correlation to the subject WITHOUT pointing it out limitations of data. 

3 Use terms like "no evidence" then try to make any conclusions. For example, I've learnt to ignore every "after experiment, we have found no evidence that xxx is harmful, thus xxx is not harmful. To this, I follow same rules in experimental physics : You either try to disprove a theory, you bring solid evidence saying it's false, not vice versa.

4 A general rule is to never trust influencers or organisations that would be directly or indirectly benefited by the research, and always look at the counter argument, if there are no counter argument, something must be wrong. In other words, don't trust someone who will profit off your belief.

5 Remember When Cigarettes Were Good For You?
"There's no evidence that smoking causes cancer," some expert said. "In fact, nicotine can improve cognition and potentially reduce the risk for dementiaIn fact" 
dejavu.....

Mecha Cities

Cities so big that can be seen clearly from earth surface will be a home to 80% of the population. These cities are built to adapt to constant change to the world, every road and architecture can be moved to a different place like in legos. These cities are built on flat surfaces, as mountains and any other obstacles are removed, if not kept for the purpose of decoration. 

Buildings are now being built by robots with incredibly cheap cost and speed, even the poorest are living in big comfortable apartments with their own garden. There are about 7-15 of these cities, they are the main geopolitical forces on the globe, they are competing with one and other, yet achieved an amazing equilibrium of power and corporations. I call them MechaCities.


On planning

A plan to never fail is the plan of death.

A plan with no anti-plan is a plan to failure.

A plan with no mathematical model is not a good plan.

A plan with no feedback loop is a guarantee half ass.


A day with no pain is a day with no gain.

A second of romance is a gallon of venom.

Time it’s nothing but an amplifier.



On Mark Zuckerberg's Meta

The transition from Facebook to Meta will be the one of the essentials business case studies ever, 

Although critics and difficulties are everywhere, and everyone believes Mark Zuckerberg made a dumb move.

Mark Zuckerberg really know what he is doing here, this is the essenses of entrepreneurship, take it to the next level.

There will always be difficulties, risk and pushiment, 

if you start with nothing you lose less, even less critics more encouragements.

In Mark's case, you start resources, you lose more, you face more critics, your moves are studied more.

But Mark is a good case of an entrepreneur that really didn't settle(no one would admit they settled, until they look back)

Do not underestimate your Will

1 Today I talked to brave young woman who is diagnosed with a deadly disease, but fight her way out and is working hard to make her second life meaningful. I asked her a lot of questions, and the most important question I asked is "How big of a role did psychological factors play during your fight? " The answer she gave is 80%.