• 0 Posts
  • 8 Comments
Joined 1 year ago
cake
Cake day: June 16th, 2023

help-circle

  • I am an LLM researcher at MIT, and hopefully this will help.

    As others have answered, LLMs have only learned the ability to autocomplete given some input, known as the prompt. Functionally, the model is strictly predicting the probability of the next word+, called tokens, with some randomness injected so the output isn’t exactly the same for any given prompt.

    The probability of the next word comes from what was in the model’s training data, in combination with a very complex mathematical method to compute the impact of all previous words with every other previous word and with the new predicted word, called self-attention, but you can think of this like a computed relatedness factor.

    This relatedness factor is very computationally expensive and grows exponentially, so models are limited by how many previous words can be used to compute relatedness. This limitation is called the Context Window. The recent breakthroughs in LLMs come from the use of very large context windows to learn the relationships of as many words as possible.

    This process of predicting the next word is repeated iteratively until a special stop token is generated, which tells the model go stop generating more words. So literally, the models builds entire responses one word at a time from left to right.

    Because all future words are predicated on the previously stated words in either the prompt or subsequent generated words, it becomes impossible to apply even the most basic logical concepts, unless all the components required are present in the prompt or have somehow serendipitously been stated by the model in its generated response.

    This is also why LLMs tend to work better when you ask them to work out all the steps of a problem instead of jumping to a conclusion, and why the best models tend to rely on extremely verbose answers to give you the simple piece of information you were looking for.

    From this fundamental understanding, hopefully you can now reason the LLM limitations in factual understanding as well. For instance, if a given fact was never mentioned in the training data, or an answer simply doesn’t exist, the model will make it up, inferring the next most likely word to create a plausible sounding statement. Essentially, the model has been faking language understanding so much, that even when the model has no factual basis for an answer, it can easily trick a unwitting human into believing the answer to be correct.

    —-

    +more specifically these words are tokens which usually contain some smaller part of a word. For instance, understand and able would be represented as two tokens that when put together would become the word understandable.




  • I wholeheartedly agree with the purpose of this community and what it advocates for, but I wanted to add some rebuttals to your points.

    1. You mention you have the privilege of not using your car. In car centric parts of the world, which is anything that isn’t a big city, this is a privilege. My family that lives in a town of 50,000 people in Germany still need to use their cars every day! There is only a bus system to get them around. Each working age adult uses their car every day! Including the one who lives in Köln and literally walks across the street to his office because car transportation is far more time efficient than transit.

    2. See anecdote above. I live in Boston, which is supposedly extremely transit friendly and the T and commuter rail, while remarkably extensive, are abysmal. I rode it for two months until I finally gave up and got a car. I live in a house 0.5 miles from the commuter rail station and it’s the cheapest around at $750k, I should be able to have reliable transport to MIT/Kendall.

    3. Your issues with driving, being honked at, being annoyed with lack of right of way, all seem to come from inexperience driving in the city. On roads with speed limits of 30 mph, there is no right of way, it’s just about whoever goes first. After some time, you learn what to expect from locals and adapt to their style. But I understand if driving in a big city makes people uncomfortable. There is a lot going on, and a lot to pay attention to.

    4. Schedule. My god is our transit schedule awful. Commuter rail only once every hour. It’s either 5 minutes early or 20 minutes late regularly. So it’s completely unreliable. The Red line is now slow as fuck. Crawling at 10 mph in most areas now. It’s faster to ride your bike between stations, and get stopped by every traffic light than it is to ride the train. And now the red line only has service every 20-30 minutes!

    I loved visiting London. We even got a rental to see things like Stonehenge or Brighton, but I never felt the need to use the car much within the city. While I thoroughly enjoyed driving through parts of London like tiny bridges that had inches of clearance on either side of the vehicle, or massive roundabouts near Victoria station, I never needed to do that for local journeys.

    It takes 30 minutes for me to drive to work, but 50-90 minutes to go by public transit and I literally live in a massive transit corridor with service from my house directly to work. It’s absolutely absurd. Essentially, transit only broadly works in US if you live in NYC. It’s too sparse in SF to be used widely. Too sparse in DC. Chicago is 10 times worse with it’s urban sprawl. And unreliable as fuck in Boston. Boston doesn’t even have a reasonable train to the goddamned airport (yes I know about the blue line, but it’s still a 15 min bus from the blue line station, and you can only transfer to the blue line from the green line).

    This is why people drive. Because for the vast majority off us, even those in Europe, there is no better alternative. If transit was so much cheaper, then why doesn’t every village of 10,000 people in Europe have a tram? There are simply too many places people want to go, and only extreme density can make transit cost effective.




  • This is the only response required. I’m quickly becoming exhausted of reading everyone’s epiphany on “enshittification” as if it’s some natural eventuality. Yes the money must eventually come, but not always at the expense of platform quality. If anything the results we see from “enshittification” are due to the fact that most businesses fail eventually due to poor leadership.

    Just to echo what you have already said, money today is simply more expensive than it used to be. We even see the impacts of macro monetary decisions on households.

    Buying a house or a car on loan is far more expensive than it would have been a year and half ago. A $500,000 house in 2021 would cost $2,000 a month at 2.75% interest and 20% down. Today same that payment is $2,800 or 40% more expensive at 7.75% interest.

    Modern companies live on revolving debt, so if their suddenly gets 40% more expensive and that same amount of money is also less valuable at the same time (inflation), then they need to make up the difference somehow.

    Corporations are trying to find the balance between squeezing more revenue to pay their ever increasing debt bills while also not destroying the environment that attracted the users (their products) in the first place. Twitter and Reddit are just going about it horrifically because of poor business leadership and decision making. Netflix’s approach appears to be sustainable, and there is no doubt that YouTube will be fine in the long run.

    This is not meant to be apologetic to the decisions made by Twitter and Reddit. They’ve made their bed through their own horrible decisions, and now they’ve got to sleep in it.