• Lvxferre@lemmy.ml
    link
    fedilink
    arrow-up
    5
    ·
    edit-2
    1 year ago

    I didn’t read this paper (I’ll update this comment once I read it), but this sort of argument relying on emergent properties pops up from time to time, to claim that the bots are handling something further than tokens. It’s weak for two reasons:

    1. It boils down to appeal to ignorance - “we don’t know, it’s a blackbox system, so let’s assume that some property (conceptualisation) is there, even if there are other ways to explain the phenomenon (output)”.
    2. Hallucinations themselves provide evidence against any sort of conceptualisation. Specially when the bot contradicts itself.

    And note that the argument does not handle the lack of pragmatic purpose of the bot utterances. At all.

    Specifically regarding games, what I think that is happening is that the bot is handling some logic based on the tokens themselves, in order to maximise the probability of a certain output (e.g. “you won”). That isn’t even remotely surprising, and it doesn’t require any sort of further abstraction to explain.


    EDIT, from the paper:

    If we think of a board as the “world,” then games provide us with an appealing experimental testbed to explore world representations of moderate complexity

    This setting allows us to investigate world representations in a highly controlled context

    Our next step is to look for world representations that might be used by the network

    Othello makes a natural testbed for studying emergent world representations

    To systematically determine if this world representation

    Are you noticing the pattern? The researchers are taking for granted that the model will have some sort of “world representation”, as an unspoken premise. With no h₀ like “no such thing”.

    And at the end of the day they proved that a chatbot can perform the same sort of logical operations that a “proper” game engine would.

      • Lvxferre@lemmy.ml
        link
        fedilink
        arrow-up
        4
        ·
        1 year ago

        I think that image models are a completely different beast from language models, and I’m simply not informed enough about image models. So take what I’m going to say with a grain of salt.

        I think that it’s possible that image models do some sort of abstraction that resembles how humans handle images. Including modelling a third dimension not present in a 2D picture, or abstractions like foreground vs. background. If it does it or not, I don’t know.

        And unlike for language models, the image model hallucinations (e.g. people with six fingers) don’t seem to contradict the idea that the model still recognises individual objects.