ailsing sent along a link to the new michel gondry video this week.
while watching it i was thinking about video annotation and story representation (surprise, surprise!). there is not much variation frame to frame, night or day, scarf or no scarf, sign to bridge, city to desert, car moving to car not moving.
if i count night -> day transitions i could create a rule that generates days passed. a no scarf -> scarf rule that infers it's colder outside. a car moving -> not moving could mean stopping to rest.
but these are rules that always produce the same outcome - a more rigid interpretation than our minds might offer.
how do we get from all these bits, these details of the frame, to "someone went on a road trip" or "an escape?"
if you look at only the first frame/last frame what can they tell us about what happens in the interim?
~ b
|