середу, 31 липня 2019 р.

Bizarre Wikipedia Edit War on Quentin Tarantino's "Once Upon a time in Hollywood"


Wikipedia is many things for many people. Some view it as the go-to place to check some basic facts, others view it as a template for the future universal source of all knowledge, a sort of an imperfect start of a the great thing. There are some people who genuily try hard to make it so, and there are other people who fuel the so-called Wiki Wormhole with unexpected and downright weird things found in the far corners of the world and the web (AV Club even runs a column about that).

Finally, there are people who take an advantage of Wikipedia being a massive compendium of seemingly sourced information to intentionally distort and manipulate certain facts in order to orchestrate an instances of confusion and conflict.
The usual victims of such actions are articles on politics, political theory, public figures of all sorts and most of the science-related articles.
The editorial boards and active users try to keep things straight for the benefit of the public. But sometimes the trolls persist and this causes so-called "edit wars" in which the story of changes and attempts to find out the reason why becomes more prominent than the subject itself.

That's what happened this time.


Quentin Tarantino's new film "Once Upon a Time in Hollywood" came out into a wide release in late July. The film garnered a positive buzz and made a splash at the box office. As any significant film release - it got a page on Wikipedia. For a while it only had an official plot synopsis and the list of cast.

But around the time the press screenings rolled out and the wide release date was coming up, it was expanded into a full-blown plot description with spoilers, et al. This things happens all the time and usually it is not a big deal.

However, things weren't that simple this time.

The big thing about "Once Upon a Time in Hollywood" is that it is a historical fiction that applies a bit of artistic license over a depiction of certain historical events. In this case, it is a murder of Sharon Tate and several of her friends in 1969 by the cult leader Charles Manson and the members of his "family".

In the Tarantino's version of events, the murder never happened due to chain of events that diverted the attention of the "family" members to the film's protagonists Leonardo DiCaprio's Rick Dalton and Brad Pitt's Cliff Booth who quickly dispatched them in a gleefully gruesome fashion. And everyone lived happily ever after, because the film is a fairy tale. It is a spin on the historical event that tries to imagine what could have been if things went a little differently.

However, the Wikipedia description of the concluding section of the plot that appeared in late May after the film's Cannes premiere was drastically different from what actually happened on the screen. It retained the story point of murder plot falling apart, but it took things into much more outlandish direction.

Here's the bit in question:


The fact of that this plot description was fabricated and misleading was repeatedly pointed out by the film critics. As Andrew Woods put it: "Don’t even read it if you don’t want to know what DOESN'T happen." (great statement, by the way). 

Incorrect and downright made-up plot descriptions happen all the time, thats a part of Wikipedia's crowdsourced nature, especially for the anticipated media releases. The difference from this situation is that upon release the information gets corrected and everybody move on with their lives. 

But this time, when users tried correct the plot description, they hit the wall. Why? The thing is - Wikipedia guidelines have a line that says "it is not acceptable to delete information from an article because you think it spoils the plot". 

And so - each time users edited the bit - the fake one turned up again. It happened multiple times going into the release week. The film's "talk" page shows a fascinating story of what was going down behind the scenes. 

The initial discussions were about casting controversies, how much of the film's plot the summary should reveal and whether the editors should respect Tarantino's wishes to avoid spoilers. 

Then things took a turn sideways and started to get weird. It all started when one of the editors claimed that he saw the film at the Cannes premiere and stated that the plot description on Wikipedia was false. However, the other editors started to argue whether Cannes viewers can be considered as a verifiable sources of information. As one of the editors put it "You have once again have not explained how I, as an editor, am able to verify the plot summary". 

Despite that, multiple users requested to fix the section and the things kept rolling back to the made-up bit again and again up. It was a loop. And it went on up until the film's wide release in UK and North America in late July that put a definitive end to the debate and reinstated the real conclusion of the story to the plot description.

To be honest, this particular edit war is a great narrative in of itself. It shows the struggles of sourcing and depicting information on the web and it points out how shaky things are when the sources can't be verified and the public involvement goes out of control. 

But it overshadow the thing that started it all - falsified conclusion of the story in the plot description section. 

***
The made-up ending is interesting thing from the conceptual standpoint.

In essense, this kind of retelling is taking an artistic license on a plot point that itself took an artistic license on an actual story. And as such, it is a showcase of derivative creativity - another spin on an idea, a variation on a variation. It functions in the same context but drives the story in a different direction using the same moving parts of the plot.

If anything, this kind of spinning and slightly making things up is how people do things since the beginning of time. We do it all the time as a kids to impress others just to look cool. The made-up ending is something like that.

Deriving from the existing and experienced things is how the cognition amps up the imagination. After a while things get complicated, the line of sources and the bleedthrough of various elements mix-up more and more to the point the derivative piece becomes a thing of its own with its own distinct themes and identity.

"Once Upon a Time in Hollywood" wikipedia made-up ending is a testament to the dedication of the Tarantino's fanbase. Whether it was mean-spirited or not is beyond the debate. The fact is - it was misleading and it caused something of a fuss. 

But it shows the level of investment into the work of art similar to pro wrestling "fantasy booking". And we all should do it from time to time, just not make that much noise about it.


понеділок, 29 липня 2019 р.

GLTR aka Giant Language model Test Room

Natural language processing applications are having their long-awaited cultural moment. Unfortunately, the spike of the interest was not caused by some aesthetic and conceptual breakthroughs but rather due to rampart malicious use of the technology.

The phenomenon known as "fake news" is the next step in the development of the information warfare tools and it is getting more and more sophisticated as time goes by. In fact, it becomes so sophisticated it becomes hard to distinguish the text written by an algorithm from a text written by a living breathing human being.

That's the reason Giant Language model Test Room aka GLTR exists.

понеділок, 15 липня 2019 р.

What is Shelton Benjamin thinking?: SmackDown LIVE, July 2, 2019



Here's a little segment from WWE Smackdown Live from July 2, 2019. It features Shelton Benjamin, one of the company's performers, being asked about the chances of the championship changing hands at the upcoming pay-per-view event "Extreme Views".

Shelton doesn't give a traditional answer. Instead of providing a verbal reply, he does a little staring routine, rolling his eyes in different directions, intensely thinking while the camera zooms in on his face. He figures out something, makes a vicious smile, glances into the camera for a bit and walks off without saying a word. The whole thing takes about 30 seconds.

It is short and sweet vignette designed to highlight some sort of a plot development and character involvement. Except, it is not. It is a non-sequitir.

The thing is - WWE has a habit of doing throwaway things with no intention of ever really committing to something. Shelton Benjamin is a good example of this phenomenon. He's employed, but not really featured in a meaningful way. The last time he was spotlighted was prior to Wrestlemania when he fought championship contender Seth Rollins in a losing effort. It came out of nowhere and it went nowhere. This vignette is the same. It turned up for sole reason of padding an episode, added nothing to the narrative and was promptly forgotten afterwards.

But it is so different from WWE's usual throwaway stuff it sticks on memory.

As it is, it is rather curious performance piece. There can be a statement about non-verbal communication hidden deep inside. After all, a skilled actor can express a lot of things with his eyes and their movement. But since, the vignette is devoid of context and lacks any substance, it depicts a literal dead end void. It doesn't make sense and yet it persists. The thing is so pointless, it fascinates.

суботу, 6 липня 2019 р.

Allen Institute's Grover and the funeral pyre of text generation

If you are following the technology news for some time, especially the ones related to artificial intelligence and machine learning technologies, you can spot this strange tendency that underlines every new project or breakthrough in the field. 

It is not about doing something that goes beyond human comprehension, it is not about making something inspiring or astounding or simply distinct. No, it is mostly about doing the opposite - the generic, blend-in, nothing special, middle of the road, template-based kind of stuff that can pass of as human-made because it is so unassuming or just being like something an average human would do. It is lazy. And we are not even starting to talk about DeepFakeNews shebang. 

This peculiar tendency leaves a strange aftertaste. The one that can be described with word "meh". And it seems like it is really the whole point of it all, because otherwise it must a prolonged episode of mass delusion. 

Case in point - Grover by Allen Institute.


Allen Institute are big guns of Artificial Intelligence and Natural Language Processing Technology. Their AllenNLP is one of the most efficient tools for developing NLP models of all kinds. It is a Swiss Army Knife of NLP that brings the technology to every corner of business or scientific operation.

So - a couple of days ago Allen Institute had presented Grover. It is a natural language generation tool able to create texts that are very much like the ones written by humans. Or, if being exact, human copywriters. Just like regular SEO-enlightened copywriters, Grover creates articles that seem legit. Like you can actually read them and think you've just wasted a five minutes of your life kind of legit. And that's about it. Yey!

Grover is a showcase of how far the NLP technology had went over the last couple of years.

With all the text analysis tools and pattern recognition and recurrent neural networks you can dig deep into the thick of the text and understand its structure and how its elements connect with each other and do your own thing based on the template and it will be mostly comprehensible. To the point that the output is completely indistinguishable from human-made.

Grover shows how easy it is to make this kind of article. It is fascinating how good it is at doing these bland, generic, middle of the road, not really saying anything texts.

I've tried the topic "Why Chael Sonnen is so good at talking and so bad at fighting?" and got exactly what i would write if didn't really cared about the topic and just phoned it in with a thermonuclear impact. It's impressive and concerning.

The availability of tools like Grover poses a serious threat to the mass media and its consumers due to systematic spread of fake news on a huge scale from all possible directions (social media, comment sections, faux new resources, reposts-rewrites, content aggregation). Just think about manipulating public opinion on climate change or abortions with an avalanche of misleading content. But that's political stuff.

There is also another problem - information noise AKA useless stuff.

The name of the perpetrator is SEO. It's not a big secret that search engine optimization downright sodomized the very concept of writing. To the point it was boiled down to wonky guidelines that turned writing into a tossing of the lego blocks to fit the requirements. SEO copywriting made writing boring and generic chore. It is like socialist realist literature with a new coat of paint. It too wholly depended on interchangeable templates and was mindnumbingly unreadable garbage that was just occupying the space.
But it works. Search engines dig it. It is cheaper than actually paying to promote your content.
This spawned innumerable amount of trash content with little to no value, and also birthed the clickbait mindset driven by trends and not actual necessity. Do we really need that many "All you need to know about Cyberpunk 2077?"

Add some AI to the mix and you get flooding of the search engines with an actual spam content that perfectly fits the criteria, is not all that blatantly spammy and it wholly overtakes the narrative leaving no place for anything else. Yey! Kinda like what we have today, except instead of going ten pages down the search results to find anything worthwhile, you will go twenty five or fifty.

But I'm used to dig through garbage, it doesn't really scare me that much. What makes me sad is the waste of talent on such kind of stuff. It seems actual creativity and NLP doesn't like each other these days.

One of the fun things about natural language generation of the past was its problematic relationship with such concept as "sense".
The majority of old-time text generators just could not pull it off clean. There was always something off about their output. Sometimes it was slight, stupid and kinda cute, other times its propensity towards hallowed nonsense was legitimately impressive. There was always something truly unexpected.

Those markov chains text generators you can find on the web - they are capable of running nuclear weapon testing in your head if used "properly". Modern text generators like Grover and their sophisticated algorithms don't do that. They are too smart to have this kind of fun. It is not part of their design - they want to do business, boring things that blend seamlessly into the background. We can do better than that.

Six new works in Die Leere Mitte

Got some great news! Six of my poems were featured in the newest issue of Die Leere Mitte . But this time it is some big guns. These guys k...