The Tale of the Sanctimonious Scrivener: A Rant

There is an old joke that professors grade essays on their heft. The weightier the paper, the better the grade. Drawing from the idea that the longer the work is, the more time was put into it and the more deserving it is of a higher grade, the concept brings the flaws of human grading into focus.

Which brings us to a recent study evaluating the accuracy of computer programs created to score essays. These programs are by no means new- they have been in use for years, particularly in the world of standardized testing. With so many short essays being churned out by test takers the world over, it seemed a simpler solution to automate the grading process.

Of course, while automated grading of multiple choice tests is simple enough, cost effective, and accurate, can we really say the same for automated essay grading?

According to a study from the University of Akron and a consultancy called The Common Pool, the answer is a resounding yes. They took something like 16,000 essays (with sets that included different lengths, different rubrics, etc.) that had already been scored once by a human, then let a computer (well, several programs, actually) grade them again. The results were almost terrifyingly similar. Want proof? Here’s a chart of the scores on mean estimation… they are all so close that the lines all appear to be one goddamn line:

Of course, charting out other factors yields less impressive-looking graphs, but fuck truth when we have visual impact, right?

Regardless of potential data skew based on the most widely circulated chart from the paper, the study really did find a striking similarity between the human and computer graders. This is the first time a study like this has been done on this scale, and it does a lot to address the many flaws in computerized essay grading. Many programs favor essays with more complex lexical choices, as they are representative of an advanced vocabulary (never mind the fact that one can easily toss around a word without knowing the finer points of its meaning, i.e. thesaurus junkies). Programs also favor length, in both the entire paper and in the sentences in themselves. And, of course, they prefer proper grammar.

However, programs have been ridiculed for favoring these technical aspects at the expense of actual content. Can we honestly dole out high marks to students spouting eloquent garbage? The programs are those theoretical professors grading papers by weight, with no regard for the actual information within. A problem, to be sure.

As artificial intelligence technology advances, though, the programs have become more complicated. They are able to discern some relationships between words and phrases that help them “understand” the meaning of the essays. Last year, the University of Florida did some research on the usage of automatic grading systems using AI technology. The system in place was able to look at something like “the heart pumps blood” and find a relationship between the words “heart” and “blood,” essentially finding the meaning of the sentence by piecing together word relationships built through the rubric created by the teacher.

Interesting, to be sure, but it’s still a crude system that can, seemingly, be easily exploited by a moderately clever student. Like a child beating the square peg into the round hole until the corners break, the systems might be able to hammer out a rudimentary “understanding” of the essays, but just as that mangled square peg will never be a perfect fit for the round hole, so too will these programs never understand complex, intricate writing.

Why, then, would we let these systems do our grading for us? There are many purported advantages to removing the human component in grading. It does away with biases (personal, racial, gender-specific), which curbs grade inflation. It alleviates teacher fatigue (from which can stem errors).

There are pros and cons to both methods of grading, to be sure. And this study seems to add another entry in the pro column of computerized grading.

***

My issue with all this isn’t whether or not the Akron study is accurate. They obviously found a strong similarity between human and computer grading of these essays. To me, this is indicative of a far greater problem.

I am mere days away from completing my English degree, and there is a problem that has been gnawing away at me for the majority of my school-going years. A problem I assumed would vanish when I entered the collegiate world. But it didn’t. It continued on, this relentless march toward mediocrity.

It is a problem with the formulaic nature of writing education.

If a computer can grade an essay with nearly the same degree of accuracy as a human, this says less about our marvelous technology (sorry, but I follow AI research and know even the most cutting-edge experimental programs are nowhere near as impressive as any human mind) and more about the shabby state of our student writing. We teach our students the fucking five-paragraph essay, the rote rehashing of theses to form concluding statements. Pick a topic, back it up with two or three points, wrap it up. There is no room for creativity, for real cleverness, for anything that makes writing art and not just a series of rules to be regurgitated from the tip of a pen or onto a computer screen. As Alexander Pope wrote,

True ease in writing comes from art, not chance, as those move easiest who have learned to dance.

Our students are less concerned with writing interesting, engaging pieces exploring novel ways of thinking or delicately bending the rules- they instead hammer out blocky, mechanical essays. They present bland topics with just the right number of supporting facts to net them a decent grade. That’s it.

I have had many professors, and I have never had one that really inspired me to be a more creative, interesting writer. There was one who broke the mold slightly, but even she wasn’t really a powerful force in my academic career. I know that many others have those professors that shaped them, that really touched them, that showed them something about themselves or their course of study or the world that makes the student grateful and better for having known them. I understand that, I respect that, but I neverhad that. My thirst for knowledge, information, and creativity has always best been sated on my own, outside a traditional classroom.

And while I’m sure there are many English professors [And since when are English professors the only ones expected to foster strong writing in their students? You might have a great idea, oh mighty chemist, but if you can't write a goddamn elucidatory (...fuck you, WordPress, that's a word) paper to share that work with the rest of the scientific community and the world, then you are shit out of luck, now aren't you?] out there who really work to engage their students, given my own experiences and the fact that most students, if they had an “inspirational professor”, only had one or two… statistically, most professors just teach their students that mechanical, boring writing.

I suppose it is time for me to clarify a few points here, particularly for those of you who know me and are pointing at the screen in horror, screaming about my hypocrisy. I am aware that I am known for being an exceedingly technical proofreader. Am I not just perpetuating this system I purport to despise? Well… yes, I am. Because there is technically nothing wrong with writing this way. And, in fact, I am a firm believer in understanding and utilizing technically sound writing, particularly in formal settings. And those five-point essays I was harping on about? Well, they are actually a very useful tool to teach young writers about structure. I do not think they are so much the devil as I find them a despicable crutch we are not only allowing older, more advanced writers to use, but we are actively encouraging this kind of lazy writing. While there is less room for creative flair in formal, academic papers, there should be breathing room for a personal voice to show through the formal technical aspects. It’s a delicate balance, tying the writer’s soul into the formal rules… but it’s certainly possible. But we are not teaching (or even encouraging) this kind of skillful writing. Which, I believe, is a travesty.

More on that in a second.

Just last night, I was teasing a boy for marking a diaeresis, as it’s considered rather archaic in modern English. That being said, I was only poking fun because I am a right and proper bitch (and because the two of us seem to communicate primarily in taunts, mockery, and faux arguments). In all actuality, I found the use of the diacritic strangely charming. I have always enjoyed people who strive to plumb the true depths of the English language. Perhaps that’s an English major thing.

But these finer points of language… they are not taught anymore. Or, at least, not to any real degree. Why did diaeresis diacritics fall out of vogue, anyway? Because the variants, sans markings, became more popular. And our schools teach what is popular. Which is fine, which is useful, but which becomes more and more diluted. Our vocabulary shrinks, the finer points of our language get lost, and then where are we? The loss of the flavorful bits of language, those accent marks and mellifluous phrases and cheeky verbage, cripples us. We lose more than just words, we lose imagination and creativity. And as those slowly degrade, so too do advances tied to them. Invention, discovery. This destroys us slowly, across all aspects of human knowledge and progression.

And we just allow it. That is what I have such a problem with.

Formula is a base, just as we have basic vocabulary. But as we continue through our education, we need to be advancing. We build on the base. We learn the rules, then we learn how to break them. Instead, we stop at a simple formula. After we’ve mastered this, we are done. The end of the line for our writing education. Oh, there’s a bit picked up here and there. But there’s no longer any real push to expand your skills.

Not even for English students, sadly.

Our writing can be graded by a computer program. That’s how basic it is, how fucking systematic it is.

Congratulations to us.

***

I don’t have a quick fix solution to this perceived problem. Perhaps you don’t even agree with me that this is a problem. So be it. These were just my bitter, scattered thoughts as I read about the Akron study.

Take this with a grain of salt, like you should all my posts, dear galleons.

Digital Assimilation: Human Hive Minds, Reverse Cyborgs, and the Power of Crowd Wisdom in the Information Age

Awaken my child, and embrace the glory that is your birthright. Know that I am the Overmind; the eternal will of the Swarm, and that you have been created to serve me. ~The Overmind, StarCraft

In 1912, Carl Jung published Symbols of Transformation, a work in which he began detailed development on his idea of the collective unconscious, one of his many enduring additions to the field of psychology (and, in my opinion, one of the more ridiculous, as it tends to feel like nothing more than refined mysticism). The collective unconscious as described by Jung is actually a knotty little thing, as he was often rather ambiguous in his various descriptions of it, allowing for a wide range of interpretations and suggestions as to its true nature.

In his The Archetypes and the Collective Unconscious, Jung lays out his idea of the collective unconscious in the first few pages:

A more or less superficial layer of the unconscious is undoubtedly personal. I call it the “personal unconscious”. But this personal layer rests upon a deeper layer, which does not derive from personal experience and is not a personal acquisition but is inborn. This deeper layer I call the “collective unconscious”. I have chosen the term “collective” because this part of the unconscious is not individual but universal; in contrast to the personal psyche, it has contents and modes of behaviour that are more or less the same everywhere and in all individuals.

His collective unconscious was based less on an all-encompassing, eternal world consciousness as it was a series of psychic structures underlying all human experience- the archetypes (the Self, Anima/Animus, Shadow, etc). From these spring archetypal images (like the hero, common across all cultures and times) and events (such as marriage and initiations). For Jung, the collective unconscious and archetypes served as a kind of DNA of the psyche. Much as genetics determine our physical traits through a mere handful of nucleobases and amino acids, Jung believed that the collective unconscious shaped the individual psyche through a small number of archetypes.

***

That’s as far as we’re going to go with that because I find most aspects of the collective unconscious to be nonsensical (I have a love-hate relationship with psychology in general).

I bring it up as a foil for what I’m going to discuss next. For if we have a concept for the collective unconscious, surely we have to have one for a collective conscious as well.

Actually, aspects of the collective conscious appear in psychology as well, particularly in the idea of groupthink. A phenomenon arising within groups of people, it’s a problem solving method wherein group members attempt to reach a consensus without conflict or critical evaluation of alternative ideas/viewpoints. William H. Whyte called groupthink, “a rationalized conformity— an open, articulate philosophy which holds that group values are not only expedient but right and good as well.”

But groupthink doesn’t arise in all groups (…rather obviously). It’s most likely to occur when the group is comprised of members of a similar background, when the group is insulated from outside opinions, and when there are no clear rules for decision making.

I think my favorite example of groupthink is not one of the more obvious political ones, but rather the movie 12 Angry Men. I have a personal connection to this movie- story for another day, that. 11 of the 12 jurors in the case succumb to blind agreement that the defendant is guilty. Their inability to rationally look at the situation and consider alternative viewpoints makes them a strong example of groupthink (and a rather horrific look at the potential for blind judgments in the legal system).

Thankfully, smooth-talking Henry Fonda is there to turn the tables.

***

But the first thing that really pops into your mind when you think about the idea of a collective conscious isn’t some psychological phenomenon you read about in your Intro to Psych course that you took because your upperclassmen friends told you it was a blow-off class, it’s something that belongs in the realm of science fiction:

The hive mind.

The Zerg in Starcraft, the Geth and Rachni in Mass Effect, the Overlords in Childhood’s End, the Toclafane and Vashta Nerada (meep!) in Doctor Who, the Dark Ones in Metro 2033 (I haven’t actually played this game, so I’m taking the internet’s word on this- I’m including it because I just wanted to say that I was actually reading about this game the other day and really want to play it)… the list goes on and on.

Unnecessary, yet awesome, Magic the Gathering moment. Bask in it, dear galleons.

All of these species exhibit some form of hive or group mind. We are used to portrayals of hive minds wherein the individual members refer to themselves as “we” or “us,” denoting their lack of individuality. They are a collective- one mind in many bodies (or one memory shared between bodies or some variation thereupon), exhibiting a telepathic connection between individual units. Often controlled by a queen-type figure, the hive mind is a devastating creation. Because there are no individuals, there is no dissent. No alternative modes of thinking. No sudden spats of morality. No crippling love or guilt or other emotions.

It’s the Utopia Big Brother and Joseph Stalin both craved.

The thought of being part of a hive mind causes a cold shiver to run down my spine. I am a confusing, bizarre, nerdy, emotionally-retarded, introverted, sexually frustrated, abrasive, half-assed intellectual with a predilection for immature jokes, frequent cussing, rampant giggling, and making absurd associations. But whatever strange compound of personality flaws I am, the fact remains that it is me. An individual. And I wouldn’t trade that sense of self, that unique sensation of I, for anything.

I assume, galleons, that the same can be said of most of you.

So it isn’t surprising that my instinctive reaction when I first read about a “human hive mind” was one of horror. But if there’s one thing that has remained steady throughout my life, it’s my insatiable, morbid curiosity. Thus, I kept reading.

In the end, the article wasn’t really about a hive mind in the sense of the images we have from our science fiction favorites. Rather, it was about the power of  crowdsourcing (a portmanteau of “crowd” and “outsourcing” that is basically summed up by its parts- outsourcing to a crowd of people) in increasing the power of AIs.

Which made me breathe a quiet sigh of relief, naturally.

The information was interesting, however, and I think the concepts of crowdsourcing and crowd wisdom are worth discussing, so that’s what we’re gonna do.

***

What exactly is the wisdom of the crowd?

Crowd wisdom is the process of taking in the collective opinion of a group of individuals rather than a single expert’s. Which sounds suspiciously similar to group minds and groupthink, doesn’t it?

The concepts are related in that we are looking at the power of the whole over the power of the one. The phrase “right reduces to might” has been popping into my mind at the oddest moments in the last two weeks, and I find this to be one of the situations where it actually fits. The might of the crowd’s opinions becomes what is considered truth.

If ever there was an argument for subjective truth in modern culture (I still feel historiography, the study of the shifting narrative of history, is the best one overall- maybe we’ll talk more about that in the coming weeks, because that’s an old favorite of mine that I don’t think I’ve really expanded upon here), the wisdom of the crowd would be it.

The internet has already started capitalizing on the wisdom of the crowd, as many of you have probably noticed. Crowd wisdom powers search engines like Google, which aggregates searches from across the globe. Have you ever wondered how Google’s search results are organized? Maybe you already realized that they are organized, in part, based on popularity- the more times users click a certain link in reference to a specific search term, the higher up the ratings that site climbs for that search term. Sort of. There’s a much more complicated algorithm at work, an algorithm they are constantly tweaking to prevent spammers from manipulating the system to land in the top results.

Then again, maybe they just use pigeons. Who knows.

What we do know is that the internet is changing. And it’s not a change we all immediately recognize, as most of us have been here through its gradual evolution. It’s only when you take a step back and really look at it that you start to see the incredible shift we’ve made from the simple organization-and-consumption-of-information model the internet has been running on. Now, we are looking at the age of user-generated content (created and shared by users) and social media, a strange new beast with a new set of rules.

Just what is so important about the overwhelming flood of social networking happening on places like the Facebook and Twitter? The strong socialization of the internet is turning traditional search and information gathering on its head. In the past, web socialization has been focused primarily in places like chat rooms (yes, those archaic institutions) and discussion boards. What we have now, however, is the ability for each user to carve out their own little microsite, an internet area and identity that is unique and centralized.

Within our individual internet realms, we have other denizens, our “friends,” those individuals in our social network that we know or respect. Just as we flock to real-life friends with similar interests, so too do we flock to internet-folk with similar interests. I don’t follow hockey players on Twitter- I follow geeks, scientists, sexual deviants, and people with wicked senses of humor. We create our online networks the same way we do our IRL ones. And web developers are looking at harnessing that information to further refine and personalize the internet experience.

Have you ever heard of Delver? Originally launched back in 2008, it began as a search engine that used your social network to generate search results. When you first got to the site, you’d type in your own name. Delver would then dig information out of your social networking sites, building its own network of associated ideas, institutions, and individuals around your personal internet community. Results were then generated with ratings based on sites related to, produced by, or tagged by members of a person’s social network. As Liad Agmon, a former CEO of Delver (I have no idea if he’s still CEO, and I really don’t care enough to look it up), once said, “you are searching the Web through the prism of your social graph.”

Delver no longer operates in this capacity- it has now switched to a social commerce site that works in a similar fashion, targeted at finding products for consumers based on their social networks.

And you thought those targeted Facebook ads were creepy. Here’s an entire site dedicated to ripping through your public profiles and spoon-feeding you things you should buy.

But don’t think Delver is unique. Remember dear old Google? While their algorithms use the power of the many to deliver strong search results, they couple this with individual search tweaking based on your personal searches. Imagine if Google harnessed the power of your social networks in the same way Delver tried to. What we’re looking at is targeted wisdom of the crowd, taking the opinions of your circles (yep, I used that word on purpose- anyone who’s been following Google+ might chuckle a bit there, mostly because the latest foray of Google into the world of social networking might just accomplish this search and social network merger we’re talking about here) and generating content that will be more relevant to you and your interests.

After all, your friends should know you better than an algorithm… right? As Udi Manber, Google’s vice president of engineering in charge of search quality, said, “The art of ranking is one of taking lots of signals and putting them together. Signals from your friends are better, stronger signals.”

***

This is a form of crowdsourcing, galleons. By essentially outsourcing the task of finding content relevant to you to your friends, search engines could get back the most relevant and fresh results.

And now we can use the power of our group intelligence on the internet to help refine and aid AIs.

Here’s a very basic example. I’m sure most of you have, at one point or another, used an online translation site to attempt to decipher something in a foreign language. And how often has it spit back almost incoherent strings of words and symbols? Better yet, have you ever translated the same sentence back and forth a few times between English and a second language? The result is usually something with little or no relation to the original sentence.

Obviously, online translators are flawed. But how do we fix them? The problem with language and AIs lies in the fact that our communication is flooded with metaphors, puns, and clever wordplay. This is difficult to translate to algorithms for an AI to recognize (though not, necessarily, impossible- remember the TWSS program?). Which makes it hard to get online translators to generate high-quality translations.

And that’s where an AI could tap into the power of people to help it:

Take the counter-intuitive idea of doing translation without bilingual workers. The idea, known as MonoTrans, is the work of Philip Resnik and Ben Bederson at the University of Maryland in College Park. Imagine a Russian and a Spanish speaker, neither of whom speaks the other’s language. MonoTrans software translates the sentence back and forth between the two languages, inevitably imperfectly. But after each translation, the Russian or Spanish speaker edits the text to make it clearer, and it is translated back again. Three round trips are usually enough for the translations to reach high quality, say Resnik and Bederson. A pair of workers should eventually be able to translate 1000 words a day, they add.

Having a crowd doing this, back and forth, would inevitably yield very strong translations. Distilling truth from the masses, the AI would become stronger and better at its job. Like reverse cyborgs, we now have machines tapping into the power of humans to augment their systems.

Amazon long ago realized the potential of using groups of humans to supplement their existing programs. They launched Mechanical Turk in 2005, a site that gives anyone access to an enormous group of online workers. Anyone can work for Mechanical Turk- and thousands do. Meaning that the speed of response can be astounding. For example, the average response time for an image query (applications created to identify images usually use some form of program to determine what the image might be- if the program fails, the image can be sent to Turkers for a response, which serves both to please the customer and teach the program AI) is somewhere around 25 seconds.

Much of this is thanks to the proliferation of smart phones. With the ability to connect from anywhere, at any time, the amount of humans available to help AI is staggeringly high at all times. And growing.

***

Remember The Matrix? Of course you do (because, frankly, if you are too young to get that reference, you need to get the hell off this blog). In the movie, super-intelligent machines had people trapped in pods (and, consequently, mentally existing in the digital world known as The Matrix), harvesting their bio-electrical energy and body heat to power themselves.

This is kind of like that, only less creepy.

Having a constant, expansive human “workforce” available does allow us to teach and train AIs to a startling degree of precision and, dare I say, humanity.

Here’s an entertaining AI training situation that might amuse you if you are ever bored late one night, dear galleons. Created by Rollo Carpenter, Cleverbot is an AI program learning to mimic human conversation. What makes it unique among the other chatbots littering the web is that Cleverbot uses algorithms to select previously entered phrases from its database of prior conversations when responding to you. Which can be either disturbingly accurate or hilariously off-topic.

However, each conversation Cleverbot has expands its database, giving it more and more to draw from. And the more it learns, the more human-like its conversations should get.

I don’t know. The two times I tried it, it kept trying to get me to talk about love, called me a vampire, and answered one of my pretentiously philosophical questions with “Tom Araya” (a member of the band Slayer)… Amusing, but hardly a believable human conversation partner.

Unless that partner was on drugs. Maybe that’s all Cleverbot can hope for- passing as a stoner.

Still, it’s entertaining for a short while. And hopefully, in the future, the power of the internet’s group intelligence will manage to train Cleverbot to the point where you will forget you are interacting with a computer (right now, there’s no way this sucker could pass the Turing test, in my opinion).

Though, frankly, if the group intelligence of the internet is the one teaching it, all it will probably do is insult you in misspelled, grammatically incorrect, bigoted nonsense. Just like any set of comments anywhere on the internet.

Maybe we shouldn’t be so quick to use crowd wisdom to teach our AIs. Because the internet collective is fucking idiotic.