Thursday, August 16, 2007

Commercials Annoy Me

I don't want to reveal too much of my personal life, of course, but I have to admit that from watching two-hour blocks of Daily Show/Colbert Report/Scrubs re-runs each weekday, I've seen this commercial for Astrive student loans somewhere on the order of twenty times. (Somewhat less than the Ditech commercial that admonishes me with "people are smart", but somewhat more than the Best Buy commercial where the dad hides his daughter's backpack to prevent her from going to college.)

One non-linguistic thing that bothers me about the commercial first: one of its claims is that an astrive loan is better than borrow from a "high-interest credit card". Nothing like informing us that your offer is better than the worst possible solution. Might as well say "better than paying for college by running small jobs for the Mob". Or "eating our hamburgers is more nutritious than subsisting on Crisco."

Returning to the linguistic point I wanted to make originally, the friendly narrator who keeps on talking down to me says at one point that college costs "major dollars... GRANDE dollars." This seems weird in a few ways:

1. It's highly nonstandard to use major to modify a plural noun.
2. It's highly nonstandard to use grande to modify a plural noun.
3. There is a standardized Spanish borrowing into English with the same meaning as grande dollars: mucho dinero.

So it's sort of a neologism,

Tuesday, August 14, 2007

A Cyclical Progression

As I was walking around yesterday, randomly taking pictures of things in the background with other things out-of-focus in the foreground, I started thinking about whether I am approaching linguistics correctly.

Early linguists did descriptive linguistics, and the whole field up to the Chomskyian revolution was, by and large, a bunch of people pointing out different neat language anomalies to each other and saying "Well, isn't that neat?", without any major theoretical framework emerging. Kind of, in my opinion, a waste.

Then along comes Chomsky to introduce some rigor to the field, and it worked. Suddenly people were combining grammar, logic, math, computer science, set theory, (a teensy bit of) psychology and cognitive science, and a bunch of other jazz together and actually getting a pretty nice little theoretical framework out. A lot of the success of this revolution came from abstracting away from language and reducing all of the beautiful neat idiosyncrasies of language to categories, rules, and various cleanly-defined abstract concepts. It worked.

But not perfectly. The problem is that language isn't quite the same as logic. Our linguistic theories work really well on these abstractions, but the problem is that these abstractions don't really translate back into real, observed language so well. Take, for instance, the abstract category verb. There are tons of things that are sort of verbs, like passive participles, gerunds, nominalizations, etc., that vary in how verb-like they are from language to language. Likewise, as my current attempt to label corpus subjects as singular/plural/mass nouns is showing me, there're some grey areas even in abstractions that aren't all that abstract (it's usually pretty clear whether there is one or more of something, but for abstract and mass nouns, it can be unclear whether something is countable). This is the sort of thing that has been shunted off for years with the old refrain "We'll let pragmatics take care of that."

But pragmatics has not taken care of these problems, which is why a lot of linguists are switching over to what is, in some ways, a less abstract approach to linguistics. I am in this camp, but the question that bugged me as I was walking yesterday was whether this is justified. Basically, we're turning back toward descriptive linguistics. We're not going all the way back there, but at the same time (and perhaps with a twinge of guilt in my math-major heart), I worry that we shouldn't go back toward descriptivism at all.

I think the loss of abstraction is justified, for two reasons: 1) the lack of progress in connecting real language usage, the sort that humans use so effortlessly, to the abstractions that are becoming increasingly tenuous and complex, and 2) we have the computational tools to make something of consequence out of a more descriptivist, less abstract approach now. We can say with confidence that animate subjects prefer certain constructions, or longer subjects favor others. I think that even if we ended up back at truly descriptive linguistics, we'd still be way ahead of the game by being able to state statistically significant tendencies and such. At worst, we'd pave a better road for a new Chomskyian revolution.

I feel much better now.

Friday, August 3, 2007

Speech v. Writing

Kate's post last week got me thinking about a lot of stuff, and coupled with part of a book I'm reading on self-organizing systems, I think there're some other relevant divisions in the goals of linguistics that need to be addressed. One that's gnawing at me is the distinction between spoken and written language. I don't think that there's a qualitative distinction in the underlying theory of how people construct sentences in the two modalities. But something's going on.

For instance, it's commonly agreed upon that spoken English is not always grammatical. People seeing transcripts often report that they surely did not say what was transcribed. And as any corpus linguist will tell you, spoken corpora are full of ungrammatical sentences. But what's interesting is that the spoken stuff seems to be locally coherent.

So here's my thought. Written stuff, thanks to the ability to see clearly what preceded the current point in the sentence, is based on global information. Spoken stuff, on the other hand, is based on what you can recall in a complicated setting where you're trying to formulate a novel thought in a stimulating environment with a reactive audience. In such situations, you should expect to have imperfect recall even of what specific words were at the start of your sentence. Rather, you could just remember the gist of what was said before and the last few spoken words, and assume that this is what your listener is doing as well. In that case, you can build the rest of your sentence based on local coherence with the recent words and the general sentence gist.

If that's how speaking and writing work, then it looks like we need different models for the grammars of the two modalities - one with rules/constraints that depend on pure global information, and the other with rules/constraints that depend almost solely on local information. This doesn't imply separate grammar types for written and spoken language, but rather a different set of constraints (or perhaps a different ranking of the same constraints, if you're particularly enamoured of OT). Alternatively, it may be that written English is subject to grammaticality judgments, and spoken English is subject to acceptability judgments, and that we're really honestly using different measures.

I don't know if this is totally the right direction, but I think the time will come (if it's not already here) when we need to address the differences in grammaticality judgments in written and spoken language.

Thursday, August 2, 2007

if you're ever asked for the difference between "DRT" and "File Change Semantics"...

A quote from David Beaver I found in my class notes from the LSA:

"Hmm, representing discourse. Well, Gilles Fauconnier's theory of discourse representation, called "Mental Spaces", uses circles with lines connecting them, like this. [draws circles]

Hans Kamp's theory of discourse representation, called "Discourse Representation Theory" (DRT), uses rectangles inside of other rectangles. [draws rectangles]

Irene Heim's theory of discourse representation, called "File Change Semantics," uses skinnier rectangles than in DRT. [draws skinner rectangles]

Those are basically the differences, except mental spaces doesn't have a model theoretic interpretation, so forget that. Ok, back to the class material..."