CTL-C



A Possible Future for Education

meller : March 29, 2012 10:58 am : CTL-C, Uncategorized

The other day as I was reading about the Stanford professor that ran a huge online version (100,000 students) of his AI class, I had a glimpse of a future for education that I have not heard discussed previously. As we worry here at the UW about the future funding for public education I see a new deep pocket on the horizon.

The argument has always been made that at least part of the “public good” of education is that it helps churn out workers, training them in the skills that businesses and industry need. One goes to school to learn aeronautical engineering with a possible eye towards employment at Boeing, or one reads Chaucer so that one can work at Microsoft (someone needs to write those manuals, you know – English majors can get jobs!). Yes, the connections are not always direct but the requirements for, “Bachelor’s degree in Computer Science, Math, Physics or related field” in a job description make it clear that at least some aspects of education are required/desired by industry.

So industry tells the government, “We need trained employees, “ and presumably the government agrees and funds public education. But there is another way that is coming up on the horizon.

A large company like Microsoft or Google could easily decide that it is in their interest to take education into their own hands. Why shouldn’t either or both of these companies decide that it is in their interest to train computer scientists? We will write our own course work, put it up on line, create our own tests, grade them (by computer, of course) and make job offers to the best of the best. Why take a BS of CS from Podunk U as an indication that they know how to program. Why not write our own tests? Why not train them in OUR OWN internal software toolkits and see how they do?

The basic idea is this, why not offer online learning, collect a 100,000 students at a pop, let them watch the YouTube videos that the company created, let them take the tests that the company created and let the company, take a look at the results and decide if they want to follow up with an in person interview. The cost is that the company would need to create courseware (good thing that there are plenty of underpaid professors out there who might be interested in some extra contract work, or maybe even interested in jumping ship to create our courseware) and then what, a few servers in the clouds, some people in HR working the telephone lines and email?

In fact, at least in the computer industry which I know fairly well, you could even blend the later levels of course work with apprenticeship/intern style projects. You hack out the code to do X, we inspect it and possibly include it in a product that we are shipping. You as an intern get to see how we work, we possibly get some useful code and more importantly get to see how you fit in before we give you any long term commitment.

Suppose more and more of this happens in the future. You could take Computer Science from Google College or from Microsoft University. You probably don’t want to go to Google School if you can’t get your grades transferred over to get a job at another company. So there will be pressure put on companies to share the grades of their students. Google will probably not get an exclusive shot at their top grads, BUT they will get a first shot, and they will get to train their students on a slightly different toolset that makes the fit better for Google.

The important thing that I see is this: Your typical university is interested in selling the high priced product. A Harvard education is expensive but it (in theory) gets you into the best highest paying jobs. Google and Microsoft, on the other hand don’t want their schooling to be any more expensive than necessary. In fact they should be willing to subsidize it. Why? Suppose that the UW educated too many aerospace engineers, their grads can’t all get jobs, or they are in so much competition with one another that the price the Boeing must pay is less. This is bad for the university. Universities, like guilds, want to limit the supply of skilled workers so that they can keep wages high. On the other hand, Boeing should see no problem in producing as many potential aerospace engineers as possible. They want to increase the supply of one of their consumables. Boeing has a reason to be interested in free education for the masses; the University of Washington does not.

Businesses already do assessment before they hire. Some of them already do some on-line assessment. The path that I see into the future is that they will start to expand that assessment from giving tests only to perspective employees who have submitted resumes, opening the tests up to anyone and making them available worldwide. The next step will be when they realize that it is actually in their interests to help people cram for the tests so that they can pass, thus practice tests will be made available. Humm, cramming for tests is just a way to encourage folks to try to spoof a higher grade on the test with just a little investment in time. There must be a better way. Finally they will realize that they should actually do the entire education on line. That way they can monitor the entire educational process and detect that talented freshman a full 3 years before they have had enough schooling to pass the tests. Knowledge is power!

When education REQUIRED that you have 30 students in a single geographical room with a teacher up front – it was not cost effective for a random company to be in the education business. What are you gonna do? – pay one professor to cast a net over 30 people to see if we get a potential employee? – screw that, let the government educate them and we will sift through the finished product. BUT on the other hand if you can cast your net and sift through 100,000 or eventually a million users why not create and distribute and grade your own coursework. Just like advertising it is all about REACH, if you can only reach 30 people advertising is not worth the cost, but if you can reach millions with your ad, it is suddenly cost effective to pay Madison Ave for writers, graphic artists, camera crews, actors and special effects. Educators that reach 30 people will be quaint. You can probably pay extra for that traditional old-time feel at a country club like Harvard.

I do not expect to see the nature of education to change overnight. That is not the way things ever work, BUT I see nothing that prevents big industry from deciding that they could and should start educating their own future workers and stop relying on public education to do it for them. I expect that you will see this progress in the way that their job listings will read. Today the job requirement for a software Engineer at Microsoft is “BS in CS required”. It will change to “BS in CS required AND a grade of XXX in our own online course,” and then someday that BS requirement will be removed since it is, after all, just BS.

Leave a response »

Does English have vowels, and can we find them?

meller : March 15, 2012 12:28 pm : CTL-C

Some years ago I thought of the following toy comp ling (Computational Linguistics) problem. I thought of it before I had ever even been aware that there was such a thing as Computational Linguistics, and I trust the folks that actually do comp ling will forgive me if I make it appear that their problems are as trivial as the one I will now pose.

I wondered if there was some way to ‘discover’ the vowels. I mean, if we were Martians looking at a bunch of English texts, wondering what those Earthlings could have meant by those symbols, would we ever discover that special set of characters that are vowels? I know that this is a nearly impossibly vague question and that is part of the reason that I did nothing with it for years.

My vague notion was that the vowels, because of their special place in the language representing voiced components, must somehow leave some kind of statistical footprint which could be noticed by our Martian linguist.

One of our characterizations of the vowels is that ‘every word needs a vowel’. The vowels are an inescapable set of characters. It occurred to me that our comp ling Martian might well ask the question, “is there a small inescapable set of letters, where every word must have one from that set?” (Yes, I am assuming that our Martians have already decided that space is a separator and that the chunks that they are separating are called words, and while we are at it, Let’s assume that they’ve also figured out punctuation and upper and lower case and they figured out that the dot over the j character is actually part of the same character and not a period. I mean, after all, this is just a toy problem. Let’s not worry about painful details that actually confront our Martian computational linguist.)

Now this is a problem that one could write some code to solve. That is of course the real reason that an old coder like me poses problems like this, just another excuse to think about how to write some more code.

Suppose I start with a block of text which I reduce to a list of words. By hypothesis, each word has a vowel I could choose one letter at random out of the first word and guess that this was a vowel and throw it into the inescapable set that I am building. For each new word, I look and see if it is already explained by the inescapable set so far. Each time I find a word that has no characters from the inescapable set, I know that my set is incomplete so I add another letter selected at random from that word.

By the time I have worked through the word list, I will have constructed an inescapable set. Every single word in my text will have been explained by that set. Unfortunately, there is absolutely nothing that prevented me from making mistakes and including characters that did not need to be in the set. The inescapable set that I have just created is not necessarily minimal. (And of course, as our Martian well knows, there is no guarantee that any single unique minimal inescapable set even exists!)

What can we do now? Well, we could just do it again. And again and again, and each time we run it we could keep track of the best inescapable set that we have seen so far. The first pass might have 15 characters, the next only 10, the next 12, eventually, by sheer luck alone on any single pass we might just have happened to choose the right characters and we will have tripped over the actual vowels.

I wrote the code to do this. If I made just a couple of passes it usually did not find the vowels. If I ran it a hundred passes it did much better, it nearly always found the vowels. If it ran it a thousand times it always found the vowels.

I thought that this was pretty cool. I had code that ‘discovered’ an inescapable set of characters in English words.  I thought that I was done. But the best was yet to come.

I started thinking about how I could tell when to stop running passes. Of course, when you are doing a Monte Carlo app with randomization you might never win –but the more you can slant the odds in your favor, the better things will be. I decided that I needed to know what the odds are that any given pass I might actually generate a true minimal set. Since I choose a character at random from my word and declare it to be a vowel, my odds of being correct are greatly enhanced if I start with short words. After all, with a one letter word like ‘I’ or ‘A’ you are assured that you made the correct choice. With a 2 letter word, you have a 50% chance of doing the right thing. If I stack the deck and start with short 2 letter words,  I can know that making 5 correct choices in a row has only a 1 in 32 (2 to the fifth) chance of happening. Without stacking the deck, my odds are even lower. This is why a hundred runs were not always enough but 1000 seemed to work.

So I started to rework my code to first sort my text to put the short words first and then the light bulb went off. I don’t need the computer at all to do this comp ling problem. Look at this:

Consider this list of words: be, fee, he, lee, me, see, tee, we

Every word on the list is explained by the single letter ‘e’. BUT if I DON’T choose ‘e’ for my inescapable set I am forced to include all of ‘BFHLMSTW’ that is way more than the smallest set that my program produced. This is not proof that ‘e’ is a required character in our inescapable set, but it is strong evidence. Any set that does NOT have ‘e’ will have at least 8 letters.

I quickly produced the following short word lists that provide strong evidence for the other vowels.

A and I are explained by the single letter words. They must be in any inescapable set.

E is explained by the list above, without it we must include ‘BFHLMSTW’

O is explained by odd, of, go, loo, mom, no, or, so, to, zoo – ‘DFGLMNRSTZ’

U is explained by dud, lull, mum, nun, up, us – ‘DLMNPS’

Y is explained by by, cry, fly, my – absence of Y requires at least 4 other letters ‘BM(CR)(FL)’

So if you start with a list of ‘AEIOUY’ ripping out any single one of those letters costs you an addition of at least another 4. This tells us that if ‘AEIOUY’ actually covers all the words (no proof of that, you must run through all the words once and see) than it is actually minimal and in fact both unique and minimal by quite a bit because it costs so much to miss one of those letters.

I thought that this was a nifty result. Comp Ling without any Ling and without any actual Comp needed once you think about it in the right way. I found it especially amusing that the role that the computer played in solving this problem was that it forced me to think about how to create an efficient algorithm and in fact, the solution ended up being so efficient that  there was no need to use the computer at all.

So there you go. English does have vowels and they can be found! You can rest easy tonight.

Enjoy!

Marlin

Leave a response »

Neuron In The Net

meller : March 6, 2012 6:04 pm : CTL-C

It’s getting loud. The party is really beginning to happen. Someone brought some chemicals and things are getting a little twisted. I look out and count 4 or 5 billion… What? 6 billion? Already? Wow, that was fast work! Looks like we’re gonna need some more chips n’ dip if we wanna keep this one from being a bust.

 Can you imagine the noise and confusion at a party this size? It is ungodly huge. We got people killin’ one another over in that corner and we’ve got yer heavy philosophy over there by the fridge and yer entertainment up stairs. Go on, check it out!

 Me? I want to follow the ideas. There are so many folks out there thinking things up, and telling others what to think, and having the ideas extended, modified and mutated. The ideas are amazing. So many of them. What a confusion! I want to head out to the fringe and bring back some of the exotica to show my friends.

 The net is almost ripe for surfing. Storms in Mexico and all that. Most days I can be found out bobbing on the board. Looking for the really big one, OUTSIDE.

 I’m ready to take my role as a neuron. I’ll read a thousand inputs from a thousand random places. Sure, I’ll have my biases. I mean, if you shove an axon into the optic bundle of course you’re gonna get a lot of graphics. If you’re hooked into comics or Vic’s Vapo-Rub you gotta expect things to be a little different. I’ll take my thousand inputs and fuck with ‘em. Mash ‘em together, get excited, get ‘em confused. Get ‘em wrong. Make them mine and pass them on to anyone that will listen.

Look, I’ve got my favorite authors, my favorite inputs. Of course I pay more attention if it is Gauss or Bach or Sedgwick. What’s that? How can I put Sedgwick in the same sentence with Gauss? Hey, I like Sedgwick. Yeah, I know that he didn’t invent all those algorithms, that he just wrote them up. He did it nicely. Besides, are you sure that Gauss wasn’t just a compiler himself? I mean, you don’t know what inputs he was hooked into. Maybe if you were in his shoes you would have been him.

You see, it is a fundamental tenant that all neurons are created equal. They all talk to anybody. They deal as best they can with whatever gets blown at them. They pass it on. Some of the people downstream will just think you’re crazy. “What’s he ranting about this time?” Others will think you’re brilliant, “Oh wow, like I never thought about it that way. That’s fucking amazing!” Those ones will crank up the volume. I mean your fans. They will amplify you and broadcast you and mimic you. Others will call you merely derivative. These are the guys that share some of your inputs. They know all your sources. They know you didn’t invent it all. You copied most of it anyway. But hey… if they say you’re derivative, they’re just jealous.

So I’m a neuron in a great neural net. Ho Humm… another stupid recursive fractal analogy, move up to a colossal scale and see the same thing. Single cell organisms group together to form a single multi-cell organism. Single neurons group to make a single brain which is a neuron in a larger single multi organism organism.

Yeah… and an atom looks like the solar system. Just another standard SF plot. Brin probably copied it from Asimov who caught it from Plato or something. So you’re a neuron. Big deal.

Well, I’m gonna start acting like a neuron that knows he is one and knows how to act. I didn’t choose my inputs. I don’t choose my outputs. I just get to change my scalar values.

Some will think this is novel, some will think it is insane, but some of you, my dearest friends, my cohort will just say, “boring…. Pointless”

Leave a response »
« Page 1 »