Talking to machines and other psychopathological indicators

Ok, nail in the coffin on the whole ‘capacity to post on time’ thing. Moving on.

So, what technologies are needed for an artificial assistant? Short answer: no-one cares. Alternately: this, this and this.

If you’re looking to do something similar, though, don’t get too bogged down just yet, in a few days (yeah, probably a week), when I’ve built a functional first version (currently it just tells me the time and the weather) I’ll be uploading an instructional, along with the source of my program, so that you can play around with the versions as I release them, or develop an offshoot yourself.

Anyway, the point of this post was just to let you know what the first version will do. I’ve finally settled on an initial batch of commands, and they are as follows:

  • “Geoffrey” (tells the program to start receiving commands so it doesn’t try to process every stray sound. Named for this. I’ll think of a better name eventually.)
  • “That’s all” (stop handling commands)
  • “What’s the time”
  • “What’s the date today”
  • “What’s the weather like” (temperature and brief summary, currently not very location-portable; if anyone knows of a good, simple site which gives forecasts for most cities worldwide, let me know in the comments, though honestly I haven’t looked much.)
  • “Set an alarm for (time)” ((time) being an actual time)
  • “What alarms do I have set”
  • “Delete the alarm for (time)”
  • “Book (appttype) for (time) (date) for (lengthoftime)” ((appttype) is currently one of “breakfast”, “lunch”, “dinner”, “a meeting”, “an appointment”, principally because I’m unfailingly dull. More to be added in time. Limitations of the software make it difficult to allow for an arbitrary description, but it will be trivial for you to add your own types)
  • “What am I doing (date)” ((date) being “today”, “tomorrow” or “on the x of y”, x and y being a day and month respectively
  • “Delete my (time) appointment (date)” (e.g. delete my five o’clock appointment on the twelfth of June)
  • “Do I have any new emails”
  • “Read them to me” (in relation to emails)
  • “Read me the headlines” (read headlines of new RSS feed items)
  • “Go on” (re RSS; I figure reading all new headlines will take forever, so it’ll probably only read ten or so at a time, then wait for a “Go on”)

Well, that’s it for now. It doesn’t cover all of the goals I laid out in my initial post, but it should be a solid start.

And what’s life without a little shameless self-promotion? Subscribe to the feed or my twitter for progress updates, and so I can let you know when the first version comes out.

And yes, I’m using twitter, but only really for post alerts.

Sony unveils prototype eye-tracking glasses

Sorry I’ve been a bit useless re a next update, I’ll push it out today, and explain myself at the same time.

Just wanted to mention a post I read earlier that related to my planned cellphone glasses.

It seems Sony are building a set of frames which will include an eye-tracker (my planned form of input) and an outward-facing camera (useful for augmented-reality applications) built into the frames. Since all open-source blueprints for iris trackers call for a them to be attached to the end of a protruding rod, this would be a significant boon to the project.

I’m not entirely sure how they’re going to move the iris tracker to the frames without losing too much accuracy, as this is precisely why most implementations don’t do so, but they do mention that the frames are only designed to be good enough for the purpose at hand, and are not a ‘full-scale eye tracker’, so I suppose it’s just a case of near-enough-is-good-enough.

These decidedly don’t look as though they’ll be ready in time for the project, but they should improve it markedly when they do arrive. Can’t wait.

What I look for in an artificial assistant

(This is a bit long, but bear with me. If it helps, it ends with laser-shooting autonomous robots. Also, most future posts will likely be a lot shorter.)

A couple of weeks ago I saw Iron Man 2 with a friend of mine, the first of many viewings (all research, I swear!). As you do after being bombarded for two hours by ridiculous technology, we got to talking about what of the myriad ludicrous contraptions could reasonably be created using available technology and a student’s budget.

After he shot down my dream of augmented-reality iris-tracking cellphone glasses (which I still believe is feasible, more on that in a few months), we got to talking about Jarvis, Iron Man’s artificially intelligent assistant.

While we both discounted (albeit grudgingly) the possibility of near-term development of artificial intelligence per se, neither one of us was convinced that a reasonable emulation of such an assistant was an impractical goal. Though Richard (that’d be my friend. I’m Wyatt, by the way. Hi!) thought that natural language processing, whereby a computer can interpret the intent behind an arbitrarily phrased command, was necessary for a useful assistant, I thought that a set of a few dozen fixed commands could be almost as useful to a user willing to deal with the learning curve.

This led to something of an impasse in our plans as, though natural language processing technology exists, it’s well beyond both our financial means to buy and cognitive means to build. After a rather drawn out discussion on the matter, I ended up on the losing (that is, onus-bearing) side of a wager.

And so we come (at last; if you have the patience to stick around you’ll find I never shut up) to the point of this post, this blog, and, for at least the next few weeks or months, depending on how far I decide to take the project, my life: To document my building of a functional and genuinely useful artificial assistant. ‘Genuinely useful’ is a bit of a cop-out description, but as a rough guide, I want it to:

  • accept commands by voice with a low failure rate (maybe 5%; this is largely beyond my control, so I’m just hoping that the available technology is good enough). Essentially I want it to fail infrequently enough that the system doesn’t frustrate me.
  • be at least as efficient as a keyboard-and-mouse based system. Obviously, a spoken-command system is going to be slower, but since I can speak to it while typing something unrelated, if the failure rate on commands is low enough I’ll call it a wash.
  • handle obvious padding functions like reporting the time, date, weather etc
  • book and alert me of appointments, and recount any given day’s appointments on request
  • get my emails and possibly send preset emails in return (if I can think of a reasonable usage case). Sending dictated emails is probably impractical for the time being since the last time I used voice recognition software it required constant monitoring to ensure it didn’t misunderstand me. Kinda defeats the point.
  • read RSS feeds and alert me of news both as it happens and on request (let’s see how pissed off I get being interrupted by a computer). As an aside, I’ll probably use face or motion recognition to ensure it isn’t speaking to an empty room, which most experts agree is one of the three signs of insanity in computers.
  • maintain an address book. Not much use presently but I want to eventually integrate it with the appointment calendar and mail client along with, if possible, some phone or Skype integration. For example, ‘tell everyone attending my three o’clock appointment that I’m running fifteen minutes late’, and have the program send them all SMSs to that effect. I am a little unsure of how I’m going to add contacts or email addresses, since the voice-recognition software I’m using requires me to use only pre-specified words, thinking I might spell them out using the NATO alphabet or something, but we’ll see.

I’ll probably add all too many more functions as time goes by, but if I could coax it into handling all of the above points, I’d be satisfied that the project was a success. Or rather, if all of the above works out, I think it’ll feel useful, which is my true measure of success.

I’m vaguely inclined to build in Facebook and Twitter integration, they being the features du jour, not to mention a cakewalk to implement, but so long as hell remains nice and toasty I’m unlikely to post on either, so they wouldn’t serve much of a purpose. I don’t know exactly who I’m saying this to, since it’s doubtful I’ll get any readers, let alone comments, for my first post, but if you have any ideas for other features I’d love to hear them.

Anyway, next time, probably some time tomorrow, I’ll go through in more detail how I plan to build it, the technologies I’m using and what I’ve managed so far.

Oh yeah, and when I’m done I’ll stick it on wheels and attach lasers to make good on my opening promise. I know, a little weak, but it’ll run down the hall shooting lasers at me while reading out the day’s headlines. Is there anything currently in existence even remotely as awesome as that sounds?