Archive for March 2006

Developing Speech Applications - Part I

Haven't blogged, or conducted a session, about Speech Technologies (my second love) and the Microsoft Speech Application SDK (SASDK) for quite sometime. However, I have received a lot of requests for the SASDK tutorial on FlightEnquiry.NET Speech Demo that I showed at the Microsoft ISV Community Days in March 2005, Pakistan Developer Conference 2005 in June 2005, and at various INETA Pakistan user group events last year. I have figured that it is better to blog about something in a series of posts, as opposed to trying to type one HUGE post and be absent from the blogging world for days and weeks. So, if time and old age permit, I will hopefully try to blog about the SASDK tutorial in 3 posts. (I will try to keep the "Atlas at last" posts coming during this time too.)

NOTE: Most of the following text had been written about 8 months ago.

BACKGROUND
When ASP.NET first appeared on the horizon, most of us heaved a sigh a relief; No more having to work with ASP's two separate languages, VBScript on the server and JavaScript on the client; no spaghetti code; and most importantly, no need to worry about posting data back on to the form when it reloaded. With a number of cool ASP.NET server controls, life had been made simpler for the Web Developer. In a sense, for a brief moment in time, ASP Developers (who had shifted to ASP.NET) forgot all about JavaScript, since all ASP.NET controls had client-side logic built into it. The burden of writing JavaScript (or JScript) code had shifted from the Web Application Developer to the ASP.NET Server Control developer.

INTRODUCTION
Enter Microsoft Speech Application SDK (SASDK). Suddenly, client-side scripting became relevant again. Speech Applications built with the SASDK were browser-based ASP.NET applications with much of the handling done at the client i.e. web-browser. In making their Speech platform compatible with ASP.NET, Microsoft had 2 objectives: (1) to allow companies to use their existing ASP.NET code infrastructure in developing Speech apps; and (2) to make use of SALT (Speech Application Language Tags), a standard for providing speech input and output in the web-browser using tags. In various talks and speaker session, I have repeatedly been asked if the Microsoft SASDK would speech-enable websites or allow voice-activated web browsing. My answer: yes and no, depending on the type of platform you are developing the speech application for.

A Speech Application developed using the Microsoft SASDK can be in either one of two categories; (1) Voice-only (2) Multi-Modal. Both application types leverage existing ASP.NET code but differ in implementation. Voice-only applications have a VUI (Voice User Interface) and no GUI i.e. all communication between the app and the user is by means of voice or DTMF (Dual-Tone Multi-Frequency) that may be transferred over a telephone line or any other voice carrier medium. Multi-Modal applications, on the other hand, have a VUI as well as a GUI. This allows the user to interact with the application using voice as well as with the mouse and keyboard. The use of Voice-only telephony applications is understandable, but why use Multi-Modal? Fact of the matter is that the bulk of Multi-Modal applications that would be developed within the next few years won't be for the desktop PC, but would instead be for Mobile and Smart Clients. PC users typically have all the hardware they need, namely the keyboard and mouse, to interact with their applications. However, it is difficult to use Mobile/Smart Client application because of the small keypad or even stylus while on the road. Microsoft's release of the Mobile Internet Toolkit did not really bring about increased development of Mobile applications as many would have expected because mobile or smart client applications are extremely difficult to use because of the small size of the device they are running on. Speech-enabled Mobile or Smart Client applications require minimal use of the keypad or stylus and rely more on speech for input (and even output), thus providing increased ease-of-use. [NOTE: Multi-modal applications failed to take-off in a way that Microsoft had exepected. The Speech API (SAPI) 5.3 that comes built into Windows Vista would allow development of desktop speech applications and DOES NOT make use of SALT. You can download a video of the Multi-modal Speech Application (Sublime Demo) running on a hand-held device from Jahanzeb Sherwani's web page.]

The execution model followed by an application developed using the SASDK is somewhat the same as any ASP.NET application, consisting mainly of a certain number of requests and responses. However, the client-side processing done for a speech application is different. In a typical web form (non-speech app) scenario, information is submitted to the server through form controls (textboxes, checkboxes, radio buttons etc.) which form the GUI. In speech applications, input is received by the a speech control that is not visually rendered but is in fact hosted inside a web page using the tag from SALT specification. When a voice input is received, it is processed on the client by JScript, meaningful information is extracted and sent to the server by means of a variable known in the SASDK domain as a "semantic item". The importance of JScript that I highlighted at the start of this article stems from this client side processing of the speech input. This scenario is valid for both, Voice-only and Multi-modal Speech applications. For more information on SALT, check out the SALT Forum web site.

Please note that in order for Internet Explorer to interpret SALT tags, you need to have Microsoft Enterprise Instrumentation Framework (EIF) installed on your PC. EIF is a pre-requisite and comes with the SASDK. Also, SASDK v1.0 and v1.1 work only with .NET Framework 1.1 and Visual Studio.NET 2003.

In the second part of this 3 part series of posts, I will briefly discuss the series of functions performed during the execution of a typical Voice-only application.
Monday, March 27, 2006
Posted by Adnan Farooq Hashmi

Atlas at last! - Part I

The great thing about being a Microsoft MVP is that you are always out to learn new tools, technologies, tips, and tricks, in anticipation of being asked a technical question on any forum regardless of what your competencies actually are. Not only does gaining knowledge benefit MVPs themselves; it also lets them share that knowledge, expertise, and experience to the developer community at large. I simply love doing it, and I am glad that I too am finally blogging about AJAX and Atlas after so many of my fellow -MVPs and -bloggers around the world have already done so. I hope to blog about Atlas, as I have understood it, in a series of posts.

You might have been hearing and reading a lot about Web 2.0 lately. If you haven't heard about it, and you apparently live in a cave (as the saying goes) check out an indepth article on it here.

AJAX is all about JavaScript as it involves using the XmlHTTP object built into your web browser, and has been a part of the browser since 1998. I dont think a lot web developers would want to get down to writing AJAX apps using directly the XmlHttp object, and would instead rely on pre-built libraries that leverage XmlHTTP. If you happen to be an ASP.NET developer (and you probably are), you need to get hold of Atlas, the client and server-side libraries and controls for building AJAX applications from Microsoft. Since Atlas has not been finally released, you might have to do a lot of JavaScript coding yourself. The thing with working with JavaScript is that when you are getting down and dirty writing code in a language that you usually do not program in, and reluctantly learning many of the details of client-side scripting that you are not accustomed to or atleast did not care about previously, you are bound to get an error once in a while. Many times, the line of code that is marked as being the culprit for the error is not actually the one that has something wrong with it. Since you might not get a lot of Intellisense and debugging support in the VS IDE for client-side scripts (although Microsoft is working to change that in future), there are a couple of things that I recommend you do when you encounter a JavaScript error.


  1. Pray that you are able to solve the problem within the next hour. [Optional]

  2. Go and have a glass of water. [Mandatory]

  3. Take a deep breath and relax. [Mandatory]

  4. Empty your web-browser's cache. [Optional]

  5. Go over your script's code, one statement at a time, to ensure that you did not mis-spell anything. Check variable and function names. Do not assume that your typing skills are excellent. You could be great, but that changes when you are typing JavaScript code. DO IT! [Mandatory]

  6. REMEMBER, JavaScript is case-sensitive, so textbox is not the same as textBox. [Mandatory]

  7. Use the JavaScript 'alert' function judiciously to display values in a browser dialog box, to ensure variable values are what you think they are at a particular point in your code. [Mandatory]

  8. There is NOTHING wrong with the Atlas client libraries (*.js files inserted into the project when an Atlas website is created); so if you get an error at, lets say, line no. 1554, check to see if you have closed all Atlas tags correctly. [Mandatory]

  9. IE would do just so much to let you know the line of code that has the error. It is highly likely that the line marked with having the error is perfectly alright. Check to see if that line in followed by a function-call statement. If it is, go to point no. 5 and start debugging that function. [Mandatory]

  10. REMEMBER, AJAX/Atlas involves calling web services asynchronously. Every web service call must also specify a call-back function, which is executed when the service finishes executing and returns data back to the browser.

  11. When you get an error on the JavaScript statement calling a web service, be sure to debug the call-back function (the function that executes once the web service returns the results of its execution). [Mandatory]



There are more indepth tutorials and technical stuff that I would cover in my future "Atlas at last" posts.
Thursday, March 23, 2006
Posted by Adnan Farooq Hashmi
My Passion, My Inspiration, My Pakistan

Popular Posts