33 Replies Latest reply on Nov 22, 2011 6:50 PM by RJKSolutions

    24 hour program-a-thon

    MichaelvdV@Atos

      Hello fellow vCampus members,

       

      I've posted a blogpost on a 24 hour program-a-thon I'm doing with the Kinect sensor to further research the concept of a Natural User Interface for the PI System and PI AF. I've committed the next 24 hours to a single Kinect programming challenge.

       

      I'm going for option #3 from the blogpost:

       

      Create a speech recognition application, which uses the Kinect microphone array. This application should be able to answer questions about the PI system (get and display data, create trends, access AF Elements).

       

      I've already gathered some idea's on how to approach this. The first (and most important thing of a new project) to do is choose a name. I'm calling this project 'Bob'. Not sure why, but it just sounds great to be able to address this application as 'Bob'.

       

      My goal is to create an application that recognizes speech, and responds by talking to the user and displaying the requested information. My goal is not to create something 'intelligent', but more to create a challenge-response (question - answer) system that can be usefull using speech. This can be further enhanced in the future by using the tracking capabilities of the Kinect sensor.

       

      My first goal is first lay the groundwork to be able to request and display basic information from a PI system, and go from there.

       

      I will keep you updated, and please... if you have any suggestions, let them know!

        • Re: 24 hour program-a-thon
          MichaelvdV@Atos

          First update (time flies when you are having fun...)

           

          I've managed to get a part of the speech recognition working, and 'hooked' it to the PI System. The recommended speech recognition library for the Kinect microphone array is the Microsoft.Speech library. This is the 'server side' of Microsoft's Speech Recognition Platform. I experienced some issues with this a few days ago (will go into detail later...), so I switched to the System.Speech library (this is the 'client' library for speech Recognition). The issue with this library is that you have to train it (using the options in the windows control panel.

           

          I've managed to create a little challenge - response framework, where I can easily add new 'hooks'. A hook consists of a grammar and an action if that grammar is being produced by the user.

           

          Here is a little update on the progress so far. Bob is able to recognize questions regarding snapshot data and point attributes:

           

          http://www.youtube.com/watch?v=S4aIHV8CCJE

            • Re: 24 hour program-a-thon

              Wow, that is impressive!

                • Re: 24 hour program-a-thon
                  MichaelvdV@Atos

                  Thanks Luis!

                   

                  Status update:

                   

                  I've implemented a hook to show historic PI System data in a table. I'm working on getting it into a trend. I'm also working on getting Bob hooked to the Wolframalpha API to be able to display information about generic calculations, weather information, chemistry and physics information, etc.. It's working, but it's not working smoothly. The issue here is that I cannot predict the grammar.

                   

                  The voice recognition works best if I can predict the grammer. For instance, when requesting information about snapshot or historical PI System data, I can load up the tagnames upfront (and include it into the grammar) so the voice recognizer can assume certain words (or terms) . When working with 'free text' the voice recognition API doesn't seem to recognize the right words. I will have to work on that, because hooking it up to a webservice like the Wolfram Alpha API would really expand the functionality off the application.

                   

                  I'm also thinking about creating some sort of workflow mechanism, where you can 'dive deeper' with the voice commands. For instance, once you give the voice command to display a trend, you want to be able to change the timerange. Additional grammar has to be loaded once a trend (with historic data) is being displayed. This poses a bit of a challenge.

                   

                  More to come, stay tuned...

                    • Re: 24 hour program-a-thon
                      Lonnie Bowling

                      I love BOB!  He seems to have a personality when reading off all of those values, excellent. This is a lot of fun to watch, can't wait for the next update!

                        • Re: 24 hour program-a-thon
                          MichaelvdV@Atos

                          Still struggeling with the 'free text'. The speech recognition is really error prone when dealing with non-defined grammars. I will leave that part for what it is, because it is taking up a lot of time.

                           

                          I've restructered the base to be able to handle 'global' user settings. These can be used for time ranges, PI Server names, etc. It already works when I want to call the history of a specific tag, and then I can increase or decrease the timerange. It's not pretty, but I'm getting there.

                           

                          After 8 hours of almost non-stop programming I'm getting tired, and I'm afraid I will be making some bad design decisions. Going to get some sleep and return in the morning! I will post a new video update asap.

                            • Re: 24 hour program-a-thon
                              Ahmad Fattahi

                              Go Michael! I was thinking what new doors this whole new way of interaction can open up. Like I said before, accessibility is would be a huge deal. Also, imagine environments where someone has to have their eyes/hands somewhere else like while driving a car. It allows you to use PI System in those environments.

                                • Re: 24 hour program-a-thon
                                  Lonnie Bowling

                                  This really has got me thinking too.  One of the big things about using smart phones is the UI has to be intuitive and easy to use.  This is something that I need to start thinking of for the mobile work I'm doing (now putting it on the big goals list for 2012).  Michael, good going!!!  It is funny, because last weekend I did some XP (extreme programming) for my mobile/cloud stuff, but only for 10 hours, I'm such a wimp!  Too bad we live so far apart, it would be fun to do a session together with a few of us!

                                    • Re: 24 hour program-a-thon
                                      MichaelvdV@Atos

                                      Thanks for all the positive comments!

                                       

                                      Starting again after getting some sleep. It's amazing what a refreshed look can do for your project.

                                       

                                      Something I forgot was the concept of 'events'. You want to be able to subscribe to events (for instance: Notifications or EventPipes). This is something that should be present in such a product.

                                       

                                      Working on that now, I hope I can show something more within the next couple of hours.

                                        • Re: 24 hour program-a-thon
                                          Asle Frantzen

                                          Go for it, Michael!

                                           

                                          Will be watching for updates throughout the day

                                            • Re: 24 hour program-a-thon
                                              MichaelvdV@Atos

                                              Another update:

                                               

                                              Getting some sleep really helped me to get some work done :)

                                               

                                              The current status is: I can request snapshots, point attributes, historic values (and display them in a trend and table), and request information from the wolfram-alpha webservice. I'm also able to set 'environment variables', such as timeranges. It is now possible to request updates when new events for a PI tag are available (EventPipes).

                                               

                                              I've created a short demo during one of my debug sessions. It's quite lengthy, but it shows the progress so far. As you are able to notice, sometimes the recognizer assumes the wrong words (CDM158 instead of CDT158). This is also a matter of speaking clearly, and I've noticed that due to my Dutch accent, some words are pronounced incorrectly, therefore it assumes the wrong semantics.

                                               

                                              Another issue is background noise and the fact that the speech recognizer picks up the speech from the speech synthesis. When Bob is talking, it sometimes interprets the (computer) speech as human speech. At this point I'm pausing the speech recognition when speech is synthesized (when Bob is speaking). This works well for most cases, but there are some minor issues there.

                                               

                                              Specially the hook with the wolfram-alpha API is quite impressive. It's easy to get loads of information just by stating your intentions. The really big issue here is that the current 'queries' are predefined, and they are not 'free'. I still have to work on improving the grammar, and recognizing 'free' text.

                                               

                                              Here is the update on youtube http://www.youtube.com/watch?v=WrS7o9C91I8 (720P, sound on)

                                               

                                              Next stop: major overhaul of the grammar logic, to be able to create better and more complicated grammars, and extract the right information once the grammar is recognized to text.

                                                • Re: 24 hour program-a-thon
                                                  Lonnie Bowling

                                                  Excellent job and progress!!  If you just keep going at this pace we can do a product release at vCampus Live :)

                                                   

                                                  So what are you using for your chart control?  That looks nice.

                                                    • Re: 24 hour program-a-thon
                                                      MichaelvdV@Atos

                                                      Thanks!

                                                       

                                                      Not so sure about the product release :) But maybe we can demo it somewhere, and have a look at the code or something.

                                                       

                                                      The chart control is from the WPF Toolkit. It's free at nicely stylable.

                                                       

                                                      Care to give some input on where to go next?

                                                        • Re: 24 hour program-a-thon
                                                          Asle Frantzen

                                                          Michael @ OSIsoft

                                                          Care to give some input on where to go next?

                                                           

                                                           

                                                          I don't know, maybe some augmented reality overlay for the video of yourself?

                                                           

                                                           

                                                           

                                                          Use a gesture to create a square / box, and then display the charts in this overlayed box? Could be fun

                                                            • Re: 24 hour program-a-thon
                                                              MichaelvdV@Atos

                                                              Well... My 24 hour program-a-thon is to an end.

                                                               

                                                              I think I've programmed about 16 - 18 hours of this 24 hours, and I've learned a lot!

                                                              • It almost seems like having some sort of speech recognition engine is a real possibility to create an easy to use interface
                                                              • Our line of work (and the PI System in general) seems to integrate pretty well with speech recognition. I think this is because we deal with a lot of small information (values, attributes, names), no lengthy texts. Asking for values, updates and creating trends seem to work pretty well.
                                                              • The System.Speech.Recognition namespace offers a lot of possibilities, and is really well set up.
                                                              • The 90-90 rule also applies here (I think this is because I was targetting this project to be small, so the last few hours were the last 10% of the 'project')
                                                              • This has a lot of potential, and I see a lot of ways this could integrate well with the motion sensing capabilities of Kinect
                                                              • 1L of Redbull gives you a headache
                                                              • Better to sleep for a few hours and come back when you are really tired. You will start to make bad design decisions, and you will miss obvious stuff.
                                                              • The bad thing about writing speech recognition software is that you can hardly listen to music while coding. Once you run your application, it will go bezerk trying to recognize the music as speech. Maybe this also has to do with my taste of music...

                                                              For me this is certainly something to do more often. At the beginning I knew nothing about the speech recognition library, and now I know stuff :)

                                                               

                                                              I think I will certainly continue to develop and research more on this subject! I'm excited by the enthusiasm of everyone, and I think this means I will have to continue with this....

                                                               

                                                              Thank you everyone! Stay tuned for further updates (although not every few hours...)

                                                                • Re: 24 hour program-a-thon
                                                                  Ahmad Fattahi

                                                                  Your energy and good job is awesome Michael! I also learned quite a bit through your posts. Maybe another use of this would be "verbal" (as opposed to "manual") data entry into the PI System!

                                                                    • Re: 24 hour program-a-thon
                                                                      Lonnie Bowling

                                                                      I think one of the big challenges of speech are good use cases.  So far I have had pretty poor experience using speech recognition in any meaningful way (as a consumer), it is usually pretty underwhelming.  I think is has to do with the requirements for it to work well, like a noise free background and clearly speaking.  That being said, it sounds like data access could be a bit easier as you have pointed out.  I think maybe hands free access to data over a phone while driving would be nice.  I'm wondering if there is a situation where speech and data request are routine enough for regular use.  Maybe a pilot or an air traffic controller, or maybe a stock trader, someone that needs hands and eyes free, but has a headset.

                                                                       

                                                                      Also, I think that people would not want to say tag names, but more like area/equipment/parameter, like "Bob, show me Dallas plant #2, generator #1 and #4 power output for the last 48 hours."  Bob would then pull up a chart with all power tags for that equipment.   Maybe that would be the strength of this system, people could intuitively query data.    Anyway, just thinking out-loud.  

                                                                       

                                                                      Great going Michael, you have really excited me with your effort and results!

                                                                        • Re: 24 hour program-a-thon

                                                                          Bugger, busy for a few days and I missed your program-a-thon.

                                                                           

                                                                          I jumped knee deep in to speech recognition with the Kinect too, the problem that I hit was not the programming of handling the speech (I ditched Microsoft.Speech library too, wasted too much time trying to get it to work nicely), I think time needs to be put in to a model/matrix of the potential voice commands that you would want to use.  Some commands would persist throughout the interface and others only become valid depending on the context of where you are within the interface.  At this point I shut my laptop and got out some pen and paper to map out a commands matrix (still in progress).

                                                                           

                                                                          The next key step is to combine the gestures and voice commands, for example a gesture of "placing your hand on your mouth and 'blowing a kiss' by moving your hand forward" is a trigger to enable voice commands, similarly  "leaving your hand on your mouth and waggling your finger" is the trigger to ignore voice commands.  I have mapped these out on paper too, all at a basic level for now.  If you get this part right then the programming becomes simple.

                                                                           

                                                                          I've seen a few Kinect SDK gesture libraries starting to be released recently, basic gestures only though.

                                                                           

                                                                          The fun part would be to decide how to navigate both PI system infrastructure and data...keeping that up my sleeve for now  Maybe there should be a (global) virtual 24 hour program-a-thon to develop a community project for vCampus?  (Not on Kinect but on another PI topic e.g. PI Web Services AF Data Reference.)

                                                                           

                                                                          By the way, I think the Kinect SDK licence still only permits research & development...

                                                                            • Re: 24 hour program-a-thon

                                                                              Just to add...I agree with Lonnie above.  The focus shouldn't be on tags but instead it should be Elements & Attributes.

                                                                              • Re: 24 hour program-a-thon

                                                                                Rhys @ Wipro

                                                                                Bugger, busy for a few days and I missed your program-a-thon.
                                                                                Same here! It's too bad you chose my week of vacation to hold this first program-a-thon! Looking forward to the next one (maybe for a Windows 8 Metro style app??)

                                                                                  • Re: 24 hour program-a-thon
                                                                                    MichaelvdV@Atos

                                                                                    Yeah, I know... I've should have anounced it far earlier. This was really a spontaneous action, because I had the time and the motivation.

                                                                                     

                                                                                    Promise that next time we will announce it more earlier, and maybe even set something up so people can collaborate

                                                                                     

                                                                                    Steve Pilon

                                                                                    Looking forward to the next one (maybe for a Windows 8 Metro style app??)

                                                                                     

                                                                                    There is definitly going to be a next one! For me it was a great success. I have a Windows 8 installation, but no Windows 8 touch device yet . I'm pretty yealous of the BUILD attandees!

                                                                                      • Re: 24 hour program-a-thon

                                                                                        Michael, figured you would like this article...mind blowing stuff!  Add location information in to the mix and you can merge real world assets with the virtual world of real time data....

                                                                                         

                                                                                        techcrunch.com/.../more-mind-blowing-real-world-kinect-interaction-from-microsoft-research

                                                                                          • Re: 24 hour program-a-thon
                                                                                            MichaelvdV@Atos

                                                                                            Wow, thanks for that link Rhys. That is indeed absolutely mindblowing. If this can be achieved using gaming technology now, I cannot begin to imagine the possibilities...

                                                                                              • Re: 24 hour program-a-thon

                                                                                                Wow, thanks for sharing this Rhys! I can't wait to play a game in my living room and fight these buggers in 3D all around me!

                                                                                                  • Re: 24 hour program-a-thon
                                                                                                    Lonnie Bowling

                                                                                                    Yes, that 6 minutes wasted was worth it.  I could imagine one day where we cannot easily separate real from virtual!

                                                                                                      • Re: 24 hour program-a-thon

                                                                                                        Apart from a really expensive game of hide and seek, my vision for this sort of stuff would be for someone to be able to walk around a plant and put on their "PI goggles" (similar affect as beer goggles ).  Wearing their intrinsically safe goggles they see the actual pieces of equipment like normal but can see the associated measurements, symbols & trends overlaid from the virtual world on the equipment...and...then interact with that data with their bare hounds, where ever they are on the plant (like you saw on that video).  

                                                                                                         

                                                                                                        Anyway, back down to reality for now...

                                                                                                          • Re: 24 hour program-a-thon
                                                                                                            MichaelvdV@Atos

                                                                                                            That sounds really great and futuristic! I was really blown away by this. Sometimes people say that StarTrek technology is becoming reality, but I don't believe that Gene Roddenberry would have dared to envision this for a (near) future! 

                                                                                                             

                                                                                                            However, as with a lot of new technology (including my Kinect efforts) I sometimes have trouble seeing the real added value and benefit. What you are envisioning sounds really great, and I would be excited to see something like that... but what is the added value over an engineer walking around with a tablet computer, assisted by something like RFID for positioning? (maybe there is, trying to start a discussion)

                                                                                                             

                                                                                                            I really believe in the power and possibilities of body/hand tracking and speech recognition, but really: if you really take a look at it, it would be for specific applications. As a developer my work is switching from Visual Studio to Outlook to my Browser, and I can't even imagine using a touch device for my daily work, let alone something beyond that

                                                                                                              • Re: 24 hour program-a-thon
                                                                                                                Lonnie Bowling

                                                                                                                Michael, you bring up a great point, I am asking myself that question all the time with new technology.  We are our own worst enemies, as our world view is formed from all our past experiences.  It takes a lot of effort to leave that behind and envision something new that truly replaces an old way of doing things, and it usually takes time.  I think when I first saw the iPhone with pinch/zoom/swipe I thought is was OK and interesting, but could not have imagine how it has influenced my everyday life 3 years later.

                                                                                                                 

                                                                                                                I think augmented reality will be a big part of the future.  I don't think it will be very obvious how it gets adopted at this point, but that is just one killer app away.   Gaming seems to usually lead the way, but who knows.  I think driving a car has a lot of possible applications.   It is fun to think about, that is for sure!

                                                                                  • Re: 24 hour program-a-thon
                                                                                    cbold

                                                                                    Michael,

                                                                                     

                                                                                    I must say this is very good stuff you are doing here! Don't stop now.