Sushil J Louis, Anil Shankar
Evolutionary Computing Systems Laboratory (ECSL)
Department of Computer Science and Engineering
Introduction Computers today use an internal clock, keyboard and mouse to provide input or context for their applications' information processing. Operating Systems support these devices and applications access this provided context through simple Application Programming Interfaces (APIs). Current application software uses this meager context to build user models and try to enhance personal productivity. There is no personalization through learning, no long term memory, and advances in vision, speech, text analysis, and the availability of cheap computing power have not been fully utilized. That is, many current computer applications lack context-awareness.
Dey [5] gives the following definition for context.
Any information that can be used to characterize the situation of entities (i.e., whether a person, place or object) that are considered relevant to the interaction between a user and an application.
In our work, we view a computer as a stationary robot with simple sensors such as for motion and speech[11]. Even without knowing who is there or what is being said, such simple sensors can be used to improve user interaction. For example, if you were Jane's user-interface you could learn answers to the the following questions.
We propose to use simple sensors to continuously gather data on the computer system's internal and external environment, store this data in a data warehouse, and mine this data for useful user-behavior patterns, in order to better predict user preferences (behavior) and improve user interaction. Application can then use this learned model of user preferences to better interact with the user. In this paper, we use a simple calendaring application program, Sycophant, that stores appointments and reminds the user using different types of reminders as a test-bed to investigate these issues. Specifically, we investigate whether Sycophant can learn a mapping from context-features to reminder type. More generally, we are interested in whether applying machine learning techniques to data gathered from simple context sensors will lead to improved human computer interfaces.
Our system continously gathers binary activity data from the keyboard,
mouse, a motion detector, and a speech sensor. We also monitor the
activity of five processes on the computer. Whenever Sycophant
generates a reminder, it expects the user to indicate whether
Sycophant used the correct reminder type. A reminder can be visual (a
pop-up window), speech using a text-to-speech system, both, or
neither. Periodically, we run a machine learning algorithm on the
gathered data merged with this user feedback to learn to predict which
of the above four types of reminders to generate for an appointment.
Preliminary results using Sycophant with external (motion, speech) and
internal (keyboard, mouse activity) sensors and a decision tree
machine learning algorithm leads to about
accuracy in
predicting whether or not to generate a reminder. Correctly predicting
which of the four different types of reminders to generate is less
accurate at about
.
Related Work Much work has been done in the area of context-aware applications and environments. Reba is a reactive system which creates context-aware room reactions by using information from cameras, microphones and other sensors [11]. This work showed the necessity for systems to be context-aware to be able to anticipate user actions and simplify user interaction.
Bailey and Adamczyk have showed that computer generated interruptions which require user input or feed back have a disruptive effect on the user's emotional state as well as the user's task performance[2]. Their study showed that at the point of interruption, the degree of disruption depends upon the user's mental load. Their work implied that the user's attention must be carefully managed among competing applications and that this management is necessary to mitigate the disruptive effects of necessarily interrupting a user.
Hudson, Fogarty, Atkeson et al.'s work comes closest to our own in
exploring how to construct robust sensor-based predictions of
interruptibility by conducting a Wizard of Oz
study [10]. They also considered which sensors
might be useful and how they could be constructed. In their study
they used experience sampling to collect self-reports of
interruptibility. Next, they built statistical models predicting human
interruptibility and achieved an overall accuracy of
using
several models. The self-reports from their initial Wizard of Oz
study, where a subject was asked to distinguish between different
levels of interruptibility showed that it is possible for humans to be
accurate to an extent of
and statistical models could achieve
as much as
accuracy. In their next study, they used real
sensors to to construct models of human interruptibility for three
different groups of people who included interns, managers and
researchers by [6]. This study also tried to
determine how much data should be collected to provide statistically
reliable estimates of interruptibility.
Horvitz and Apacible built models for predicting the cost of interrupting users [9]. For this purpose, they used machine learning techniques for generating statistical models to infer the state of interruptibility of users.
Our work is complimentary to the above approaches. Sycophant learns whether or not to interrupt the user as well as how to interrupt the user. Like Fogarty we use real sensors but in addition to learning whether to interrupt the user, sycophant uses machine learning algorithms to learn which one of four different types of reminders to use in interrupting the user. In this work, we compare the performance of different algorithms as well as the effect of different sensors on learning the mapping from sensors to reminder type.
Sycophant can generate four different types of reminders: A simple pop-up window containing the appointment text, a voice reminder where the appointment text is spoken using the Festival Speech Synthesis System [3], both the previous types, and neither. In the last case, no reminder is generated but is instead buffered for later output. This can be desirable behavior for example when there is no one in the room at the time the reminder is generated. On the other hand, it can also be quite annoying if your calendar ``learns'' not to remind you under certain conditions.
The appointments for Sycophant were set up to mimic the user's regular work-day. These included reminders for drinking coffee, attending talks, conferences, classes, some personal appointments, and reminders outside regular office hours. For example, a reminder for watching for watching a soccer match on cable TV at two a.m. in the morning would fall outside of regular office hours. Sycophant in this case learned a rule which said that if the appointment time was before nine a.m. in the morning, then no reminder was to be generated. This context learning was performed with respect to the user under study whose regular office hours start at nine a.m. Figure 1 shows a screen-shot of the application.
Figure 2 depicts sycophant's architecture. The calendaring
application runs as separate process and five sensors collect data on
the computer and immediate vicinity. These sensors are binary, for
example, when the motion sensor detects motion it reports a value of
,
otherwise. Our sensors are:
For this preliminary feasibility study, we
collected data from a single user over a period of six weeks.
Every fifteen seconds, we checked all sensors for activity and stored these values to a file. Next, we extracted the following six features from the raw
data [10]:
Any5, if the sensor is active during any of the fifteen
second intervals during the last five minutes. All5, if the
sensor is active during all of the fifteen second intervals during the
last five minutes. Any1, if the sensor is active during any
of the fifteen second intervals during the last minute. All1,
if the sensor is active during all of the fifteen second intervals
during the last minute. Immed, if the sensor is active during
the last fifteen second interval. Count, the number of
intervals during which the sensor is active during the last five
minutes. Therefore every sensor provided six features. We considered
each of the five user processes as a separate ``sensor,'' so the number of sensors grew to nine and we therefore ended up with a total of
features.
Finally, we also included a user identifier and the next appointment time.
Sycophant can be instructed to remind a user
minutes before a
scheduled appointment. When it is time to remind a user (say
minutes before the appointment time), Sycophant initially checks for
the existence of a learned user model (a mapping of context features
to reminder type). If so, it uses the reminder type as dictated by the
model to remind the user. Initially, when no model has been learned,
we use a hand-coded rule set. This static hand-coded rule set is used
until we get the minimum of ten exemplars needed by Weka
(
-fold cross-validation requirement). Once the reminder is
generated, the user can give feed-back to Sycophant agreeing with the
reminder type generated or providing their preference. It is this
user-feedback which is used for creating the training data set for our
machine learning algorithms.
Here is an exemplar from our data set:
User1, 05.00, 0, 0, 0, 0, 0, 0, 7, 0, 1, 0, 0, 0, 20, 1, 1, 1, 1, 1, 20, 1, 1, 1, 1, 1, 20, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 20, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0where the first two features correspond to User-Id and Time of Appointment. The remaining set of features in groups of six represent Sensor-Count, Sensor-All5, Sensor-Any5, Sensor-All1, Sensor-Any1 and Sensor-Immed. The sensors are ordered as follows: Motion, Talk, Process, Keyboard and Mouse. Each of these six features is derived for each sensor mode. The last value from the above data row corresponds to the type of reminder actually preferred by a user, which is obtained through user feedback.
Once, the context-sensitive data set is available, it is used to build a user model to learn the type of reminder to generate. We use and compare multiple machine learning algorithms from the Weka machine learning tool-kit for this purpose
As an aside, in the current version of the program, the reminders
which get voted by the model to be of type-
(do not
interrupt/remind) get buffered. When it is time to use a reminder of
types other than 0, the buffered reminders are also output along the
reminder for that particular instant of time. Thus no appointments are
ever ``forgotten.''
Results
For our study, we chose the following machine learning algorithms from
the Weka tool-kit: Zero-R, One-R, J48, Bagging,
Logit-Boost and NaiveBayes. Zero-R simply predicts
the majority class in categorical data or average class if the class
is numeric. One-R generates a one level decision tree which
tests only one particular attribute and forms a set of rules based
only on that attribute. J48 builds a C4.5 decision
tree [12]. Bagging creates
artificial data
sets from the original data set and applies a decision tree inducer on
each of them. The
generated classifiers then vote for the class to
be predicted. LogitBoost uses a learning algorithm for
numeric prediction and a combined model is formed which is then used
for classification [7]. NaiveBayes
selects the most likely classification based on a set of attribute
values using prior probabilities and conditional densities of the
individual features.
We also constructed three other data sets with reduced numbers of
features after ranking the individual features based on the
information gain ratio [12] from decision tree
induction. The top
features are considered in one set (Set 1),
the top
in the next set (Set 2), and the top
in the last set
(Set 3). Next, we compared the performance of J48 on these
data sets against the complete data set (Set 0) with all the
features. We provide the top
features below in order of information gain:
Keybd-Count5, Keybd-Any5, Mouse-Count5, Mouse-Any5, Keybd-Any1, Mouse-Any1, Keybd-Immed, Mouse-Immed, MotionCount5, Motion-Any5, Motion-Any1, ApptTime, Mouse-All1, Keybd-All1, Motion-Immed, P3-Count5, P1-Count5, P2-Count5, P1-All5, P2-All5, P4-All5, P4-Count5, Talk-Count5, Motion-All1.
Figure 3 shows the performance of J48 on
different data sets. J48 correctly classified
of the
instances and generated
rules for the complete data set (with
features). The algorithm also correctly classified
of
the instances on the reduced data set with
features with
rules being generated. The figure shows the relative performance of
the decision tree inducer on our four data sets and there seems to be
little performance degradation even with only
features. We chose
to use Set 2 with
features for further study.
|
|
The confusion matrices for the two class problems are given in
Table 3 and Table 4.
J48 improved its performance to
in case of the data
set having all the features and to
on the reduced feature
data set. Clearly Sycophant can more accurately predict whether or
not to generate a reminder. Predicting which reminder type to generate
seems harder and this remains an area of active research in our group.
Finally, removing motion and speech features from Set 0 resulted in a
statistically significant decrease in prediction accuracy on the four
class problem - the clearly points to the importance of paying
attention to the computer system's environment (external context) in
improving user interaction.
Although the decision trees generated for the user model could not be included in the limited space available, we would like to note the following. Keyboard, Mouse, Motion and Talk give the most useful information as evidenced by the tree generated for data Set 0 as well as from the ranking of individual features based on the information gain ratio criterion. On the complete data set with four classes of reminders, the decision tree constructed an interesting rule with a keyboard feature (Keyboard-Any5) chosen as the root node. The user's working hours which start at approximately 9 a.m. is the next significantly useful feature. Next is talk count. For example if the talk count was greater than 2, and there is no motion in the last minute and if the appointment time is greater 12.20 (lunch time) then both types of reminders are generated because the user is usually in a comatose state after lunch and did not care which type of reminder she wanted and often chose both. On the reduced feature data set, for the four classes of reminders, J48 constructed an interesting rule with Keyboard-Any5 at the root node. If the talk count in the last five minutes was greater than 2 and there is keyboard activity in the last minute, then generate a voice reminder. This seemed to make sense to the user in that she has just started to use heavy use of the keyboard and therefore prefers a soothing voice reminder over a more distracting pop-up window.
Conclusions and Future Work
In this paper, we investigated an approach to building a
context-learning user interface application using information from
simple sensors that detected internal (keyboard, mouse, and process
activity) and external (motion, speech) context. Our calendaring
application, Sycophant, used machine learning techniques to learn,
based on this context information, a mapping from sensor values to
reminder types. We obtained
accuracy in learning to choose
between four reminder types (four class problem); more impressively,
we were able to obtain
accuracy for the task of learning
whether or not to generate a reminder (two class problem). We found
that simple sensor information like the existence of motion and speech
in the user's vicinity along with keyboard and mouse activity are
useful for learning the mapping from sensor values to reminder types.
We are now gathering more data from different groups of users and are considering the suitability of other applications to our context-learning approach. Investigation is also being done into finding more sensors or varying the current sensors to increase the performance of the system. Finally, we would like to investigate adaptive user interfaces that combines expert generated rules with machine learned rules in genetics-based machine learning systems [8]. Acknowledgments This work was supported in part by contract number N00014-03-1-0104 from the Office of Naval Research.
This document was generated using the LaTeX2HTML translator Version 2002-2-1 (1.71)
Copyright © 1993, 1994, 1995, 1996,
Nikos Drakos,
Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999,
Ross Moore,
Mathematics Department, Macquarie University, Sydney.
The command line arguments were:
latex2html -split 1 iri04
The translation was initiated by Sushil Louis on 2005-01-06