Graduate Researcher and PhD student at the University of Arizona. Specializing in Data Visualization for HPC processes.
Working with Dr. Katherine Isaacs on multiple projects related to the study and development of more usable and more efficent Gantt chart visualizations for showing massively parallel execution traces.
Developed multiple research projects in collaboration with interdisciplinary teams of Computer Science, Earth Science and Cyberinfrastucture researchers. Assisted in the writing of several grants and actively dissemenated research and development through conference papers and journal papers. Attendended multiple computer science and information science confrences as a representitive of our labratory and presented on the research being done at the NRDC.
Aided in the support and management of research and communication associated with two working groups ("Clusters"): Usability and Envirosensing. Provided technical support to enable online meetings and bi-annual in-person meetings. Presented on personal and collaborative research at online meetings and in-person confrences. Contributed to collaborative research projects and successfully applied for funding for a personal project.
Maintain and develop software for two Windows-based sever clusters comprised of over 10 physical machines and 21 virtual machines. These dual clusters provide data ingestion, storage and management services for a large interdisciplinary research team funded by an NSF-EPSCoR grant (2013-2018). Maintaining these clusters requires constant communication with many stakeholders, advanced networking skills and a thorough understanding of Windows server software.
In my capacity as a co-instructor alongside Dr. Sergiu Dascalu, I presented multiple full-period lectures to a class of 17 Master & PhD students on the subject of Human Computer Interaction, with an emphasis on empirical research. I also assisted with grading, advising and preparing class materials for presentations.
Lead a team of four graders in grading homework, tests and documentation for a year-long, project oriented, undergraduate capstone course comprised of two semester-long courses: CS 425, Software Engineering and CS 426, Senior Projects in Computer Science. Over the last two years as a TA for these courses, I lectured several times on course content, presented on my research lab and pitched two projects for students to take on. In 2017/2018 there were 125 students in these classes, in 2018 there are currently 105 students.
Full instructor for two classes on math fundamentals with an emphasis on GED and Accuplacer test preparation. In this position, I prepared lesson plans and class materials, graded homework and exams, and personally tutored and advised students. Class sizes were taken by about 40 students.
Role: Software architect, advisor and manager to a team of four undergraduate students on a year-long research and development project oriented around utilizing OWL/RDF ontologies for dynamic interface creation.
Role: Advised and managed team of four undergraduate students on a year-long project with the goal of developing an API that connected the NRDC Database to an EarthCube funded software package: Cloud Hosted Real Time Data Services for the Geosciences (CHORDS).
Role: Advised and managed team of two undergraduate students on a project dedicated to developing a mobile game in Unity based around a mathemtical concept of "Disk Covering": Overlay.
A data quality control application that will be deployed on the CUHASI website as a web application which can simplify the data quality control process. Developed primarily for an audience of small to medium size research teams who disseminate their data products through CUAHSI’s data discovery platform, This service will provide tools for configuring QC tests to run on their data with customizable parameters. Additionally this platform will provide supervised machine learning functionality to classify new data qualities from past QC flags.
Developed a ANSI C to MIPS compilier using Python and Python LEX/YACC. Although not fully functional to ANSI C standards, it is capable of excuting recursive fuction calls, control flow statements (including all varieties of loops), standard assignments, and n-dimensonal array accesses/storage.
Formalized an ontology describing the system level metadata of the NRDC with the goal of utilizing this ontology to generate relational database tables and microservices. Utilizing Python's RDF lib and Python's strong tools for string formatting, relevant information can be stripped out of an ontology and injected into SQL query and Flask templates to create a cyberinfrastucture-in-a-box.
Grant-funded, exploratory, collaborative project between multiple institutions (LTER, NCAR, University of Michigan and the NRDC) with the objective of integrating real time Quality Control services into the CHORDS platform.
Developed GPU implementation of a "Robust and Sparse Fuzzy K Means Algorithm" in CUDA and Python. This software was developed to quickly and accurately impute data into quality-controlled time series data. By leveraging GPUs over traditional computation methods this imputation algorithm saw up to 185 times speedup while maintaing the high accuracy of Fuzzy K Means imputation. In addition to the GPU implementation of this algorithm, a sequential version was implemented in python and C to provide a runtime baseline.
Developed a web service and web-based UI that autonomously tests time-series environmental measurements flowing in from multiple remote sites across rural Nevada. The tests performed determine the quality of incoming data without need for constant human examination and intervention. These tests additionally flag problem data with relevant metadata that informs scientists of when, where, and why this datapoint was flagged. Stakeholders are an interdisciplinary group of cyberinfrastucture researchers from the earth and computer sciences.
Citation: Rui Wu, Connor Scully-Allison, Moinul Hossain Rifat, Jose Thomas Painumkal, Sergiu Dascalu, and Frederick C Harris, Jr., “Virtual Watershed System: A Web-Service-Based Software Package for Environmental Modeling.” Accepted to: Advances in Science, Technology and Engineering Systems Journal (ASTESJ)
Abstract: In this thesis we explore a problem facing contemporary data management in the earth and environmental sciences: effective production of uniform and quality data products which keeps pace with the volume and velocity of continuous data collection. The process of creating a quality data product is non-trivial and this thesis explores in detail what knowledge is required to automate this process for emerging mid-scale efforts and what prior attempts have been made towards this goal. Furthermore, we propose a model which can be used to develop a mid-scale data product pipeline in the earth and environmental sciences: Keystone. Specifically, by automating Quality Assurance, Quality Control and Data Repair processes, data products can be created at a rate that keeps pace with the production of data itself. To prove the effectiveness of this model, three software applications that fulfilled each of the key roles suggested by the Keystone model were conceived, implemented and validated individually. These three application are the NRDC Quality Assurance Application, the Near Real Time Autonomous Quality Control Application (NRAQC), and the Improved Robust and Sparse Fuzzy K Means (iRSFKM) imputation algorithm. Respectively, they provide the functionalities of metadata management and binding through a multi-platform mobile application; automated data quality control with the help of a dynamic web application; and rapid data imputation for data repair. The latter leverages multi-gpu processing to add significant speed to a high accuracy algorithm. The NRDC Quality Assurance application was validated with the aid of a directed …
Citation: Connor Scully-Allison, 2019. Keystone: A Streaming Data Management Model for the Environmental Sciences (Masters thesis).
Abstract: The physically-based environmental model is a crucial tool used in many scientific inquiries. With physical modeling, different models are used to simulate real world phenomena and most environmental scientists use their own devices to execute the models. A complex simulation can be time-consuming with limited computing power. Also, sharing a scientific model with other researchers can difficult, which means the same model is rebuilt multiple times for similar problems. A web-service-based framework to expose models as services is proposed in this paper to address these problems. The main functions of the framework include model executions in cloud environments, NetCDF file format transmission, model resource management with various web services. As proof of concept, a prototype is introduced,implemented and compared against existing similar software packages. Through a feature comparison with equivalent software, we demonstrate that the Virtual Watershed System (VWS) provides superior customization through its APIs. We also indicate that the VWS uniquely provides friendly, usable UIs enabling researchers to execute models.
Citation: Connor Scully-Allison, Vinh Le, Eric Fritzinger, Scotty Strachan, Frederick C. Harris, Jr., and Sergiu M. Dascalu “Near Real-time Autonomous Quality Control for Streaming Environmental Sensor Data,” Procedia Computer Science, Vol 126, pg 1656-1665, 2018.
Abstract: In this paper, we present a novel and accessible approach to time-series data validation: the Near-Real Time Autonomous Quality Control (NRAQC) system. The design, implementation, and impacts of this software are explored in detail within this paper. This software system, created in close conference with environmental scientists, leverages microservice design patterns employed for high volume web applications to develop a contemporary solution to the problem of data quality control with streaming sensor data. Through a comparative analysis between NRAQC and the GCE Toolbox, we argue that the web based deployment of QC software enhances accessibility to crucial tools required to make a robust and useful data product from raw measurements. Additionally, a key innovation of the NRAQC platform is its positive impact on modern data management practices and quality data dissemination.
Citation: Connor Scully-Allison, Hannah Munoz, Vinh Le, Scotty Strachan, Eric Fritzinger, Frederick C. Harris, Jr., and Sergiu Dascalu “Advancing Quality Assurance Through Metadata Management: Design and Development of a Mobile Application for the NRDC,” International Journal of Computers and Their Applications, Vol 25, No 1, pg 20-29, March 2018.
Abstract: In this paper we present the design, implementation, and impacts of a cross-platform mobile application that facilitates the collection of metadata for in-situ sensor networks and provides tools assisting Quality Assurance processes on remote deployment sites. Created in close conjunction with scientists and data managers working on environmental sensor networks, this paper details the software requirements, specifications, and implementation details enabling the recreation of such an application. In a discussion on how this software improves on existing techniques of logging contextual metadata and quality assurance information, we show that this application represents a significant improvement over-existing methods. Specifically, the proposed application allows for the near-real time update and centralized storage of contextual metadata. Compared to prior methods of logging, often physical notebooks with pen and paper or program comments on embedded field sensors, the method proposed in this paper allows for contextual information to be more tightly bound to existing data sets, ensuring use of collected data past the lifetime of a specific research project.
Citation: Connor Scully-Allison, Sergiu M. Dascalu, Rui Wu, Lee Barford, Frederick C Harris, Jr. “Data Imputation With an Improved Robust and Sparse Fuzzy K-Means Algorithm.” Submitted to: Proceedings of the 16th International Conference on Information Technology: New Generations (ITNG 2019) April 1-3, 2019, Las Vegas, NV.
Abstract: Missing data may be one of the biggest problems hindering modern research science. It occurs frequently, for various reasons, and slows down crucial data analytics required to answer important questions related to global issues like climate change and water management. The modern answer to this problem of missing data is data imputation. Specifically, data imputation with advanced machine learning techniques. Unfortunately, an approach with demonstrable success for accurate imputation, Fuzzy K-Means Clustering, is famously slow compared to other algorithms. This paper aims to remedy this foible of such a promising method by proposing a Robust and Sparse Fuzzy K-Means algorithm that operates on multiple GPUs. We demonstrate the effectiveness of our implementation with multiple experiments, clustering real environmental sensor data. These experiments show that the our improved multi-GPU implementation is significantly faster than sequential implementations with 185 times speedup over 8 GPUs. Experiments also indicated greater than 300x increase in throughput with 8 GPUs and 95\% efficiency with two GPUs compared to one.
Citation: Vineeth Rajamohan, Connor Scully-Allison, Sergiu Dascalu, and David Feil-Seifer. "Factors Influencing The Human Preferred Interaction Distance." In IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), New Delhi, India, Oct 2019.
Abstract: Nonverbal interactions are a key component of human communication. Since robots have become significant by trying to get close to human beings, it is important that they follow social rules governing the use of space. Prior research has conceptualized personal space as physical zones which are based on static distances. This work examined how preferred interaction distance can change given different interaction scenarios. We conducted a user study using three different robot heights. We also examined the difference in preferred interaction distance when a robot approaches a human and, conversely, when a human approaches a robot. Factors included in quantitative analysis are the participants' gender, robot's height, and method of approach. Subjective measures included human comfort and perceived safety. The results obtained through this study shows that robot height, participant gender and method of approach were significant factors influencing measured proxemic zones and accordingly participant comfort. Subjective data showed that experiment respondents regarded robots in a more favorable light following their participation in this study. Furthermore, the NAO was perceived most positively by respondents according to various metrics and the PR2 Tall, most negatively.
Citation: Ryan Devaney, Sanya Gupta, Vinh Le Connor Scully-Allison, Frederick C. Harris, Jr., and Sergiu Dascalu “Overlay: an Educational Disc Covering Puzzle Game” in Proceedings of the ISCA 27th International Conference on Software Engineering and Data Engineering (SEDE 2018) October 8-10, New Orleans, LA, pp. 30-36.
Abstract: In the last decade, video games have quickly become a major contender against present media standards,like movies and music. While video games often serve as a form of entertainment, it has had a long history of being used as an engaging tool for education. However, video games that are more tailored towards education tend to fail in compelling their audience to play as strongly as their non-educational cousins. To address this problem, this paper presents Overlay: an educationally oriented video game that is designed to teach basic problem-solving skills while maintaining high levels of engaging entertainment. Overlays core gameplay mechanic derives from Richard Kershners Disk Covering problem, where users use basic geometry to solve a positioning problem and progress through the game. As users progress through the game, the levels becomes increasingly more challenging. Progress is tracked through the game and the top-ranking scores are stored online through a web service handling communication to the main data source. These gameplay aspects allow Overlay to seamlessly blend STEM education with viscerally enjoyable entertainment.
Citation: Pattaphol Jirasessakul, Zachary Waller, Paul Marquis, Vinh Le, Connor Scully-Allison, Scotty Strachan, Frederick C. Harris, Jr., Sergiu M. Dascalu “Generalized Software Interface for CHORDS,” in Proceedings of the ISCA 27th International Conference on Software Engineering and Data Engineering (SEDE 2018), October 8-10, New Orleans, LA.
Abstract: In the physical sciences, the observation and analysis of environmental readings, such as wind speed, sap ﬂow, atmospheric pressure, temperature, and precipitation, beneﬁt greatly from real-time visualization as they allow environmental scientists to create faster actionable intelligence. However, the scarcity of easily accessible and customizable real-time visualization software often creates logistical problems for researchers focused in environmental sciences. The goal of this paper is to present an alternative approach, based on usability and open source software, for the Nevada Research Data Center (NRDC) to visualize environmental data in near-real time and conﬁrm its viability for usage with other research projects of similar size. This approach involves creating an open-source near-real time interface to act as middle-ware between the NRDC’s data repository and CHORDS, a cloud-hosted data visualization package. While this interface is primarily being built for the NRDC, there is an emphasis on tooling it to be as generalized and generic as possible.
Citation: Hannah Munoz, Connor Scully-Allsion, Vinh D. Le, Scotty Strachan, Fredrick C. Harris, Jr., and Sergiu M. Dascalu “A Mobile Quality Assurance Application for the NRDC” in Proceedings of the ISCA 26th International Conference on Software Engineering and Data Engineering (SEDE 2017) October 2-4, San Diego, CA, pp. 30-36.
Abstract: In this paper we present the design, implementation, and impacts of a cross-platform mobile application that facilitates the collection of metadata for grassrootslevel sensor networks and provides tools for Quality Assurance processes on remote deployment sites. Created in close conjunction with scientists working on environmental sensor networks and data management experts, this paper details the software requirements, speciﬁcations, and implementation details required to construct such an application. In a discussion on how this software improves on existing techniques of logging contextual metadata and quality assurance information, it is shown that this application represents a significant improvement over existing methods. Speciﬁcally, the proposed application allows for the near-real time update and centralized storage of contextual metadata. Compared to prior methods of logging, often physical notebooks with pen and paper or program comments on embedded ﬁeld sensors, the method proposed in this paper allows for contextual information to be more tightly bound to existing data sets, ensuring use of collected data past the lifetime of a speciﬁc research project.
Citation: Connor F. Scully-Allison, Hirav Parekh, Frederick C Harris, Jr., and Sergiu M. Dascalu “Analysis of User Experience and Performance at Initial Exposure to Novel Keyboard Input Methods” in Proceedings of the 2017 International Conference on Computers and Their Applications (CATA 2017) March 20-22, 2017, Waikiki, HI, pp. 72-78.
Abstract: We evaluated user performance and user experience with two novel input methods for mobile devices: Minuum and MessagEase. Subjects used a Qwerty keyboard to give a performance baseline. We compared input speeds, error rates, and keystroke counts among all three inputs to understand what factors discourage continued use or widespread adoption of new keyboard formats. It was found that MessagEase performed poorly upon initial exposure in terms of speed and error rate at 1482 mSec per character and 35.75% errors per line. Being 82.8% slower and 81.1% more error prone than Qwerty, there was a strong correlation between negative opinions of MessagEase and user performance. For Minuum, the performance gap was less signiﬁcant at 32.1% slower speeds and 50.9% greater error rate. Accordingly Minuum was correlated with a better overall user experience compared to MessagEase.