Exploratory Research for Machine Learning
How to proceed with ML research for new and fun stuff
When we research something, we look to gain knowledge and answers about things (if any). The idea is to follow a process (the scientific method) that will lead us to an outcome, but that process might be affected by the desired results and the nature of the problem itself.
While there is sufficient documentation, books, and articles about research methods for some well-depicted problems, these often point to research in medicine, psychology, economics, and natural and political sciences (among others). During my doctoral studies, I found that the books I had to review as part of the program had a strong focus on psychology. I found it challenging as I was trying to apply research on information systems, particularly for machine learning, which is a quantitative discipline. Although the books in the program covered practically everything in terms of methodologies, they still let me spin a bit on what to use if I want to create research on how to build a self-driving car or train a convolutional neural network to identify Covid in X-rays.
This post reviews the Exploratory Research previously used in machine learning research to understand how problems are framed and solved. I will also briefly introduce the method I used for my dissertation at the end of this post for sign language recognition.
Exploratory research looks to investigate questions that have not been previously studied in depth or are different from other problems in literature. For example, research titles such as "Generative adversarial nets" (Goodfellow, 2014) sound like a candidate topic to be assessed as exploratory. This ground-breaking research reveals how to train generative models using an adversarial process by training two neural networks simultaneously. If we look at the 2014 paper, it goes straight to the action, with virtually no clarification on methodology, research techniques, or anything in this space. This type of research does not mean there is no research question and methods. Still, the way it was written shows only the novelty, not the failed experiments and other original theories for the adversarial GANS apart from what has been included in the literature review.
Tom Dietterich (2019) from Oregon State University recommends the following process to drive the exploratory research successfully.
Exploratory research -> Initial Solutions -> Refinement & Evaluation -> Competing Solutions & Comparative Evaluation -> Mapping The Solution to Space -> Engineering & Technology Transfer.
This process states an "initial solution" to a problem that can be refined, reduced, and evaluated to be compared to other solutions. This will lead the research to determine if the outcome is important enough for publication or if it needs additional refinement, or should it be stashed.
The idea here is to provide the kick-start solution to the problem. It does not have to be pretty, but it will serve as the base model for improvement. This stage is essential as this helps to define a precedent for a base solution to a complex problem. A paper might be created at this stage even if the solution does not generalize well. For example, Bayesian networks (Pearl 1985) described simple message passing for tree-structured networks.
Nothing stimulates good research like a bad paper about an interesting problem - Dietterich
Refinement and Evaluation
This is the process where the initial solution is evaluated for improvement. The refinement process can affect the initial solution by proposing new metrics, algorithms, hyper parameters, optimization techniques, or more data. The idea is to see if this could be done better even if the new solution takes a 180-degree turn.
Competing Solutions & Comparative Evaluation
In the previous step, the model or algorithm is compared against itself. We are looking to understand how this is compared against other methods or techniques available for similar purposes in this phase.
It is now always easy to perform the comparative evaluation when the research is proposing something completely new, as in the case of adversarial networks. But the analysis can also be put in perspective against other research efforts that look to solve the same problem, but not in the same way.
After Goodfellow's paper on adversarial gans, now it's easier to do this as many researchers are looking for progress, and improvements against each other can be used for benchmarks.
Mappings Solutions to Space
This looks to identify the design space for a particular problem. What are the bounds, critical design decisions, and how is the algorithm compared to others? For example, the concept of learning and the training time changes drastically between KNN and logistic regression on foundational machine learning algorithms. The training time becomes a problem for KNN as the dataset size increases. But why? How does this affect the usage of the algorithm for particular issues? How can the issue be fixed? These are questions to revisit in this stage as time and space complexity are under evaluation.
Engineering & Technology Transfer
At least in machine learning research, applied research is recommended, and a proof-by-construction will help demonstrate that the investigation is sound. Many research papers test over well-known datasets such as ImageNet, Fake News Detection Dataset, Boston Housing, Atari RL, or MNIST. Study replicability is essential as we often read articles to solve other problems and require code or parts of the approach used to solve the research problem. Writing a paper with publicly accessible data and a code-repository is one of the best things we can do for the scientific community that uses ML for their research efforts.
Design science research focuses on the development and performance of (designed) artifacts with the explicit intention of improving the functional performance of the artifact. Design science research is typically applied to categories of artifacts, including algorithms, human/computer interfaces, design methodologies (including process models), and languages. Its application is most notable in the Engineering and Computer Science disciplines, though it is not restricted to these and can be found in many disciplines and fields (Vaishnavi et al., 2019).
Design science is a research methodology that focuses on an artifact and looks for improvement iteratively. So, a machine learning model can be seen as an artifact, subject to the seven guidelines for the research:
- Design as an artifact: Design-science research must produce a viable artifact in the form of a construct, a model, a method, or an instantiation.
- Problem relevance: The objective of design-science research is to develop technology-based solutions to important and relevant business problems.
- Design evaluation: The utility, quality, and efficacy of a design artifact must be rigorously demonstrated via well-executed evaluation methods.
- Research contributions: Effective design-science research must provide clear and verifiable contributions in the areas of the design artifact, design foundations, and/or design methodologies.
- Research rigor: Design-science research relies upon the application of rigorous methods in both the construction and evaluation of the design artifact.
- Design as a search process: The search for an effective artifact requires utilizing available means to reach desired ends while satisfying laws in the problem environment.
- Communication of research: Design-science research must be presented effectively both to technology-oriented as well as management-oriented audiences.
Dietterich's exploratory research process can be merged with the Design Science methodology (they are highly compatible but not the same). Design science puts mathematical rigor and artifact validation as a key to improving the solution (research results), methods, constructs, and other design theories to construct new knowledge. The idea is that the exploratory analysis is performed in iterations. We can evaluate the effect of a particular change in the model or methods used on each cycle.
Design Science for Sign Language Recognition
I used design science for my dissertation "Video-Based Costa Rican Sign Language Recognition for Emergency Services" which proposes the process required to transform a video into text when a person is communicating in sign language (LESCO).
The entire project was envisioned as a collection of artifacts: from the raw video data, the algorithms to transform the data, to the machine learning models used to classify each sign into its textual meaning (label).
Everything is measurable, therefore it can be improved.
The valuable thing about design science is that we can plan the experiment and collect the information on each iteration. After every cycle, we observe what changed and rerun the experiment to see if there are any improvements (pretty much trial and error, which in synthesis is at the core of the experimental research paradigm). This can take us to a path for improvement where we are storytelling how we got into the eureka moment.
Dietterich, T. (2019). Research Methods in Machine Learning. 46.
Hevner, A. R.; March, S. T.; Park, J. & Ram, S. Design Science in Information Systems Research. MIS Quarterly, 2004, 28, 75-106. URL: citeseerx.ist.psu.edu/viewdoc/download?doi=..
Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2 (NIPS'14). MIT Press, Cambridge, MA, USA, 2672–2680.
Pearl, J. (1985) A model of self-activated memory for evidential reasoning, in Proceedings of the 7th Conference of the Cognitive Science Society, University of California, Irvine, CA, pp. 329–334.
Vaishnavi, V., Kuechler, W., and Petter, S. (2004/19). "Design Science Research in Information Systems" January 20, 2004; last updated June 30, 2019. URL: desrist.org/design-research-in-information-..