To improve my knowledge of all things robots and human computer interaction, I followed some of Robot Ignite Academy’s courses, to supplement the robotics side of things.
My recent interest in chaos and fractals resulted in me crossing paths with the Lorenz system, I thought it was cool and wanted to learn more and plot one of my own. The system was published in 1963 by chaos theory Pioneer Edward Lorenz  in a paper titled “Deterministic Nonperiodic Flows”  - its a simplified mathematical model for atmospheric convection. Lorenz was driven to create such a mathematical model because he wished to explain the complex and unpredictable behaviour exhibited by the weather, thus Lorenz equation was “accidentally” created. The Equation represents the convection motion of a fluid cell which is warmed from below and cooled from above, a visual illustration of this:
This micro-project is a beginning step into exploring the world of fractals, by visualising the well-known Mandelbrot set, using the Python Programming language.
Chaos and life?
Chaos theory is an incredibly intriguing branch of mathematics which states complex dynamical systems such as the weather, the human brain, stock market, are not exactly as we once thought them to be. In theory the behaviour of these complex systems can be calculated according to physical laws, or in our case how life events unravel, and yet their future outcomes remain in principle “unpredictable”. Such systems are incredibly sensitive to initial changes; making predictions are very difficult, since a very slight change in condition can have a huge effect on the outcome. The outcomes can be calculated using mathematical models, however there’s an awful lot of things to take into consideration, making it very difficult, thus we tend to approximate.
This report explores the application of a layered control system and Reinforcement Learning (Sutton & Barto, 1999) to improve intelligence of a LEGO Ev3 Robot (LEGO, n.d.) tasked with circumnavigating a circuit. A layered control system alias “The Subsumption Architecture“ (SA) is an inherent parallel system with layers of behavior modules that operate asynchronously (Fitz-Gibbon, 2004) allowing one to incrementally build upon their model producing more complex intelligent systems (Brooks, 1991). SA and similar approaches have proven to be successful when adopted (Brooks, 1990), (Rosenschein & Kaelbling, 1986), (Arkin, 1987) and in particular (Maes & Brooks, 1990) which employs a learning algorithm with provides feedback for a given behavior similarly to Reinforcement Learning (RL). RL is a learning paradigm for agents interacting with unfamiliar or complex environments (Littman, 2001) which focuses on value function and policy gradient methods (Heidrich-Meisner, Lauer, Igel, & Riedmiller, 2007). The agent told action, but instead discovers an optimal behavioral strategy based upon which actions yield the most reward while attempting them (Heidrich-Meisner et al., 2007; Sutton & Barto, 1999). Temporal difference (TD) algorithms such as Q-learning (C. Watkins, 1989; C. J. C. H. Watkins & Dayan, 1992), defined by: