Co-ordinated Planning Under Uncertainty with Air and Ground

the vehicle when pointing straight down, so the ground station planning .... Using the law of total probability and an assumption that the target motion is ..... feasible, collision- and exposure-free path can be found using a standard graph search ... EMI/RFI solutions .... Journal of Basic Engineering, 82(Series D):35–45, 1960.
1MB taille 2 téléchargements 341 vues
Co-ordinated Planning Under Uncertainty with Air and Ground Vehicles Abraham Bachrach† , Ruijie He† , Sam Prentice† , Michael Achtelik‡ , Daniel Gurdan‡ , Jan Stumpf‡ and Nicholas Roy† †



Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 32 Vassar St., Cambridge, MA 02139.

Ascending Technologies GmbH Graspergerstr. 8 82131 Stockdorf Germany.

Abstract We describe a co-ordinated system of an autonomous unmanned air vehicle (UAV) and unmanned ground vehicle (UGV), with the goal of planning a trajectory for the UGV in the presence of a mobile adversary. There are three technical challenges to this problem. Firstly, we describe the design and operation of our hex-rotor helicopter, a micro air vehicle designed to be an autonomous airborne sensor. Secondly, we describe the problem of how to reliably extract an estimate of mobile ground adversaries, from a sequence of images from the UAV, and we describe an algorithm that learns to recognize and detect mobile ground units. Thirdly, we describe how an autonomous UGV can use adaptive planning to generate trajectories with respect to the uncertainty in the distribution of mobile ground adversaries, in order to minimize the likelihood of detection.

1 Introduction We describe a co-ordinated system of an autonomous unmanned air vehicle (UAV) and unmanned ground vehicle (UGV), with the goal of planning a trajectory for the UGV in the presence of a mobile adversary. The position of the adversary is initially unknown, so we use a camera sensor on the UAV to detect and track the adversary, or “guard”, relaying the position of the guard to the UGV. Our UAV platform is a micro six-rotor helicopter sensing platform (shown in Figure 1), equipped with on-board attitude control via an IMU, position control via GPS navigation and a video camera system. The vehicle has a wireless communication link to the base station that provides low-bandwidth digital IMU/GPS information and a high-bandwidth analog link that transmits camera images. We assume both the UAV and UGV have knowledge of their own position via GPS. There are three technical challenges to this problem. The first such challenge is the design and operation of our hex-rotor helicopter, a micro air vehicle operated as an autonomous airborne sensor. We divided the problem into the air-vehicle and onboard-electronics design problem, using commercial components whenever possible, and the offboard command-and-control problem. This division of labour allowed the two teams to focus on the relevant technical issues, and a well-defined vehicle interface allowed each team to abstract away the performance of the system on the other side of the interface. As an example, the software team was able to begin testing on larger vehicles that supported the same interface, well before the final configuration of the MAV was assembled. The second technical challenge is to reliably geolocate ground objects by extracting each object from an image and recovering its position in the world frame, such as determining the guard position from a sequence of images from the UAV. Specific to guard tracking, we use an adaptive algorithm to learn features that correspond to the moving guard object, and then use Bayesian filtering to track the object over time. Additionally, we are able to assemble a sequence of images into a globally consistent map of the environment. The third technical challenge is to plan with respect to the uncertainty in the probability distribution over possible guard locations and trajectories provided by the tracking filter. As a result, we must be able to plan trajectories for the UGV that minimize the likelihood of detection given the uncertainty in future guard 1

Figure 1: Our six-rotor helicopter with bird’s-eye video camera. The helicopter is 28cm in diameter and weighs 142g without the navigation electronics, camera or communication hardware. positions. This planning problem requires us to plan with respect to the distribution of guard positions, that is, the information space of the guard, a planning problem that is typically computationally intractable. In order to find plans efficiently, we use a dynamic action selection strategy to guide the search for useful plans.

2 Air Vehicle Our vehicle design consists of a custom-designed carbon-fiber airframe. The propulsion system consists of 6 customdesigned brushless motors, optimized for maximum thrust-to-weight ratio, and custom-designed electronic motorcontrollers that are also optimized for weight. The batteries are 2000 amp-hour Thunderpower 3-cell lithium polymer batteries. The vehicle is 29 cm rotor-tip to rotor-tip and weights 142 grams without the navigation electronics, camera or communication hardware. The complete vehicle is shown in figure 1.

3 Payload The payload consists of two main components: the navigation unit and the camera sensor. Navigation system Our navigation system consists of a 60MHz Philips ARM microprocessor, u-blox GPS receiver, compass, IMU and pressure sensor. The ARM microprocessor integrates the IMU and GPS measurements to provide a consistent state estimate at 1000 Hz. The on-board software accepts waypoints in the GPS (world) co-ordinate frame and uses PID control to achieve the desired position. The height estimate is relative to the position of the vehicle on take-off. The waypoint controller attempts to achieve the desired position initially with 15m accuracy, and then takes an additional 30 seconds to achieve the position with 2.5 m accuracy. If the waypoint is not achieved to within 2.5 m in the 30 seconds, the control software assumes that external factors (i.e., wind) are interfering and ends the attempt. In this way, we are guaranteed some baseline level of performance (15m), and the vehicle will attempt to achieve a higher level of accuracy without excessive time delays. The vehicle additionally carries a DigiKey 900MHz Xtend RF module operating at 100 mW. We communicate with the MAV with a USB-serial converter to the Xtend base station; the bandwidth is such that we typically can get 40 Hz updates. The vehicle is configured to use the digital data link as the primary communication mechanism. If

2

the digital data link is lost, then the vehicle throttles back to 30% and attempts to land safely. This can be over-ridden with an auxiliary RC link operating at 72 MHz. If a safety pilot observes the vehicle behaving incorrectly, then an RC transmitter can be used to assume control over the vehicle and return it to base or land it safely. Camera system Our camera sensor is a Black Widow KX141 480 line CCD camera with 90◦ field-of-view. Additionally, we use a Black Widow TD240500TX 2.4 GHz 500 mW transmitter, and a YellowJacket YJS24 2.4 GHz diversity receiver at the ground station. This camera and transmitter provides excellent video capability at long ranges, and the 2.4 GHz frequency does not interfere with our 900 MHz data link. Additionally, the camera is mounted on a small servo that provides 90◦ motion along one degree of freedom, allowing the camera to tilt from directly forward to straight down. The servo is controlled from the ARM navigation computer, which in turn receives servo instructions from the base station. The camera lens extends below the frame of the vehicle when pointing straight down, so the ground station planning software automatically returns the camera to the forward view when the vehicle is below 5m.

4 Mission Operations The overall control architecture is organized as an approximate three-tier architecture [3] consisting of different modules communicating at the different levels. The bottom two layers consist of the navigation and control software; the top-most layer is the human interface. The base layer governs hardware interaction and control by providing an abstract set of base and sensor interfaces. The navigation layer implements intermediate navigation primitives including localization, dynamic object tracking, and motion planning. Each major capability is built as a separate software module, communicating with each other over a communication protocol called IPC, developed by Reid Simmons at Carnegie Mellon University [9]. The IPC communication system has been used successfully in a number of mission critical applications, including the NASA space probe DS-1 and the winning entry in the DARPA Grand Challenge in 2005. While modularity introduces some overhead, it has several important advantages over monolithic systems: • Flexibility - If any components of the hardware design change, the software modifications are localized to a single module, which typically facilitates maintenance and upgrades. • Network support - All of the modules need not run on the same processor. • Reliability - If a single module fails, the remaining modules will continue to operate. • Extensibility - It has been much easier to modify components, or develop new ones, as each component is self-contained. The majority of communication occurs as asynchronous anonymous publish-and-subscribe, so that no module is required to know the source of any data. The message passing mechanism transparently provides access to all data as it is generated, regardless of module locations. All functions generating outbound communication traffic from a module are placed in a separate source file. All modules include a separate interface library that abstracts the subscription process to the modules’ messages. In order to ensure robustness and ease-of-use principles, we provide a central repository and interface for handling parameters. Since our software is distributed across multiple processors, a common failure point of such distributed systems is the conflict between parameter values that are loaded from different local sources. By requiring modules to retrieve their parameters from a single source (loaded from a single file), our system ensures that parameter values are consistent across all modules. This approach of storing parameters in a single file, and distributing parameters programmatically, has the additional benefit that parameters can be updated during run-time, eliminating the need to restart modules. We also require that the model repository include maps. Instead of loading environment descriptions from local files, all modules also request maps from a central server. Distributing maps over the communication network does have the disadvantage of potential bandwidth. However, this cost is only incurred at startup, and when amortized over the operation of the vehicle, is negligible compared to the value of ensuring map consistency. 3

GUI Estimate in World Co−ordinates Mission Planner Waypt Command

Image of Target

Estimate in Image Plane State Estimator

Project to World Co−ordinates

GPS/IMU Send GPS Coords to MAV

Guard Estimator

Hex Controller

Update state and plan

(a) Control Flow

Tracker Images Camera Server

(b) Software Architecture

Figure 2: The main software modules and data flow. The basic flow of control is on the left: images of ground targets are acquired, the world co-ordinates are recovered according to the GPS and camera parameters, the plan is updated and new GPS waypoints are communicated to the MAV. Not shown are pieces of the infrastructure such as the mapper, the message router, parameter and map servers, or the off-line configuration tools.

4.1

Hardware Interfaces

The core hardware interface modules are a basic communications module with the MAV and a camera receiver. The data link is the Xtend RF module described in the payload section, connected to the computer via a USB-serial converter. The communication of data from the vehicle consists of a polling mechanism for low-frequency data such as the vehicle internal state, battery voltage, GPS accuracy, etc. Additionally, a small high-frequency data packet is transmitted without polling at a fixed 30 Hz rate that provides the GPS and IMU estimates. By splitting the communication of the data from the vehicle into a segment of always-available, low-latency data and slower polled data, we are able to maximize the available bandwidth to provide accurate state estimates at any point in time. The communication of data to the vehicle consists of basic waypoint instructions in GPS co-ordinates. The second hardware interface module is an image server that receives images from our Sensoray S2255 frame grabber. The Sensoray publishes planar 640x480 422 YUV images to the USB line, which are converted by the camera server to RGB images before being published for use by the tracker and display.

4.2

State Estimator

The state estimator module allows us to integrate multiple sources of information in estimating the state of the vehicle, including integrating camera images, optical flow or ground ranging into the state estimate. For the purposes of the MAV competition, the on-board GPS/IMU state estimation proved sufficiently accurate that we do not in practice integrate additional information. When flying indoors, however, the state estimator module is essential for tracking the vehicle.

4.3

Visual Tracking

The objective of the visual tracker is to detect the presence of the guard in a cluttered environment without prior knowledge of the guard appearance. We assume that the position of the UAV is known from GPS, that the guard is moving at least part of the time, and is the only moving object in the scene. Following [7], we first segment the optical flow field in order to detect the moving entity. We use the Lucas-Kanade optical flow algorithm to extract the optical flow field, as shown in figure 7. The dominant component of the optical flow is the motion of the camera itself, with an additional localized component that is the motion of the guard vehicle. While optical flow is frequently useful for detecting the presence of the moving guard in the image, it is a very noisy feature for the purposes of tracking, and requires sufficient apparent motion between the guard and the background. We wish to be able to continue tracking objects when they temporarily come to a halt. We must therefore use additional 4

(a) Original Image

(b) Optical Flow

(c) Detected Features

(d) Ensemble Filter Response

Figure 3: (a) An example optical flow field containing a moving car. (b) The remaining flow vectors after discarding the vectors due to the vehicle motion. (c) The detected features. (d) The response of the learned features across the sub-image of the detected car.

Figure 4: Two examples of tracking a moving vehicle from a UAV in very different environments. (a) A snowy field on Boston (b) A parking lot in Florida. features for tracking; having localized the object initially based on motion, we can use image features from the initial image to train a classifier to recognize the object in subsequent frames. Using the approach developed in [1], once the object has been detected based on optical flow, we extract color features and a histogram of oriented gradient features across the image. We use an online version of AdaBoost [4] to learn a classifier for the object, using features localized on the moving object as positive training instances and all other features as negative training instances. Boosting is a machine learning algorithm that combines a set of “weak learners” to generalize to more complex concepts. In the case of image tracking, we use boosted logistic regressors to identify features (colour threshold, oriented gradient feature, etc.) that separate the foreground pixels (the object to be tracked) from the background pixels. A single logistic regressor unit cannot separate all foreground pixels from background pixels, so after each regressor is trained, incorrectly labeled pixels are given additional weight. At each iteration in the boosting algorithm, the most discriminative feature is selected for regression on the weighted pixels. The power of AdaBoost lies in how the weights on the training images are updated; images that are classified correctly receive lower weight on the next iteration, while images that are classified incorrectly receive higher weight. The weights bias the next weak learner towards the images which are close to the margin. Note that the object appearance will vary over time (for instance, the orientation of edge features will change as objects rotate in the image), so we continually learn new classifiers. As each new image arrives, we first identify the location of the target being tracked, and then use the existing classifier to label the location of the object. We then re-train the classifier based on the new image appearance. In this way, our classifier is robust to changes in illumination and appearance as the target moves. Figure 4 shows the tracking task operating in both an unstructured open field (left) and a structured parking lot (right). The parking lot is a challenging task because the car was the same color as the background, and the regular structure of the lines tended to align with the shape of the car. In order to provide some smoothness to the tracking, we use the resulting ensemble of learned classifiers as the sensor model in a probabilistic filter in order to estimate the object position and trajectory. We use a particle filter to implement the probabilistic estimate p(xt |z0:t ), where xt is the location of the target in the image at time t, p(xt |z0:t ) is the probability of the target at the location after having received measurements z0:t . We can compute this distribution

5

Figure 5: An example image during the camera calibration process. The extracted corners are highlighted in colour and represent the labels y in the least-squares minimization. recursively as p(xt |z0:t = αp(zt |xt , z0:t−1 )p(xt |z0:t−1 )

(1)

where α is a normalization constant. Using the law of total probability and an assumption that the target motion is first-order Markov, we can rewrite this as Z p(xt |z0:t = αp(zt |xt ) p(xt |xt−1 )p(xt−1 |z0:t−1 )dt, (2) Xt−1

Note that p(xt−1 |z0:t−1 ) is just the target distribution on the previous timestep, and p(zt |xt ) is our sensor model (the likelihood of detecting the target at position zt given the target is at xt ). p(xt |xt−1 is just a model of how the target moves, which we assume to be Gaussian motion with some fixed variance. By writing the probabilistic filtering equations in this way, we can use a form of non-parametric density estimation known as importance sampling [10] to track the distribution. In contrast to more conventional filtering techniques such as the Kalman filter [5], we have found that non-parametric density estimation is useful for modelling the non-linear sensor and motion models and the non-Gaussian noise distributions. The motion of the vehicle is particularly non-linear, and large swings of the MAV generally cause very large displacements if the target in the image. As a result, we generally assume that the motion variance is large, such that the target estimate becomes lost if the target is not observed for more than two or three frames in a row. Given the emphasis on the measurement model, and the relatively low influence of the motion model, it is tempting to discard the filtering in the image plane entirely. Our experience however is that the filtering in the image plane provides a measure of smoothing of the estimates and essentially acts as a form of outlier rejection.

4.4

Camera Calibration

Given the tracked position of an object in the image, we can recover the position of the object in the world coordinates from knowledge of the camera properties and the knowledge of the vehicle GPS position and attitude. The camera properties include intrinsic camera properties such as camera focal length, centre of projection and a dewarping matrix that allows the camera tracking software to eliminate radial and tangential distortions due to the field of view. Additionally, a set of extrinsic properties are required that represent the rigid transformation from the camera image plane to the centre of body of the vehicle. We recover these parameters in a calibration process in which the vehicle is put into a known position (from GPS) with respect to a known target on the ground (shown in figure 5). A series of images are captured at different positions and attitudes, and the corner features of the checkerboard pattern are automatically extracted. Given the known positions of the vehicle x, the configuration of the checkerboard (and hence the relative positions of the corner features y), the camera parameters can be recovered as terms in a linear system A using a least-squares minimization.

4.5

Visual Mapping

In many circumstances, missions flown by the MAV may not have a prior map of the environment, or the existing maps may no longer be accurate. A single image taken from a sufficient height usually provides a partial view of 6

Second building Mapping error

First building Rows of parked cars

(a) SURF Feature Extraction

(b) Global Map

Figure 6: (a) An example image used in constructing a map. Notice the relative sparsity of SURF features (yellow dots) and (b) the final map from XXX images. Note that while the map contains some errors, high-level features are clearly visible including the rows of parked cars and the buildings. The image is distorted due to the radial distortion of the wide-angle lens. environment, but combining multiple such images into a single consistent map provides a substantial improvement in situational awareness. Constructing such maps is challenging because building accurate maps requires solving a chicken-and-egg problem. If we know the position, attitude and internal parameters of the camera, then the mapping problem is relatively trivial because the images can be projected into the world co-ordinate frame and simply mosaiced. Unfortunately, the GPS and IMU estimates from the MAV are rarely accurate or precise enough to allow simply projection and mosaicing. On the other hand, if we have an accurate map of the environment, we can easily recover the position, attitude and camera—this is exactly the camera calibration problem. In the absence of either an exact estimate of the camera or a prior map of the environment, constructing a single map from multiple images requires solving jointly for the camera position and map estimate. This problem is known in the mobile robotics community as the Simultaneous Localization and Mapping, or SLAM, problem. In order to provide visual slam capability to our MAV system, we use a visual SLAM package due to Steder et al [11]. The algorithm proceeds by extracting features from the image and performing joint inference of the three-dimensional location of the features and the camera position at each point in time. The specific features used are the “Speeded Up Robust Features” (SURF) features [2]. Figure 6(a) shows a set of SURF features (the yellow dots) extracted from a single image taken from the MAV over Soldier’s Field at Harvard University. A sequence of these images are assembled into the map shown in figure 6(b). In order to test the limits of the inference algorithm, we performed the inference without any prior information from the GPS or IMU. While the image contains numerous small registration errors, it is sufficiently consistent at a global scale to identify the row of parked cars and the location of the building.

4.6

Guard Estimator

In order to ensure that we have a consistent guard tracker that is more robust to missing multiple guard measurements (for example, as a result of occlusions from the buildings), we also perform guard estimation in world co-ordinates by taking the current estimate of the tracked guard in the image plane and projecting the image co-ordinates into world co-ordinates based on the current position of the MAV and a calibrated camera model. Our model of estimation in world co-ordinates is that the guard dynamics are much better known, but the GPS position of the vehicle is more noisy and the projection operation is also somewhat noisy, leading to much noisier measurements in the world co-ordinate frame. We therefore run a second particle filter using much lower variance motion model and a much higher variance sensor model. This form of the particle filter allows the tracked estimate of the guard position to persist considerably longer without observations, but requires more consecutive measurements to substantially alter the current estimate. Figure 7 shows a series of images taken from our UAV successfully performing the tracking task.

7

Figure 7: A series of images showing a car being tracked from the UAV.

4.7

Mission Planner

The mission planner takes a high level mission description in terms of high-level tasks, and converts individual tasks into specific waypoint controls for each task. The mission planner has the following responsibilities: • Ensuring valid execution of the mission • Computing the vehicle trajectory from a high-level mission description • Validating a high-level mission description • Monitoring the vehicle to ensure safety Valid Execution of Missions At the highest level, the mission planner is described by the finite state machine shown in figure 8. The vehicle can begin operation either by receiving a specific task, or by receiving a high-level mission description. Once the vehicle has received a task or mission, it transitions to the ready state, at which point it begins executing the current task (either the received task, or the first task in the mission). Each task is equivalent to a named waypoint (e.g., “Bank”), a sequence of waypoint commands (e.g., “Coverage pattern over the bank”), or a configuration command (e.g., “Change to the following absolute height”). The complete task list is below: Task ended in mission and more tasks remain

Not Ready

Ready Received valid mission or task

Running Received Start

Trouble Trouble State e.g., Low battery

Task ended with no further tasks in mission or Received Pause Task ended with no mission defined

Figure 8: The finite state machine of the mission planner. Table 1 provides a list of the possible tasks that can be used to define a mission. Mission descriptions are specified in a human-readable format, with a separate line per task. If the operator wishes to specific additional parameters (e.g., hold time), these can be specified after the task name on the same line. A typical mission description will therefore have a form similar to: Takeoff Bank 8

1. Takeoff

7. Minesweep - Fly to the current position of the commandos, descend to 10 metres and fly a path corresponding to the expected commando path.

2. Land 3. Placename - Fly to the GPS co-ordinates of the named position. The map is assumed to be prelabelled with specific locations.

8. Hold - Maintain the current position for a fixed number of seconds.

4. Coverage - Fly a coverage pattern (typically a rectangular box) over a named location.

9. Pause - Change to Ready state, as if the task had been paused. Once a task is paused, the human operator must resume the task from the GUI.

5. Guard search - Fly a coverage pattern over a named location until the guard is detected.

10. Velocity - Re-configure the default flight velocity

6. Track - Hold a position over the current estimated position of the guard, adjusting this position as the estimated guard position changes. Typically immediately follows a Guard search task.

11. Relative Height - Re-configure the default flight height relative to the current height. 12. Absolute Height - Re-configure the default flight height in absolute terms.

Table 1: The set of tasks that can be used to define a mission. Coverage Bank Guard Search Bank Track Minesweep Home Validating Mission Descriptions In order to minimize errors and reduce risk, the mission description must be valid. Each task has pre-conditions that must be satisfied, otherwise the mission will be rejected by the mission planner with an error. There are a number of pre-conditions on each task; a few examples are: • If a take-off task is present, it must be the first task. • If a land task is present, it must be the last task. • Named places must already exist in the map at the time of mission definition. • Velocities and heights must lie within pre-defined limits. • Coverage and guard search tasks use a pair of named places to define the bounding area of the search. Both points must exist, and a main direction of search must be specified. • If a minesweep task is defined, a commando plan must already exist at the time of mission definition. Some of these pre-conditions do not appear to have serious consequences if violated, such as pausing as the last task in a mission. However, this task would have no effect since the MAV automatically pauses once the mission ends. The pre-conditions are therefore also designed to ensure consistency and avoid unexpected behaviours, as opposed to dangerous behaviour. Under the time-critical conditions of executing the mission, our goal is to minimize the number of errors the human operator can make in specifying a mission.

9

Figure 9: (a) An example specification of the coverage place and (b) the waypoints of the corresponding coverage pattern. A number of high-level tasks require the mission planner to compute the appropriate waypoint trajectory based on the current world configuration and the specified task. Computing Trajectories from Mission Descriptions Given the human-readable format of the mission descriptions, each task must be turned into a waypoint command for the helicopter. For each task, if it contains a named place, then the first action is to determine the corresponding GPS location from the map. Coverage and search patterns are converted into an ordered sequence of waypoints that may also depend on a specified direction (e.g., latitudinal, longitudinal) of the search. Figure 9(a) shows an example of a coverage specification in terms of two named places, and figure 9(b) shows the corresponding flight path in terms of waypoints. Monitoring the vehicle to ensure safety Finally, the mission planner is also responsible for ensuring the vehicle safety, such as monitoring the battery voltage and automatically returning the vehicle to home if the battery voltage falls below a certain level. Additional safety monitors that we have considered are height monitoring and proximity monitoring, however, the specific hostage detection task requires manoeuvres that would violate safety margins, so we do not fly with these monitors in place and instead rely on human operators to ensure the safety of the vehicle.

4.8

Multi-Vehicle Deconfliction

The nature of the problem specification has allowed us to separate the mission into distinct phases, each assigned to a different MAV operating in a different part of the airspace. We therefore use one MAV at the bank building in order to monitor the guard vehicle, while the second MAV is used to monitor commando progress. We run two entirely separate instances of our high-level mission control software to process the sensor measurements and control each MAV separately. Information is shared between the guard tracking module attached to the primary MAV and the secondary MAV, and the states of both vehicles are available on both human interaction displays. We rely on physical and temporal separation in the mission descriptions to minimize the probability of the vehicles interacting in flight.

4.9

Commando Planner

Given the ingress point of the commandos and the destination point of the bank, we must be able to identify a path for the commandos that minimizes the likelihood of exposure to the guard vehicle. We therefore use an automated planning system based on the probabilistic roadmap [6] in order to identify a trajectory that minimizes the expected likelihood of being exposed to the guard. The assumption of our planner is that the commandos and UGV are capable of arbitrary paths throughout the environment. If the motion of the commandos and UGV are restricted to specific routes, the algorithm can be trivially modified to consider only those routes and small deviations from the routes. We assume that the commandos are holonomic and that they have full control authority, allowing us to ignore their dynamics and treat the problem as a kinematic motion planning problem. Note that we must plan in both the

10

space of commando and guard positions; each commando motion takes some time ∆t, during which the guard moves. We do assume that we can predict the position of the guard at time t + ∆t given the current guard position at time t Each commando motion therefore changes not only the commando position but also the guard position. C denotes the configuration space [8], the space of all commando-guard poses, Cf ree is the set of all configurations (based on the map M of obstacle positions) in which the commando and the guards are not inside obstacles, and the guard cannot observe the commando. Cexp is the set of invalid poses (either collisions with obstacles, or poses resulting in exposure to the guard), so that C ≡ Cf ree ∪ Cexp . Given the initial position of the commandos and guard, s0 and a map of the environment, the planning problem is to find a sequence of actions to move the commandos from state s0 to the goal without colliding with obstacles or being detected by the guard vehicle. Our system has 4 degrees of freedom (x, y of the commandos, and x, y of the guard), so C = R4 , which is of moderately high dimension. Note that we do not have a unique goal state, but a goal set; the goal set consists of all configurations in which the commandos are at the bank. The Probabilistic Roadmap (PRM) is a common algorithm [6] for planning in high-dimensional problems, in which a discrete graph is used to approximate the connectivity of Cf ree . The PRM builds the graph by sampling a set of states randomly from C (adding the start state s0 ), and then evaluating each state for membership in Cf ree ; the assumption is that it is much cheaper to evaluate randomly sampled poses in higher dimensions than it is to build an explicit representation of Cf ree . Samples that lie within Cf ree constitute the nodes of the PRM graph and edges are placed between nodes where a straight line path between nodes also lies entirely within Cf ree . Given this graph, a feasible, collision- and exposure-free path can be found using a standard graph search algorithm from the start node to the goal node. The path can be executed by using a simple controller to follow each edge to the goal. Note that the sampling algorithm dictates the variety of paths that can be used by the commando and UGV motion. If we wish to restrict their motion to specific routes, then the algorithm no longer places samples randomly throughout the environment, but deterministically places samples at the endpoints of each route segment. We make two small variations from the standard PRM algorithm in how we sample from Cf ree . Firstly, we do not sample directly from the commando and guard positions, but instead first sample commando positions, rejecting commando positions that create collisions and creating edges between positions that avoid collisions. We next propagate the minimum time trajectory to each position, and determine the arrival time t at each node. We then compute the expected guard position at each commando position’s time t, and discard the commando positions and edges that would be visible by the guard at each position’s time of arrival. The graph can then be searched for the shortest path that will not lead to collisions with obstacles or be seen by the guard, using standard search techniques such as A⋆ . As a final processing step, the random sampling strategy can lead to valid but somewhat counter-intuitive plans, so we “smooth” the plans by introducing additional positions along the graph whenever such an additional position can reduce the path length. The plan then consists of a series of GPS waypoints and expected arrival times that can be used to dispatch the commandos.

4.10

Human Interface

A major component of our concept of operations is the human operator interface shown in figure 10. The display provides situational awareness in terms of tracking the vehicle in the prior map, providing a display to monitor height, battery and GPS quality. The height, battery and GPS quality displays are configured to provide warnings if the vehicle falls below an acceptable height, battery voltage or if the GPS lock is lost. The display allows the map to edited quickly by adding additional obstacles. Additionally, objects can be initialized in either the map display (left panel), or directly in the live video feed, using the camera calibration parameters and the current state estimate of the vehicle. In addition to the current mission definition list (left panel, under the map), a set of controls are provided for emergent task control, for example, if the initial mission description is no longer valid, or if the human operator wishes to deviate temporarily from the current mission. Through the virtue of the message protocol, the human interface is separated programmatically from the implementation of the underlying modules, and is updated via message passing. This allows us to run multiple instances of the display for different human operators, and to run multiple configurations of each display. During mission execution, each operator has well-defined tasks such as guard tracking, mine detection, vehicle monitoring, and therefore has configured their display appropriately to provide different controls. The display in figure 10 is specific to the vehicle monitoring task, and therefore does not show the additional controls for panning the camera, identifying mines or controlling the guard tracking process.

11

Figure 10: One of the configurations of our user interface. The large central panel contains the map, current estimate of the vehicle position and orientation, and displays named places, coverage patterns, obstacles, mines and the current guard position estimate. The box below the map contains the mission description. The right panel streams the video from the on-board camera, shows the vehicle status in terms of GPS position, height, intended goal and battery status. The bottom bar also shows the GPS estimate quality. Between the image display and the status panel are the buttons for executing out-of-sequences tasks, such as pause, go home, track, etc.

5 Risk Reduction Our risk reduction consists of the following components: • Vehicle status 1. Shock/vibration isolation The vibration isolation is most important for vehicle stability and for accurate IMU measurements. As a result, we have focused on ensuring the propulsion system and air frame are mechanically balanced. The IMU and compass components are placed on mechanically isolated and damped stand-offs in order to minimize the impact of vibrations on the state estimation. 2. EMI/RFI solutions There are three primary points of RF interference: the digital data link at 900 MHz, the video transmission link at 2.4 GHz and the pseudo-ranges from the GPS system. We have minimized potential interference by placing the different signals at different parts of the spectrum, and using the airframe to isolate the GPS receiver (placed on top of the frame) from the other transmitter/receivers. Voltage regulators are also used in our custom electronics package to isolate the electronics from the propulsion system. Finally, we use a diversity receiver to ensure reliability of the video signal. • Vehicle safety 1. Autonomous monitoring. The mission planner has the ability to autonomously return the vehicle to the base in the event of low battery, or land the vehicle in the event of GPS outages. 2. Autonomous logging. In order to validate our performance and models during run-time, we require that the telemetry and sensor data from the vehicle be logged during every mission. A new telemetry/sensor

12

log is created automatically with every take-off command. Additionally, changes to the map are always stored automatically. In this manner, no state can be lost 3. Digital communication verification. All commands with the vehicle contain embedded verification in terms of checksums, and also a hand-shaking protocol using acknowledgement packets. 4. Manual over-ride from the ground station. If an unmodelled event occurs, for example, the guard behaves in an unexpected manner, or autonomous detection activities fail, the GUI can be used both to reset the state estimate and object detection estimate. Additionally, if the plan execution fails in some unexpected way, the ground station can take control of the vehicle via a remote joystick mediated by the digital data link. 5. Manual over-ride from a safety pilot. If the communication with the ground station is operational but in some failed mode, the safety pilot can use an RC transmitter to over-ride the commands from the ground station. • Modelling, simulation and testing 1. Mission verification. In order for a mission description to be accepted by the mission planner, the mission must be self-consistent and each task must satisfy the task preconditions. 2. Execution verification. Before running all missions, we execute the mission in simulation. Our software base contains a simulator that has an identical interface as the helicopter communication module, which allows us to replace the hardware interface module with the simulator without changing or restarting the other processes. Given a specific mission description, we can run the mission first in the simulator, verify the execution and then re-run the identical mission on the helicopter by terminating the hardware simulation and executing the hardware interface module.

6 Ground Vehicle Our ground vehicle operation is implemented in exactly the same manner as the commando planning, with the exception that the goal location is specified as mine locations along the commando route. We use the Sim UGV facility instead of providing our own ground vehicles. As previously described, we first discretize the area of operation by sampling positions around obstacles. We then create a graph by adding edges between nodes that are not exposed to the guard vehicle. We use A* search to find a path from the current position of the Sim UGV to each waypoint, and use radio communications to give GPS destination commands.

7 Conclusion By dividing the problem into a hardware design and software design problem, we have been able to make progress on multiple fronts simultaneously. We have demonstrated in tests the ability to find and track moving ground targets, to fly long distances and to identify a series of ground objects. Our vehicle is available in a variety of configurations (and is based on an original, larger quad-rotor design), and our software is equally adaptable for a range of missions including MAV ’08.

8 Acknowledgements • Abraham Bachrach is supported by Aurora Flight Sciences and the Air Force Office of Scientific Research under contract # F9550-06-C-0088. • Ruijie He was supported by the Republic of Singapore Armed Forces. • Sam Prentice and Nicholas Roy are supported by the Air Force Research Laboratory under “Learning Locomotion” contract # FA8650-05-C-726. 13

• This project was supported by the Office of the Dean, School of Engineering and the MIT Air Vehicle Research Center (MAVRC) and their support is gratefully acknowledged. • Col. Peter Young, Jonathan How, Spencer Ahrens and Brett Bethke provided additional support in the development of the vehicle and their support is gratefully acknowledged. • Bastian Steder, Giorgio Grisetti and Cyrill Stachniss provided the visual SLAM algorithm and their support is gratefully acknowledged. • The development of the vehicle vision system was supported by the Air Force Office of Scientific Research as part of the Defense University Research Instrumentation Program, “Cyber Flight Cage” contract # FA9550-071-0321. • Electronic design of the vehicle was supported by Klaus-Michael Doth and his support is gratefully acknowledged.

References [1] Shai Avidan. Ensemble tracking. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2005. [2] H. Bay, T. Tuytelaars, V. Gool, and L. Surf. Speeded up robust features. In Proceedings of the 9th European Conference on Computer Vision, volume 1, pages 404–417. [3] R. P. Bonasso, R. J. Firby, E. Gat, David Kortenkamp, D. Miller, and M. Slack. Experiences with an architecture for intelligent, reactive agents. Journal of Experimental and Theoretical Artificial Intelligence, 9(1), 1997. [4] Y. Freund and R.E. Schapire. A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. Computational Learning Theory: Second European Conference, EuroCOLT’95, Barcelona, Spain, March 13-15, 1995: Proceedings, 1995. [5] Emil Kalman, Rudolph. A new approach to linear filtering and prediction problems. Transactions of the ASME– Journal of Basic Engineering, 82(Series D):35–45, 1960. [6] L. E. Kavraki, P. Svestka, J.-C. Latombe, and M. Overmars. Probabilistic roadmaps for path planning in high dimensional configuration spaces. IEEE Transactions on Robotics and Automation, 12(4):566–580, 1996. [7] A. Lookingbill, D. Lieb, D. Stavens, and S. Thrun. Learning activity-based ground models from a moving helicopter platform. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Barcelona, Spain, 2005. [8] T. Lozano-Perez. Spatial planning: A configuration space approach. IEEE Trans. on Computers, C-32(2), 1983. [9] Reid Simmons. The inter-process communication 2.cs.cmu.edu/afs/cs/project/TCA/www/ipc/ipc.html.

(IPC)

system.

http://www-

[10] AFM Smith and AE Gelfand. Bayesian Statistics without Tears: A Sampling-Resampling Perspective. The American Statistician, 46(2):84–88, 1992. [11] Bastian Steder, Giorgio Grisetti, Slawomir Grzonka, Cyrill Stachniss, Axel Rottmann, and Wolfram Burgard. Learning maps in 3d using attitude and noisy vision sensors. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), San Diego, CA, USA, 2007.

14