AI Senior Specialist
Indra Sistemas , Madrid
After obtaining my BSc in Telecommunications Engineering in 2017 I enrolled in the MSc in Telecommunications Engineering. During the second year I met Dr. Miguel Ángel Sotelo Vázquez, who introduced me to autonomous vehicle technologies and research.
In 2018 I started working for the INVETT (INtelligent VEhicles and Traffic Technologies) Research Group as an intern while finishing my MsC thesis, which was awarded as 1st Finalist MSc thesis of the IEEE Intelligent Transportation Systems Society (IEEE-ITSS) Spanish Chapter Awards 2020. In June 2019 I finished my MSc and started a 4 year grant (also in INVETT Research Group) funded by the Spanish Ministry of Economy and Competitiveness to get the PhD. I started my PhD in Information and Communications Technologies under the supervision of Dr. David Fernández Llorca and Dr. Ignacio Parra.
Since 2019 I'm reviewer in some international conferences (IEEE Conference on Intelligent Transportation Systems, IEEE Intelligent Vehicles Symposium) and also international journals indexed in JCR (IEEE Transactions on Intelligent Transportation Systems, IEEE Transactions on Intelligent Vehicles, IEEE Intelligent Transportation Systems Magazine, IET Intelligent Transport Systems).
During the PhD I'm working with undergraduate students in laboratories/practical sessions of some Engineering subjects, as well as being co-advisor of students to conduct the research to successfully complete their BsC theses. My research interest includes autonomous driving, intelligent transportation systems, driver behaviour modeling, trajectory prediction, and transformer networks, among others.
Currently working on my thesis.
Indra Sistemas , Madrid
INVETT Research Group (Intelligent Vehicles and Traffic Technologies), Universidad de Alcalá
INVETT Research Group (Intelligent Vehicles and Traffic Technologies), Universidad de Alcalá
PhD in Information and Communications Technologies
Universidad de Alcalá, Spain
3 month internship at MRT-KIT (KIT Institut Für Mess-und Regelungstechnik)
Karlsruher Institut für Technologie, Karlsruhe, Germany
Master of Science in Telecommunications Engineering
Universidad de Alcalá, Spain
Bachelor of Science in Telecommunications Engineering
Universidad de Alcalá, Spain
I'm teaching the practical/lab part of some subjects in engineering degrees during my PhD.
The objective of the course is the study in depth the structured programming using C programming language. The programme of the course is: review of basic concepts about pointers, advanced use of pointers, advanced management of functions, creation and manipulation of files, dynamic data structures and algorithms.
Disclaimer: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.
IEEE material: Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
The accurate prediction of road user behaviour is of paramount importance for the design and implementation of effective trajectory prediction systems. Advances in this domain have recently been centred on incorporating the social interactions between agents in a scene through the use of RNNs. Transformers have become a very useful alternative to solve this problem by making use of positional information in a straightforward fashion. The proposed model leverages positional information together with underlying information of the scenario through goals in the digital map, in addition to the velocity and heading of the agent, to predict vehicle trajectories in a prediction horizon of up to 5 s. This approach allows the model to generate multimodal trajectories, considering different possible actions for each agent, being tested on a variety of urban scenarios, including intersections, and roundabouts, achieving state-of-the-art performance in terms of generalization capability, providing an alternative to more complex models.
This work presents a novel method for predicting vehicle trajectories in highway scenarios using efficient bird's eye view representations and convolutional neural networks. Vehicle positions, motion histories, road configuration, and vehicle interactions are easily included in the prediction model using basic visual representations. The U-net model has been selected as the prediction kernel to generate future visual representations of the scene using an image-to-image regression approach. A method has been implemented to extract vehicle positions from the generated graphical representations to achieve subpixel resolution. The method has been trained and evaluated using the PREVENTION dataset, an on-board sensor dataset. Different network configurations and scene representations have been evaluated. This study found that U-net with 6 depth levels using a linear terminal layer and a Gaussian representation of the vehicles is the best performing configuration. The use of lane markings was found to produce no improvement in prediction performance. The average prediction error is 0.47 and 0.38 meters and the final prediction error is 0.76 and 0.53 meters for longitudinal and lateral coordinates, respectively, for a predicted trajectory length of 2.0 seconds. The prediction error is up to 50% lower compared to the baseline method.
This paper introduces a novel method of lane-change and lane-keeping detection and prediction of surrounding vehicles based on Convolutional Neural Network (CNN) classification approach. Context, interaction, vehicle trajectories, and scene appearance are efficiently combined into a single RGB image that is fed as input for the classification model. Several state-of-the-art classification-CNN models of varying complexity are evaluated to find out the most suitable one in terms of anticipation and prediction. The model has been trained and evaluated using the PREVENTION dataset, a specific dataset oriented to vehicle maneuver and trajectory prediction. The proposed model can be trained and used to detect lane changes as soon as they are observed, and to predict them before the lane change maneuver is initiated. Concurrently, a study on human performance in predicting lane-change maneuvers using visual inputs has been conducted, so as to establish a solid benchmark for comparison. The empirical study reveals that humans are able to detect the 83.9% of lane changes on average 1.66 seconds in advance. The proposed automated maneuver detection model increases anticipation by 0.43 seconds and accuracy by 2.5% compared to human results, while the maneuver prediction model increases anticipation by 1.03 seconds with an accuracy decrease of only 0.5%.
Understanding the behavior of road users is of vital importance for the development of trajectory prediction systems. In this context, the latest advances have focused on recurrent structures, establishing the social interaction between the agents involved in the scene. More recently, simpler structures have also been introduced for predicting pedestrian trajectories, based on Transformer Networks, and using positional information. They allow the individual modelling of each agent's trajectory separately without any complex interaction terms. Our model exploits these simple structures by adding augmented data (position and heading), and adapting their use to the problem of vehicle trajectory prediction in urban scenarios in prediction horizons up to 5 seconds. In addition, a cross-performance analysis is performed between different types of scenarios, including highways, intersections and roundabouts, using recent datasets (inD, rounD, highD and INTERACTION). Our model achieves state-of-the-art results and proves to be flexible and adaptable to different types of urban contexts.
While driving on highways, every driver tries to be aware of the behavior of surrounding vehicles, including possible emergency braking, evasive maneuvers trying to avoid obstacles, unexpected lane changes, or other emergencies that could lead to an accident. In this paper, human’s ability to predict lane changes in highway scenarios is analyzed through the use of video sequences extracted from the PREVENTION dataset, a database focused on the development of research on vehicle intention and trajectory prediction. Thus, users had to indicate the moment at which they considered that a lane change maneuver was taking place in a target vehicle, subsequently indicating its direction: left or right. The results retrieved have been carefully analyzed and compared to ground truth labels, evaluating statistical models to understand whether humans can actually predict. The study has revealed that most participants are unable to anticipate lane-change maneuvers, detecting them after they have started. These results might serve as a baseline for AI’s prediction ability evaluation, grading if those systems can outperform human skills by analyzing hidden cues that seem unnoticed, improving the detection time, and even anticipating maneuvers in some cases.
This paper describes a novel approach to perform vehicle trajectory predictions employing graphic representations. The vehicles are represented using Gaussian distributions into a Bird Eye View. Then the U-net model is used to perform sequence to sequence predictions. This deep learning-based methodology has been trained using the HighD dataset, which contains vehicles' detection in a highway scenario from aerial imagery. The problem is faced as an image to image regression problem training the network to learn the underlying relations between the traffic participants. This approach generates an estimation of the future appearance of the input scene, not trajectories or numeric positions. An extra step is conducted to extract the positions from the predicted representation with subpixel resolution. Different network configurations have been tested, and prediction error up to three seconds ahead is in the order of the representation resolution. The model has been tested in highway scenarios with more than 30 vehicles simultaneously in two opposite traffic flow streams showing good qualitative and quantitative results.
This paper describes preliminary results of two different methodologies used to predict lane changes of surrounding vehicles. These methodologies are deep learning based and the training procedure can be easily deployed by making use of the labeling and data provided by The PREVENTION dataset. In this case, only visual information (data collected from the cameras) is used for both methodologies. On the one hand, visual information is processed using a new multi-channel representation of the temporal information which is provided to a CNN model. On the other hand, a CNN-LSTM ensemble is also used to integrate temporal features. In both cases, the idea is to encode local and global context features as well as temporal information as the input of a CNN-based approach to perform lane change intention prediction. Preliminary results showed that the dataset proved to be highly versatile to deal with different vehicle intention prediction approaches.
Recent advances in autonomous driving have shown the importance of endowing self-driving cars with the ability of predicting the intentions and future trajectories of other traffic participants. In this paper, we introduce the PREVENTION dataset, which provides a large number of accurate and detailed annotations of vehicles trajectories, categories, lanes, and events, including cut-in, cut-out, left/right lane changes, and hazardous maneuvers. Data is collected from 6 sensors of different nature (LiDAR, radar, and cameras), providing both redundancy and complementarity, using an instrumented vehicle driven under naturalistic conditions. The dataset contains 356 minutes, corresponding to 540 km of distance traveled, including more than 4M detections, and more than 3K trajectories. Each vehicle is unequivocally identified with a unique id and the corresponding image, LiDAR and radar coordinates. No other public dataset provides such a rich amount of data on different road scenarios and critical situations and such a long-range coverage around the ego-vehicle (up to 80 m) using a redundant sensor set-up and providing enhanced lane-change annotations of surrounding vehicles. The dataset is ready to develop learning and inference algorithms for predicting vehicles intentions and future trajectories, including inter-vehicle interactions.
This paper describes an end-to-end training methodology for CNN-based fine-grained vehicle model classification. The method relies exclusively on images, without using complicated architectures. No extra annotations, pose normalization or part localization are needed. Different full CNN-based models are trained and validated using CompCars [31] dataset, for a total of 431 different car models. We obtained a top-1 validation accuracy of 97.62% which substantially outperforms previous works.
The PREVENTION Dataset collects data acquired by INVETT Research Group self-driving car -Citroën C4- in wide variety of scenarios and conditions. More than six hours of driving from raw data sources, such as colour cameras, LiDAR and both long and wide range RADARs. This dataset is focused on the development of prediction algorithms and applications that could anticipate the intention and trajectory of sorrounding vehicles
If you are using our data for research purposes, we would be grateful if you cite us.
@INPROCEEDINGS{prevention_dataset,Room E-202.
Dpto. Automatica
Escuela Politecnica. Campus Universitario.
Ctra. Madrid-Barcelona, Km. 33,600.
28805 Alcalá de Henares (Madrid), Spain