Saturday, April 30, 2022

#VLAIauto AI Automotive Summit

From my invited talk at  #VLAIauto AI Automotive Summit (of the 7th Automotive Sensors & Electronics), April 26-27, 2022 | Munich, Germany

#VLAIauto #ai #autonomous_driving

Thursday, August 6, 2020

The 2020 World ADAS & Autonomous Driving Conference

Tomorrow, I'm honored to be the Chairperson of
the 2020 World ADAS & Autonomous Driving Conference and to moderate the very interesting panel discussion on the future of Connected, Autonomous, Shared, Electric vehicles in a post-COVID world with Shyam Sundar, Sanjay Puri, and Vienna Harvey. I'm also excited to be presenting about the recent advances in Deep Learning approaches to Autonomous Driving.

I'm sharing this to encourage delegates from Egyptian universities to register to join for free. The conference is scheduled to happen virtually this year, and universities and OEM's delegates can register for free: (terms:

The World ADAS & AD Conference is a platform for experts from automotive industry, academia, and government institutions discussing the innovations, challenges, and future aspects of innovative technologies in automotive.

Thursday, April 23, 2020

Online Teaching - Coronavirus

It was much easier for me to switch the lectures-based classes I'm teaching to online classes due to Coronavirus pandemic than conducting Labs online. Specifically, it was challenging to deliver a Digital Design Lab I'm teaching at AUC online without breaking some of the Intended Learning Outcomes (ILOs). I feel really proud of how we've managed to cope with such challenge so far!

We enabled each student to connect remotely from his/her home to a PC in the lab to use and access the lab licensed software and tools. The hardware material needed for each lab experiment are physically connected to all the lab machines. We also connect a camera to each PC that monitors the connected hardware to allow students to see it remotely while programming it. Interestingly, we enabled students to mimic pressing hardware push button or switches remotely through their home PC keyboard!

In these images as example, each student, from home, remotely uses Xilinx Vivado to design and develop his/her digital circuit and to program the physical FPGA in the lab, and to remotely see the outcome of their circuits through a camera, while having me and the TA connected with him/her via Zoom. Students can use some keyboard key presses if they want to press a switch or a button on the physical hardware board. In short, students can remotely program, see, and control/interact the hardware and to use the lab software and tools. One picture is old actually to show the lab before switching to online teaching :)

This is a team effort, and I have to thank Dr. Mohamed Shalan and Eng. Amr ElShorbagy for their great contribution to it. This would not have been possible without the full support received from the university and my department (CSE) in terms of providing the cameras and the remote access tools. And for sure I would like to thank my amazing students who showed great levels of agility and cooperation. Also, it's important to mention that any instructor who implements similar approaches should account for the overhead students consume compared to before having all of this online.

Camera Sees the Hardware, and students work remotely from their homes on our empty lab

Students program, see, and control/interact the hardware remotely

 Old picture for the lab before switching to online teaching

Students program, see, and control/interact the hardware remotely

Comments like this motivate :)

Tuesday, April 21, 2020

Distracted Driver Dataset

Distracted Driver Dataset

Hesham M. Eraqi 1,3,*, Yehya Abouelnaga 2,*, Mohamed H. Saad 3, Mohamed N. Moustafa 1

1 The American University in Cairo
2 Technical University of Munich
3 Valeo Egypt
* Both authors equally contributed to this work.


Our work is being used by researches across academia and research labs:




Distracted Driver V1Distracted Driver V2
Key Contributions
  • First publicly available dataset for distracted driving
  • Training and testing datasets are split randomly
  • Collected more data with more drivers
  • More precise labeling and better sampling per class
  • Training and testing datasets are split based on drivers
Dataset Information31 drivers44 drivers
License AgreementLicense Agreement V1License Agreement V2
Download LinkIf you agree with terms and conditions, please fill out the license agreement and send it to: Yehya Abouelnaga: or Hesham M. Eraqi: receiving a filled and signed license agreement, we will send you the dataset and our training/testing splits.
  • H. Eraqi, Y. Abouelnaga, M. Saad, M. Moustafa, "Driver Distraction Identification with an Ensemble of Convolutional Neural Networks", Journal of Advanced Transportation, Machine Learning in Transportation (MLT) Issue, 2019.
  • Y. Abouelnaga, H. Eraqi, and M. Moustafa. "Real-time Distracted Driver Posture Classification". Neural Information Processing Systems (NIPS 2018), Workshop on Machine Learning for Intelligent Transportation Systems, Dec. 2018.

Terms & Conditions:

  • The dataset is the sole property of the Machine Intelligence group at the American University in Cairo (MI-AUC) and is protected by copyright. The dataset shall remain the exclusive property of the MI-AUC.
  • The End User acquires no ownership, rights or title of any kind in all or any parts with regard to the dataset.
  • Any commercial use of the dataset is strictly prohibited. Commercial use includes, but is not limited to: Testing commercial systems; Using screenshots of subjects from the dataset in advertisements, Selling data or making any commercial use of the dataset, broadcasting data from the dataset.
  • The End User shall not, without prior authorization of the MI-AUC group, transfer in any way, permanently or temporarily, distribute or broadcast all or part of the dataset to third parties.
  • The End User shall send all requests for the distribution of the dataset to the MI-AUC group.
  • All publications that report on research that uses the dataset should cite our publications.


This is the first publicly available dataset for distracted driver detection. We had 44 participants from 7 different countries: Egypt (37), Germany (2), USA (1), Canada (1), Uganda (1), Palestine (1), and Morocco (1). Out of all participants, 29 were males and 15 were females. Some drivers participated in more than one recording session with different time of day, driving conditions, and wearing different clothes. Videos were shot in 5 different cars: Proton Gen2, Mitsubishi Lancer, Nissan Sunny, KIA Carens, and a prototyping car. We extracted 14,478 frames distributed over the following classes: Safe Driving (2,986), Phone Right (1,256), Phone Left (1,320), Text Right (1,718), Text Left (1,124), Adjusting Radio (1,123), Drinking (1,076), Hair or Makeup (1,044), Reaching Behind (1,034), and Talking to Passenger (1,797). The sampling is done manually by inspecting the video files with eye and giving a distraction label for each frame. The transitional actions between each consecutive distraction types are manually removed. The figure below shows samples for the ten classes in our dataset.


All publications that report on research that use the dataset should cite our work(s):
Hesham M. Eraqi, Yehya Abouelnaga, Mohamed H. Saad, Mohamed N. Moustafa, “Driver Distraction Identification with an Ensemble of Convolutional Neural Networks”, Journal of Advanced Transportation, Machine Learning in Transportation (MLT) Issue, 2019.
Yehya Abouelnaga, Hesham M. Eraqi, and Mohamed N. Moustafa, “Real-time Distracted Driver Posture Classification”, Machine Learning for Intelligent Transportation Systems Workshop in the 32nd Conference on Neural Information Processing Systems (NeuroIPS), MontrĂ©al, Canada, 2018.

Friday, April 17, 2020

Clustering Lectures [English]

Lecture 1: Clustering 1 (K-means)

Lecture 2: Clustering 2 (DBSCAN - Hierarchical - GMM - Validation)

Arabot Robot

Arabot robot is my graduation project (BSc thesis) from Cairo University Electrical Communications Engineering department in 2010. The team was composed of 5 colleagues and friends; Hany Ahmed, Mahmoud Sami, Mostafa Khattab, and Mahmoud Serag. I wanted to share some pics and videos with you here :)

We have developed a robot waiter that serves the customers of restaurants and hotels; it understands Arabic speech (continuous speech) and interacts with the customers and the kitchen-man by listening to their orders. It moves around the place, chatting with customers, taking orders, delivering it, and sending text orders to the chef’s PC wirelessly in a fully-automated way. It also has a user interface which enables the customization of its behavior, look and technical parameters (even the color of the LED's on Arabot chest). The robot is fully designed and built by us.

Arabot was awarded:
- The best project of the Egyptian Engineering Day 2010
- SAMSUNG Real Dreams Award 2011
- Young Innovator Award 2010

Sunday, May 5, 2019

IndabaX: My Deep Learning Object Detection Workshop

Object Detection with Deep Learning Workshop

A Google CoLab-based 3 hours workshop that I was invited to conduct at IndabaX Egypt 2019 conference. It uses TensorFlow & PyTorch to demonstrate the progress of Deep Learning-based Object Detection from images algorithms.

Sunday, September 30, 2018

GANs: Generative Adversarial Networks

These people on the image below are not real people. They are generated with Generative Adversarial Networks (GANs) that is trained on 30,000 celebrity photos.

GANs are one of the Deep Learning-based generative methods that can produce novel samples from high-dimensional data distributions, like images or LiDAR data. In the previous figure results, the model learned to generate entirely new images that mimic the appearance of real photos, so it generated new people photos. Machine Learning models can be looked at as one of two types: Generative Models and Discriminative Models. To understand the difference, let us assume you have input data X and you want to classify the data into labels Y. A generative model learns the joint probability distribution P(X,Y) and a discriminative model learns the conditional probability distribution P(Y|X), which is read as "the probability of Y given X". As a simple example, if you have data in the form of (X,Y), here is what each model tries to estimate:

Generative models provide a way to learn data representations without extensively annotated training data. It can learn in supervised, unsupervised, or semi-supervised modes. The distribution P(Y|X) is more natural for classifying a given example X into a class Y, however the generative model distribution P(X,Y) can be transformed into P(Y|X) by applying Bayes rule and then also be used for classification. For example, suppose we have two classes of animals, elephant (Y=1) and dog (Y=0), and X is the animal picture. Given a training dataset, a discriminative model tries to find a multidimensional decision boundary that separates the elephants and dogs images in space. Then, to classify a new animal image as either an elephant or a dog, it checks on which side of the decision boundary it falls, and makes its prediction accordingly. While a generative model looks at elephants images and builds a model of what elephants look like. Then, looking at dogs images and builds a separate model of what dogs look like. Finally, to classify a new animal image, we can match the image against the elephant and the dog models, to see whether it looks more like the elephants or more like the dogs we had seen in the training dataset. On the other hand, the advantage of generative models, is that it can use P(X,Y) to generate likely (X,Y) pairs as well, because it learned the training data distribution. However, discriminative models generally outperform generative models in classification tasks.

Generative models can learn data of any type; the data can be speech, LiDAR cloud of scan points, text, videos, etc. The data does not have to be images. Generative models are mainly classified into two types:
1) Density Estimation models:
It learn a Probability Density Function (PDF) of the training dataset that can be used to generate data similar to what had been seen in that training dataset. It tries to estimate a PDF that is too close to the training dataset.

2) Sample Generation models:
It learns a model that can generate data that is too close to the training dataset directly. There is no need to explicitly learn a training data PDF.

Examples Density Estimation models is Deep Belief Networks (DBNs) and Restricted Boltzmann Machines (RBM). While Generative Adversarial Networks (GANs) falls into the second category of Sample Generation models. The name “adversarial” here means training the network to classify adversarial examples by training it on these adversarial examples, these examples are generated (not real) so the network learns to differentiate it from real data examples. Typically, a GAN consists of two Neural Networks: a generator (G) and a discriminator or a critic (D).

Real data samples X (could be images) are sampled from the training dataset and given as input to D, and the training is conducted to make D(X) near to 1 (D is differentiable). On the other side, input noise vector N is inputted to G so it generates a fake image G(N), and then that image is given as input to D and the training is conducted to make D( G(N) ) near to 0. Hence, with time the discriminator learns to differentiate real from fake data. To make the generator more efficient in generating real-like samples, D is stacked after G, and G is trained to make D( G(N) ) near to 1 (G is differentiable and D is kept constant). Hence, with time the generator becomes smarter and generates more real-like samples. The following figure describes the learning process. Typically, the generator is of main interest, the discriminator is an adaptive loss function that gets discarded once the generator has been trained.

If D makes the right prediction, G updates its parameters in order to generate better fake samples to fool D. If D’s prediction is incorrect, it tries to learn from its mistake to avoid similar mistakes in the future. This process continues until an equilibrium is established. Let’s Discuss such equilibrium through an example. Assume G is a money forger and D is a police. The police wants to allow people with real money to safely spend it without being punished and to catch forged money. Simultaneously, the money forger want to fool the police and successfully use their forged money. So with time, both of the two players gets better and better in doing their job. If both players have unlimited capabilities, So the “Nash equilibrium” (from game theory) corresponds to that G generates perfect samples that comes from the same distribution of the training data. Hence for any image received (real or generated), D will say it’s 50% real and 50% fake. The goal of learning the GANs is to establish equilibrium between errors of Generator and of Discriminator. In other words, the learning phase is ended when Generator is “smart” enough to fool the Discriminator in 50% of cases.

The generator random noise vector specifies which image is generated; each generated image corresponds to input random input number. One nice and interesting observation about GANs is that arithmetic operations on that input noise vector started to take a meaningful role. For example, if you subtract the vector that generates a “man with glasses” image, from the vector that generates a “man” image, and then add the vector that generated a “woman” image, the resulted vector causes the GAN to generate a “woman with glasses” image. It is like an arithmetic operation:

Generative Adversarial Networks are an interesting development, giving us a new way to do unsupervised learning and allow machines to generate data and arts, like the following painting of Van Gogh, which is generated by Convolutional Neural Networks. The figure also shows bedrooms that are generated with GANs.

One key advantage about GANs is that it bridges the gap between simulation and reality. The task of autonomous driving requires collecting a huge amount of real data to achieve acceptable generalization. With GANs the translation from simulated (syntactic) data to real data becomes feasible. In the same context, now in CDA/CDV we currently investigate the ability to generate ScaLa LiDAR data that corresponds to camera images, and the opposite way around.

One big open problem about GANs is that it is hard to evaluate its performance. In the images domain, it is quite easy to at least look at the generated samples to judge the model accuracy, although this is obviously not a satisfying solution because it is not automated and the human factor makes it a qualitative evaluation. The evaluation becomes a bigger problem in case of non-image data, like for example a GAN that generates ScaLa-like LiDAR data.