Big Data for Mobility

Corralling Data to Solve Mobility Problems

Big Data for Mobility combines Internet of Things sensor data with sophisticated computational frameworks and data science to analyze and improve the cost, energy, and environmental impact of our mobility systems.

Big Data for Mobility

 " "
Researchers use real-world, near real-time data to optimize mobility, energy efficiency, and productivity.  

Lawrence Berkeley National Laboratory (Berkeley Lab) researchers recognize our ever-present mobile devices as sensors that continuously transmit useful information. The implications for research are astounding, including collecting and analyzing data to predict traffic flow, prevent traffic jams, and improve safety.

Our researchers develop the data science and computational frameworks needed to build next-generation transportation/mobility system models and operational analytics.  Leveraging high-performance computing, sensor data, and big data analytics, they model urban-scale transportation networks and energy systems. They use real-world, near real-time data to optimize mobility, energy efficiency, and productivity.  

Research also illuminates interdependencies between infrastructure, transportation, and the environment to inform public policy, including improving air quality and integrating electric vehicles.

Researchers apply diverse computational techniques to large-scale mobility issues, including machine learning algorithms, diffusion convolution neural nets, game theory, convex optimization, network optimization, deep reinforcement learning, partial differential equations, and numerical analysis.

Research Groups

Smart Cities Research Center

Dr. Jane Macfarlane is the Director of Smart Cities Research Center, a collaboration between University of California, Berkeley's (UC Berkeley's) Institute of Transportation Studies and Berkeley Lab. The center works to improve energy-efficient mobility systems by studying mathematical models and data analytics with approaches ranging from urban-scale simulation to control theory. They work with industry and public agencies to collect and model data to develop more efficient transportation networks. The research focuses on novel approaches to modeling interdependent energy and transportation systems employing machine learning, high-performance computing, and various optimization algorithms and infrastructure control methods.

The center takes advantage of data collected by mobile devices to dramatically enhance the understanding of urban mobility. The work leverages rich geospatial data analytics and develops novel approaches to studying urban dynamics in the nexus of cyber, physical, and social systems. They produce transportation development scenarios and recommendations to practitioners and decision-makers. Research areas are grounded in the disciplines covered by master and doctorate programs in civil, systems, urban planning, and transportation engineering.


Mobiliti: Scalable Transportation Simulation Using High-Performance Computing. Mobiliti is a software tool that accurately simulates the San Francisco Bay Area population's movement through its road networks and estimates associated congestion, energy usage, and productivity loss. (See article: Mobiliti: A Game Changer for Analyzing Traffic Congestion.) Mobiliti is a proof-of-concept, scalable transportation system simulator that implements parallel discrete event simulation on high-performance computers. It uses diffusion convolutional neural networks with infrastructure sensors (inductive loops in the highway) to predict traffic dynamics. More than 11,000 sensors are being used to simultaneously predict speeds and flows for one hour into the future. The researchers plan to use the same machine learning model to determine if mobile device data could be used as the input to reduce reliance on embedded sensors. They are using Berkeley Lab's Cori supercomputer to simulate metropolitan-scale areas in reasonable compute times (minutes versus days). 

They have also introduced optimization algorithms to provide dynamic traffic assignment at scale. The goal is to use this capability to create synthetic data (large-scale speed and congestion profiles across the entire road network) for a variety of conditions and then use machine learning to represent the fundamental constraints and dynamics introduced by the characteristics of the road network. They hope to engage the transportation community in the design of next-generation active congestion control strategies.

The simulation capability includes a data-driven energy estimation that uses machine learning to estimate energy consumed on the network from mobile device data. With this, they can compare active congestion control strategies in the context of the energy impact. HPC4Mobility: Quasi-Dynamic Traffic Assignment for Large Metropolitan Areas Connectivity enabled by telecommunications systems has introduced the opportunity to implement active control of vehicle routing across connected fleets.

The U.S. Department of Energy (DOE) sponsors Mobiliti through its Vehicle Technologies Office under the Big Data Solutions for Mobility Program, an initiative of the Energy Efficient Mobility Systems (EEMS) Program.

HumNet Lab: Human Mobility on Networks

Complex systems and network sciences to improve urban societies

HumNet Lab is led by Berkeley Lab researcher Marta Gonzalez, an associate professor in UC Berkeley's City & Regional Planning Department, and faculty council member of Berkeley Institute for Data Science. HumNet Lab develops numerical models and computational tools to better characterize and understand human interactions in the built and natural environments. (See news story: Machine Learning to Help Optimize Traffic and Reduce Pollution); See video presentation in which Dr. Marta Gonzales presents a study for optimizing plug-in electric vehicles' start and end charging times and the DeepAir project.)


DeepAir: Machine Learning for Improved Air Quality

DeepAir (Deep Learning and Satellite Imaginary to Estimate Air Quality Impacts at Scale) is the first-ever application of data fusion of infrastructure imagery (urban form and transportation networks) with environmental sensors. The goal is to enable science-informed policy by understanding interdependencies between infrastructure, transportation, and the environment.

The lab adopts state-of-the-art techniques in computer vision and urban traffic data from mobile phones to quantitatively link transportation policy interventions with air quality improvement. Such tools allow for the design of improved static and mobile air pollution sensing networks that are rapidly evolving with emerging sensor technologies (personal monitors, satellite monitoring, and even non-air quality measurements). The goal is to make these methodologies fully scalable from local to regional scales and extendable to other domains involving human and environmental interactions.

Bikes Planning

For this project, researchers introduced a data science framework to identify streets that are candidates for adding new bike infrastructure, taking into account potential bike flow and preserving global connectivity. They identify potential bicycle trips by coupling mobile phone data and GPS traces from bikers' smartphone applications. 

Urban Traffic

This project uses novel methods to model traffic in urban road networks, such as the maximum vehicle throughput and the network's structure.

Planning Electric Vehicles 

HumNet has also used cell phone data to study how people move around cities and to recommend electric vehicle charging schemes to save energy and costs.

Berkeley Mobile Sensing Lab

The Berkeley Mobile Sensing Lab is led by Dr. Alexandre Bayen, faculty scientist at Berkeley Lab, the Liao-Cho professor of engineering at UC Berkeley, and the director of the Institute of Transportation Studies. The Mobile Sensing Lab's research lies at the intersection of control, optimization, and machine learning with applications in mobile robotics, transportation, and engineering. The problems involve the application of machine learning algorithms to large-scale mobility problems. Computing techniques include game theory, convex optimization, network optimization, deep reinforcement learning, partial differential equations, and numerical analysis.



Flow is a deep reinforcement learning framework implemented on Amazon Web Services (AWS) Elastic Compute Cloud (EC2) and used for learning and optimization over microsimulation tools for traffic flow. Its main application includes mixed-autonomy traffic, studying the impact of a small proportion of self-driving vehicles on the rest of traffic flow. Flow is a traffic control benchmarking framework. It provides a suite of traffic control scenarios (benchmarks), tools for designing custom traffic scenarios, and integration with deep reinforcement learning and traffic microsimulation libraries.

Network Optimization and Analysis of the Impact of Information on Traffic Flow

This project focuses on analyzing the impact of routing apps, such as Google, Waze, Apple traffic, INRIX, and more. Our approach develops new network traffic flow models that incorporate the impact of routing information on traffic flow and routing. We provide a theoretical analysis of the resulting mathematical framework and numerical simulations for practical cases of the impact of such apps on congestion.

Collaborations and Partnerships

Berkeley Institute for Data Science

Berkeley Institute for Data Science (BIDS) is a central hub of data-intensive research, open-source software, and data science training programs at UC Berkeley.

BIDS programs and initiatives are designed to facilitate collaboration across an increasingly diverse and active data science community of domain experts from the life, social, and physical sciences, as well as methodological experts from computer science, statistics, and applied mathematics. Since its launch in 2013, BIDS has cultivated an environment of open inquiry and discovery for data-intensive research. As an integral part of UC Berkeley's Division of Computing, Data Science, and Society (CDSS), launched in 2019, we continue to seek new and creative ways to cross traditional academic boundaries and engage a diverse community of researchers representing a wide array of disciplines.

Berkeley Institute of Transportation Studies (ITS)

ITS addresses challenges in our transportation systems, including safety, energy consumption, an aging infrastructure, and a lack of reliability, resilience, and sustainability. Spanning nine departments and four colleges within UC Berkeley and two divisions at Lawrence Berkeley National Laboratory, ITS is a unique environment where the entire pipeline from science and technology inception to deployment can be brought to bear on these challenges, working directly with transportation practitioners and the worlds of policy and governance in which they must function.

ITS researchers work in a wide range of fields, including robotics and machine learning, behavioral economics, policy, and urban planning. To effectively harness that expertise, our plan for the future focuses on four growth areas that will allow us to advance the knowledge base in key fields such as self-driving cars, airspace governance for the coming drone revolution, and a clean-energy infrastructure. With our mission of service to the State of California, and with our San Francisco Bay Area location — ground zero for the extraordinary data-rich, technologically-advanced era in which we live — ITS aims to be the inventor of the smart cities of tomorrow, contributing to an always more efficient and sustainable transportation system. 

Data, Tools, and Facilities

Energy Sciences Network (ESnet)

ESnet is the U.S. Department of Energy's (DOE's) dedicated science network, helping researchers meet their goals from experiment to discovery.

ESnet provides the high-bandwidth, reliable connections that link scientists at national laboratories, universities, and other research institutions, enabling them to collaborate on some of the world's most important scientific challenges including energy, climate science, and the origins of the universe. Funded by the DOE Office of Science, ESnet is managed and operated by the Scientific Networking Division at Lawrence Berkeley National Laboratory. As a nationwide infrastructure and DOE User Facility, ESnet provides scientists with access to unique DOE research facilities and computing resources.

National Energy Research Scientific Computing Center (NERSC)

NERSC is a DOE Office of Science User Facility that serves as the primary high-performance computing center for scientific research sponsored by the Office of Science. Located at Berkeley Lab, NERSC serves more than 7,000 scientists at national laboratories and universities researching a wide range of problems in combustion, climate modeling, fusion energy, materials science, physics, chemistry, computational biology, and other disciplines.

NERSC is known as one of the best-run scientific computing facilities in the world. It provides some of the largest computing and storage systems available anywhere, but what distinguishes the center is its success in creating an environment that makes these resources effective for scientific research. NERSC systems are reliable and secure, and provide a state-of-the-art scientific development environment with the tools needed by the diverse community of NERSC users. NERSC offers scientists intellectual services that empower them to be more effective researchers. For example, many of our consultants are themselves domain scientists in areas such as material sciences, physics, chemistry, and astronomy, and are well-equipped to help researchers apply computational resources to specialized science problems.

Alex Bayen on the Impact of Routing Apps on Traffic
Mechanical Faculty Scientist/Engineer
Director, Smart Cities and Sustainable Mobility Center
Energy/Environmental Policy Research Scientist/Engineer
Materials Faculty Scientist/Engineer