Creating vivid crowd animations is core to immersive virtual environments in digital games. This work focuses on tackling the challenges of the crowd behavior generation problem. Existing approaches are labor-intensive, relying on practitioners to manually craft the complex behavior systems. We propose a machine learning approach to synthesize diversified dynamic crowd animation scenarios for a given environment based on a text description input. We first train two conditional diffusion models that generate text-guided agent distribution fields and velocity fields. Assisted by local navigation algorithms, the fields are then used to control multiple groups of agents. We further employ Large-Language Model (LLM) to canonicalize the general script into a structured sentence for more stable training and better scalability. To train our diffusion models, we devise a constructive method to generate random environments and crowd animations. We show that our trained diffusion models can generate crowd animations for both unseen environments and novel scenario descriptions. Our method paves the way towards automatic generating of crowd behaviors for virtual environments.
We introduce the first-ever pipeline that targets at language-guided generation of environment-compatible scenarios involving a large number of agents navigating in real-time.
We assume agents are divided into multiple groups where each group is controlled by a common velocity field. Our method takes as input a map of the environment, and a general script describing the behavior of these groups.
As there is no existing datasets that provide the complete set of the environment map, group-wise description and behavior, we propose a constructive method for data generation.
We randomly sample the environment with multiple entities:
With sampled agent group paths in the environment, we generate the groundtruth canonical sentences:
Guided by the sampled path, we construct the velocity field and propose a simulation-assisted velocity adjustment procedure:
|
|
We show the generated diversified dynamic crowd animation scenarios. Please refer to the main paper and supplementary material for detailed text descriptions and velocity fields.
|
|
|
|
|
|
|
|