Datasets

Generating Data

Introduction

Running the LEAN engine locally with the CLI requires you to have your own local data, but real market data can be expensive and governed by difficult redistribution licenses. Instead of using actual market data, you can also opt to use realistic fake data by using LEAN's random data generator. This generator uses a Brownian motion model to generate realistic market data. It is capable of generating data for most of LEAN's supported security types and resolutions, which makes it a good solution to design and test algorithms without the need to buy real financial data.

To use the CLI, you must be a member in an organization on a paid tier.

Supported Security Types

The random data generator supports the following security types and resolutions:

Security TypeSupported Resolutions
TickSecondMinuteHourDaily
Equitygreen checkgreen checkgreen checkgreen checkgreen check
Forexgreen checkgreen checkgreen checkgreen checkgreen check
CFDgreen checkgreen checkgreen checkgreen checkgreen check
Futuregreen checkgreen checkgreen checkgreen checkgreen check
Cryptogreen checkgreen checkgreen checkgreen checkgreen check
Optiongreen check

Supported Densities

The random data generator supports the following densities:

DensityDescription
DenseAt least one data point per resolution step.
SparseAt least one data point per 5 resolution steps.
VerySparseAt least one data point per 50 resolution steps.

Run the Generator

Follow these steps to generate random data:

  1. Open a terminal in one of your organization workspaces.
  2. Run lean data generate --start 20150101 --symbol-count 10 to generate dense minute Equity data since 01-01-2015 for 10 random symbols.
    $ lean data generate --start 20150101 --symbol-count 10
    Begin data generation of 10 randomly generated Equity assets...

    You can also specify an end date using --end <yyyyMMdd>, generate data for a different security type using --security-type <type>, for a different resolution using --resolution <resolution>, or with a different density using --data-density <density>.

    For a full list of options, run lean data generate --help or see Options.

The following image shows an example time series of simulated data:

time series of simulated data

You can also see our Videos. You can also get in touch with us via Discord.

Did you find this page helpful?

Contribute to the documentation: