Skip to content

ML.DATASETS.MAKE_MOONS

Generates a synthetic dataset of two interleaving half-moons — a classic non-linearly-separable shape.

Syntax

ML.DATASETS.MAKE_MOONS(n_samples, noise, random_state)

Arguments

Name Type Default Description
n_samples int 100 Total number of points generated, split evenly between the two moons.
noise float None Standard deviation of Gaussian noise added to each point. Use 0 for clean moons.
random_state int None Random seed for reproducible output.

Returns

Dataset

A Dataset (DataFrame) with columns [x_0, x_1, target], where target is 0 for the upper moon and 1 for the lower.

When to use

Use ML.DATASETS.MAKE_MOONS to generate a synthetic 2-D dataset of two interleaving half-moons. It's the second classic non-linearly-separable shape after MAKE_CIRCLES — useful for demonstrating kernel methods, RBF SVMs, and any classifier that needs a non-linear decision boundary.

A common use is to stress-test a classifier on a deliberately tricky 2-D problem: linear classifiers and linear PCA cannot separate the two moons; non-linear models (RBF SVM, KernelPCA + linear classifier, neural networks, decision trees) typically can.

Examples

Generate 200 mildly noisy moons with a fixed seed:

=ML.DATASETS.MAKE_MOONS(200, 0.1, 0)

Cell B4 now holds a [Database Icon] Dataset reference. Preview the first few rows with ML.DATA.SAMPLE(B4, 5). The columns are x_0, x_1, and target (0 or 1).

Remarks

  • n_samples is the total number of points; sklearn splits it evenly between the two moons.
  • noise is the standard deviation of Gaussian noise added to each point. Pass 0 (or leave blank) for clean, noise-free moons.
  • random_state controls reproducibility. Pin it to an integer if you need the same dataset every run.
  • See also ML.DATASETS.MAKE_CIRCLES for concentric circles, the other standard non-linearly-separable demo dataset.

See also